Categories
python

Import gotchas

  • Something is wrong with your organization or how you’re importing the code.
    • You need to separate out different pieces of code but they’re interdependent
    • You need to organize code to reuse key pieces in other code with minimum reduplication of your effort
    • You need to have a single area to maintain and update key pieces of code but there’s initialization and setup code also needed
  • import gotchas – from Westra Chapter 7
    • name masking
      • If you name your package with the same name as a another package, then depending on the search order the interpreter will find one or the other package but you may introduce confusing difficult bugs.
      • If you name a python script with the same name as a different module, you can also mask the name and run into confusing difficulties. If you try to import the module you can end up erroneously importing your file.
      • The fix: you need to identify and give unique names, and you can use the package hierarchy to help
    • double import problems
      • sys.path issues
        • if we add a package directory to sys.path instead of the directory containing the package then the interpreter can import modules both directly and as part of the package, but the interpreter will initialize the module twice which can create subtle errors.
        • The fix: sys.path should include the directory containing the package and not any of its subdirectories
      • if you have executable top level run-once code in a script A and then write some additional functions in a script B that imports A, then the interpreter will run A twice.
        • The fix: it is best to move code that you want the interpreter run in an if __name__ == “__main__” block, which allows both
          • 1) the functions in the module to be imported without triggering the code and
          • 2) the module to be run at the command line
  • You can check out Westra’s Modular Programming book Chapter 7 for more on this, and sign up for more ideas on fixing complex bugs
Categories
blog

Setting up a blog

  • Setting Up WordPress, DNS and Email
    • You want to setup a blog and get a mailing list, and have experience with tech in various forms, such as programming skills or familiarity with Linux. But even with some tech skills there’s so many options out there. How can we post content, and, yes, tweak things, while avoiding unproductive rabbit-holes of configuring servers, DNS, WordPress etc.?
    • While keeping an eye on the cost?
  • There a number of different features that we will consider – for me these are from most to least important
    • Custom domain name – for professionalism
    • E-mail – actually 2 pieces
      • E-mail list system to allow people to join and build audience
      • Send and receive email from the domain name (and not e.g. our personal gmail account)
    • WordPress software – everywhere and has many plugins
    • Linux, Apache, MySQL, and PHP (LAMP) server – I don’t care about this, it is more of an implementation detail and terminology I saw in setting up Linode.
  • The basic outline is:
  • More setup details coming!
Categories
python

Modular Programming with Python – review of first 3 chapters

Review Post – Modular Python by Erik Westra, Chapters 1-3

  • Functions are one of the most powerful tools in programming. As with any powerful tool, they also raise many questions and issues. Python has a specific terminology around functions, along with how to organize them into files that define Python modules. This book teaches firstly, Python modules as files that have Python functions, and secondly, modular programming in general. Modular programming is organizing large programs into pieces that you can reuse as requirements change. In this post I review how the book explains Python modules from the perspective of files of Python functions for those of us who want to learn how python modules work.
  • Chapter 1: Introducing Modular Programming
    • The introduction has some good basic goals. However the very first modular program uses very slightly advanced points that are irrelevant for the purpose of explaining modules, for example module-level global variables and exceptions. That said, for programmers who are not beginners, this is fine.
    • The next example, though somewhat simpler, similarly glosses over concepts which Westra hasn’t really introduced yet. The example packages a group of animal modules (toy Python functions in toy Python files) into an animal package, along with its initialization file – although we are still trying to learn what a module is . Next is an example of a larger program that might tell us why and how modularization improves our codebase and workflow.
    • Westra next shows us how to implement a module involving a cache. He gives us the definition of the module-level global variable we first saw, along with some public and private module functions. Overall this first chapter doesn’t teach us anything specific, and I recommend skimming it just to get an idea of his approach.
  • Chapter 2: Writing your first Modular Program
    • This chapter involves an inventory control system with modules for a report generator, data storage, and a user interface. The code covers a number of details that help us understand the size and scope of a program that begins to benefit from a modular approach.
    • Though seeing a big program is useful, the program again goes into many technical details that aren’t relevant to the modular programming concept – Westra acknowledges this and clearly says “don’t worry too much about the details of these functions”
    • In the implementation of the main program we see how to import the modules we created – this is the important point. This too is a chapter we can skim.
  • Chapter 3: Using Modules and Packages
    • Here we begin to actually learn how to create and use a directory (called a package) of Python module files.
    • From the outset Westra discusses packages within packages – this is a potentially confusing concept that he illustrates well, early, and consistently so that we can understand the principles clearly.
    • He diagrams directory trees and how these relate to packages which makes these clear. He next discusses initialization of modules – giving specific code examples, presented well and clearly.
    • After that, Westra goes into what the import statement does – likewise, specific, succinct and thorough.
    • Relative imports are likewise complex and so a major source of confusion. But again Westra diagrams the situation helpfully and clearly. He gives straightforward diagrams, and then writes the exact code that executes the depicted relative import. This method lets us quickly understand how the relative import system works.
Categories
python

While loops: 5 things to know

Number 1: what is a while loop:

Loops cause computers to repeat certain calculations in certain ways. While loops are one of the basic building blocks of programs. They can be a simple way to cause the program to repeat a particular set of instructions while a specific condition holds.

For example, we may want to repeatedly add numbers, sales or units, to find a total. While loops tell a computer to repeatedly perform such an instruction while a certain condition holds. Continuing the example, we may have a sales log file. Whenever a sale occurs, a program may add row a with the date, order number, quantity, price, and part number.

We may want to write a new program that instructs the
computer to read rows of the logfile and sum up the quantity*price. The while loop would help us total amount over the whole logfile, reading and summing while there are rows of data.

But we can do more complex tasks, such as finding sales up to Jan 1 of this year, by adding a condition to the while that the date is before jan 1 of this year. Similarly we can sum while the subtotal is below $20,000, which would tell us how long and what transactions let to that amount.

Though powerful, we face a learning curve when starting to write while loops. Here we cover 2 main categories of issues.

While Condition

Number 2: Never stopping

The while loop repeats the instructions while a certain condition holds, and stops repeating when that condition no longer holds. So the condition in the while loop needs to correctly capture all situations when the instructions are supposed to

  • run, and
  • stop

The classic error is the runaway while loop
here the stop is not reached:

i = 0
while i < 10:
    print(i)

Number 3: Never starting

Another classic is one where the loop never runs to begin with

i = 10
while i < 10:
    print(i)

Issues with updating

Number 4: updating issues

The other key issue is that the while loop typically needs to update some aspect of the data, such as a variable or a file
Often some arithmetic or complex logic is involved. These lead to many issues and need to be thoroughly understood while example with

i = 0
while i <= 10:
    print(i)
    i = i * 1.1

will never end because multiplication by 0 always results in 0.
Similarly

i = 0.1
while i <= sqrt(i):
    i = sqrt(i)
    print(i)

will never terminate because although i gets larger each time through the loop, it will never reach 1 since sqrt(x) < 1 for all x

Number 5: The off-by-1 error

The last is the off-by-1. Often loops have a small issue with the
update or with the while condition that causes errors that are
either 1 more or 1 less than the desired outcome in some way.
For example we may want to print 10 numbers but get 11:

i = 0
while i <= 10:
    print(i)
    i += 1

or we may get 9 numbers:

i = 1
while i < 10:
    print(i)
    i += 1

There are likewise 2 ways to get 10 numbers, only one of which might
be desired outcome

i = 1
while i <= 10:
    print(i)
    i += 1

or

i = 0
while i < 10:
    print(i)
    i += 1
Categories
Data Science

Predicting based on timeseries

  • Suppose we have a dataset of phone pickups and timestamps, and we want to know if the chance of pickup varies with time. A question on the Internet asks whether to use linear regression, and whether 1-hot encoding would be useful.
  • Linear regression finds the best fit line between the input and the target variable. It works when the target is proportional to the input. But the probability of picking up a phone call is not proportional to the hour of day on a 24 hour basis – 8am might have high rate, 9am not, and 12 again have a high rate.
  • We can think about 1-hot encoding. We have timestamps and 0/1 for pickup or no pickup. If we 1-hot encode the hour, we’ll get 24 independent variables that take on values of 0 or 1 . For the regression problem we’ll be fitting 24 coefficients for these 24 variables to predict call pickup. Each coefficient could be interpreted as the probability of pickup during the corresponding hour. Time is the independent variable, the 1-hot encoding represents the hour of that timestamp, and the prediction of the regression is just the probability of pickup based on the dataset.
  • To find out if this is a good approach, we need to think about what we’re trying to do – the objective. If the objective is to know whether the probability of pickup varies over the hours, then the variation in the coefficients tells us how the hour predicts the probability of pickup.
  • But using linear regression here is overly complex – we get the same result by grouping your data by hour and then finding the fraction of pickups over total calls.
Categories
debian Uncategorized

screensaver

found a nice screen blurring command that relies on ImageMagick


#!/bin/sh -e

# Take a screenshot
#scrot /tmp/screen_locked.png
import -window root /tmp/screen_locked.png

# Pixellate it 10x
mogrify -scale 10% -scale 1000% /tmp/screen_locked.png

# Lock screen displaying this image.
i3lock -i /tmp/screen_locked.png

# Turn the screen off after a delay.
sleep 60; pgrep i3lock && xset dpms force off

then we need xautolock, but there is no xautolock. so instead we try
https://github.com/fgsch/xidle

which locks only from the command line, not from the .i3/config.  so we compile xautolock from source fromhttps://github.com/l0b0/xautolock

finally we need to add a line to an x startup script like .xsessionrc or .i3/config to add the line

exec xautolock -time 1 -locker '~/.local/bin/fuzzy_lock.sh' &

Categories
debian Uncategorized

volume from mint command line

pactl set-sink-volume 0 +15%

pactl set-sink-volume 0 -5dB
Categories
Uncategorized

Excel: unmerge when unmerge is greyed 0ut using format paintbrush

one thing that may work is to use the format paintbrush to brush in from an unmerged cell

Categories
Uncategorized

Outlook locks up, “Research Pane”

Once every couple of weeks, my Outlook 2010 freezes up.  Whatever I click, I get “Research Pane,” and nothing I type shows up.  I wondered if it was a keyboard issue, but other MS applications worked.  So after some Googling I found a solution.  Need to open VB (Alt-F11) and then the immediate window (Ctrl-G), and then type:

[code]

Application.Explorers(1).CommandBars("Research").Enabled = false

[/code]

Originally found this as one of the results here:

http://superuser.com/questions/56265/stop-the-research-pane-appearing-in-microsoft-office

Categories
git

Useful git aliases

Here’s some useful git configurations and aliases.  The way you make a git alias is as follows:

$ git config --global alias.co checkout
$ git config --global alias.br branch
$ git config --global alias.ci commit
$ git config --global alias.st status

You can use git config to also set various items such as the diff.tool, which uses gvimdiff whenever you use git difftool.  This makes the diffs much more visually understandable since you can scroll and see more context.

color.ui=auto
diff.tool=gvimdiff
alias.loga=log --oneline -15 --graph --decorate --all
alias.rev=remote -v
alias.lst=ls-tree --abbrev -l HEAD
core.pager=less -r