Friday, November 21, 2014

OSX Terminal: Page Up, Page Down and terminal history search

It's taken me literally years to join the dots on this one. Most tutorials cover only half of this material properly. By the end of this, page-up and page-down will function as with Linux. If you type part of a command, page up and page down will cycle through previously issued commands from the shell history.

Part One: Send the escape sequences appropriately from the terminal app
Part Two: Interpret them correctly at the other end
Part Three: Know how to use Page Up on an Apple keyboard


Part One


Part one involved opening up your terminal preferences, and finding the "keyboard" section. At the time of writing, this is in the profiles area, then over to the right of the .


You may or may not already have an entry for "Page Up" and "Page Down". It looks like a kind of double up arrow. If you've never been into these settings before, you shouldn't have one.

Hit the plus sign. You can now choose your key, your modifier, and your action. Page Up should be an available choice. The tricky part is not so tricky, and it is sending the right control codes. The first character is produced by hitting the escape key in the text box. It should produce something like \033. Following that (without any white space), type [5~. This is the control code for page up.

To add page down, the procedure is identical. The control code is [5~.

You have now completed part one. I chose not to use a modifier, because I don't feel the need to use page up and down for scrolling through the buffer (its normal use). You might like to use a modifier like control or apple-key.

Part Two


This is also covered by existing tutorials, but not usually properly connected to Part One. What you need to do here is "bind" the control codes to the history search action inside the terminal. This is done by editing special files. The tutorial I read had these edits inside the ".profile" file in the user home directory, but I found this didn't work effectively and I had to use the ".bash_profile" file. You may already have some text in these files, it should be safe to just make the following additions at the end of the file.

The instructions contained here fully describe the process: http://www.macworld.com/article/1146015/termhistory.html

Basically, use the command 'nano' to edit your bash profile:
> nano ~/.bash_profile
then add the following lines:

bind '"^[[A":history-search-backward'
bind '"^[[B":history-search-forward' 

BUT! You have to do it without copy and paste, because there are special characters in there. Taken direct from the tutorial: Copy and paste the first part of each line above (bind '"), and then press Escape-V. When you do, you’ll see a little tag at the bottom of the window that reads [ Verbatim Input ]. Now press the Up Arrow (or Down Arrow, depending on the line), and you’ll see the above codes appear (and you’ll exit Verbatim Input mode when you press the arrow key). After that, just copy and paste the rest of each line, and you’re done.

Save the file by pressing Control-X (for exit), Y (for yes to save changes), and Return (to accept the filename).

Step Three

I have an external keyboard, with page up and page down keys. However, my laptop doesn't have those keys! Oh noes! 

Fortunately, you can trigger this by hitting the "fn" key and the up arrow. Phew.

Thursday, August 7, 2014

PyCon AU 2014 Writeup

I recently attended PyCon AU in Brisbane, Australia. It was an amazing conference, and I wanted to record my thoughts. I will organise this post in time order.

Videos are coming out, and will all eventually be published at https://www.youtube.com/user/PyConAU/videos.

Friday Miniconfs


The first day consisted of "miniconfs", which are independently organised focused streams on specific topics. I attended the "Science and Data" miniconf. It is clear that this is a huge and growing component of the Python community. However, science and data still suffer from a lack of general Python community integration. The tools put in place do appear to be having a transformative effect on scientists who are adopting them (notable technologies include Ipython Notebook, scipy, numpy, and efforts such as Software Carpentry). However, general best practises around software design, team workflow, testing, version control and code review have not been so enthusiastically adopted. Going the other way, data-oriented techniques and self-measurement have not been widely adopted within open source.

On of the major "new" tools is "pandas", which provides extremely strong data management for row/column based data. This tool is a few years old, but is really coming into its own. It supports very strong indexing and data relation methods, and some basic statistical techniques for handling missing data and basic plots. More advanced techniques and plots can be achieved by using existing Python libraries for those purposes by fetching the pandas data structures as numpy arrays.

Saturday: Main Conference Day One


The main conference was opened by a keynote from Dr James Curran, who gave an inspiring presentation which discussed the new Australian national curriculum. This is to include coding from early years through to year ten as a standard part of the standard education given to all Australians. This is an amazing development for software and computing, and it looks likely the Python may have a strong tole to play in this.

I presented next on the topic of "Verification: Truth in Statistics". I can't give an unbiased review, but as a presenter, I felt comfortable with the quality of the presentation and I hope I gave the audience value.

I attended "Graphs, Networks and Python: The Power of Interconnection" by Lachlan Blackhall, which included an interesting presentation of applying the NetworkX library to a variety of network-based computing problems.

For those looking for a relevant introduction, "IPython parallel for distributed computing" by Nathan Faggian was a good overview.

"Record linkage: Join for real life" by Rhydwyn Mcguire gave an interesting discussion of techniques for identity matching, in this case for the purpose of cross-matching partially identified patients in the medical system to reduce errors and improve medical histories.

"The Quest for the Pocket-Sized Python" by Christopher Neugebauer was an informative refresh of my understanding on Python for developing mobile applications. Short version: still use Kivy.

Sunday: Main Conference Day Two

The keynote on day two was given by Katie Cunningham on the topic of "Accessibility: Myths and Delusions". This was a fantastically practical, interesting and well-thought-out presentation and I highly recommend that everyone watch it. It had left a strong impression on many members of the audience, as would be shown later during the sprint sessions.

"Software Carpentry in Australia: current activity and future directions" by Damien Irving further addressed many of the issues hinted at during the data and science miniconf. It covered familiar ground for me in that I am very much working at the intersection of software, systems and science anyway. One of the great tips for helping to break down some of the barriers when presenting software concepts to scientists was to work directly with existing work teams, as those scientists will be more comfortable working together where they have a good understanding of their colleagues work practises and levels of software experience. In a crowd of strangers, it can be much more confronting to talk about unfamiliar areas. It strikes me that the reverse is also probably true when talking about improving scientific and mathematical skills for developers.

"Patents and Copyright and Trademarks… Oh, why!?" by Andrea Casillas gave a very thorough and informative introductory talk on legal issues in open source IP management. She was involved with http://www.linuxdefenders.org/, a group of legal activities which protect IP for open source projects.

"PyPy.js: What? How? Why?" by Ryan Kelly was a surprisingly practical-sounding affair after you get over the initial surprise and implementing a Python interpreter in Javascript. One argument for doing this rather than a customised web browser is for uniformity of experience across browsers. If a reasonably effective Python implementation can be delivered via javascript, that could help to pave the way for more efficient solutions later.

The final talk was one of the highlights of the conference, "Serialization formats aren't toys by" Tom Eastman. This highlighted the frankly wide-open security vulnerability of integrating XML or JSON (and presumably a variety of other serialisation formats) without a high degree of awareness. Many ingesters will interpret parts of the documents as executable code, and allow people to execute arbitrary commands against your system if they can inject an XML document into it. For example, if you allow the uploading of XML or JSON, then a naive implementation of reading that data will allow untrusted and arbitrary code execution. I think this left a big impression on a lot of people. 

Monday and Tuesday: Developer Sprints

One of the other conference attendees (Nick Farrell) was aware of my experience in natural language generation, and suggested I help him to put together a system for providing automatic text descriptions of graphs. These text descriptions can be used by screen reader applications used by (among others) the visually impaired in order to access information not otherwise available to them.

Together with around eight other developers over the course of the next two days, I provided coordination and an initial design of a system which could do this. The approach taken is a combination of standard NLG design patterns (data transformation --> feature identification --> language realisation) and a selection of appropriate modern Python tools. We utilised "Jinja2", a web page templating language usually used for rendering dynamic web page components, for providing the language realisation. This had the distinct advantage of being a familiar technology to the developers present at the sprint, and provided a ready-to-go system for text generation. I believe this system has significant limitations around complexity which may become a problem later, however it was an excellent choice for getting the initial prototype built quickly.

You can find the code at https://github.com/tleeuwenburg/wordgraph and the documentation at https://wordgraph.readthedocs.org/en/latest/. Wordgraph is the initial working name chosen quickly during the sprints -- it may be that a more specific name should be chosen at some point.  The documentation provides the acknowledgments for all the developers who volunteered their time over this period.

It was very exciting working so fast with an amazing group of co-contributors. We were able to complete a functional proof-of-concept in just two days, which was capable of providing English-language paragraph-length description of data sets produced by "graphite". This is a standard systems metric web application which produces time-series data. The wordgraph design is easily extensible to other formats and other kinds of description. It the system proved to be of wider use, there is a lot of room to grow. However, there is also a long way to go before the system could be said to be truly generally useful.

Concluding Remarks

This was a fantastic achievement by the organisation committee, and a strong set of presentations made it highly worthwhile and valuable for people who might be considering attending in future. It sparked a great deal of commentary among attendees, and I have a lot ideas for the future and I am sure my work practises will also benefit.

The conference vibe was without doubt the friendliest I have ever experienced, improving even further on previous years' commitment to openness and welcoming new people to the community. This was no doubt partially a result of the indigenous "Welcome to Country" which opened the first day, setting a tone of acceptance, welcoming and diversity for the remainder of the event. The dinners and hallway conversations were a true highlight.

I hope that anyone reading this may be encouraged to come and participate in future years. There are major parts of the conference that I haven't even mentioned yet, including the pre-conference workshops, the Django Girls event, organised icebreaker dinners and all the associated activities. It is suitable for everyone from those who have never programmed before through to experienced developers looking for highly technical content. It is a conference, as far as I am concerned, for anybody at all who is interested or even merely curious.

Finally, I would just like to extend my personal thank you to everyone that I met, talked to, ate with, drank with or coded with. I'd like to thank those people I didn't encounter who were volunteering, presenting to others, or in anyway making the event happen. PyCon AU is basically the highlight of my year from a personal and professional development perspective and this year was no exception.

Thursday, March 6, 2014

Pushing state through the URL efficiently

I am building a web app. I want to be able to share URLs to particular things, and more specifically, to parameter-tuned views of particular things. The kind of tuning might be a database query, e.g. to restrict a date range. Or, it might be setting the X and Y axes to use for a chart.

Either way, I needed to gather the state necessary for that somehow. Doing it server-side or using session state was out of the question, since that made it hard to email a URL to a friend.

One option would be to use something like a URL shortener to store the config on the server, and share the key to that set of configuration through the URL. That would work fine, but it has two downsides:
  (1) The state is not user-readable
  (2) You can't blow away the server data and start again without affecting URLs
  (3) Remember, cool URIs don't change

For those reasons, I thought something like json would be perfect. It's well-described, human-readable, and very standard. However, it make your URLs look a bit ... meh. I wanted an alternative which to some degree hid what was going on, but was reverse-engineerable.

So I hit upon encoding the data. Python supports, e.g. string.encode('hex'). This meets some of the brief -- it happily turns a string into a hexadecimal number which can be trivially converted back again. This can be used to encode config into a less visibly clumsy way of passing state. It just tends to be a bit long.

I then hit the tubes to see how one could more efficiently pack data. There were a lot of good answers for really long strings which provided an efficient encoding, but few examples for short strings of ascii. What people were doing, however, was minifying the json.

I ended up using the following process to achieve my goals:
  -- minify the json
  -- call base64.urlsafe_b64encode(minified.encode('bz2'))

This first packs the json down into an efficient number of ascii characters, then applies a bz2 compression technique, and then packs that into a url-safe parameter which can be easily interpreted by the server (or anyone else). It also puts the JSON config data into a fairly safe packet. There's not a lot of risk on the server-side that poor decoding of the url component will result in some kind of security exception.

So, how does it perform? Well, here is the non-minified json snippet:

{
    "title": "Thunderstorm track error",
    "x_axis": "time",
    "x_labels": "time_labels",
    "y_axis": "dist",
    "y_labels": "dist_labels",
    "series_one": "blah"
}

The input json snippet was 177 characters long.
The length of the minified json was 140 characters long
The length of the bz2 data was 126 'characters' long
The length of the base-64 url encoding was 168 characters long

For larger json files, the saving from minifying is even greater. Also, for much larger json files, I would expect the saving from the bz2 compression to be a much higher proportion also.

The final url string was slightly shorter than the original string. It's not a big saving, but at least it's not larger. By contrast, if I just hex encode the minified string, the length is 280 long. Each step of the process is important to keeping the shared string as short as possible while still keeping a sensible transport format.

I'd be curious if anyone else had done any work looking into sharing shortish ascii strings for sharing configuration via URL parameter.

Tuesday, February 4, 2014

Help required: Python importing (2.7 but could be convinced to change)

I am writing an application, which also incorporates the ability for users to run -- and test -- their own custom sub-project. These sub-projects are called "experiments". The structure on disk is as follows:

/tld
   -- application/
          __init__.py
          foo.py
          bar.py
          tests/
              test1.py
              test2.py
   -- data/
          experiments/
                first_experiment/
                     incoming_data/
                          foo.csv
                          bar.csv
                      scripts/
                          get_foo.py
                          make_bar.py
                      tests/
                          test_get_foo.py
                          test_get_bar.py
At the moment, test_get_foo.py includes sys.path.append('../scripts') in order to find the scripts which are being tested.

If I run py.test (or python -m unittest) from the data/experiments/tests/ directory, that's fine. If I run the application tests from the tld, that is fine. But I can't run the experiments tests from the tld, using e.g. py.test or python -m unittest discover. I'm in module importing hell, because '../scripts' is evaluated relative to the executing directory (tld) not relative to the test file directory.

What is the right thing to do here?

                   

Saturday, December 15, 2012

Please help support the lamp with a lamp stack... and here's why

I'm about to engage in blatant fanboi marketing, so if you don't want to experience that, stop reading now.

"The Light by MooresCloud" is the name of an amazing product. It's a computer, inside a lamp. The lamp is attractive, and would be worthy of the $100 price even if it just sat there like a rock making your room light up.

But what's truly amazing about it goes far deeper...

Perhaps you have heard about the "Internet of things". This refers to the idea that everyday appliance will be internet connected. We are starting down that path already. Our phones are internet connected, and they became computers almost overnight. Now they are channels and platforms, delivering not just phone calls, but text messages, emails, movies, web pages, notifications, shopping transactions and limitless other information exchanges. Our televisions are going the same way -- they don't just such down sound and images from the sky any more. They give us internet TV, apps and more.

This is only the beginning.

Software freedom true believers, bleeding heart optimists will know that the beating heart of the internet is software built by volunteers, for free, for the love of the game. People who cared sat down, figured out how to make a million computers talk to eachother efficiently and at great distances, and then just gave it all away. They mostly had day jobs, because creating the internet out of nothing didn't earn them a paycheck. It was an essentially creative exercise, a solution to an out-of-context problem which nobody knew existed. They probably didn't even know what they were building.

Nobody really wants their bedside lamp to do all of these things.  At least, not exactly. But it could certainly do with some upgrades. Like, maybe it could turn on in the morning automatically when the alarm clock goes off, so you don't have to fumble for the switch... and maybe some more...

This is the internet of things. Not powerful phones, or powerful televisions, delivering the same content. But rather, it is the seamless and intelligent integration of tiny appliances, operating in concert based on our intentions. For example, it's 2am. You bump your lamp on. Its onboard computer notifies the Phillips Hue LED lamp down the corridoor to the bathroom. They both recognise the 2am timestamp, and light dimly rather than blazing 60 watts straight into your sleepy eyes.

The Cathedral and the Bazaar is a seminal work on the economics of open source software. It discusses the traditional, capitalist, business-based model of invention and monetary return. It accepts that by creating intellectual property, protecting it, and extracting a return, one can make invention profitable. But it also outlines another approach. Not all work is profitable. Some work is done simply to address costs. For example, if you are in the business of selling fishing lines, you don't care much about phones. You'll pay to get a better one, but you don't mind if that improvement goes only to you, or to everyone at the same time. Imagine a world if every time you paid for something, *everyone in the world* got the benefit. That's open source. Imagine if every time you paid to get a software bug fixed, it got fixed for everyone. And imagine if, every time, anywhere in the world, someone else paid to fix a software bug, your world got automatically better, for free. That's the key. Imagine if you could concentrate on the business which you were really in, while everything else just got better for free. 

Moore's Cloud have done something amazing. They will sell you a light (well, reward you with one at the kickstarter stage). But they will give you everything else for free. Including instructions for building your own light. The software. Oh, and their business model. You can simply download their financial documents and business plan. Just like that. Why? Because they don't care about that. They believe they can do a better job of developing the leading edge than anyone else, and that open developments will drive out closed developments in the short and long run. Nobody can steal their ideas because everybody can have them for free.

So, how does the rubber hit the road? Open source software is still largely a volunteer exercise, although major corporations invest in it for precisely the reasons outlines in the Cathedral and the Bazaar. Google doesn't want to own your web browser and compete against Microsoft. They want to own your search results, and make browser competition irrelevant. Which they pretty much have. Many pieces of software cost money, representing substantial intellectual property and value, and kudos to their inventors. But as many are free, getting quietly and continually better for free, like a rising tide lifting all boats. 

Moore's Cloud live at the intersection of the Open Source movement, the modern startup innovation culture, a commercial business and the obvious strategic trend toward an Internet of Things. Like the early internet pioneers, those people participating in this space are solving an out-of-context problem for the 99%. In twenty years, when the world around us is profoundly inter-connected, and this profound interconnection becomes the environment in which we live, this movement will seem every bit as profound as any other major innovation in our built environment.

Building the internet, and building open-source software takes trust, commitment and skill. It takes people to work together at a distance, with little direct obligation. It takes time and it takes money. It takes donations. It requires a business model which will allow the makers and dreamers to try, fail and succeed. It needs your help. For the price of any other piece of quality industrial design, why not also take part in the revolution?

Check out their kickstarter pitch. Let them tell you their story in their own words. Here's the trick. If they fail, backing on kickstarter is free. You can help with as little as a $1.00 contribution. For $100, one of the lights can be yours, and you can own a part of history. And get a bedside lamp to be proud of.

http://www.kickstarter.com/projects/cloudlight/light-1/ 


Footnotes:
  -- This post was made without consultation with the team behind Moore's Cloud
  -- I'm definitely not making any money out of this. I've backed them, but I have no vested interest.
  -- I've probably made lots of mistakes. This is a blog post on the internet, get over it. I did it in a rush.
  -- That said, I'll make any and all corrections required / desired

Thursday, December 13, 2012

[solved ]LG LM7600 Wifi Connection Password not accepted

Hi all,

Some breadcrumbs for anyone else experiencing this problem.

SYMPTOM:
   The LG LM7600 will not connect to the wireless network. It appears not to accept your wireless password, but you're sure it's correct.

PROBLEM:
   Your password may have spaces in it. The LG LM7600 is too stupid to recognise a password with a space in it.

SOLUTION:
   Change your wireless password to not have any spaces in it.