Friday, November 13, 2009

Resolver Spreadsheet = Awesome (ish)

Here are some reasons why Resolver Spreadsheet is awesome:

1. It produces Python code. There is no particular reason why I couldn't virtually cut-and-paste something from the spreadsheet as an algorithm in my application. Vice versa, there is no reason I couldn't "prove" an algorithm to a co-worker or supervisor by setting up some sample runs and graphing them.

2. It produces Python code. You can edit it to change the semantics. If you like coding, you can do it!

Things that are not as awesome:

1. It only runs on Windows

2. There's a 31 day evaluation. I understand why, but I still find evaluation periods annoying and too short. If I could have a 6-month evaluation period, then I'd be happy.

Thursday, September 24, 2009

py3k fail

Okay, it's not py3k that's broken. It's just that I can't load jpgs (I think) because I can't use Pil to create a TkSomething graphics canvas (at least that's how it goes in the tutorial). Which means I can't try out what I wanted to try out using py3k. Hum.

So I've reconfigured the project to Py26. It's a bit disappointing not to be able to use the latest version of my favourite language, but at least I can use the most exciting language features in Py26.

-T

Thursday, August 27, 2009

Really Powerful AI: how close is YOUR system?

I read a lot about AI. Many people think AI is, one day, possibly soon, going to achieve a level of competency such that it will revolutionise the planet and destabilise humanity as the only intelligent force. Some people point to a number of identifiable trends in computing, usually trends in computing power, and claim this indicates a progression towards that stage.

Some such people are crackpots, while others are insightful individuals with the knowledge and experience that they should know what they are talking about -- certainly well enough to give a valid and sensible opinion.

However, I suspect that many people aren't directly working on a system which demonstrates such capability or promise.

So here's my question to anyone working in AI (or any related area). How close is the system that you personally work on, right now, to demonstrating the kind of ability which would contribute significantly towards an independent, thinking machine?

Cheers,
-T

Friday, August 21, 2009

A few days of WingIDE; a few years of Eclipse

I thought I would just quickly blog about my experience giving Wing IDE a fair trial, using it exclusively to fix one particular bug in my system. The short version: it's okay. It needs a visual diff.

Okay, so now for the long version. My application is a hybrid Python/C application, with my particular turf being almost exclusively in the Python code. The team has basically adopted Eclipse + PyDev as a kind of default environment, but it's not proscriptive. Eclipse works, there's no doubt about it. It supports pretty much the workflow I want, does a reasonable job of editing, and a reasonable job of communicating with our SVN repository. It's flexible, so I can do C work in it, hook into our ant targets, change the repository and manage multiple repositories. It's popular, so there is a wealth of information on teh internets. On the downside, I find that it's slow, the code folding is flaky (try folding a file, then editing just one method), prone to crashes.

I've long known that Wing IDE exists, and for extra kudos it's written in Python. It seems to be a genuine competitor, and I thought I'd take it for a spin.

The first thing I had to get used to was how it treats projects. It's probably better actually, but different. It creates project files, in which it stores (apparently) a list of files and directories which are included in the project, plus (presumably) various project settings and configuration files. However, it doesn't actually make your directories for you. You have to do that yourself, either during 'save as', or externally and add them to the project. Well, okay, I'll do that. But what's weird is that there's no capacity (that I could find) to create a project from a code repository. Checking the code out is legwork you have to do yourself, before you create the Wing IDE project. Well, okay...

So I did that. Once the project is created, and the directory added, you can then enable version control for the project and then you finally get some GUI help with your repository. However, this is basically where Wing IDE starts to fall down, and unfortunately it's one of the things I ran into first. So, I'll hold off on criticizing this functionality while I talk about the good points.

It's quick and responsive -- much nicer than Eclipse for doing the job of typing out code. Wing IDE is definitely a nicer editing experience.

Its folding is mostly better. It doesn't randomly unfold while you're typing, which is a big win. Unfortunately (for my tastes) it folds all the methods directly next to eachother so there's no whitespace in between. I don't know if I like that, and I'm definitely not used to it. I prefer a gap so I can start typing in between two methods if I want to. But it's a big win over Eclipse.

Its two-pane view is a little better, and again way faster. Eclipse throws a shoe sometimes when you have the same file open in two panes. It would be neat if I could flow one file into a kind of two-column layout with integrated scrolling, but Eclipse doesn't do that either.

Its search is better and easier to use.

In short, if it weren't for the code repository issues, to which I will now return, I'd much prefer Eclipse. So now for the show-stoppers.

In Eclipse, as soon as I change a file, the project browser clearly marks that I have diverged from the latest repository version. Wing doesn't.

In Eclipse, there is a *great* visual diff tool. Wing IDE will just show you the text diff file.

Eclipse has a built-in repository browser. Wing doesn't.

Anyway, I've run out of time to write down any more thoughts, but that was my experience trying out Wing. It needs to be more capable with code repositories before its functionally at the same level as Eclipse. When it gets there, its speed will put it in first place (for me).

-T

Tuesday, July 28, 2009

I know I should know how to do this, but...

I'm convinced I have read about why not to do what I'm in the middle of doing. It even feels like a bad idea, but I'm having a brain failure. I'm refactoring. I have a bunch of methods which basically take a list of say 20 similar objects, and merge the similar objects together into new objects. Imagine having a list of 20 numbers. Any numbers within 1 of eachother are 'similar', and should be removed from the list, then their average should be added back to the list. After the process, you might end up with say 4 'representative' numbers.

Okay, now I'm not dealing with numbers, but with complex objects, but the principle is the same. Rather than having like 6 different methods doing basically the same thing, I thought I would implement just *one* mergeList method which would take a pairwise mergeMethod which would then get applied to each pair. Okay, great. So I did that, but now I've realised that I'm going to end up with like 6 different pairwise mergeMethods, when really most of *those* are still pretty similar. In fact, some of them are already in my codebase and are written such that they take a modal switch as an argument.

So now I'm in this situation where I have a generic merge process, but I need to pass modal arguments down to the pairwise merge method. Now for the evil bit.

So, there's this thing called **kwargs. I can write my generic merge process such that it will accept arbitrary keyword arguments, which it can then pass directly to the pairwise merge methods. I could then call my generic method with a pairwise mergeMethod and additional arguments to be passed to that mergeMethod.

Is that evil? Should I be using inheritance instead of modal arguments?

Also, in self-defence, a lot of the constraints here come from dealing with a mature codebase. I'm just trying to work out where I should decide to draw the line and get things Done.

Cheers,
-T

Friday, July 3, 2009

HELP: How do you do Test Driven Design and Prototyping?

Here's where I fall over with TDD. Let's imagine a standard day in my life...

I have some programming problem. I need to build a Thingy to do Stuff. I don't already have anything that does something similar, so I sit down and think about the problem. Along the way, I figure out some approaches to the problem. I don't really believe in BDUF, so mostly I'll just start coding. This kind of exploration is what helps me think, and so I'll build 2 or 3 partial programs before I start to converge on something approaching a solution. Let's dot-point the process so far:

* Problem. Solution?
* Analyse
* Maybe scrawl out a flowchart
* Write a program that for some simple input, generates something like the right output
* Gather up more input data sets, and pump them through the program, extending and fixing as I go
* Reach workable solution

Okay, now a few background points. This isn't how I'd approach a big, team project. But it's how I approach anything I have to solve by myself. I can't just navel-gaze and come up with a great program design, and if we're being honest, I'll bet you can't either. To reach a decent design, I basically need to build 2 or 3 mediocre attempts first.

Now, as far as I understand it, TDD goes hand in hand with unit testing, which is all about small, well-tested, re-usable components. Well, that's great if your fundemental starting point as a designer / developer is the component. But really, it's not. Your starting point is the problem, and the process is one of decomposition and analysis.

Some problems lend themselves to an easy decomposition. A problem which lends itself to a decomposition will immediately make you think "hey, I know how to solve this. If only I had a sorter, a comparison algorithm, some kind of message generator and an input parser this would be a cakewalk!". That kind of problem isn't so hard, and is made out of nice, well-defined objects whose role is well-understood.

Other problems make you think "uh-oh. This one's going to take some coffee, a whiteboard and a fair bit of muttering." Some part of me thinks that the better and more experienced you get, the more new problems should tend to fall into the first category, but in fact I just tend to get given harder and harder problems (or so I think!)

So this is a question to TDD experts. What is the design process that should be followed when confronted with a new problem?

Cheers,
-Tennessee

Tuesday, June 30, 2009

Call for interest: should I run a python AI competition?

This is just a shoutout to see whether there would be any interest in my running a Python AI competition? There are a few Pythoneers who are into AI that I know of, and it occurred to me that one thing which could be done to serve the community would be to offer a competition.

At this stage, I'm thinking either of challenges which relate to basic AI algorithms, or perhaps building a chatbot/twitterbot. The first challenge should probably have a low barrier to entry, so perhaps I will put together a multi-stage or multi-challenge competition so that people can choose their own level of competition.

Leave a comment with (+0) if you think it would be neat, (+1) if you would take part, (-0) if you're not really in favour, or (-1) if you would sabotage the competition :)

Cheers,
-T

Thursday, June 18, 2009

Python, running in a browser

http://lackingrhoticity.blogspot.com/2009/06/python-standard-library-in-native.html

I have a dream: one day, I will be able to not know Javascript. Thanks Mark!

-T

Wednesday, June 17, 2009

Driving unit test takeup with code coverage

For anyone who has not fully gotten organised with unit testing and code coverage, this is for you! :) My project involved a large inherited codebase which has good black-box testing, but little unit testing and no coverage metrics. Tackling unit testing was always something impossible for me -- how do you take 96, 000 lines of Python and 1, 000, 000 lines of C, and build unit tests? (nb the lines-of-C count seems high, but it's what 'wc' said.)

The general advice is not to try -- but to get traction by just writing one unit test for the next bit of new code you write. Then, the next bit, etc. Eventually, you will have figured out unit testing and will then make an appropriate judgment on what to do about the body on untested code.

I have typically found this to be quite a large hill to climb. I work only on a subset of the code, which practically requires me to invoke most of the code just to get to my part. Most of my methods require such a lot of setup that it seemed quite infeasible to tackle unit testing without doing some more thinking about how to do this in a sane way. Setting up and tearing down my application was just not going to be feasible if I were going to put a lot of unit tests in place -- I reckon the setup would have cost between 1 and 7 minutes per test!

This got relegated to the too-hard basket, and set aside until now. Here's how I found a way to get traction.

What turns out to be pretty straightforward, is integrating coverage testing. You more-or-less just switch it on, and it will record to a file across multiple program invocations. This can be used to count coverage across test scripts, or user testing, or development mode use, or indeed in operations.

I ran this through about half my black-box tests, and found I was sitting at around 62% code coverage. That's not too shabby, I reckon! I know for a fact that there are quite large chunks of code which contain functionality which is not a part of any operational code path, but is preserved for possible future needs. I estimate 25% of our code would fall into that category, lifting the coverage to 87% for the sub-area of code which I work on. Now I've got numbers that look like a challenge, rather than an insoluble problem!

I think that's the key... make lifting code metrics an achievable challenge, then it will seem more attractive. It's probably important not to target a particular number. I know '100% or bust' may be what some advocates would put forward, but for anyone new to the concept, I personally feel that simply measuring the current coverage, then understanding where that number comes from and what it means, it the more important achievement.

What is clear is that I'm not going to easily lift my coverage metrics beyond a certain point simply through tactical black-box testing. I'm going to have to write tests which go through the code paths which aren't part of the operational configuration. I'm going to have to write tests with very specific setup conditions in order to get to lines of code which are designed to handle very specific conditions. All of a sudden, I've got achievable and well-defined goals for unit testing.

I call that a win!

Cheers,
-T

Anyone been to an Australasian Joint Conference on Artificial Intelligence

Dear lazyweb,

Has anyone been to one of these conferences in the past? Did you find it valuable as an attendee? Please feel free to email me directly at tleeuwenburg@gmail.com if you would prefer not to leave your comments publicly.

Cheers,
-T

Thursday, June 11, 2009

The Pinball Principle

I read a fair bit about different development methods. Agile is pretty popular amongst developer advocates, but I would argue almost everyone in charge of development *actually* just runs with a grab-bag of "what works for them" taken from Waterfall, Agile, what their peers think, what happened on the last project etc. They'd be mad not to -- can you imagine running development based on what doesn't work for you?

Some people talk about bottom-up vs top-down, inversion of control, directed vs undirected, the importance of use cases, test-driven development etc. Usually, the argument is that if you tick whatever list is being advocated, better software will get written in an efficient amount of time and deliver what was ordered to the greatest extent. But, somehow, it never seems to work out that way.

There is a huge complicating factor, which is that most development activity is done due to an obligation owed by an employee to an employer, client or other stakeholder. The developer ends up responsible for not just managing their own development practises, but managing their clients through the process also. If there are just a few stakeholders, then that's easier (unless they're very difficult stakeholders). Issues come up when there are multiple stakeholders, and when there is conflict between those stakeholders. The best possible course of action ceases to be "build and deliver approximately X" and becomes "make all these people happy to the extent possible".

How does making people happy fit in with SLDC practises? Where do you stand firm, and where are you flexible? When do you tell your manager "we shouldn't do that" and when do you just fit in? How do you sell a use-case Agile practise to a control-oriented manager? What are the ethics of accepting your manager's control-oriented activities if you think things could be done better another way? When do you take a decision on your own shoulders? How do you actually adopt new design practises? How does it fit into the development team and the organisational culture?

Most people in an organisation run according to what I call the 'pinball principle', which is to say they get batted around by their key stakeholders, bouncing off obstacles, temporary victories, other groups etc like the objects in a pinball game. To some extent you can control your own fate and direction, but there are a *lot* of things which you can't control.

In a large development team, a kind of 'surface' will form around the team, representing that team's culture, which can protect those inside of it. However, some people either don't have a protective surface, or they are right on the edge of it. In these cases, you can't simply adopt a practise and hope it will result in better outcomes, you need to use personal insight to negotiate the situation you find yourself in, and no amount of 'best practise' will prevent your key stakeholders looking you in the eye and demanding action, their way.

The question is whether there is anything we can learn from the pinball principle. If our environment is so complex and demanding, is there anything about that which should inform our behaviour? Should we acquiesce to all incoming requests, devolving responsibility, and concentrating instead on doing the best job we can? Or will something like the Agile methodology allow us to write great componentry which will then turn out to be just the thing for the next request? Should we attempt to educate stakeholders in appropriate negotiation processes, or should we instead rely on meetings and face2faces to take care of things. What is the cost of attempting to enforce process?

If project direction is likely to be set by pinball mechanics, how can we best write software knowing that will happen?

I don't know the answers -- not all of them anyway. But here is *my* list of things which I think will help keep the development *direction* on-track, and best serve long-term development success. I have just come up with them now, but they are based on ideas which I have been refining for some time now. I'd love feedback on the list and this post!

1. Get all of the stakeholders into the same room at least once a month. Make them review their priorities in front of everyone else, and get agreement on the priorities for the development team.
2. Maintain a whiteboard or wiki with the top four development priorities. Focus prioritisation on the top four issues.
3. Email out a bulletin every fortnight without fail to all stakeholders and users who are interested, including the current priority list and progress achieved.
4. Hold 'air your grievances' sessions regularly (say twice a year), especially if there is a layer of bureaucracy between you and your end users. Don't minute them.
5. Identify what makes your stakeholders actually happy (i.e. less user complaint, positive feedback from other departments etc) rather what what they say they want (more progress for less money). You will only make other people happy by reducing the pressure they get from the pinball board. That's only sometimes the same as what they're asking you to do
6. Pursue a two-layer development methodology. Make the bottom layer whatever you like (Agile, TDD, Waterfall), and make sure the developers understand it. Keep on top of new ideas here. Make the top layer As Simple As Humanly Possible. This is your 'interface' process, and keeping it simple is the only way that others will follow it. If you can't explain it at the beginning of every meeting in two slides, then it's too complicated.
7. Document Everything.
8. Publishing = profile = perception. Make sure you nail the following items: project web page, prototype web page, email bulletins, conference/journal publications (if relevant), regular meetings/presentations

Regards,
-Tennessee

Friday, May 29, 2009

Multi-party vidoeconferencing

Is it practical yet to have a multi-party videoconferences? What free software should I use? What will work for both Windows and Linux?

Tuesday, May 19, 2009

Cobra -- Next Big Thing (in a few years time)

Thanks to Simon Wittber for alerting me to this language via his blog post, "Cobra vs Python".

The Cobra homepage is available here.
The Cobra v Python overview is available here.

I'm going to make the ludicrously early call that this language will be the next big thing, eventually. I will not be moving away from Python any time soon, since (a) my job depends on it, and (b) it's fully operational now. However, I really do think that Cobra is, if not simply 'better', an important step forward in the evolution of dynamic language generally. Watching Cobra develop will provide insights for all languages, I have no doubt!

It has the features that make Python great, plus important features from other languages which will make it even more popular and even more palatable. It allows the kind of productivity you can only get from being able to 'throw something together' using elegant syntax and high-level semantics with dynamic typing, but then it *also* allows optional static typing to get the most out of compile-time checking to provide early-warning of type errors and an extra layer of guarantee of program functionality. It also includes contracts, which, while they could in general terms be implemented in any language using a lot of assert statements, are supported by Cobra with a neat syntax which will encourage their use. I can well imagine using this as a strong argument in its favour -- developers could prototype a system or server using dynamic typing, then go back and tighten up the screws with static typing as appropriate and the introduction of contracts.

It deploys to all major platforms, including .NET.

For me personally, the best features are:
* Decimal arithmetic by default (i.e. the literal 5 will become a decimal, not a float)
* Optional static typing
* Contracts
* That all the best features (for me) of Python are included

Cheers,
-Tennessee

Thursday, May 14, 2009

Two big thumbs up for OpenGoo

http://www.opengoo.org/

It's like Google Docs on steroids, but open-source and installable on your intranet right now.

I've only spent about 3 minutes exploring this software, but it appears to be absolutely amazing.

Two thumbs up, OpenGoo!!!!

Monday, April 27, 2009

Pro bono activities

Do any IT firms have explicit pro bono programs or open-source contribution programs which might mirror the pro bono activities of legal firms? The law industry obviously has a long-standing culture of engaging in pro-bono activities with the imprimatur of the corporation.

A few IT companies obviously house and/or champion open-source projects. However I'm not sure how these are really structured. Does 'the system' adequetly encourage and/or support these activities (beyond simple competitive-self interest, or enlighted corporate social responsibility)? Are there tax breaks for companies engaging in open-source or pro-bono activities such as ethical or charitable software development?

I just don't know. Perhaps the answer is very well-known, but I didn't turn up much in 20 minutes of looking. I just felt like blogging my curiosity on the topic...

-T

Thursday, April 9, 2009

Ramblings on AI

As some know, I work on a natural language generation system for weather forecast reporting. I also have an interest in general AI. This post represents little more than the idle thinkings from my lunch break...

The ultimate question of AI is, of course, can we build a machine that thinks (for various definitions of thinks)? One response to that question by anyone interested in programming is to imagine how such a thing might be constructed.

I have presented on the topic of the structure of an AI agent before. I have been trying to flesh out some aspects of what many might call the most fundamental component, the reasoning and learning engine. As I write that, however, it is of course possible that these be two separate systems, but I will continue as though they were one. By learning here, I don't just mean laying down and accessing memories, but rather the act of mapping a new situation onto an existing conceptual framework, either by creating new conceptual substructures or recognising the applicability of an existing conceptual substructure.

Along the way, a number of people have offered their insights as to what may be the best way to perform reasoning and learning, but I'm not convinced they have solved the issue of conceptual structure yet.

I have come to think that any AI system will need, in addition to generalised learning tools, a number of pre-written or pre-constructed concepts and processes for understanding information. An example of a conceptual construct may be a Bayesian Network. The agent might go -- oh hey look, here's a situation with a bunch of discrete inputs which I can recognise, and a few output states! Great, I can map this new situation onto a Bayesian Net and learn some appropriate responses.

Going further, the system may be able to recognise a new situation as being related to a known situation. Oh great -- this situation is really a variation on the Poker Card Game! I'll copy that network and get a head start on my learning. I'll just set these probabilities to .5 since they're unknowns and away I go.

However, what's not clear to me is how an agent could possibly going about choosing the appropriate input schema, come up with its own output states, or infer network structure. This problem extends to many kinds of reasoning engine -- rule-based systems, ANN, others.

Clearly, there is no One True Network to rule them all. I don't know whether anyone can conceive of a any structure which is inherently able to perform all thinking. The human brain doesn't appear to be built that way either. To my understanding, it is born with some inherent structures plus the ability to learn. It demonstrated remarkable plasticity and regenerative capacity, but it's still the case that there are certain physical areas which strongly tend to be responsible for particular kinds of thinking.

It is also probably true that no-one is every going to have the kind of direct insight and 'just come up with' a fully generalisable reasoning engine capable of learning how to deal with any situation.

However, it does seem to me to be possible to proceed along the following path:
* Identify some situations
* Write conceptual structures which can reason about those situations
* Write additional software which attempts to map new situations onto existing situations
* Write software which is capable of evolving new conceptual structures to some extent

It seems as though learning new things needs a structure to cling to -- like evolution. It's very difficult to cross certain functional divides solely through a process of evolution. Unlike organisms, who live, reproduce and die, conceptual structures can do more than that. We can use our insight as humans to build more advanced structures more quickly than may arise through chance. If we see the mind itself as consisting of a low-level, always-on, processing algorithm in which a multitude of conceptual structures exist, I think that could help. We should be able to give any AI a head-start by building some specific conceptual structures while still allowing others to evolve and grow.

...

well, I'll think more about that later.
-T

Friday, April 3, 2009

Call a function recursively

At the risk of public embarassment, I didn't actually know I could do this:

>>> def recursive(list):
... if len(list) == 1:
... print list[0]
... else:
... recursive([list[0]])
... recursive(list[1:])
...
>>> recursive('hello')
h
e
l
l
o

It seems perfectly obvious now -- it's not like I've never written code like this in other languages, but for some reason I just hadn't been thinking about it.

Thursday, April 2, 2009

Open Science: Good for Research, Good for Researchers

http://scholcomm.columbia.edu/open-science-good-research-good-researchers


Quote: '''
Open science refers to information-sharing among researchers and encompasses a number of initiatives to remove access barriers to data and published papers, and to use digital technology to more efficiently disseminate research results. Advocates for this approach argue that openly sharing information among researchers is fundamental to good science, speeds the progress of research, and increases recognition of researchers. Panelists: Jean-Claude Bradley, Associate Professor of Chemistry and Coordinator of E-Learning for the School of Arts and Sciences at Drexel University; Barry Canton, founder of Gingko BioWorks and the OpenWetWare wiki, an online community of life science researchers committed to open science that has over 5,300 users; Bora Zivkovic, Online Discussion Expert for the Public Library of Science
'''

Friday, March 27, 2009

Call for Reviewers to The Python Papers

Hi all,

This is a call for assistance from the Python community to register as a potential reviewer for The Python Papers. We are currently unable to process our academic articles as effectively as we would like due to a small list of reviewers. It would be wonderful to have some more people register with ojs.pythonpapers.org and flag their availability as reviewers for our academic stream. You can flag this in the "reviewing interests" section of your profile.

Reviewing a paper need not be intimidating or difficult, and is a great opportunity to contribute back towards the community in the form of supporting academic work which involves Python.

For those who do not feel they have an academic bent, we also make use of our list of registered users when exploring articles in the technical stream.

Thanks very much,
-Tennessee Leeuwenburg
(Co-Chief Editor, The Python Papers)

Friday, March 13, 2009

Wheelchair tech

I was walking about my city recently, and observed a woman in a wheelchair using some form of touch-screen interface, on a screen which was integrated to the chair. Further to the mobility difficulties, she appeared to suffer from a more generalised difficulty with effective use of her limbs. I do not know what she was doing; nor what the functionality of the interface was. Here is what I think *should* have been on it. What do you think?

1.) Integrated GPS system displaying directions overlaid on a map
2.) A large-buttoned interface tied into a telephone dialing system, allowing ease-of-access to Taxi services, personal contacts and any relevant emergency services that may be required
3.) If necessary for safety, a GPS tracking device. For some individuals with a carer who is partially responsible for their wellbeing, the capacity for that carer to easily contact and locate the impaired individual could substantially increase freedom of travel
4.) Additional geographic information systems covering accessibility of local public transport; accessible footpaths and eating establishments; accessible public toilets and other facilities

Any other suggestions for this somewhat naive list?

Friday, January 2, 2009

Watermarks, signatures and security

I just had this idea... I was thinking about computerising organisaitonal processes which require an approval signature. Suppose manager X needs to approve request R. If he has a drawing tablet, he could digitally sign a document -- that is to say associate it with the document either by overlaying it or attaching it. However, that's somewhat susceptible to hacking by taking an image of a signature and associating it with some other document. That is to say, it's hard to guarantee that the associated signature is 100% confirmed on the document being signed.

Supposing, instead, that I had software which did this:
1.) Take signature input from drawing pad
2.) Generate a hash from the signature image
3.) Use the hash as a salt for a watermark
4.) Create a watermarked document
5.) Overlay the signature image onto the watermarked document image
6.) Lodge the image hash in some collision detection database

It seems to me like I should be able to encode the signature hash into the watermark document somehow. Then, it seems like I should be able to guarantee that the watermarked document was created using the signature that's on the document.

A forger/hacker could not then take the signature and do the same thing to another document, because the duplicate hash would then get picked up by the collision detection database. Further, each colliding document could be identified by its watermark and signature hash. All anyone using the system need to is check that the document is watermarked. The system guarantees that all watermarked documents are properly authorised.

Should a *real* hash collision occur, the authorising supervisor could just re-sign the document.

This relies on every signature being subtly different. That is, the tablet must have quite high resolution to capture the slight differences of pen pressure and letter shape in each signature. This doesn't seem beyond the realms of possibility however.

And, once all it said and done, there exists a system where:
* The approving manager can just sign digital documents completely straightforwardly
* All authorised documents are stored and tracked
* All digitally watermarked documents have a full history

What does the blogosphere think?