Optimize keyboard layout

Other languages

I'm sure you, too, know the history of the Sholes ("QWERTY") keyboard, that was designed to avoid that too fast typists jammed mechanical typewriters. You would know also that the Dvorak layout is cosidered by many to be faster and more effective.

When I tried to learn to use the Dvorak keyboard, a friend of mine asked me: «Why don't you write a program to measure your keyboard usage, and then obtain a personalized layout?». In a moment of crazyness, I decided to do just that.

The first program is freq.pl, which reads the files given on the command line and produces a probability matrix, writing it into the file whose name is in the $MATRFN variable, default /tmp/freq.matr. Actually it sees the text as a Markov process over the characters with memory 1, and extracts the transition matrix. Note: the elements are frequencies, not probabilities. The optional normalization is left as exercise to the reader.

The second program is freqdump.pl, which is probably useless. I've written it (three or four different versions) to have an idea about the frequencies. Give it a look if you want.

The most useful program (I hope) is optkeyb.pl, which starting from the matrix (as usual, filename hardcoded) and the QWERTY layout searches by stochastic gradient descent a better layout. In other words, it calculates a value for the layout (sum over the pairs of keys of their distance times the frequency of that pair), then tries to exchange two random keys looking for a better result. To avoid local minima (there are a lot) it starts by randomly exchanging $PRE_SHUFFLE pairs, and if it doesn't find a better layout for $STARVATION tries it starts again, after having written the locally optimum layout at the end of the file /tmp/layouts. It uses curses and the corresponding Perl module Curses.pm.

To avoid bad things, like numbers scattered between other keys, it's possible to set into the %locked hash the keys that must not be moved.

To give yoy an idea of the results, after some hours of computation the best layout was:

` 1 2 3 4 5 6 7 8 9 0 ; =
   - x w h t s a l b y ' j q
   z \ [ c i e r u p . ,
    k v f d n o m g ] /

Bear in mind I use a IBM U.S. keyboard, and where the q is, there's usually the backslash/pipe key, which is larger than the others, so I should have locked it...

To have an idea of the optimization, the value relative to the QWERTY layout is 10.190.280, for the one above is 6.797.370, meaning a 34% reduction in the space travelled by the fingers during writing.

DatesCreated: 2003-01-28 10:09:25 Last modification: 2023-02-10 12:45:24