Each year for the last four years there has been a programming contest associated with the International Conference on Functional Programming. The contest runs for 72 hours, is open to all comers and entries may be made using any programming language.
Previous year's contest tasks have ranged from a 3D Ray Tracer to playing a board game in a round-robin tournament against the other entries.
The contest judging heavily emphasizes program correctness with there often being a "torture test" round to try to eliminate programs before the contest proper. Any crash, failure to terminate, or incorrect ouput results in the program being immediately out of the contest.
Thus this contest can't be won by indiscriminate "hacking". It emphasises correctness, safety, and robustness first, rapid design and implementation of sophisticated and clever algorithms second, and raw speed a distant third. A brute force method might do well, but it's likely to be beaten by a more clever algorithm.
This year there were a record 171 entries. The elimination round removed 112 buggy programs, leaving 59 entries for the main event.
The Dylan Hackers believe that the Dylan programming language is one of the best yet designed for general-purpose programming. Unfortunately, not everyone agrees with us (yet).
We are eager to buy and use Dylan products from companies such as Apple or Harlequin but we've learned from bitter experience that the best way to ensure we can always program in the language of our choice is to have the source code for the compiler. And so, as well as working on our real jobs, we all put some time into maintaining and enhancing the Gwydion Dylan system.
There are few of us, and much work to be done. If you find Dylan interesting please consider joining us!
This year the Dylan Hackers team consisted of:
- Andreas Bogk, Germany
- Chris Double, New Zealand
- Bruce Hoult, New Zealand (Captain)
- Peter Housel, USA
As we live widely spread around the world, we coordinated our efforts via the CVS server at the Chaos Computer Club Berlin and the #dylan IRC channel at Open Projects Network. Our thanks go to both those organisations for their continued support of Open Source software.
Our entries were compiled using Gwydion Dylan 2.3.5 and egcs-2.91.66 on RedHat Linux. Various team members use the same Dylan compiler on other x86 Linux's, FreeBSD, LinuxPPC, MacOS X and Cygwin. Chris also used Functional Developer on MS Windows.
The ICFP contest has several important differences from other contests of which we are aware:
- It is distributed, making it cheap and easy to enter (though, strangely, no team that is itself geographically distributed has ever to our knowledge won a prize).
- It is held over a significant (but not onerous) period of time, allowing for quite complex and potentially useful tasks to be solved. Who would have thought a high performance 3D Ray Tracer with a built in scene generation language, procedural textures, and Constructive Solid Geometry modelling engine could be written in three days? A number of teams managed to do just that last year.
- Contestents are positively encouraged to use a wide variety of programming languages and to publish their solutions. It is pretty hard to find solutions to non-toy problems written in different languages so this is a very valuable resource.
- Many programming contests set problems that can be completely solved (at least in principle) in the time available. Thus far, the ICFP contests have set more challenging problems. Usually it is possible to create some sort of solution in 24 hours (Unix "cat" was a valid entry in 1999 and again ths year), but even 72 hours is not enough to completely conquer it.
In terms of the Gwydion Dylan effort itself, it seems to be worthwhile to enter the contest for a number of reasons:
- Build team spirit with a concentrated effort on a task of limited duration and scope.
- Create a freely-distributable Dylan program of significant size for use as example code. The ICFP2000 Ray Tracer task in particular makes a fine sample program for the Gwydion Dylan distribution.
- Exercise and stress test parts of the compiler and runtime system that we might not normally hit. Last year we discovered a problem with the compiler's lexer. This year we found that the read-to-end() function in the I/O library exhibits O(N^2) performance on large files and that this may be influencing the speed of the compiler itself.
- Get an idea of how our compiler stands relative to the implementations of other languages. Of course this is masked to some extent by the choice of algorithms and by programmer skill, but you do tend to get a number of entries with similar strategies and algorithms.
A major design goal of Dylan is to produce a language in which complex programs can be rapidly prototyped in the same way as with dynamically-typed languages such as Smalltalk or Lisp or "scripting" languages, while at the same time enabling performance comparable to statically typed languages such as C or ML. This is acheived by the selective use of type declarations and "sealing" so that there is a seamless transition between quick and dirty prototyping code and delivery code.
So we should be able to do well in a time-limited contest situation.
- Get a little more visibility for our favourite language. And if we happen to win, get a lot more visibility :-)
- So far, no dynamically typed language has ever won a prize at an ICFP contest. In particular, no Lisp family language has ever come near a prize. The results have been totally dominated by statically typed languages such as the ML dialects, Haskell and the C dialects. It would be nice for us if Dylan was the first to prove that static typing isn't necessary for high performance, correct, programs.
- Have fun!
This year the task turned out to be optimizing an HTML-like language for minimum size. On the whole it was quite a simple version of HTML with boolean attributes for Bold, Emphasis, Italic, Strong and Typewriter, and multi-valued attributes for underline level, text size and colour. There were however a number of tricky points that neeeded watching:
- The <EM> tag toggles the Emphasis attribute rather than simply setting it.
- The underline level clamps at triple-underlining
- There is a Plain tag that quickly resets everything except text size and colour to the initial settings without having to close other attribute tags first
- The Strong and Emphasis tags interact
- Unlike the other attributes, you can switch directly to the desired text size and colour without closing the previous tag. However the initial values are not specified so there is no way to return to them without closing all open size and/or colour tags.
- only Typewriter, Underline, Size and Colour are important for whitespace, and Colour is only important if the spaces are underlined.
We immediately realized that there were two totally different reasonable approaches to the problem:
The tree-rewriting option seemed like it would probably be quite fast and perform well on typical documents, but the synthesis option seemed likely to hold out the possibility for a much bigger win, especially on the sadistically pessimised documents that we expected the judges to throw at it. On the other hand, there is the possibility of not being able to come up with something even as good as you were given in the first place, thereby running the risk of being eliminated for "stupidity".
- Perform pattern-matching based rearrangements of the document's tree structure
- Discard the existing tree structure and synthesise a new, hopefully better, structure
We quickly decided to go for the big win, with a backup option of outputting the original input if it was better or if we ran out of time. The overall strategy was to be:
- Read the whole input into a string and keep it in case we couldn't do any better
- Parse it into an internal data structure consisting of an array of attributed text runs, simulating the effects on the document state so as to perform space compression and redundent tag elimination virtually for free.
- Synthesise a document from the parsed structure using a simple one-pass deterministic "greedy" algorithm that tries only one option at each stage, guided by simple heuristics. Keep the result if it's smaller than the original input, otherwise throw it away.
- Run a more sophisticated optimizer, possibly trying multiple optimizers and/or iterating until time is almost up.
We finished the first three steps in the first 24 hours, thus giving us a plausible "Lightning" entry and leaving the last two days to write and test more sophisticated approaches. Once the basic framework was in place we decided to have some internal competition and Andreas and Bruce independently wrote their own optimizers with different philosophies.
In the end we submitted four entries:
- A lightning entry, submitted towards the end of the first 24 hours. Unfortunately a bug was created in previously working code at the last minute, resulting in the EM tag not toggling properly.
The optimizer works by making a single pass through the list of attributed text-runs, opening or closing tags to get to the desired state. It is totally deterministic, trying only a single option at each point, guided by some simple heuristics. I think it's worth showing the optomizer code here, in full.define method generate-output(input :: <stretchy-object-vector>) => strings :: <list>; let state = make(<generator-state>, text-runs: input);Basically, it tries pretty hard to find an excuse to close the most recently opened tag or tags, and when there are no more plausible tags to close it opens the needed additional tags in some random order.
while(state.remaining-text-runs.size > 0) check-timeout(); let from :: <attribute> = state.from; let to :: <attribute> = state.to;
if(#f) print-state(state); end if;
local method can-pop-to-attr(name) from.name ~== to.name & member?(to, state.attribute-stack, test: method(x :: <attribute>, y :: <attribute>) x.name == y.name; end); end method can-pop-to-attr;
let pop = (state.attribute-stack ~== #() & state.attribute-stack.tail ~== #() & to.value = state.attribute-stack.second.value) | (from.bold & ~to.bold) | (from.italic & ~to.italic) | (from.strong & ~to.strong) | (from.typewriter & ~to.typewriter) | (from.underline > to.underline) | (from.font-size & ~to.font-size) | (from.color & ~to.color) | can-pop-to-attr(emphasis) | can-pop-to-attr(font-size) | can-pop-to-attr(color);
if(from.value = to.value) // same attributes .. just output the text emit-text!(state); elseif(pop) pop-tag!(state); else do(curry(push-tag!, state), applicable-tags(state)); end if; end while;
while(state.open-tag-stack.head ~== #()) pop-tag!(state); end while;
reverse!(state.output-tokens); end method generate-output;
Clearly there are plenty of cases in which this is going to produce far from the optimum result, but in practice it does surprisingly well from a very small investment in time and in particular it all the examples in the specification document that do not require the use of the "plain" tag.
- We thought the lightning entry was actually pretty good, and that it was worth finding out how it might have done, had it not contained the silly error. So we submitted a fixed but otherwise unchanged version in the main contest under the name "Dylan Lightning". It appears that its only serious competition in the lightning division would have been Erlightning.
- An entry under the name "Dylan Hacker" (it was supposed to be "Reanimator", but the submission process got rather confused in the last seconds. Never mind.) This version uses an optimizer written by me (Bruce), with fallback to the "lightning" optimizer in case of runing out of time, an error being thrown, or the result being worse. Interestingly, this fallback was actually activated on the "006-exhaustive.txt" test. On this 1.2 MB file Dylan Hacker takes 38 minutes on a 700 MHz Athlon. Presumably on a 1 GHz machine (such as the judges', if Intel is comparable to AMD) it should take about 28 minutes. The timeout set turned out to be 30 minutes, and the Dylan Hacker result was identical to the Dylan Lightning result. It must have been only seconds away from producing its output -- which would have been within three bytes of the much better Haskell Carrots result on this test.
Such are the hazards of an all-or-nothing approach.
The algorithm is a quite simple generalization of the one used in Dylan Lightning. It makes a single pass through the file, but tries a number of possible outputs in parallel. Where Dylan Lighting just uses the first reasonable thing it finds, Dylan Hacker tries to follow all possibilities. Of course this requires exponeential time and space in general, so there are several levels of pruning of possibilities.
- First, not all possibilities are actually tried. I'm still not sure if this is a good thing to do -- some ultimate compression is lost, but there is some speedup, and I don't know where the balance lies.
- Second, I imposed an arbitrary limit of 2000 "moves" generated from any given point in the text. This is not triggered very often except on large files. It can result in the best moves being mistakenly discarded, although I tried to generate them in a plausible order.
- Third, after move generation, there is an attempt to discard redundant possibilities.
- if two different sequences of operations leave you at the same point in the text with the same set of open tags then the future options are identical and you can forget the one that has generated the longest text output so far -- or if they are the same length then discard one at random. This requires a sort O(n.lg n) followed by a linear pass.
- find the shortest output to the current point in the text, including closing all open tags. Then check if it is possible from this shortest state to close all open tags and open a completely new set to get into the same set of open tags as some other state and still be shorter than it. This requires two O(n) passes.
This is actually a compromise filter, developed near the end of the contest. I had developed a much stronger test which could find the minimum number of tags to close and open instead of opening and closing everything (see can-jaunt-state?() in the source code), but I found that in practice this trimmed very few additional possibilities while taking much longer to run.
- Sort the states into a (hopefully) reasonable "goodness" order so that the moves most likely to be good are generated first in the next round.
- Delete all but the first 1000 states. Again this can result in wrongly discarding good positions, but it's necessary to keep the RAM usage and processing time down.
The idea of this combination of lossless and lossy filters is to try to follow the entire exponential tree of possibiliites as long as this is cheap in time and space, and then to (hopefully) gracefully degrade to a mode in which the processing time is purely linear in the length of the input file.
I choose the arbitrary fudge factors (2000 moves generated, and 1000 moves kept) by experimentation using two guidelines:
- I let the program run on simple input files (up to and including 000-example.txt) with no constraints at all, and then tried to keep the contraints loose enough to still find that best result.
- I made a best guess about what time limit the judges might impose on large and difficult inputs, and tightened up the constraints until the program would meet those time limits.
Having two parameters to adjust made it easier to meet both guidelines at the same time. The second guideline was the riskier one, and I obviously got it just a little wrong with the 006-exhaustive.txt file. Fortunately, there was the quite acceptable Dylan Lighting result to fall back onto. One otherwise pretty good team (Equipe-de-merd) managed no compression at all on exhaustive, presumably because they ran out of time and their only available fallback was to the original input.
- An entry under the name "Beam Search". This version used an optimizer written by Andreas, once again with fallback to the Dylan Lightning optimizer in case of trouble.
Once again, this optimizer makes a linear pass through the text generating all likely moves and then pruning all but the best. In this case, however the pruning is severe with initially only the best eight states kept at each point in the text. It is then re-run with larger and larger "beam widths" until either time runs out or else there is no point at which pruning was necessary during a pass (in which case the optimum has been found).
The obvious question presents itself: why so many versions?
I think it is good when you can afford it to have multiple competing versions within the team, and then submit the best one. But in this case we really weren't all that sure exactly how the judges were going to trade off speed and compression level in their evaluations. Should we go for something that gets quite useful compression in never more than a couple of seconds? Or should we always use all the time and try for the absolute best compression? Or should we put in a fixed but significant level of work, hoping to get within a few percent of perfect compression but without using up all the time?
We couldn't guess what the judges really wanted so we did all three.
My own guess is, naturally, that Dylan Hackers' strategy of trying hard but not too hard is the right one. But I may well be quite wrong... Here is an evaluation of how the results would go if I was the judge.
Everyone else was tied up at work and/or asleep for the first part of the contest so it was left to me (Bruce) to develop the data structure definitions and write the parser.
Andreas wrote the Dylan Lightning optimizer, ably assisted and advised by Chris.
Peter wrote a random test case generator, and I wrote a validator so we could have fast automated test instead of relying on the judges' web page validator.
Andreas and I each wrote our optimizers. Chris popped in from time to time to help with running test cases, tracking down a few bugs in Harlequin's/Functional Objects' superior IDE debugger.
Andreas and I wrote the bulk of the code but the help from the others was invaluable in saving us time.
It would have been nice to also have an optimizer using a totally different technique, such as Dynamic Programming. I didn't know what that was before this contest, but in a way I'm glad because none of the teams that have said they used DP managed to get finished during the contest. They turned out some pretty awesome programs and results completed after the contest. I'm especially impressed by Tom Rokicki's effort.
I realized in the closing stages of the contest that now that I had large files going in linear time, I could easily predict how tight time was going to be and lighten or relax the arbitrary fudge factors accordingly. This would (hopefully) prevent using all the time to no avail as I did on 006-exhaustive.txt, while also giving a better chance of finding the optimum solution for smaller files where there was plenty of time. The settings I actually used resulted in a maximum RAM use of about 15 MB (it simply took too much time to proceess more data than that on large inputs), so I would probably have made initial parameters limiting RAM usage to a 100 MB or so (we know the judges' machine had 256 MB RAM) and tighten that down if there appeared to be a danger of running into time trouble.
I realised right in the last minutes of the contest that it was silly to be generating all permutations of the tags that needed to be opened at a given point and then pruning them later. It would be much better to keep a single data item saying "open these tags, in some order" and then only choose a concrete order when you later discovered which tag you wanted to close first. If you need to open four or five tags then this could result in a speed-up of a couple of orders of magnitude.
Must remember next year to leave time to optomize the code itself. Speed counts!
We'd like to thank the organizers for running such a fun and interesting contest.
We'd like to thank the great people (formerly) at Apple, Harlequin and CMU for designing such a great programming language, and to CMU for donating their implementation to the Open Source movement.
We'd like to thank the other contestants for providing such great opposition. We'd particuarly like to thank Camls R Us for running the contest this year so we didn't have to try to beat them.
I personally would like to thank my partner, Svetlana, for keeping me so well supplied with food and drink and so scantily supplied with interruptions during the contest.