# Another Blog

## A Different Kind of Obscurity

Learning Haskell is probably the hardest thing I’ve tried regarding computers. It requires giving up all previous conceptions about programming and adopting a different, more pragmatic and heavily mathematical approach. I’m currently struggling with basic monads and haven’t even touched on monad transformers, arrows, and who knows what more Haskell has in store for me.

You may be wondering why I find Haskell interesting, despite its initial difficulty to understand (and despite a serious lack of time on my part). The main reason is that, after I have written or otherwise fully understood a code snippet, I frequently find myself seemlessly pausing to admire its elegance and simplicity. I value simplicity above all, and clearly distinguish it from easiness. Haskell is simple in that it is governed by simple and few rules, but it is by no means easy to learn or master. I often find myself asking silly, if not outright stupid, questions on #haskell or #xmonad (both are hosted on the Freenode network). The people there are, however, kind enough not to complain about it.

The `obscurity’ in the tile comes from the difference in Haskell’s syntax, compared to what I have used so far (and continue to use and study). C, C#, Java, Python. I’ve been hyped up about all of these at one time so far, and I still consider C and Python to be useful and not broken. Haskell, however, is different in both philosophy and, inherently, syntax. Programs end up being unusually terse, but pack a lot of meaning in every line of code. I estimate that I can understand about 10% of all the Haskell code, based on what I have seen, but that 10% (maybe even less) I understand into such depths that I can easily explain them to another person, if they have the appropriate prior knowledge. Or at least so I hope.

As to writing programs myself, I’ve stuck to simple ones, emphasizing on understanding the various language features. It’s quite common that a program that I would normally write in an hour in C takes up to one or two days in Haskell (allowing time for daily routine and documentation), but ends up being a lot shorter, clearer and more meaningful than the C version. As such, the obscurity I have mentioned is beneficial — people communicate more efficiently when their language is more precisely defined and they think more efficiently when they needn’t consider underlying details (the last part is valid for all high-level programming languages). Which is not to say Haskell won’t let you touch those details, though.

I also have a code snippet to show off; it is probably inefficient and could easily have been written more elegantly by one with more experience, but I am happy that I made it work. Any suggestions or commentary will be very appreciated. But before the code, the problem (it is taken from my Algorithm Analysis course):

Given an undirected graph $G = (V, E)$ and an integer $k \le card(V)$, find all the complete subgraphs of G that have exactly k elements.

The problem was proved to be NP-Complete, and we solved it using C during a class. Now, the interesting part in implementing it in Haskell was using the Writer monad, something I had never done before. The representation of the graph is by all means faulty (vertices are numbered 1 through nrNodes, there may be pairs containing numbers outside that range, etc.), but the algorithm works for correct input, which is clearly the key element. The terseness and expressiveness of the code still amaze me, although I must reiterate: it’s probably faulty and ugly in the eyes of an expert. The lack of I/O is compensated by excellent interactive support (I use ghci). So, without further ado, the code, in all its (perceived) glory:

```import Control.Monad.Writer

data Graph = Graph { nrNodes :: Int, edges :: [(Int, Int)] } deriving Show

isConnected :: Graph -> Int -> Int -> Bool
isConnected g a b = or [(a,b) `elem` (edges g), (b,a) `elem` (edges g), a == b]

isComplete :: Graph -> [Int] -> Bool
isComplete g vs = let
isConnectedToAll x = all (isConnected g x) vs
in and (map isConnectedToAll vs)

generateCompleteSubgraphs :: Graph -> Int -> [[Int]]
generateCompleteSubgraphs g k = map reverse \$ execWriter \$ f 1 []
where
f n set
| n == k+1  = if isComplete g set then tell [set] else return ()
| otherwise = mapM_ (\x -> f (n+1) (x:set)) (filter (not . (`elem` set)) validChoices)
where validChoices = if n == 1 then [1..(nrNodes g)] else
if head set == nrNodes g then [] else [1 + (head set) .. (nrNodes g)]
```

PS: If anyone knows of a better way to include Haskell source code in WordPress, please let me know.

December 14, 2007

## A busy week

I actually wrote my first lines of code for cspay these days, but they were mere headers for the library I am writing. I started reading parts of the standard and making a rough sketch of what will have to be included in an ods file; the standard is huge, but hopefully things can be simplified to a bare minimum, stock proto-spreadsheet. I am having difficulties deciding what to take for granted and what to expect from the user. libspreadconv has to strike a balance between being easy to use and widely applicable; so, while we have to be able to customise styles, they shouldn’t have to be specified if one wants a plain sheet. I will probably ask my friends for help on the mailing list.

In a totally different direction, I took part in some student things which I can’t really translate in English. One for Physics, about the quantity of information transmitted by measurements (again, translation may have ruined the meaning), and in Mathematics, about the coding of information by neuronal spike trains. The latter was slightly more interesting, but we only had to do a translation — and the professor practically forced it down our throats, but all in all both events were useful and failry exciting (intellectually, mind you).

I also helped RobyC with a very interesting piece of homework: compressing a bmp file into jpeg. Most of the program was already done, including the headers and file input and output, all we had to do was encode the information. This proved difficult because of not having read the homework specification thoroughly enough. We spent hours debugging, with hex editors and all, only to have someone suggest a detail which we had left out. Infinitely stressing, especially since I had an exam the following day, but also very interesting; you kind of get that warm feeling of accomplishment when you see it’s actually a stadards-compliant jpeg file.

Busy as I was, I got into some serious Armagetron Advanced with the boys these days, causing me to see coloured walls in my sleep and to miss this mornings Data Structures course, which I heard was surprisingly interesting. Heading home tomorrow, with that guilty feeling of leaving Roxi behind and skipping two days of school, but also happy I’ll finally get to see my family (6 weeks is apparently quite a lot by my standards). I’ll have a lot of work to do for Numerical Methods and other projects when I return, but I have to get it over with somehow.

Damn anal wordpress added extra line breaks and my text (pasted from vim) looked like shite.

May 19, 2007

## Starting the Yet-to-Be-Named Project

I got rejected for Google Summer of Code, but that was to be expected. As easy as Plan 9 is and as much as I loved working with it at the beginning, such matters ar too serious to be covered within a week. Perhaps next year I will be better prepared in a field; I would of course like to try my hand at it again.

But Razvan (my Operating System Usage teacher in the first semmester) came up with a proposal to write a system that generates spreadsheets for hourly paid course assistants in our faculty. I naturally agreed and a team was quickly formed. We met today for the first time. While the project itself doesn’t sound like much, we decided to do it properly. We are using Razvan’s server for the entire development process (I got a new IMAP email address courtesy of him), including mailing lists, RCS (to be decided), web page, wiki and testing.

Everyone is very excited and we’ve already outlined a brief design: a C library for reading a configuration file and actually outputting the spreadsheets, a (probably PHP, which I’m not happy about) web interface, a minimal, console based program, which will be called by the web interface, and classic, offline programs for both Windows and Linux. We are considering GTK or wxWidgets. The latter looks better, but I’m reluctant to use C++ and it has no C bindings. The spreadsheets will be XML-based, using ODF and Open Office XML, but we’re planning to keep the config file simple (no XML, as parsing it would probably be harder than generating the documents).

For the moment, we still need to find a name, although a mailing list is in place. A wiki and versioning system will follow. I’ve chosen to read up on creating dynamic libraried and to think of the configuration and console program format. Razvan also suggested using a lexer, which is an entirely new concept to me (actually it was until a couple of hours ago).

Amazing how much a little planning can save; we probably would have switched ideas a lot of times and still wouldn’t have come up with a design close to this one. Still, as good as it seems, it will probably be subject to change once coding gets off the ground.

April 17, 2007 Posted by | cspay, programming, rosedu, school | | 2 Comments

## The File Compress Utility — Coolest So Far

So I helped RobyC and D00kie with their Shannon-Fano algorithm implementation this past week and it was by far the most interesting thing I have ever programmed. Only two days for the initial program in C, two more for a basic C# port and another day and a half for a Visual C++ port. The latter ended up very messy, with unmanaged C code that was being called from a .NET interface. Again, the productivity vim and Ion3 offered me was no match even for the awesome IntelliSense (as much as I disapprove of Microsoft, Visual C# is just too darn easy and fun to use).

Most problems with the inital C code were based, as usual, on lack of attention in writing code, although the printf debugger (tm afaik) helped me do away with them quickly. I also got in trouble by forgetting about endianness — I spent at least half an hour looking at a hex dump and cursing because bytes were ordered differently in words. I was plain stupid at that time. But overall, the satisfaction of “diff infile outfile” returning nothing on a 80+MiB tarball was unique thus far. I was especially impressed at my apparent skill with pointers. Maybe the classes help, too.