Idiomdrottning’s homepage

A simple mess

It often feels like many programs are about converting from one format to another. It was true in the seventies and it’s true now. From database to web, from app user input to file, from code to binary, from JPEG to PNG.

Identify the formats you have, and their structures, and figure out a straight-forward way to convert between them. They figured this out in the seventies.

I accidentally did something right when I designed the call-tables to be able to access the underlying hash table directly, both for reading and writing.

It’s rarely needed but when it is it saves a lot of effort.

“But, that’s messy? But, that’s a leaky abstraction?” That’s the point, yeah. Because the job of my program isn’t to be a new hard format. A call-table is just a convenience closure to make working with hash tables a li’l easier in the world of functions. It’s not meant to be an entire new be-all, end-all data structure.

This is also something people keep getting wrong about Markdown as originally presented. Markdown isn’t a format. It’s a convenience tool that helps you write some of the boringest and commonest parts of HTML easier, and you can easily drop into more wonky HTML at any time.

That way, call-tables and Markdown both avoid a common mistake we programmers do—I’ve done it too, a bunch of times—is introducing unnecessary intermediary formats. Full formats, with corner cases and completeness and specs and flowers and a wedding dress.

In CSS — Back to Basics, I wrote:

Ever see those puzzles for kids that are like a list of things on the left like “Donald, Superman, Kirk” and on the right you’ll see a list like “Spock, Daisy, Lois” and you’re supposed to draw lines between them?

Those lists are neat on their own but when you draw the lines they cross each other and look all tangled and messy.

Tangled lines connect Kirk to Spock, Donald to Daisy, and Superman to Lois.

That’s the job of those lines, though. They are tangled and crossed and crooked and overlapping but they keep the lists themselves neat.

Semantics and presentation is two formats and between them it’s OK that there’s a mess called CSS. That’s its job.

The trap we have a tendency to fall into, sometimes not even consciously, is to be like “hmm, I have a problem. This format of [queries/​widgets/​templates] is so messy, if only there was a simpler format of [objects/​trees/​functions] on top that I could interact with instead. I’ll make one.”

And now we have two problems. We keep introducing format after format after format.

We have this tendency to want our functions to be simple and clean so we make the formats messy, and I’m thinking that might be backwards. We’re just procrastinating the actual solution.

“I need to turn A into B. Well, there’s this clean, easy transformation loop that’ll get me to A.5, and then there’s this other beautiful nice li’l function I can make that’ll turn that into A.75, and then, ooh, this was super hairy but if I call the entire team in and we struggle with an all-nighter we can kinda sorta maybe turn it into A.875, and then…”

Instead, recognize that the formats you’re dealing with are what they are. Sometimes you work on W3C or whatever and it’s your job to shape the web, but for most apps, it’s not your job to change the formats so they make sense. It’s just to move data between them.

That’s not to say that we never should introduce new formats. An obvious exception comes to mind: When you have many-to-many. GCC and Pandoc both do benefit from their intermediary AST format.

So, take heed, library writers.

In one sense, there are three kinds of libraries.

This third kind is kind of fraught.

In one sense, you can claim you’re invoking my “many-to-many” exception: you hope your library will be used in many databases and many apps.

Then it becomes the job of the next poor sucker down the line to move data between that format and the next.