Idiomdrottning’s homepage

In praise of SXML’s @-marker

So I was talking to some friends about SXML and how much I appreciated the @-list as a solution to the problem that have vexed lispers being confronted with SGML since the dawn of time. What is structurally the difference between attributes and elements? Why are li entries not instead attributes to ol? Why are href attributes not instead elements in the a? What is “written on the box” vs “in the box”?

I was rereading some older texts and it struck me that even though the @ seems like such a perfect solution now, so that it seems weird that we didn’t come up with it right away… it wasn’t always so. Some other completely cockamamie solutions were suggested, like Enamel’s weird “attribute separator symbol”. (Also that text was written before Markdown was common place. We’ve come so far…)

An XML spec editor and good friend of mine then suggested:

It would have been simpler to say that the second Lisp element of an XML element is an attribute-alist that must always be present even if it is (). That allows completely uniform processing.

And here is my reply♥︎

Alist would've broken uniformity if it had been dotted, which elements aren't and can't be. I know the non-dotted version uses additional conses, but that's part of uniformity. (I later remembered that alists aren't necessarily dotted. I'm glad that wasn't my only point♥︎)

Secondly it’d have required some processors to keep additional state in order to detect of they were in the second elem or not. With the a-notation, xpath becomes very elegant. Now, some of my SXML-processing is already stateful in that way, but not all of it is. Even other tree-finding tools… like grab all span is as simple as grab all @.

Thirdly, pre-post-order-splice, which is a thing of wonder, would have to have different semantics since there now is mandatory empty lists in the tree. Not to say “I like what we have and I don’t like change”, just that splicing is a useful albeit non-uniform shortcut when using processing tools.

Fourth, the current definition makes many trees be “unintentionally SXML” which is a fantastic side-benefit. The mandatory element adds a restriction on what is SXML beyond just being a tagged tree.

Lasker said of the board game go (a.k.a. baduk):

While the Baroque rules of Chess could only have been created by humans, the rules of Go are so elegant, organic, and rigorously logical that if intelligent life forms exist elsewhere in the universe they almost certainly play Go.

The game feels more discovered than concocted.

S-expressions (not including any particular representational details, such as parens vs angle brackets vs whatever space letters they have on planet Aurabesh) feel the same way. cons, cars, cdr feel heaven sent. “God wrote in Lisp.” Linked lists and trees share this property and adding a “tag” feels like a very short step. (Even the polyseme with other data structural uses of “tag” rings true, even of the space word for “tag” might be “banana” for all I care.)

Now, the @ is completely arbitrary. It could just as easily have been any other symbol. The placement that the @-list is, if present, the second element is also arbitrary. A tagged tree schema could’ve been implemented where the @ is last, for example. But, that’s true for any other tag. In my worldview I see the @ as almost part of the schema more than part of “what is valid SXML?” For example, li has to have an ol or ul parent, while href has to have an @ parent. And head can’t come after body in an html element. All of those structural semantics are arbitrary but that’s fine.

Inserting a mandatory () in all elements, in a specific position, is mandating decisions that should be schema-level onto the structure level.

In summary: @-lists in SXML are not only awesome but also more uniform in several ways.