Idiomdrottning’s homepage

In defense of ReStructuredText

OK, everyone knows RST sucks compared to Markdown or even org-mode (and that’s a low bar since org-mode’s format sucks. Org-mode is a wonderful app that I love, but the app is good in spite of its format, not because of it), but they can’t help it since RST preceded Markdown by three years and org-mode by two. RST came out before these good formats and it was all they had to work with at the time.

Now, could they have made better decisions like `a link format that wasn't <from://hell>`_ or a header format that doesn’t require monospace fonts? Sure, that would’ve been nice, but since they didn’t, the format is what it is and I believe in “don’t fix what’s not broken”. Trying to overly NIH old formats is one of the big reasons we have such a messy and crufty tech stack.

That’s right! This post isn’t just an excuse to snipe at RST’s design decisions; I’m trying to make a larger point about overly wanting to improve formats and create layers of specs upon specs upon specs upon specs.

RST is part of the Python ecosystem just like POD is for Perl and roff is for manpages, and it’s better that it remains that way than trying to change it.

Sometimes simple and direct is the best.

They correctly point out on their webpage that:

reStructuredText is designed for extensibility for specific application domains

That’s great. They’re not selling it as a universal format.

On the other hand:

reStructuredText is a revision and reinterpretation of the StructuredText and Setext lightweight markup systems.

So they already tried to NIH it! But OK, the damage is done, two wrongs don’t make a right and so on.

But what about Markdown?

I love Markdown, I use Markdown, I write in Markdown. But as I’ve said many times: the reason it’s great is because it’s not a “standard” or a “format”, it’s a tool. The exchange format on Gruber’s blog is still HTML, he just uses Markdown to write that HTML more comfortably. Same here on this page, I publish HTML, I just use Markdown to write it.

But what about MyST?

Jonny wrote in suggesting MyST!

I hadn’t heard of that before, thank you ♥︎

I totally love their approach of bridges, conversions, interoperability between Markdown and RST 👍🏻 but the MyST extension syntax does not look very good 🙋🏻‍♀️

For example:

Some **text**!

:::{admonition} Here's my title
:class: tip

Here's my admonition content.{sup}`1`
:::

I hate writing like that, I’d even prefer writing:

Some **text**!

<aside class="admonition tip">
  <p class="admonition-title">Here's my title</p><!-- Although why in the heck is a title a <p>? -->
  <p>Here's my admonition content.<sup>1</sup></p>
</aside>

When you overload your “lightweight” syntax with arbitrary custom grawlixes (like “::: in this context means an aside and :class: in this other context means adding an attribute” etc) it becomes very hard to remember and use. Some people have zero issue, they love writing languages like that, and can’t fathom why others struggle. But I’ve dealt with enough users who struggled with various wiki formats and “lightweight” formats over the years, including myself among those strugglers, to know it’s a problem for many.

The “passthrough” feature of Markdown was one of its core brilliancies.

XML, SGML, sexps, JSON, ini, TOML, and even YAML (sorta) all are structured. They express a nested structure of explicitly marked up data using a small handful of specially blessed characters. Angle brackets for XML, parens and quoting for sexps, lists and maps for JSON and YAML. (I even think some of these go too far, but that’s just me. Most other programmers can handle it. And I can console myself with paredit and emacs.)

Text is not structured in the same way. Text is a list of characters while these other formats are designed to express tree data.

But MyST tries to add structure with some ad hoc ASCII art shenanigans. “People can remember # headers and * lists and *emphasis* so why can’t we add a few more to that?” Because we can remember a few but not all. We can remember our best friend’s phone number but not memorize the phonebook.

In linguistics there’s this idea of “open class” words and “closed class” words. There can be an endless amount of nouns and verbs and even adjectives, and language still makes sense without any major uncromulent frobnications. But there can’t be a whole lot of new particles and articles like “the” and “a” and stuff, because then foo frotz quux xyzzy baz fubar. That’s what these “I’ll just add ooooone more thing” projects don’t get. Our memories are full already and can’t handle anymore!

Gruber left curly braces free which makes template languages (like Liquid among many others) a match made in heaven for Markdown (it’s also great for “fallthrough” TeX/ConTeXt/Sile). It lets the text just be text and lets us use templating languages to add anything else.

I guess that MyST is not that confusing for people who managed to understand the triple backtick fence thing. I never did. I still use the “indent by four spaces” thing.

Once you’ve understood the triple backtick fences it’s not that big of a leap to also add a directivename in curlies and a YAML block of params. And once you’ve understood that, you might want to start using the shorthand for some of those YAML params.

But then nesting those backticks becomes pretty ugly since you add backticks to the outer forms (instead of what would’ve made more sense, coming from org (or from atx headers), which is to add them to the inner forms). And then they have the colon option which requires a command line flag to even work.

It’s almost as if using backticks and colons for this, as opposed to using a symmetric, nestable pair like [] or <> or () or {} or <foo></foo>, was a bad idea.

I use asides all the time. By typing <aside>. I also use <i> and <cite> the same way, reserving * and _ for emphasis and not for mislabeling all uses of cursive text.

With the kramdown/Textile/RST approach, you have to think <figure><img/> etc etc while what you write is:

![foo](/bar.img "baz"){: standalone loading=lazy }

You’re not thinking “I wanna make an image here! Here we go and make an image in the most convenient image writing syntax possible”, you’re thinking “OK, how can I get Kramdown to generate <figure><img/> etc etc etc.”

I.e. you’re still writing HTML but through a bell jar with oven mittens on and juggling chop sticks, backwards and in heels. It’s not really an “alternate document format”, you still have to think in terms of HTML. And given that, that’s where I choose the Markdown approach of “my .md file is basically a .html file except that I have some sugar for the most common everyday things”.

If we were truly free to design a beautiful document format, that’d be another question entirely, but given that the goal of these texts is to fit browsers, web pages, web page templates, it’s hard to get away from having to think in HTML to some degree. That’s horrible. That sucks. But given that we’re stuck with that premise, that’s where I prefer the original approach of Markdown with plenty of raw HTML as opposed to new special grawlix syntax.

I look kindly towards attempts at making writing full HTML less verbose while keeping all of the full document structure, like Haml or SXML, that’s not what I’m slagging here, those are cool.

It’s the bolted-on nature of “wait a minute, I guess we want a way to make figures also, so let’s see where we can squeeze that in” that bugs me.