7off is a Chicken Scheme program that can convert from Markdown to Gemini’s text format, based on lowdown.scm.
It can be used as a library, with a procedure named 7off
that reads
current-input-port
and writes to current-output-port
. There are
three keyword arguments: allow-wack-headers
, default-alt
, and
polish
.
A stand-alone, command-line binary named 7off
is also included.
By default, it reads from standard in and prints to standard out, but
there is also the --input-file
(a.k.a. -i
) and --output-file
(a.k.a. -o
) to set those.
7off --input-file my-snazzy-example.md --output-file my-drab-output.gmi
There are five other options.
By default, it refuses to convert documents where there are skipped
header levels. Use --allow-wack-headers
(a.k.a. -w
) to allow
these.
It also warns when it has flattened any lists, when there are H4 or
deeper, and when reference links references are missing. Use
--disable-warnings
(a.k.a. -q
) to disable these warnings.
Changing straight quotes to curly quotes and consecutive hyphens to
various dashes is off by default. Use --polish
(a.k.a. -p
) to
enable it.
You can have a list of specific URLs to swap out in a file that contains, as an s-expression, a list of dotted pairs where each car is an url to change and each cdr is what to change it to.
Like this:
(("http://boring.and-so-on/here" . "gemini://much-more-inter.esting/stuff-here")
("https://another-quite-boring.tedious/url" . "gemini://much-better-mirror-for.gemini-capsule/here"))
And point to that file with --swap-urls
(a.k.a. -s
).
Finally, the default alt tag for source code snippets is “Code” without the preceding space.
```Code
Use --default-alt
(a.k.a. -a
) to change that.
The plan in the future is to, I dunno, use tags in YAML preamble or something to be able to set specific alt tags, to support arbitrary ASCII art.
Standard Markdown rules: Hardwrap text, and 7off
will softwrap it,
and please use double space at the end of lines where you want to
preserve a hardwrap.
This is also true inside blockquotes.
Nested lists, it flattens (and prints a warning that it did so).
As far as nested blockquotes, it currently does pass them through. The plan is to at least start printing a warning when it does this. Please don’t start giving nested links and nested blockquotes special support in clients. Better practice is to break up the quote into separate blocks, giving attribution to each.
Gemini text supports only one level of list or blockquote.
I realized that Markdown links does have two qualities that Markdown to Gemini translators can make use of.
The first is something that HTML really doesn’t do, normally. It’s a reference location separate from the inline link.
The other is shared by HTML, but a kind of rarely used feature, and it’s a title distinct from the element’s text. Both inline links (a.k.a. “explicit links”) and reference links can have a title.
Putting those two things together, it becomes kind of natural to turn
Hi, my [link] that I just casually mention
[link]: gemini://my.boring/url "I like this link"
to
Hi, my link that I just casually mention
=> gemini://my.boring/url I like this link
I.e. use the reference location to determine where the link line should go, and the link title to determine what it should be called.
So reference links have their Gemini semantics kind of given.
When there is no title, the prose element text is used.
Hi, my [link] that I just casually mention
[link]: gemini://my.boring/url
to
Hi, my link that I just casually mention
=> gemini://my.boring/url link
This means links like this:
Hi, my [link](gemini://my.boring/url) that I just casually mention
In a short text line or list line with just one link, let’s turn the entire line into the link.
=> gemini://my.boring/url Hi, my link that I just casually mention
Otherwise, “extract” the link (keep the prose text in there), and then extracted links can show up before the next header (i.e. at the end of the section), or before the next non-extracted link (i.e. preceding them in the same link list), or at the end of the document.
7off currently supports a much narrower range of markdown than any other markdown to gemini converter I know. It currently doesn’t support any extensions compared to Gruber style basic Markdown.
It doesn’t even support the ``` thing, ironically for something you
want to publish as gemini text. You need to indent pre blocks by four
spaces. (The git repo has a simple Unix text filter to help with that,
anti-backticks.scm
. It’s just a small stdin/stdout toy; pipe to
sponge
if you want to edit in place.)
This is not a philosophy statement on my end—I use the heck out of such extensions when available, and backtick support would solve the alt text problem. It’s just that the upstream library, lowdown.scm, doesn’t support them yet.
That said, lowdown.scm has good support for HTML elements in the markdown text. I plan to develop the support for that further.
Currently, as far as HTML elements go, primarily I properly strip some
of the inlines like <cite>
, <i>
etc, so you can freely use them in
your source document.
I support the <h1>
, <h2>
etc series, <del>
just because I think
it’s cute (it emits the matching number of ^W digraphs to indicate
deleted text), and <table>
with <th>
, <tr>
and <td>
(although
it currently can’t understand colspan).
This version just outputs such tables as tab-separated values. That’s
in one sense a step back from the beautiful Unicode tables that
md2gemini
supports. Hopefully this is more accessible for low-vision
technology until browers can catch up that make it easier to skip pre
blocks.
In the future, I want to also support <dl>
, <dt>
, <dd>
, <a>
,
and <img>
elements.
The biggest flaw in 7off’s markdown support currently is that it, unlike Gruber markdown but like kramdown, requires a blank line before blockquotes. In other words, it won’t recognize this:
Sandra wrote:
> Whaddayamean, I thought Markdown's syntax was inspired
> by how people used to write email in the nineties?
To sorta compensate for this, in a sort of half-thought-through, iffy decision, I decided to remove blockquote-preceding blank lines in the Gemini output. If we can’t have beautiful input, we shall at least have beautiful output.
Note that blank lines are considered part of paragraphs in Gemini text semantics. It’s more idiomatic in Gemini to not need blank lines everywhere.
For example, this is also fine
* non-link lines
=> /page and link lines
=> /home all mixed up
* together
To force extra blank lines, you can use a markdown hr (three or more hyphens on a line).
Taking all that together, you see that I try to be very strict and drab in what I output. I remove inline markup such as emphasis.
The strict and drab version of Gemini text I support here is that way for a reason—accessibility primarily—but this strictness is not something for Gemini clients to emulate.
To restate that: clients should not expect, want, or care about all documents being as strict and nerdy as the ones created here.
For example, clients and scrapers must not care about, or rely on, there not being any wack header levels.
That’s part of the niceness of Gemini, it’s really hard to mess up documents. Four supported line types, and optionally three advanced ones, and that’s it. Seven types.
Documents that don’t conform to the strictness that 7off aspires to are not wrong.
It’s not spec-breaking to put *asterisks* around a word for example, even if that’s something 7off deliberately removes.
The key to a successful protocol language is to make as few and simple demands as possible on each other.
Client writers, please never support, for example, *asterisked* words by highlighting or bolding them (that’d be “embracing and extending”), but, also, please don’t bork on them. It’s just text. It’s just seven line types.
People can do whatever. It’s fine.
This is my third attempt at this; for the longest time I used md2gemini, with some contributions by me (uh, that got lost in their git history somehow) and I started tacking on more and more preprocessing and postprocessing.
Then, I tried making a Gemini writer for pandoc, in Lua. I didn’t get very far with that approach and never put it in actual production.
Finally, I made 7off. There still issues to fix, but, this is something I use in practice on hundreds of pages.
The source code, including a license file (AGPL) and a “Hacking” text (explaining the architecture, the separate parsing passes etc) is available via
git clone https://idiomdrottning.org/7off
The name is sort of a, uh, it’s a reference to Gemini having seven line types and to the original “markdown” name being a pun on discount pricing.