Idiomdrottning’s homepage

Digital Rain

Still from the Matrix (1999) opening sequence

I’ve always wondered about what a goofy world it’d be if the Wachowskis were right, and code encoded in base124 represented as mirrored kana & numbers and each execution thread running downwards like raindrops separately on a flourescent CRT were the most efficient way to inspect running processes.

Obv I don’t think that the encoding is syncopated relative to the represented symbols.

There’s, most likely, an N:M ratio of symbols in the underlying process language to symbols in the display language, where either or both of N and M may be variables but they are both integers and each underlying string of process symbols have one unique set of display symbols.

Uh, that got a bit abstract. A more concrete counterexample…

The base64 representation of utf-8 text gets syncopated such that 4 display symbols is 3 process bytes, and each underlying symbol may be anywhere from 1 (in the case of the 7-bit alphabet) to 29 bytes (in the case of some big emoji). The underlying symbols don’t match up at all with symbols in the display layer.

Even if every symbol is within the 7-bit subset, there’s syncopation.

Uh, let me try to phrase that again.

Let’s say we have this sentence:

“Hi, nice to see you!” in base64 is “SGksIG5pY2UgdG8gc2VlIHlvdSE=”, while “Hey, nice to see you!” is “SGV5LCBuaWNlIHRvIHNlZSB5b3Uh”—it’s not gonna be fun for human brains to pattern match or learn to “read” base64 data on the fly especially when raining down the screen.

So back when I first saw The Matrix, I was like “wow, they’re pretty baller to be able to decode this in realtime”—I was in high school and we had just learned to turn two hex symbols or three octal symbols into one byte symbols or eight bit symbols, and I imagined the Matrix dorks to do something similar except instead of base 16 or 8, it was base 124. (Or I guess base 66 if there’s no hiragana.)

That’s what I thought all these years but today I realized that more likely that’s not how it works. More likely it’s the case that sets of process symbols turn into sets of display symbols and vice versa in a more consistent mapping. Like APL basically or like any other pretty-printer.

You see three symbols and you can “read” what that triplet of symbols always mean (of course, syntax means that the surrounding context matters, but the surrounding context is also “readable”). Or one symbols, or four. Maybe it’s variable arity (like how the interned visual representation of Lisp symbols have variable byte lengths) or maybe it’s sorta like opcodes and there are only a few possible instruction widths, but, either way, it’s likely not that they use a syncopated base encoding.

I mean yes, the, hmm, I don’t wanna spoil the movie or the nature of the antagonists, but let’s just say that in the context of movie it’s likely that they do pass binary data around. But the dorks aboard the Nebuchadnezzar would most likely “decompile” for their displays rather than looking at a syncopated base rep, is what I’m saying.

And, for some reason, they really dig on the mirrored numbers & kana to represent that.