Idiomdrottning’s homepage

Against Tree Taxonomy Models

It’s easy to get the impression that a tree hierarchy is the solution to the lumper vs splitter problem.

You know, how a splitter would wanna put Bach and Sabbath in two different categories (“baroque” vs “rock”), while a lumper would put them in the same category (“music”). A common solution to their dilemma is to use a tree model. Using sub-genres. Putting “baroque” and “rock” as separate sub-genres of music. And then you find a third band, Metallica let’s say, and the splitter wants to put it in a separate category from Sabbath, while the lumper is saying “whaddayamean? You already made the ‘rock’ sub-genre?”. Again, tree branching to the rescue: make more sub-genres under rock: “heavy rock” for Sabbath and “trash metal” for Metallica.

And that’s the model a lot of us are stuck with, from school. But a tree is not enough. (Maybe this post is gonna get a li’l bit timecubey here.) Like, a barbershop, a post-rock quartet, and the Beatles would all fall under three different branches in the music genre tree if sorting by instrumentation and intent, but they do have the property of “having four band members” in common. Similarly, a butterfly, a sparrow, and a fruitbat are all completely different in the tree of biology but they all can flap wings to fly.

Instead, have a data structure where you can record specific traits as tags and then you can do operators on that, such as creating sets of traits and organizing it in various trees according to the sets-of-sets you find. You could generate a music tree from “orchestra” vs band”, and then band as “quartet” vs “trio” vs “duo” etc. Or whichever way you want to slice it today.

Of course, that’s what modern biology does do, since it’s recording the genome.