Skip to main content

Life as a Taxonomist

Before I started working in controlled vocabularies myself, I didn’t know the field even existed. Unlike many taxonomists, I didn’t start with a Library Science degree (MLIS). I had a B.A. and M.A. in English and my first career was a teacher. When I got my first job as an Assistant Thesaurus Editor, I didn’t know what the job entailed. That’s where I learned the basics of controlled vocabulary construction and maintenance, and it’s where I got my initial taste of how you can drive a taxonomist crazy.

Now, I understand we’re all just trying to do our jobs. No one, I’m sure, sets out to make their taxonomist suppress their internal paroxysms of frustration. However, as with any expert role, those who know their industry like to share their stories dealing with those who don’t.

For your reading pleasure, hopefully, I have assembled a very click-baity titled list of ways to drive your taxonomist crazy.

1. Everything Is Miscellaneous

I’ve always liked the title Everything Is Miscellaneous, by David Weinberger, though I must confess I’ve never actually read the book. The reason I like the title so much is that it pops immediately to mind when users want long, unwieldy, flat lists of concepts, freewheeling user-generated folksonomy keywords, or categories called “General”, “Other”, or “Miscellaneous”. To be fair, there are taxonomists who create such categories themselves for the concepts that just don’t seem to fit anywhere in a taxonomy. 

I’m of the opinion that if you need a “General”, “Other”, or “Miscellaneous” category that there is a structure missing from the taxonomy or the concept is too “young” to be clearly defined. I don’t see a problem with a placeholder category or taxonomy to hold user-suggested or new concepts which can’t find a home in the current taxonomy. I tend to use something like “Suggested Concepts” as a completely separate scheme to house those concepts which need more research or simply don’t seem to have a place to go in the existing taxonomy. 

Taxonomists will tell you, however, that nothing is miscellaneous. Every concept has a home.

2. Fix this Miscellany

The only role I can think of who might dislike a messy spreadsheet dropped into their inbox more than a taxonomist is a data scientist. I haven’t had this happen to me very often, but when it does, the time spent cleaning data and fixing formatting can be daunting. Someone has downloaded, built, or otherwise collected a list in which everything seemingly is miscellaneous and asks the taxonomist to “taxonomize this”.

Ok, so maybe this is a part of a taxonomist’s job. In fact, many taxonomists might revel in the challenge of cleaning up that mess into a coherent and beautifully structured taxonomy. Understand, though, this is a lot of work!

3. A Miscellany by any Other Name

Taxonomy is not SEO keywording. Not every possible concept variant and misspelling is a synonym. It’s true that taxonomies support search and make it better, but there are often more appropriate places to store search variants and misspellings, such as in the search appliance itself or in text analytics software rules.

Taxonomies can include synonyms (alternate labels) and variants (hidden labels), but brainstorming a list of every possible spelling mistake or concept phrase is not really for taxonomy work.

So, please, don’t send a huge list of variant terms to your taxonomist.

4. What Miscellany Is This?

There is a guiding principle for controlled vocabulary developers called literary warrant. Essentially, ask whether the concept you want to add to the taxonomy exist in the literature. Taking this to a broader scope, is the concept you want represented in print or used as a term of art in your field?

Or, maybe, is it some jargon-y, made-up concept you wish to popularize? Marketing creativity can often be at odds with taxonomy practices by attempting to include ephemeral or ahead-of-the-curve concepts the marketing department wishes to develop into an actual concept. Put these in the “Miscellaneous” bucket and wait for literary warrant to catch up.

5. Every Miscellany Is Special

There are several items which may fall into the “Everything Is Special” category, but I’m going to focus on pre-coordinated terms. In the world of indexing, and, in fact, in many systems which retrieve information, concepts need to be pre-coordinated to be effective.

In most taxonomy applications, however, there is no need for pre-coordinated concepts as multiple concepts can be applied to content or Boolean operators can be used in search for retrieval purposes. Following the notion that “everything is special”, there is also the common feeling that “everyone is special”, including the concepts any given user needs in the taxonomy. The result, in my experience, is requesting specialty terms which needn’t be that special. For example, how do “Marketing Meeting Minutes” differ from “Financial Meeting Minutes” and “Engineering Meeting Minutes”? The fact is, they are all a document type called “meeting minutes”. The group which creates the meeting minutes really should be irrelevant if content is tagged with other concepts such as the group names. The level of specificity, in most cases, is unnecessary.

6. If the Miscellany Fits…

Taxonomists are artists. They have built a beautiful taxonomy and now the real world has rudely intruded on its brilliance. Often the culprit is a consuming system which cold-heartedly doesn’t recognize the glory of a well-structured taxonomy. These systems want to consume taxonomies as flat lists or pathways, strip away semantically rich relationships (SharePoint, what say you?), limit the taxonomy depth (again, SharePoint?), or remove concept attributes. 

I believe a taxonomy should always be built agnostic to the tool in which it is developed or the tools which consume it. On the other, a beautiful taxonomy isn’t worth anything if it isn’t operationalized. Sometimes, you just have to make the miscellany fit.

7. Everything is Everything

Words rarely have just one meaning. The point of taxonomy is to unambiguously identify concepts so we have clarity of meaning. Polyhierarchy is a valid taxonomy construction in which one clearly defined concept occupies several places in a taxonomy structure.

It’s very easy to abuse polyhierarchy by adding a concept in every conceivable location, often with the idea that the concept is a “place” the user wants to go to and, hence, should have multiple ways to get there through hierarchy. I call this the “conceptual end-cap” based on my experience in e-commerce taxonomies in which product owners wanted to sell their products in relation to other products. That’s a fine and achievable goal, but not through mapping every conceivable place a product could live in a hierarchy. Rather, use relationships and faceted taxonomies to get there.

Everything is not everything.

8. Everything Is Everywhere All at Once

I solicited feedback for this blog online on LinkedIn, and learned that another way to make a taxonomist crazy is to request geography-based taxonomies. My interpretation of this is a taxonomy which organizes concepts based on geography, even if those concepts are potentially cross-geography or are regional variants.

An example of this may be a product taxonomy including products which are available in North America and then creating a separate taxonomy or branch for products available in Asia (or even by specific countries). Another example may be including “crisps” in a UK regional taxonomy and using the concept “chips” or “potato chips” in the US taxonomy rather than making them synonymous concepts which are driven by localized web page experiences.

Geography-based taxonomies risk being much like the previously described polyhierarchical structures in which every concept is everywhere all at once.

9. Everything Is Over-Engineered

If you are a taxonomist who has ever over-engineered a taxonomy, raise your hand. I’ll speak for myself when I say “guilty as charged”. 

I was of the mindset that a single, monolithic, deep, and faceted taxonomy was the way to ensure that terminology remained consistent. The pros of a taxonomy of this nature is that it is a singular maintenance job. The cons of this approach are that it is difficult to maintain as structures and concepts are easily buried under layers of hierarchy and risk being paralleled in different sections using a different organizational premise. Additionally, providing proper access permissions to a complex taxonomy can be challenging.

Perhaps the worst thing about an over-engineered taxonomy is delivery. As mentioned above, your epic taxonomy will likely be flattened or otherwise distorted by a consuming system. Creating ways to filter and deliver a taxonomy for use by end users can become a governance nightmare as APIs start requiring different IDs, collections, or statuses to deliver the right concepts to the right UI. While this is achievable and desirable, it can be more difficult with a single, complex taxonomy rather than using shallower, inter-related schemes to get the same result.

When everything is over-engineered, everything becomes more complex.

10. Every Miscellany Is Hilarious

Go on, do it. Make a taxidermy joke. Ask if taxonomy has anything to do with taxes. Nudge your taxonomist in the ribs and ask whether there’s another name for a thesaurus. Send an available job opening for an oncologist to an ontologist. Ask your ontologist about the nature of being.

You may just get an earful of polyhierarchy or an instance right in your class. 

Collection of swirls

When everything is over-engineered, everything becomes more complex.