Information Panopticon

Home » 2026 » April

Monthly Archives: April 2026

Themes and Trends from Taxonomy Boot Camp London

https://pixabay.com/photos/london-tower-of-london-england-4395918/

“And history was the reason why she would never go to London. She saw it as dominated by the Bloody Tower, Fleet Street full of demon barbers, as well as dangerous escalators everywhere.” – Anthony Burgess, Inside Mr. Enderby

After a six year hiatus initiated by the Great Plague of 2020 and continued with a lost conference venue, Taxonomy Boot Camp London returned in 2026 at the America Square Conference Center. The event was co-located with the KMWorld Europe conference, allowing attendees to come together for keynote sessions while attending one of two tracks for each conference. Just blocks away from the Tower of London and literally encompassing part of the ancient walls of the City of Londinium, the steadfast and ancient hosted audience to the rapidly changing world of knowledge organization systems and artificial intelligence.

As with past conferences, I’m going to sum up some of the key themes and trends of Taxonomy Boot Camp as I heard them.

Working with AI

A prevalent theme across all sessions was working with AI rather than against it. While there is common concern that AI will replace jobs and remove the human from the work equation, most sessions focused on using AI as a tool to accomplish tasks that are repetitive, time-consuming, or inconsistent. In particular, using AI to identify and extract entities for taxonomy building or inclusion, summarizing large quantities of text, or automatically classifying content using taxonomy values.

Another key focus was on the probabilistic notion of machine learning models:

At a high level, probabilistic AI models uncertainty and provides outcomes based on likelihoods. This means that it doesn’t always offer one definitive answer but instead provides a range of possibilities with associated probabilities. Deterministic AI, on the other hand, is rule-based, designed to yield specific, predictable outcomes without room for variability once given a particular input. (Decision Point Advisors)

Machine learning models may generate different answers to the same question, often termed a “hallucination”. Grounding machine learning models in knowledge bases, including deterministic models like graphs in the form of taxonomies and ontologies, can create a neuro-symbolic AI approach providing more consistent answers.

Content and Training

Working with AI also means curating the training data made available to machine learning models. Publicly available large language models (LLMs) are trained on easily accessible large data sets…specifically, content available on the Internet. As we well know, content quality on the Internet is as varied as the people who create it and make it available. While LLMs then get the benefit of a variety of input, they also suffer from the biases inherent in that input. Using synthetic training data or retrieval-augmented generation (RAG) to supplement the pre-existing LLM training data can improve results. In particular, using organization-specific knowledge bases of training data can help provide more specific responses applicable to your domain with fewer erroneous answers.

Deciding what training data to use and how taxonomy and ontology structures become part of that training data is partially in the purview of taxonomists, so becoming familiar with which LLMs are being used and for which use cases will be important parts of a taxonomist’s changing role.

Language Consistency

While this is nothing new, many sessions focused on keeping the “controlled” in controlled vocabularies. Since nearly every session linked back to AI in one way or another, even the context within which we consider the basic tenant of control in a controlled vocabulary was emphasized as continually pertinent. With differences in language across one or more semantic models, machine learning outcomes are put at risk. As more areas of application for machine learning are being found, we are also venturing into areas involving more risk, like mental health or medical advice, legal enforcement, and financial decisions. Language consistency used in semantic models and as applied as metadata for training content is now more important than ever.

Among the use cases pertinent to this consistency is tagging documents at more granular levels, including inline tagging and tagging content chunks. Again, this is nothing new and has been a practice of DITA for 20 years. However, being able to consistently and accurately create training data to balance probabilistic large language models with deterministic knowledge can counter machine learning hallucinations and create more trustworthy AI agents.

“Bond”ing

And, although not a taxonomy or knowledge management theme, I did notice another commonality across at least two presentations: James Bond as an example. Perhaps it was the London venue that caused presenters to use our favorite secret agent as an example, but there he was, connected semantically to movies, sports cars, and identification numbers. I myself created a simple Bondtology for illustration purposes in past workshops and webinars. Interesting that a profession like spycraft met at the intersection of establishing deterministic truth through semantic models to avoid being deceived by artificial intelligence.