KMWorld 2025 Themes and Trends
“But I’m a creep / I’m a weirdo / What the hell am I doin’ here? / I don’t belong here.” – Radiohead, Creep
I attended the KMWorld Conference and two of the four co-located sub-conferences, Taxonomy Boot Camp and Enterprise AI World, in Washington, D.C. last week. It was the 20th Anniversary of Taxonomy Boot Camp, and although I raised my hand when asked if I had attended 10 or more, I had never actually bothered to count how many years I’ve been to the conference. It turns out, I have attended KMWorld and the co-located conferences for 13 years out of the last 18, with my first being the 2008 Enterprise Search Summit West when it was still hosted on the West Coast in San Jose, California. Over this span of time, I’ve seen trends rise and fall, fears realized and alleviated, promises promised and promises kept.
Every year there seems to be at least one theme I can carry away from the conference, and for several years, I’ve written about the themes and trends from the conference as I see them. Some of these blogs are hosted on the Synaptica website and on my personal blog at Information Panopticon. I will go into some of those themes and trends as I did last month in my recap of the Henry Stewart Semantic Data Conference and HS DAM.
After many discussions at the event with colleagues in the industry, I can confidently state there was one overarching theme: things are weird.
Things Are Weird
What’s weird in 2025? Seemingly, everything. While the wider world may be as weird as it’s ever been, the state of knowledge management, taxonomy, and related fields covered by the conference have been supremely weirded and mostly by AI. There has been a rapid shift from feelings of dread that AI is coming for all of our jobs to an embracing acceptance that those who use AI as a work tool will maintain their place, or even excel, in the industry.
Despite this acceptance, the industry is still weird. In the world of taxonomy, we all understand and believe that AI and its need for clean, curated, semantic data will necessitate good taxonomists and taxonomy systems. However, while the bottom hasn’t completely fallen out of the market, there seem to be fewer jobs, more unemployed taxonomists, and lower salary and contract rates. Maybe it’s the generalized uncertainty in the tech sector driving the slump, but many people I spoke with at the conference think that it’s a misunderstanding of AI foundational needs that’s causing companies to lay off taxonomists and invest in AI more heavily.
What else is weird is how excited everyone is about AI but how few companies have managed to successfully bake AI practices into their processes. We’ve seen the incredibly successful use of AI in text summarization and generation, particularly when it comes to meeting minutes and the identification of who said what and what action items were decided upon. Agentic AI has had some success, and taxonomists seem to be working at skilling up in prompt engineering using already existing taxonomies and ontologies and generating new taxonomy options from generative AI prompts.
What was very weird for me is when you hear people in taxonomy talking about the auto-generation of taxonomies who (including myself) for years said that the results of autogenerated taxonomies were mediocre at best, you know there’s been a seismic shift in the industry.
AI in Everything, Everywhere, all at Once
Like every other conference I’ve attended this year, the focus is on AI. Something that was different this year, in my opinion, was the amount of more practical applications and use cases that were presented. While AI still hasn’t found its stride when it comes to maximizing ROI in a variety of use cases, there were very informative sessions on what skills knowledge management and taxonomy specialists need to master in order to work in an AI world. I have seen this reflected in the job market. There are still taxonomy and ontology roles being posted, but there are many more requiring technical skills like Python as part of the role. In essence, companies seem to be looking for people with both the business skills to build and manage semantic models as well as the technical skills to implement them in AI applications. While I know these people exist, I don’t know how common they are. People at the conference felt that there will always be a need for taxonomists who do not have deep technical skills, but an acknowledgement that getting skilled up in prompt engineering is a differentiator.
Several presentations in both the KMWorld and Taxonomy Boot Camp conferences focused on AI readiness. What does an organization need to appropriately identify AI use cases, guardrail the data within the organization so only what is appropriate is available to AI tools, and determine how humans fit into the process? Since generative AI can extract and generate metadata candidate concepts, it’s important to understand the results to interpret their trustworthiness and value. Many of these concepts are net new and should be added to taxonomies as concepts, properties, or relationships. Other values are net new but are ephemeral or trending and can be used in search, for example, but aren’t yet established enough to be codified in semantic models. Other concepts will already be in the taxonomy and can be discarded unless they are used to inform which content includes taxonomy values and should be tagged appropriately. Finally, some generated concepts are garbage or nonsense and should be used to inform and improve the machine learning model. All of these questions need to be answered and processes decided upon and put into practice.
Another common theme was the use of taxonomies and ontologies to inform and refine prompts and to support Retrieval Augmented Generation (RAG) with or without labeled content as part of a knowledge graph. Taxonomies and ontologies supply the domain specificity required to augment large language models (LLMs) to inform them of the specifics of the organizational perception. In short, using internal semantic models is a shortcut to providing the book of knowledge about the organization. What was once a topic in only a few presentations seemed to be much more prevalent this year.
As I mentioned earlier, organizations are now in the phase between conducting proof of concepts on AI in the organization and the need to productionize and provide ROI on AI in organizational workflows and processes. If the AI bubble is stretching, the continued low adoption and ability to prove ROI for organizations will cause the bubble to burst. The AI industry will only continue to grow and expand if there are tangible benefits increasing the bottom line of organizations who have adopted the practices and scaled them into production.
Taxonomy, Data Governance…and Where Is the Data?
At least two presentations focused on the intersection of semantics and data governance. These both stood out to me because I have experienced the intersection of semantic models and highly governed data in data lakes myself as there are frequent conversations about who owns which data and what is the ultimate source of truth. How do internally developed taxonomies and ontologies fit into the overall data governance model? Who owns which values in which systems? How are the practices similar and different?
In his keynote, Malcolm Hawker simplified two key aspects in the different disciplines. In data governance, one of the main concerns is accuracy and precision for measurement. In taxonomy, the focus is on meaning, or semantics. Bridging the two disciplines is the bringing together of measurement and meaning. AI is a motivating factor in reconciling the two disciplines due to the need for clean, structured data and its use in tagging unstructured content. Although the approaches to relational database focused data management and graph database semantic data are different, they are complimentary functions ensuring the best data quality in an organization.
Beyond who creates and owns the data and in which system it lives, organizations may be trending towards moving back to “walled gardens” of data to use in AI. The difference between public tools like ChatGPT and using LLMs within an organization puts the emphasis on what data is shared with the model and which users can see it. Since many AI tools use publicly available information on the Internet, organizations may protect their data and bring more of it in house so it is not available to such tools externally. Similarly, the concept of having a sovereign cloud, in which data is housed in the cloud but designed to meet regional data requirements and regulations, is probably gaining traction. Where there have been so many efforts to make data and content freely available, there is also recognition of which data should be available to machine learning models both to produce reasonable outcomes and to protect organizations and individuals. As states, provinces, countries, and regions pass additional laws about data, privacy, and AI, parsing out which data is subject to which regulations is becoming more difficult. Separating data in sovereign cloud environments may be a key to addressing this challenge.
In Summary
Things are weird, but the foundational value of semantics has not disappeared. The focus has changed and the toolsets we use to create and productionize semantic models have expanded, but the need for clean, accurate, source of truth metadata has only grown in importance.
