Home » Posts tagged 'artificial-intelligence'
Tag Archives: artificial-intelligence
KMWorld 2025 Themes and Trends
“But I’m a creep / I’m a weirdo / What the hell am I doin’ here? / I don’t belong here.” – Radiohead, Creep
I attended the KMWorld Conference and two of the four co-located sub-conferences, Taxonomy Boot Camp and Enterprise AI World, in Washington, D.C. last week. It was the 20th Anniversary of Taxonomy Boot Camp, and although I raised my hand when asked if I had attended 10 or more, I had never actually bothered to count how many years I’ve been to the conference. It turns out, I have attended KMWorld and the co-located conferences for 13 years out of the last 18, with my first being the 2008 Enterprise Search Summit West when it was still hosted on the West Coast in San Jose, California. Over this span of time, I’ve seen trends rise and fall, fears realized and alleviated, promises promised and promises kept.
Every year there seems to be at least one theme I can carry away from the conference, and for several years, I’ve written about the themes and trends from the conference as I see them. Some of these blogs are hosted on the Synaptica website and on my personal blog at Information Panopticon. I will go into some of those themes and trends as I did last month in my recap of the Henry Stewart Semantic Data Conference and HS DAM.
After many discussions at the event with colleagues in the industry, I can confidently state there was one overarching theme: things are weird.
Things Are Weird
What’s weird in 2025? Seemingly, everything. While the wider world may be as weird as it’s ever been, the state of knowledge management, taxonomy, and related fields covered by the conference have been supremely weirded and mostly by AI. There has been a rapid shift from feelings of dread that AI is coming for all of our jobs to an embracing acceptance that those who use AI as a work tool will maintain their place, or even excel, in the industry.
Despite this acceptance, the industry is still weird. In the world of taxonomy, we all understand and believe that AI and its need for clean, curated, semantic data will necessitate good taxonomists and taxonomy systems. However, while the bottom hasn’t completely fallen out of the market, there seem to be fewer jobs, more unemployed taxonomists, and lower salary and contract rates. Maybe it’s the generalized uncertainty in the tech sector driving the slump, but many people I spoke with at the conference think that it’s a misunderstanding of AI foundational needs that’s causing companies to lay off taxonomists and invest in AI more heavily.
What else is weird is how excited everyone is about AI but how few companies have managed to successfully bake AI practices into their processes. We’ve seen the incredibly successful use of AI in text summarization and generation, particularly when it comes to meeting minutes and the identification of who said what and what action items were decided upon. Agentic AI has had some success, and taxonomists seem to be working at skilling up in prompt engineering using already existing taxonomies and ontologies and generating new taxonomy options from generative AI prompts.
What was very weird for me is when you hear people in taxonomy talking about the auto-generation of taxonomies who (including myself) for years said that the results of autogenerated taxonomies were mediocre at best, you know there’s been a seismic shift in the industry.
AI in Everything, Everywhere, all at Once
Like every other conference I’ve attended this year, the focus is on AI. Something that was different this year, in my opinion, was the amount of more practical applications and use cases that were presented. While AI still hasn’t found its stride when it comes to maximizing ROI in a variety of use cases, there were very informative sessions on what skills knowledge management and taxonomy specialists need to master in order to work in an AI world. I have seen this reflected in the job market. There are still taxonomy and ontology roles being posted, but there are many more requiring technical skills like Python as part of the role. In essence, companies seem to be looking for people with both the business skills to build and manage semantic models as well as the technical skills to implement them in AI applications. While I know these people exist, I don’t know how common they are. People at the conference felt that there will always be a need for taxonomists who do not have deep technical skills, but an acknowledgement that getting skilled up in prompt engineering is a differentiator.
Several presentations in both the KMWorld and Taxonomy Boot Camp conferences focused on AI readiness. What does an organization need to appropriately identify AI use cases, guardrail the data within the organization so only what is appropriate is available to AI tools, and determine how humans fit into the process? Since generative AI can extract and generate metadata candidate concepts, it’s important to understand the results to interpret their trustworthiness and value. Many of these concepts are net new and should be added to taxonomies as concepts, properties, or relationships. Other values are net new but are ephemeral or trending and can be used in search, for example, but aren’t yet established enough to be codified in semantic models. Other concepts will already be in the taxonomy and can be discarded unless they are used to inform which content includes taxonomy values and should be tagged appropriately. Finally, some generated concepts are garbage or nonsense and should be used to inform and improve the machine learning model. All of these questions need to be answered and processes decided upon and put into practice.
Another common theme was the use of taxonomies and ontologies to inform and refine prompts and to support Retrieval Augmented Generation (RAG) with or without labeled content as part of a knowledge graph. Taxonomies and ontologies supply the domain specificity required to augment large language models (LLMs) to inform them of the specifics of the organizational perception. In short, using internal semantic models is a shortcut to providing the book of knowledge about the organization. What was once a topic in only a few presentations seemed to be much more prevalent this year.
As I mentioned earlier, organizations are now in the phase between conducting proof of concepts on AI in the organization and the need to productionize and provide ROI on AI in organizational workflows and processes. If the AI bubble is stretching, the continued low adoption and ability to prove ROI for organizations will cause the bubble to burst. The AI industry will only continue to grow and expand if there are tangible benefits increasing the bottom line of organizations who have adopted the practices and scaled them into production.
Taxonomy, Data Governance…and Where Is the Data?
At least two presentations focused on the intersection of semantics and data governance. These both stood out to me because I have experienced the intersection of semantic models and highly governed data in data lakes myself as there are frequent conversations about who owns which data and what is the ultimate source of truth. How do internally developed taxonomies and ontologies fit into the overall data governance model? Who owns which values in which systems? How are the practices similar and different?
In his keynote, Malcolm Hawker simplified two key aspects in the different disciplines. In data governance, one of the main concerns is accuracy and precision for measurement. In taxonomy, the focus is on meaning, or semantics. Bridging the two disciplines is the bringing together of measurement and meaning. AI is a motivating factor in reconciling the two disciplines due to the need for clean, structured data and its use in tagging unstructured content. Although the approaches to relational database focused data management and graph database semantic data are different, they are complimentary functions ensuring the best data quality in an organization.
Beyond who creates and owns the data and in which system it lives, organizations may be trending towards moving back to “walled gardens” of data to use in AI. The difference between public tools like ChatGPT and using LLMs within an organization puts the emphasis on what data is shared with the model and which users can see it. Since many AI tools use publicly available information on the Internet, organizations may protect their data and bring more of it in house so it is not available to such tools externally. Similarly, the concept of having a sovereign cloud, in which data is housed in the cloud but designed to meet regional data requirements and regulations, is probably gaining traction. Where there have been so many efforts to make data and content freely available, there is also recognition of which data should be available to machine learning models both to produce reasonable outcomes and to protect organizations and individuals. As states, provinces, countries, and regions pass additional laws about data, privacy, and AI, parsing out which data is subject to which regulations is becoming more difficult. Separating data in sovereign cloud environments may be a key to addressing this challenge.
In Summary
Things are weird, but the foundational value of semantics has not disappeared. The focus has changed and the toolsets we use to create and productionize semantic models have expanded, but the need for clean, accurate, source of truth metadata has only grown in importance.
Semantic Data 2025 Themes and Trends

“Trust in me, just in me / Shut your eyes and trust in me” – The Jungle Book
I attended the Henry Steward Semantic Data Conference co-located with HS DAM in New York City a few weeks ago. As I’ve done with KMWorld in the past, I’m going to summarize some themes and trends I took away from both conferences, with an emphasis on Semantic Data.
Inevitable AI
The most common theme of both conferences, unsurprisingly, was artificial intelligence (AI) in all of its forms, applications, and impact. Broadly speaking, the key takeaway across all of the presentations and discussions was: this is happening. Whether it’s baked into digital asset management (DAM) systems (hint: it is), used wildly thrown at use cases until something sticks, or carefully governed with strict governance, guardrails to protect the organization, its people, and the people they serve, and measured to understand the effectiveness of different large language models (LLMs), AI is happening. So what do we, as digital asset and semantic data professionals, do about it? What is our role in the use of AI in the organization and in the public sphere? What are our responsibilities?
From the Semantic Data Conference, several themes emerged:
- Organizations are going to experiment with generative AI models to develop workable pipelines with humans in the loop;
- Context is key, and organizations can develop domain-specific and constrained semantic models to be used in conjunction with external LLMs;
- It’s incumbent upon all of us to develop valid, organizationally-specific and curated training data sets to provide machine learning models the context to output reasonable results.
Themes from the Digital Asset Management Conference included:
- AI can speed up the generation of assets and the automated application of metadata to those assets;
- Access to clean, curated metadata is critical, both from taxonomies and sources like data lakes;
- Metadata as a source of truth for embedded AI can lead to better analytics;
- Asset provenance is essential for usage and rights management, especially when AI is involved.
Metadata Is Critical
That’s it. That’s the story. Metadata is critical. It has been, and it will continue to be. But, maybe, organizations are more aware of the importance of metadata because of the lightning fast rise of AI. Metadata is critically important as applied to digital assets, and semantic metadata powers better asset connections, discovery, personalization, and analytics.
Core to the importance of metadata is the importance of trust. Metadata quality must be trusted. The data and content to which metadata is applied must be trusted. Quality, trusted data leads to quality, trusted content and training sets which can feed into AI pipelines. Similarly, legal and reputational risks can be mitigated by ensuring the quality of information and data, especially as applied as compliance and usage rights.
Since semantic models are a source of truth for quality metadata, developing taxonomies and ontologies over time can create more complexity as needed to support a variety of use cases. Complexity sounds like a negative, but the world is complex, and semantic models are meant to represent organizational domains, which are by necessity complex. Complex semantic models support a variety of use cases, even if they do take more conscientious planning, development, and governance. Within these complex models are fit-for-purpose structures addressing use cases.
As with AI processes, developing, managing, and governing metadata in all its forms involves humans in the loop. Even as the identification, extraction, and application of metadata improves with AI, humans need to be involved in the process to add, remove, and quality check automatically applied metadata. As pipeline processes improve, reaching a specified threshold of metadata accuracy may reduce the need for human intervention and review.
Context and Trust
If I had to boil the conference down to two keywords–or, maybe, if I could only apply two metadata tags to the conference–they would be context and trust. Data and content requires context and semantic models are one way to provide this context whether for use in machine learning pipelines or direct human interaction with content.
The AI Bot Wars

“A robot may not injure a human being or, through inaction, allow a human being to come to harm.” – Isaac Asimov, I, Robot
Someday in the distant future, automated artificial intelligence bots will wage misinformation, disinformation, fake news, and propaganda (University of Montana) campaigns directly against each other as a form of information and psychological warfare aimed at civilian populations. These campaigns will serve to erode trust, sow confusion, and create chaos within an enemy’s society. Hot wars, waged by humans or by drones and robots, will only be necessary as mop-up operations to consolidate power and assert authority. These wars will let peoples’ own interpretations and imaginations weaponize messaging against their fellow citizens until they destroy themselves from the inside. A fictionalized account of this type of hybrid warfare mixing misinformation campaigns, cyberattacks on infrastructure, and conventional military was the plot of the recent movie Leave the World Behind.
Bot wars are not the fiction of the distant future, however. They are here today and they are improving just as rapidly as the quality of artificial intelligence. Long gone are the days of blurry photos of Nessie and shaky video of Bigfoot. Misinformation created by generative AI was a key component in the Iran-Israel conflict of 2024-2025 (EDMO) and has been central to Russia’s online propaganda campaigns (NATO).
Today’s generated images and videos are hyperrealistic and can only be determined to be fake by 1) knowing the context or content to be untrue, or 2) having access to metadata which has not been tampered with. How do we combat this onslaught of misinformation? What role do semantic professionals, including taxonomists and ontologists, have in the war for truth?
Evolution of Bot Wars
Today’s artificial intelligence wars are mostly fought by people generating content. Easy access to cheaper, faster, and better artificial intelligence tools allows any user to generate new images and text rapidly with little to no skill in video or content editing necessary. Already existing content creation and social media sharing platforms have expedited and expanded the range and audience for user-generated content, real or not. Most of these platforms can’t keep up with content review and provide no mechanism for viewing the content source, including the metadata which may reveal whether the content is real or generated using AI tools. The democratization of content generation tools has meant an explosion of content (hence the term “content creators” as, seemingly, a professional occupational title). These tools have been praised for their ability to allow users to document, in real time, true events unfolding around them. These same tools allow users to document, in real time, unreal events manufactured by them with the same ease as documenting reality. Science fiction will just be fiction, the only science involved being the technical tools used to create the fiction.
I believe the next step in the misinformation wars will be an advancement in bot-on-bot directed counter-misinformation campaigns. In fact, these wars may already be happening with the number of fabricated online personas generating content responding to other comments which, in turn, may also be the product of fake online personas. Whenever one bot posts generated content, another bot will respond, countering and confusing the messaging. There may be truth in some of the counter-messaging, posting real content in direct response to fictional content. But, really, why bother with the truth at all? One bot can simply respond with equally outrageous content rebutting or retaliating against the first. Since artificial intelligence can generate content so quickly, why not take it a step further and do what any good marketer would do, segmenting and personalizing content to audiences based on their previous social interactions, including posts, likes, and network relationships. Not only can misinformation be generated quickly, it can be tailored to segmented audiences to trigger the most resonating and visceral reactions: fear, rage, mistrust, joy. Eventually, without any direct human intervention at all, peoples’ confidence in truth erodes and the reinforcement of already held beliefs and biases are strengthened. We already talk about echo chambers; the next echo chambers will be bots talking to bots with segmented human audiences receiving the exact messaging they would like to hear. Even as I talk about “will”, these trends are emerging on social media platforms today.
Recursion
I think “recursion”, “a computer programming technique involving the use of a procedure, subroutine, function, or algorithm that calls itself one or more times until a specified condition is met at which time the rest of each repetition is processed from the last one called to the first” (Merriam-Webster) is a great way to describe the more general content feedback loop we currently, and will increasingly, find ourselves in.
The referencing of original sources into various new content forms is happening increasingly in media as authoritative, unbiased news sources and are replaced by opinionated, subjective, and polarized “news” platforms. Algorithms on popular social media platforms weight toward content which has more interactions, positive or negative, and this content drowns out everything else. The number of memes and video clips I see repeated—or, rather, regurgitated—in my social media feeds gives a false impression that only a narrow range of topics are being covered. The breadth is shaved off at the long tails and only the highest middle of the bell curve is spit out into our feeds. Of course, these feeds are shaped by the content with which we interact, creating an echo chamber of reinforced, narrowly focused subject areas. Even as the overall amount of content expands exponentially, our exposure is limited to what we already think…or, rather, believe. Because belief is replacing authoritative fact. Our friends and feeds reinforce these notions; that unpleasant or dissonant facts are a matter of belief rather than any measurable objective truth.
The recursive, or regurgitative, nature of our content sources is going to have long-term effects on the bot wars. As AI bots create more and more content, they will seek out public sources of information and, eventually, feed their own previously created content into the self-guided learning models. Endless loops of self-referencing, recursive, regurgitated, manufactured information will act as the source of truth for new information; an endless entanglement of un-cited, untraceable, unverifiable information. As the bots play out their battles, the information will become so convoluted and unprovable that the only thing left will be belief. Even without the bot wars, we are finding ourselves here today. Belief over science or fact, individual belief over public sentiment, personal fictions over established facts.
The Battle for Semantics
From the early days of my career in an academic thesaurus to the present, the overwhelming mission of establishing “Truth” when so many concepts are only contextually true has haunted me. Fundamental, existential questions of being are, of course, at the heart of semantic modeling. Ontology is “the philosophical study of being” (Wikipedia) after all. As we watch truth and untruth blend into a bizarre miasma of half-truth in real-time, I wonder if other people in the semantic field feel the way I do. I have seen the frustration from scientists as they are dismissed as fraudsters somehow tricking the public into believing humans landed on the moon, vaccines can prevent disease, and fluoride is good for your teeth. Are taxonomy and ontology practitioners feeling the same level of dispirited frustration when they face the daunting task of asserting truth in a postmodern, truthless world? Will the AI bots win?
In the spirit of never giving up in the face of seemingly insurmountable odds, I offer up the following calls to action for semantic professionals which will, at least partially, address the coming AI bot wars:
- Lobby for increased use of semantic practices and technologies (taxonomies, ontologies, graph databases) in your organization. The use cases for semantics are real and can be clearly defined. The real work comes in convincing the C-suite that a rather insignificant financial investment in graph databases and taxonomy and ontology management software can indeed provide a large ROI.
- Taxonomists and ontologists need to engage directly with subject matter experts to ensure that semantic models accurately reflect the domain(s) they cover. Ongoing data ownership, quality assurance, and SME relationships should be an integrated part of the semantic model governance process.
- Similarly, semantic experts need to seek out and be involved with AI and machine learning activities in the organization. As foundational source-of-truth data for machine learning training sets, ensuring semantic models are accurate and are appropriately used in AI projects will help these projects be more successful with less risk to the organization.
- Target the most sensitive use cases. Semantic truth is the most convincing in areas in which the organization experiences risk. Find legal use cases tied to public content or product statements. Understand what risks threaten the company and which practical use cases semantic models can address.
- Design transparency into semantic models, including read-only access to taxonomies and ontologies in a variety of visualizations, so end users can understand and utilize them better. A significant part of any taxonomist’s job is helping users understand what taxonomies and ontologies deliver. Allowing end users to explore for themselves is a part of this work.
- Fight for the same transparency in content UIs in which metadata can be viewed by end users to understand the origin of the content, including whether it was generated by AI.
- If politically inclined, lobby for AI regulation and policies at the national and international level. Establishing regulations guiding the use, and particularly the transparency, of AI for all users will help to ensure that there are consistent best practices in how we implement and interact with AI and its generated content. In 2024, the European Union passed the AI Act, and more national governments and international organizations should follow suit.
- As a new technology, end users need to understand how AI works at least at the fundamental level. There needs to be more programs aimed at providing media literacy for the general public so they can learn how to identify and distinguish truth from untruth especially when it comes to AI-generated content.
- In support of media literacy and metadata transparency, publicly available AI-generated media detection tools need to be more common and easily usable by a general audience. These tools should have the ability to flag and identify misinformation for others.
The fight for truth will be partisan, political, frustrating, and even violent. We live in a postmodern world, but the death of truth will benefit those who create the most convincing and appealing misinformation the fastest. Counteracting these misinformation campaigns may very well be the last bastion of retaining democracy.
KMWorld Themes and Trends

“At a certain level of machine-ness, of immersion in virtual machinery, there is no more distinction between man and machine.” – Jean Baudrillard, Violence of the Virtual and Integral Reality
I attended the KMWorld and co-located conferences held in Washington, D.C., November 18th – 21st. Across the conferences and sessions that I attended and the conversations between sessions, there were several themes which resonated. I’ll detail some of those themes and highlight any talks in particular I thought inspired or captured those trends. I couldn’t attend everything, so please forgive me if I’ve overlooked great talks, of which there were many.
Taxonomies as an Enterprise Strategy
The importance of taxonomies and ontologies as a foundational business and technical program was particularly evident at Taxonomy Boot Camp, but the theme also came up in talks in the other conferences as well. I presented on the topic in my session, “Stand Still Like the Hummingbird: Enterprise Taxonomy Strategy When Nothing Stands Still”. Likewise, Thomas Stilling in his Monday keynote, “Be the Change: Your Taxonomy Expertise Can Help Drive Organizational Transformation”, emphasized metadata as an organizational strategic asset and discussed how to position taxonomy to align with strategic objectives. Similarly, Lindsay Pettai’s session, “Empowering Your Enterprise With a Dynamic Taxonomy Program”, discussed the importance of having an enterprise taxonomy program.
These are just a few examples. Throughout the conferences, the notion that taxonomies and ontologies provide structured, source of truth values for navigation, search, and content tagging was prevalent. The main theme, however, was how taxonomies and ontologies are critical to the success of machine learning (ML) model training and program success. In talk after talk, artificial intelligence (AI) and ML were top topics and various ways to approach projects to make them viable and successful included what needed to be in place. Taxonomies and ontologies were among those foundational requirements.
As a 16-year veteran of KMWorld associated conferences–my first being Enterprise Search Summit West in 2008–I have seen the more recent embrace and understanding of enterprise taxonomies and ontologies beyond simple use cases like navigation. Even just a few years ago, taxonomies seemed to be misunderstood or undervalued by many conference attendees. Now, talk of their use as enablers of AI was ubiquitous.
Artificial Intelligence Is Here to Stay
I don’t believe I saw a talk which didn’t include the keywords “artificial intelligence”, “machine learning”, “AI”, or “ML”. I was unable to attend KMWorld last year, but in years past, any conversation on the role of AI/ML was frequently met with scoffs, eyerolls, or fear. In KMWorld’s Thursday Keynote, KM, Experts & AI: Learning From KM Leaders, Kim Glover and Cindy Hubert captured something important when they said that AI is driving a lot of emotions. Some of these emotions are excitement, skepticism, and fear. Addressing these emotions is going to be essential in addressing the technologies.
The overarching theme is that AI/ML is here to stay, so what are you going to do about it? Continue to reject it and fall behind the curve? Embrace it without question? Embracing ML models, including LLMs, with guidance and guardrails seemed to be the message most were conveying. While many organizations are still in the proof of concept (PoC) project stage, adoption seems to have already arrived or is pending, so get ready to embrace the change to be successful.
The exponential growth in ML has been fueled by cheaper storage, faster technologies, more robust and accessible models such as ChatGPT, and built-in technologies like Microsoft Copilot. Not only are these tools more accessible, they are more accurate than they have ever been in the past…for the right use cases. And this was another big takeaway: AI/ML is here to stay, but use these technologies for the right use cases. Document summarization and generative AI text generation are big winners while tools used for critical decision making are still improving and must be approached with caution.
Human-in-the-Loop, Cyborg, or Sigh…Borg?
Overwhelmingly across sessions, the consensus was the importance of the human in the use of AI/ML technologies. For both the input and the output, to avoid garbage in and garbage out, human beings need to curate and review ML training sets and the information that an ML model outputs. As noted above, document summarization may be low risk depending on the industry, but decision-making in areas like law and medicine are high-risk and require humans-in-the-loop.
In his session, Evolving KM: AI, Agile, & Answers, the inestimable Dave Snowden discussed storytelling–or, rather, talked about storytelling through the art of storytelling–as an intimately human activity. He noted the deeply contextual nature of human knowledge and the need for knowledge management to generate knowledge rather than store it as codified data. If knowledge can not be fully codified as data, therefore, it is difficult or impossible to transfer that level of human knowing into machines and their models. The connectivity to knowledge and deep metaphor was evident in my three takeaways from his session: his notion of a tri-opticon integrating knowledge between three areas of an organization and the connection to Welsh Triads; forming teams or roles of threes, fives, or sevens, Welsh Rugby, and the number of people on a rugby sevens team; and assemblages, mycorrhizae, and rhizomatic networks in Deleuze and Guattari’s, “A Thousand Plateaus”. These kinds of connections are weak use cases for machine learning and emphasize the importance of human knowledge and inclusion in AI technologies.
And, speaking of human knowledge, of course knowledge retention and transfer were key topics in an era when both age and work-from-home job mobility are creating more job turnover than ever. While human knowledge is difficult to capture, using AI agents to capture knowledge and ML models to parse textual information will assist in ramping up the sheer scale and increasing pace necessary to retain and transfer knowledge.
The human-in-the-loop, the cyborg, and, sigh…Borg, are all going to rely on knowledge and ethics in order to create a human-machine interactive paradigm. If humans don’t use our own ethical boundaries to curate machine content, then bad data, and bad ethics, will spread throughout our information systems.
While there was some fear and loathing in D.C., there was a stronger current of hope, optimism, and curiosity in the many ways we can use taxonomies and ontologies, AI/ML, and human knowledge management together to guide us toward a brighter future of technology use.
