Home » Posts tagged 'polyhierarchy'
Tag Archives: polyhierarchy
Taxonomies and the Fall of the House of Escher

“I know not how it was–but, with the first glimpse of the building, a sense of insufferable gloom pervaded my spirit.” – Edgar Allen Poe, The Fall of the House of Usher
I consider well-constructed semantic models akin to constructing a foundationally sound, well-architected, and visually appealing building. Not to be hyperbolic and melodramatic, because [swoon] that just isn’t me, has any taxonomist looked upon the works of others and despaired? Upon casting eyes upon and getting the “first glimpse of the building” that is the organizational semantic structure, suddenly felt “a sense of insufferable gloom” pervading the spirit? Boy howdy, have I.
To be fair, there are a number of factors at work leading to semantic debt; compromises in semantic integrity violating best practices in taxonomy construction left for some future taxonomist to unwind. These can be strong organizational cultural pushback to accepting taxonomies in structure or content, internal politics, or designing so that consuming systems can ingest the data. In any case, violations in taxonomy best practices can compound over time, leaving the semantic models in the current state not particularly semantic at all.
While it is mission-critical to gather input from business stakeholders to build, implement, and maintain taxonomies, it is also critical to allow the respective subject matter experts in taxonomy and business to do their work according to the best practices of their domains. Like a building drafted by M.C. Escher and constructed by Edgar Allen Poe & Associates, taxonomies can become circuitous, recurring, and not very meaningful if straying too far from best practices.
Escherian Design
I have frequently been involved in taxonomy design projects in which the stakeholder input into the semantic structures follows a line of thinking mirroring the work the business users do. In fairness, taxonomies should support whatever use cases assist end users in performing their jobs. However, business stakeholders are not necessarily taxonomists and so their recommendations may not follow taxonomy design principles. Here are some taxonomy design suggestions I have seen.
Taxonomies as virtual end caps. In this scenario, product owners try to mirror their product placement in the physical world as taxonomy structures in the virtual world. So you may get suggestions to build taxonomies like this:
Lumber > Deck building materials > Nails
Roof building materials > Shingles > Nails > Roofing nails
In essence, the concept representing the objects in the physical world are placed in the same locations in taxonomies as they would be in the layout of the store. The terms become conceptual end caps, quick items to throw in your cart because they are related to the products you are purchasing. In this case, I need nails for specific reasons, like building a deck or putting on a roof. For convenience, I put an end cap display of nails in the lumber department or by the stacks of shingles so buyers don’t need to hit every department to complete a project.
Taxonomies as navigational structures. While taxonomies can absolutely be used as navigational structures on the front end, the proposal here is that taxonomies exist this way in the back end taxonomy management system. Taxonomies may then be built like this:
Apparel > Men’s > Basketball shoes
Apparel > Women’s > Basketball shoes
From an access perspective, these are easy to understand navigational pathways leading directly to a set of products I can then filter by size, color, or brand to see what’s available but also make it easy to make a purchase.
Taxonomies as processes, stages, or funnels. Building taxonomies following process steps, stages, or trying to capture marketing user journeys through the funnel so that structures can look like this:
Awareness > Consideration > Conversion > Loyalty
Planning > Design > Prototype > Design for manufacturing > Manufacturing > Post-manufacturing
In this case, the sequential steps or stages are nested as a hierarchy as if to illustrate the progression through the process as a ladder or directional move through the concepts.
These are just a few of the examples I’ve experienced when working with stakeholders in the taxonomy design process. What’s wrong with giving the end users what they want by designing enterprise taxonomies to adhere to some of these patterns?
Maurits “Context” Escher
“If everything means everything, then nothing means anything.” Like Rick from Rick and Morty, I’m trying to build a following around a catchphrase. I made this same point in my blog Polyhierarchy and the Dissolution of Meaning. Repeating the same concept in multiple locations, whether trying to mirror real-world endcaps or to capture new contextual meanings from hierarchical placement, is a big taxonomic no-no and for good reason. If concepts become contextually dependent, then the individual subjects and objects within a semantic model lose their crisp focus. The point of taxonomies is to disambiguate concepts and ensure that each item is clearly defined in meaning and scope. Of course there are concepts that really can exist in more than one location in a polyhierarchical structure, but these occurrences should be minimal and not be forcing different contextual meanings.
Using the above examples, “nails” violates the “is a…” principle in that they are not semantic children of their parents. They are necessary items to complete a deck or a roof, but they are not decks or roofs themselves. We can easily build separate, mutually-exclusive taxonomy schemes or branches and connect them with semantic relationships to include all of the items necessary to build a deck or put a roof in place. Nesting them in contextual proximity is not following taxonomy best practices and, ultimately, causes ingestion confusion when stripped from context. More practically, repeated concepts will likely break a consuming application when the system finds the same label (and, if built properly in a taxonomy management system, same URI) showing up in two different locations. These are often ignored on ingestion because the system can not resolve the entities.
Building Codes and Accessibility
If you’ve seen any architectural drawings by Escher, you probably know that not only would they be very difficult to build in the real world even by Edgar Allen Poe & Associates, they would never pass local and state building codes for accessibility. Look at all those stairs! Not a ramp or elevator in sight!
Providing accessibility to products (or content) using navigational taxonomies is an excellent way to assist users in getting to what they are looking for. While there are more searchers than navigators in the world, simple drilldowns to products or content in hierarchies in conjunction with filters is still useful as an additional means of locating products or information. Navigational taxonomies rely on their contextual construction to provide signposts for users to know exactly where they are in the product structure and in the potentially very large “store” they are trying to navigate. Pretty self-explanatory name for these types of taxonomies.
Navigational taxonomies can be built directly in front-end applications to serve retail and information finding use cases. If possible, the values can come from back end taxonomy management systems to ensure consistent concepts and messaging across the organization. In these cases, the front end system may consume values from across the taxonomy schemes and hierarchies and display them in a different contextual hierarchy or as filtered values in left-hand navigations. It may also be possible that the taxonomy management system allows for the construction of semantic master schemes which can be reassambled in the tool or through the API into navigational hierarchies. Using our example above, the taxonomies behind the scenes may look like this
Products > Apparel > Footwear > Basketball shoes
People > Demographics > Men’s
In this case, only the values needed to construct a navigational taxonomy are pulled from their respective schemes and reassembled. The advantage to this methodology is that one best, preferred concept label and its unique ID are used in all locations. Any tagging to product images, copy, web pages, or concepts used in navigational structures or filters can be used for a variety of analytics including clicks on navigational nodes or filters, clicks on product images, analysis of products added to carts, etcetera, without having to reconcile the same or similar values for analysis.
Temporal Ladders
Taxonomy structures typically follow a parent-child “is a” structure in which the children are instances of their parent concepts. It is also possible to construct whole-part relationships (called meronymy in linguistics) in which the children are a part of the parent concept.
While it is possible to model temporal or sequential events in taxonomies and ontologies, it typically requires advanced skills in ontology modeling, can be challenging to implement, and can be subject to change when trying to mirror processes. Processes are not only sequential, but can change frequently as well. Changing a foundational semantic structure to keep in pace with changes in marketing funnels or manufacturing processes may not be worth the effort if the steps can be created taxonomies independent from their hierarchical structure.
That all said, using relationships to define sequence rather than hierarchical structure can be one simple way to create a semantic sense of order. For example, using a relationship like has predecessor could link books, films, or process steps in order to model sequence.
It’s all a Question of Time
As context graphs are gaining momentum in at least understanding if not yet implementation, we will likely see more ways to bridge the taxonomy modeling-temporal process gap. In the meantime, adhering to foundational taxonomy best practices is a best bet to ensure that your semantic models are ready for the next evolution to capture temporal events to provide additional context to the graph.
In short, maintaining “is a” or whole-part taxonomy structures as base semantic models while developing more complex ontological designs and connected data as part of a context graph will potentially provide a good combination to avoid Escherian design practices and Gothic horror in your semantic structures.
Polyhierarchy and the Dissolution of Meaning
“Everything is everything/What is meant to be, will be.” – Lauryn Hill
Polyhierarchy
Polyhierarchy is “a controlled vocabulary structure in which some terms belong to more than one hierarchy. For example, rose might be a narrower term under both flowers and perennials in a horticulture vocabulary” (ANSI/NISO Z39.19-2005 (R2010), Guidelines for the Construction, Format, and Management of Monolingual Controlled Vocabularies).
While the ANSI/NISO Z39.19-2005 (R2010) standard is still my go-to for foundational taxonomy principles and may provide validation for using concepts in more than one location, I try to avoid polyhierarchy as much as possible. I see it as a construct necessary only in rare situations and because many systems are unable to consume taxonomical concepts in any other way than their actual location in a hierarchy. Specifically, I don’t like polyhierarchy which is 1) abused out of necessity to suit use cases consuming systems can not otherwise meet, or 2) used to solve many, differing use cases. To me, polyhierarchy is the enemy of specificity; it is the forward slash of the taxonomy world…the imprecision and indecision of the either/or.
There is a conflict between the construction of one or more taxonomies for semantic accuracy and how those taxonomies are displayed because of the inability to transform and restructure taxonomies to meet different, real-world use cases. If the use case demands a concept be more than one thing in more than one place, it must be put in all of those locations in the originating taxonomies to suit navigational needs.
My former colleague and contemporary taxonomy practitioner, Bob Kasenchak, wrote in his blog post “On Polyhierarchy”, “The most common misuse of polyhierarchy is overuse: the tendency to give terms multiple parents without sufficient reason.” I agree. This statement gets to my main objection with polyhierarchy in that when it is overused, semantic precision is diluted. When everything is everything, nothing is anything.
Polyhierarchy in Navigational and Information Access Taxonomies
People have different ways of searching for information and, in an online world in which a user can start in any number of locations and expect to get to the information they want, polyhierarchical taxonomies facilitate navigating to information through multiple pathways.
A common and familiar use case for polyhierarchy is in navigational taxonomies used in online retail. Consumers may require multiple entry points in product hierarchies to find what they are looking for. Using a search engine to get to a product display page in the first place is a common scenario in findability, while searching directly on the retailer’s website is often a consumer’s next choice. However, once on a website, users may use navigational structures and filters to get to specific products. Even if the navigational browse taxonomy is displayed as a flat list rather than a hierarchy, having multiple points of entry is going to lead consumers to the product they are seeking.
For example, one might expect to find Basketball shoes under Men, Women, Unisex, AND Kids. One may also expect to find Basketball shoes under Sports > Basketball. Given the current trends in athleisure apparel, one might also expect to locate Basketball shoes under Casual or Lifestyle. These divergences in meaning account for both a consumer’s individual browsing paths and competing notions of what Basketball shoes are worn to do. For a consumer, Basketball shoes may be just as easily in one category as another without any conflicting meanings.
Supporting this use case in one or more back end systems powering a front end experience may demand a concept be placed in more than one location in a taxonomy management system because the downstream system(s) can only consume concepts exactly as they appear in a hierarchy. In this scenario, you are forced to set up taxonomies that look like the following:
Kids’ shoes
Basketball shoes
Men’s shoes
Basketball shoes
Unisex shoes
Basketball shoes
Women’s shoes
Basketball shoes
Sports
Basketball
Basketball shoes
In the Basketball shoes example, the concept isn’t inherently a member of all the locations it is listed, but is listed in all locations as a way to facilitate user access to products through navigation. Even in this oversimplified taxonomy model, the repetition of the concept is becoming unwieldy.
Sometimes products really are two different things which can’t, or shouldn’t, be reconciled. The Z39 provides the example that a piano is both a percussion and stringed instrument. Therefore, on a website which sells many kinds of musical instruments, listing pianos under both seems sensible. Similarly, for a retailer selling toasters, ovens, and toaster ovens, we might expect to see Toaster ovens listed under concepts like Ovens and Countertop appliances.
The same principle applies when accessing informational content. For example, a country can be a part of a continent and a designated geographical region including more than one continent. For example, Denmark is both a part of Europe and EMEA (Europe, Middle East, and Africa). In a hierarchy, the construction may look like this:
Continents
Europe
Denmark
Geographical Regions
EMEA
Denmark
These use cases illustrate a need for polyhierarchy even in cases in which the back end systems may not support the need well.
Polyhierarchy in Semantic Taxonomies
Taxonomies which adhere to more stringent guidelines, which I will term semantic taxonomies, are those which follow taxonomy construction and maintenance standards in an attempt to arrive at more regular, logical structures to reduce or eliminate ambiguity. Building logical, semantic taxonomies have several long-term advantages.
First, adhering to simple principles of placing a concept in its single best location mitigates problems with system interoperability. In some cases, downstream systems consuming from a taxonomy management system can only recognize a single instance of a concept, most likely because it doesn’t have the ability to reconcile a label name with exactly the same string of characters. Another potential issue is consuming systems won’t allow for a concept with any label to have the same GUID to exist in more than one location. In well-structured semantic models, any polyhierarchical concept should only have one GUID or URI and not be a unique instance with exactly the same label but different identifier in each location. In this situation, the system receives the above example taxonomy hierarchy Kids’ shoes > Basketball shoes first on import and ignores each subsequent instance as it reconciles matching label strings.
Second, maintaining models requiring many polyhierarchical concepts becomes more difficult as more instances, and more semantically different domains, are covered by the taxonomies. Using the same form for a concept label with a single URI or GUID for multiple purposes can eventually cause a maintenance breakdown in which the concept loses semantic precision and scope and appears in locations with different logical underpinnings, especially using relationships with unique semantic meanings.
Finally, building semantic taxonomies supports the root purpose of taxonomic structures and ontologies: to define concepts so they are unambiguous. My taxonomy 101 go-to is the “is a…” principle. As a fundamental premise, I reject that a concept in most cases can not be placed in one, single best location expressing its intrinsic meaning. Is a toaster an appliance? Yes. Is an oven an appliance? Yes. Based on this, it’s easy enough to put toasters and ovens in their place.
Polyhierarchy also has acceptable use in semantic taxonomies. A concept can truly be a member of two categories which are overlapping or mutually exclusive. Our Denmark example above is a case in which a concept is a member of two categories. A homograph, like Mercury, is an example of a concept which has several, mutually exclusive, meanings.
However, in both cases, there are modeling choices to avoid polyhierarchy but are dependent on having the right functionality available. If the taxonomy tool supports associative relationships and consuming systems can use both hierarchical and associative relationships, the modeling may include a semantically named relationship in place of a standard hierarchical relationship. The associative relationship is part of geographical region can be used to create a specific semantic relationship to the concept EMEA allowing Denmark to be a child of Europe but not of EMEA.
Continents
Europe
Denmark is part of geographical region EMEA
Geographical Regions
EMEA
In the Mercury example, the Z39 suggests the use of parenthetical qualifiers so the concept appears in mutually exclusive domains which may very well all appear in one thesaurus:
Planets
Mercury (planet)
Metals
Mercury (metal)
Space vehicles
Mercury (space vehicle)
One of the challenges, especially in retail taxonomy concepts, is that concepts are rarely a single term. Returning to our Toasters and Ovens example, the concept Toaster oven was intrinsically two concepts, not one, because we have introduced a pattern or stacking nouns (toaster + oven) to create a new, compound concept. Even more frequently, adjectives are modifying nouns to include more than one independent, atomic concept. For the concept Men’s basketball shoes, the pattern is gender + sport + product. Sticking with our notion of a semantic taxonomy, the three separate concepts can easily belong to three, mutually exclusive schemes covering Gender, Sports, and Products. When the new concept is created, it’s easy to see how concepts find polyhierarchical locations in different schemes to support navigation.
What a thing is versus what is used for can also be problematic and demands a shift in thinking. Or, rather, defining exactly the modeling approach used across a set of taxonomies to maintain consistent semantic principles. Again, I stick with what a thing is. My favorite example is James Bond’s exploding pen from GoldenEye. Is the pen a writing utensil? Yes. Is the pen a weapon? Well…in this case it is. In the narrow perspective of spycraft, perhaps a pen is a weapon, but it is not inherently a weapon. In the Bond universe, a pen could very well appear in a taxonomy of weapons, but, as above, there are concept form and modeling choices which would alleviate the confusion. Rather than Pen, would it not then be entered as Exploding pen? Similarly, Bond has used a Rocket pen and a Poison pen. Once we modify these concepts, they then can find themselves in one best place in a taxonomy of weapons.
Why consider alternate modeling practices to avoid polyhierarchy if the standards and tool functionality allow it? In addition to the two reasons noted in this section, there is planning for unknown domain expansions in attempts to future-proof taxonomies for additional, currently unknown use cases.
Polyhierarchy across a Graph
A fundamental problem in modeling taxonomies is trying to serve two masters by including both semantic structures following logical rules and the useful, though typically less semantically precise, structures required for navigation. By trying to model for both purposes, there are inevitable conflicts which cause compromises in structure and meaning.
Different types of polyhierarchical instances living in the same domain attempting to address conflicting use cases cause the hierarchical taxonomies and the ontologies which provide logical modeling practices for the overall graph to experience semantic drift. While the human mind can understand seeing Dog food as a narrower term for both Pet food and Dogs, a system can only accept the strings it is given.
Using inconsistent modeling practices, like using different types of hierarchical or associative relationships for the same concept, causes concepts to drift from tightly bound semantic meaning, structural context, and scope. As the meaning expands to address more use cases, the precision wanes. As I said earlier, when everything is everything, nothing is anything. In other words, concept meanings become less precise and eventually concepts shift to mean what they are, what they are used for, where they are located in a navigational taxonomy virtual folder structure, who owns the concept, and on and on. The meaning erodes.
So what? We can see the concept in context and figure out what the meaning is, right? So why bother being so tightly bound to the concept meaning. A good use case example is using taxonomies to build machine learning models. The imprecision of having Basketball shoes under multiple parents to provide specific paths for gender navigation while also having the concept nested under sports requires that the model must be trained to understand that a basketball shoe is not a sport but is used for the sport of basketball. The more connections a concept has to other concepts through hierarchical and associative relationships, the more imprecise it becomes across the graph. While hierarchical structures are useful, graphs are even more so, providing the logical underpinnings for machine learning models, knowledge graphs, recommendation systems, semantic search, etc. Precise meaning becomes more important with each use case.
Polyhierarchy isn’t necessarily to be forbidden in semantic structures, but I propose using it sparingly, when a concept has truly more than one meaning, and for semantic structures which can then be transformed to provide concepts in any hierarchical structure for consuming systems and navigational use.
