In my previous post on this topic I discussed how using ontological structures allows more precise modeling of relationships between concepts in a controlled vocabulary (knowledge organization system).
Another definitive feature of knowledge graphs is the ability to connect concepts to external resources using Linked Data. Rather than discuss technical features of RDF and Linked Data, it seems more useful to examine some use cases.
Let’s say we have a large vocabulary describing concepts (and their relationships) in some knowledge domain; the vocabulary is used to tag documents for retrieval (a very common use of subject metadata). For example, consider a large collection of articles about science tagged with relevant subject metadata allowing a user to find all content related to a topic:
(This could include relevance weighting or other information; for now let’s just assume it describes articles about a topic).
This is essentially a self-contained structure; the vocabulary (and therefore topics) are related to the content in our repository—and only in our repository: we have a graph of documents related to our topic.[1] As an example, we can refine our diagram to be more specific:
Now let’s say we want to add some information to our structure: a definition for each topic. Perhaps we want to offer a mouse-over definition for terms, either in the content or in a vocabulary-based search/discovery portal, or something similar. One way to do this, obviously, is to look up (or write from scratch) a definition for each term and include it in our vocabulary as a field; this approach is prohibitive for vocabularies including thousands of terms. Besides, this work has already been done and exists in shareable data formats in repositories we can leverage, so why do it again?
We can include in our vocabulary structure, using a URI resource, a reference to an external data source about each topic[2] and extract the definitions that are already in place and include them in our structure:
Since we know that dbpedia pages include an element called “dbo:abstract” (the bit in the ontology that has a short definition of a topic), by equating our topic “Optics” with the URI about the same concept in dbpedia we can query this information and add it to our vocabulary, making it available as hover-text or some other use we imagine in our repository about science. This repository is now more than just a store of content: it includes information related to that content.
Perusing the dbpedia page about Optics we see that there’s actually a host of other useful information about this topic available, including:
- Links to societies that publish on this topic
- Journals in this discipline
- Related subjects (in the dbpedia ontology)
- Near matches to subjects in other vocabularies
- Well-known members of the field
- Related images on Wikimedia
…and other information that we can use, extract, or point to to further enhance our repository as a source of information as well as our content.
If we like, we can include suggestions for additional sources of information beyond our specific repository for users looking for more resources; by leveraging and connecting to external resources via Linked Data, we can enrich our vocabulary (and imaginary website) and transform our content repository into a clearinghouse for information.
Bob Kasenchak, Senior Manager Client Solutions
Follow Bob on Twitter @TaxoBob