Synaptica talks with the Archivist at the Defense Imagery Management Operations Center (DIMOC), based at Fort Meade, Maryland. DIMOC is part of Defense Media Activity (DMA), the Department of Defense’s direct line of communication for news and information to U.S. forces worldwide. Managing and training users of the DIMOC Controlled Vocabulary is one of the archivist’s main roles. The DIMOC CV is a reference tool designed to assist users to enter relevant, standardized keywords for long-term archiving and accessibility of Department of Defense photos and videos.
Would you give me a quick overview of DMA and your role?
DIMOC Archivist: Here at DIMOC, we organize and collect all the physical and digital assets the U.S. military creates. Physical assets include negatives, prints, slides or video tapes. Digital assets can include still, video or audio content. These assets are used throughout the Department for a variety of uses including training, operations and military exercises. DIMOC’s role is to ensure this visual record is preserved as a historical record of the Department of Defense.
The archival personnel process this visual information before transferring it to the National Archives for long-term preservation. There is a detailed process using taxonomy and thesaurus management for each asset to ensure the metadata, keywords, and categorization are all correct. We use Synaptica KMS to manage our controlled vocabulary and thesaurus as part of this process.
Currently, we are involved with a major digitization project ensuring the taxonomy is accurate. This workflow involves making sure all the metadata and terms used are technically sound, to include spelling and formatting.
A brief introduction to the Defense Imagery Management Operations Center (DIMOC), its mission and capabilities. DIMOC is the operational arm of Defense Visual Information, a component of Defense Media Activity at Fort Meade, Maryland.
Tell us more about the National Archives and DMA.
We are aligned with the National Archives requirements for metadata and government records standards. There is certain information the National Archives requires of all government records and we make sure that with the help of the controlled vocabulary we meet these requirements. The National Archive receives assets from a variety of sources. They don’t set the standards for government records, but they do have certain requirements.
Another project we are looking at is integrating between two existing systems. We have an archive system and a production system. We are about to make these two processes into one overall: moving from individual storage to a joint arrangement. Part of this work is making sure metadata is consistent because there is potential for duplication and inconsistency within the unique assets. Having a controlled vocabulary helps us resolve this across all our metadata fields.
Which Synaptica product are you using and how?
We use Synaptica KMS to maintain our standards and keywords. Each image will have a caption, usually describing what is going on or where it was taken. We extract this information but read the caption backwards to collect all the nouns first. Then match it with our keywords. For example, if the phrase used is an aircraft carrier, then it will look for a positive match. The word ‘carrier’ on its own would not be found or approved, but the phrase ‘aircraft carrier’ would be approved. KMS helps us validate the terms and use the entity extraction and automation accurately.
This is a way we are maintaining our vocabulary standards without relying on human data entry. We have additional fields that help with this process such as the geographic locations, in addition to the use of U.S. military hierarchy and organization in fields such as combatant command. Using these standards helps us ensure consistency with the format, accurate spelling, capitalization and reduction of typos. It also offers a direct parallel to the organization of the U.S. military and its visual records.
Do other people have access to KMS?
DIMOC receives submissions of both digital and physical assets. We also digitize a lot of our physical media, such as DVDs or 16mm film. The process varies for the different media types but once they are transferred to the archive they unite. Today’s military photographers submit almost everything digitally. Once it is submitted to us, the metadata is checked, edited and enhanced accordingly. This curation process helps increase the accessibility of these visual records during search. This operational imagery or newsworthy content is transmitted into our operational system, with public access, for primary use in and amongst the military services as is required for their strategic communication purposes. It takes 180 days for a new digital image or video to move onto the DoD archive after entering the operational system.
We have a group of fifteen people who help with the curation process. They view the image or video, then complete the keywords using Synaptica KMS as a guide. We also have government personnel who monitor this stage as well.
This is when we can also add new terms to the system. Each month, we produce a report that helps validate any new emerging keywords. We can also add to the archive. This way, when an editor enters a term it will automatically generate as another search option.
Our system also uses predictive text and language, it recognizes text being typed and constantly learns from the users involved. We also get reports from searches within the system to strengthen the terminology to the user’s language. This is specifically speaking to the related terms use. Instead of “fighter jet,” which is a common term found in searches for fighter aircraft, we connect the terms within Synaptica and within our archival system to improve the user’s search results. We also hope that they notice the system change their input of “jet” to “aircraft” and provide specific fighter aircraft results such as F/A-18, etc. This creates a learning environment and is a passive benefit to enforcing a vocabulary.
What was your approach to selection? What were you looking for?
The archival personnel were involved in the selection of the taxonomy management software. As the lead technical advisors we worked with two others who assisted in terms of IP network and security considerations. We spent six months looking at options from a number of organizations to review their taxonomy standards and approach to classification, including the Library of Congress, the Smithsonian, the Metropolitan Museum of Art and the Getty Museums. Knowing the nomenclature was important and how these organizations categorize different types of records and artifacts across different disciplines. We focused on the theory and figured out the best approach for our needs.
We wanted visualization provided with our vocabulary so that it was easy to understand for a common (non-military) user while offering integration capabilities across various systems.
A lot of your role involves training users, can you tell us how you approach?
We have a public-facing page as part of the process for uploading assets. Part of the archival role is presenting to and guiding users. The radial map (see image) shows the different terms and understand the relationships. We avoid taxonomy language and use parent or child terms for users, which is more readily understandable to someone who is going to use our vocabulary infrequently. If we are working with a more technically minded audience, then we may introduce more advanced language. When you get to this level of depth and understand how content is organized in just one system, you need to be able to speak at different levels, because most people take for granted what a search will do and provide what they want. This is a good thing – users want what they want, and content providers have to adjust both their search, discoverability and training/learning environments to meet these needs.
You often need to walk them through the process. We may use a vehicle as an example – is it an air, land or a water vehicle? Are we looking at an airplane or a helicopter, land construction vehicles, tanks, trucks, ships or boats? Then we could be referring to sub-surface versus surface vehicles. This method makes it relevant to the user. After that, we can introduce abstract keywords within KMS using a concept (recruiting, morale or welfare). A common challenge for us is to demonstrate how military personnel do their jobs. For example, communication is a conceptual term and a core function of much of the military training preparation for operations. Communication is a common theme within our imagery and is a good example of using keywords to convey concepts and to create collections of imagery around the same word. The DIMOC Controlled Vocabulary is available online to the public.
What would you advise others who are approaching a similar project?
Manually draw up your organization and the terminology, and the language your organization uses. Make sure your plan is visual, and think through all the possible relationships. KMS allows us the flexibility to take a different approach to these categories and relationships; and we did in many trial and error scenarios. We had a whiteboard and a big piece of paper and we mapped out a number of terms within the various organizational structures this way to ensure what would work in the final vocabulary. Don’t look at one product or one system if you can avoid it, because it can limit the potential for your vocabulary application.
What are the biggest challenges on the horizon for your industry?
It’s all about access to the information when you need it. There is no point having a file with no metadata, because you will never find it again. What’s the point if you can’t find it?
Our taxonomy, nomenclature and classification are all related to search functionality. Why are we doing this? Why do we care that a dog is part of the canine family? There are reasons for this beyond the scientific classification, as another example. Not just why you want to do a search, but what you expect to find in a search. There’s information such as data, and on top of that building block is information as wisdom. Taxonomies and vocabularies serve as the integration between both those blocks leading to a smooth user experience which is measured by the success of the access.
The DIMOC Controlled Vocabulary is available online to the public. Synaptica Insights is our regular series of case studies sharing stories, news and learning from our customers, partners, influencers and colleagues.