For the latest in our series of Insights Synaptica talks with Paula McCoy, Senior Manager, Content, at ProQuest LLC. Paula is based in Louisville, Kentucky, where she manages Editorial curation, including indexing and vocabulary strategy. She led the company’s implementation of auto-categorization, and her areas of expertise include taxonomies and controlled vocabularies, metadata, and vocabulary mapping, and search and browse functionality. Paula is past chair of the Taxonomy Division of Special Libraries Association; she is currently chair of the division’s Professional Development Committee. ProQuest is based in Ann Arbor, Michigan, with offices around the world. It is a global leader in the provision of specialized information resources and technologies.
Tell us more about your organization
McCoy: ProQuest is committed to empowering researchers and librarians around the world. Our information content and technologies increase the productivity of students, scholars, professionals and the libraries that serve them.
Through partnerships with content holders, we preserve rich, vast and varied information – whether historical archives or today’s scientific breakthroughs – and packages it with digital technologies that enhance its discovery, sharing and management.
In our office, we receive thousands of new articles every day, which are available immediately for customers. To ensure searchability of this diverse content, we abstract and index articles according to established standards, using our own specialized controlled vocabularies which have been honed over many years.
Tells us about your role
McCoy: My title is Senior Manager of Content, overseeing Editorial operations for ProQuest’s academic platform. Our job is to manage curation of content as it comes in, and that means ensuring accuracy and quality in metadata and indexing. We’re talking high volumes of new articles every day, for example, thousands of newspapers articles from international sources.
These major volumes mean we need to prioritize our content workflow. We look at things like source and content type, discipline and database, and publication title. We consider what level of attention a document needs once it goes through the automated indexing process. Since we receive so much information from newspapers, we tend to review that at a broader level, focusing on the top stories and on entities like people, places, and organizations.
You deal with intense volumes of content; tell us more about how this is managed
McCoy: Our editorial team is a little over 50 Content Editors working in teams. I prefer to think of them as content curators. We’re organized by discipline: news, business, health and medical, science and technology, social sciences, arts and literature, and interdisciplinary, niche content. Our editors are subject matter experts: They know the topics that are relevant for their area, what to expect with indexing outcomes, and which publications they need to focus on. Many of us have worked here for more than twenty years. Added up, this means we know and understand our content intimately.
A necessity for our work is that we actively manage our vocabularies as topics, terminologies, and trends shift and change. I have two taxonomy editors who report directly to me, who constantly review our indexing and ensure the editors have the terms they need. You’ll find that we are all avid radio listeners and news junkies, so we by nature stay on top of issues.
When did you become involved with taxonomy?
McCoy: My role has changed over the years. I was a supervisor for our business database for almost 20 years, and at some point, I started to manage our controlled subject vocabulary. In 2003, this role was formalized when I became Taxonomy Manager.
For years, once we automated our editorial work, we were able to search and validate terms, but the terms themselves had to be entered manually in a UNIX system. When I started managing the vocabulary, I had to add new terms to a Word document and then print copies for the editors every year or so. We were also increasing the frequency of term additions, which meant we had to send memos out as new terms were added. It became clear we needed a vocabulary management solution.
I should add that ProQuest was also growing as an organization by this time. We had acquired other companies which had their own databases, vocabularies, and editorial systems, making indexing and vocabulary management more and more complex.
Tell us about which Synaptica product your organization uses
McCoy: Since 2004 we have been using Synaptica KMS to manage our vocabularies, linking it through APIs to our editorial system. When we add, change or delete terms the updates are immediately available to the indexing team and updated within our auto-categorization software. We now have more than 50 users of Synaptica in several locations, a small number of them doing updates but much more using it to search for terms, to suggest new terms, and to provide feedback on auto-categorization results.
Our vocabularies include the main ProQuest thesaurus plus specialty thesauri for disciplines such as linguistics, life sciences, aquatics, and sociology. We also have company/organization, geographic, and people name vocabularies. Our company file alone totals over half a million names. The taxonomy editors add new terms several times a year and are constantly updating term relationships and hierarchies.
We use Synaptica for purposes other than just controlled vocabulary management. For example, I got the idea from Jim Sweeney at Synaptica several years ago to create concept schemes as a database of sorts. Now we have a concept scheme where editors can suggest new terms, giving us the rationale for it, and another one where they can comment on the usage of a term by the auto-categorization software. This has made Synaptica a key tool for editorial operations.
How did you hear about Synaptica?
McCoy: I started with a basic Internet search on thesaurus or taxonomy management software. I found three possible options and worked with our development and internal systems teams to prepare our requirements.
I knew what I wanted and needed in a tool. My managers and I worked with the technical team reviewing proposals and demos, then we systematically evaluated the pros and cons of each one. We chose Synaptica because it could manage our volume of terms and vocabularies, plus its Oracle database fit well with our internal systems.
Did you have any concerns before starting the project?
McCoy: I didn’t have any major concerns, but I knew I would have a lot of clean-up work to do. We had to provide Synaptica with our vocabularies in a variety of formats and with different levels of accuracy. Synaptica did an excellent job of providing reports once they imported our files, which made the clean-up work pretty easy. We had to use our old vocabulary system in addition to KMS in the beginning, as we weren’t fully into our new editorial tool. It wasn’t a worry, just something we had to consider as part of the implementation.
Our other practical challenge was that, as we acquired other companies using different systems and vocabularies, we needed more and more to get everyone onto the same system. So I began using Synaptica in more creative ways, and as my needs expanded, my expectations grew. You start to see possibilities. It’s been great to talk to Jim and the team and say, “have you thought of this?” In the first few years of Synaptica, we sent many suggestions to Jim that were incorporated into KMS.
The level of service and the regular annual upgrades are beneficial. Being able to send suggestions of what you need is appreciated. This has been so good for us over the years, especially as we’ve expanded the way we use the tool.
Is there a specific feature that stands out to you?
McCoy: When we reviewed the options for ProQuest thesaurus management system, one of my main requirements was the ability to obtain reports, quickly. With some of the systems, we looked at this option either didn’t exist or was not at the level we expected or needed. For us, it’s a priority. We have also noticed that reports have improved with each update. This allows my team, and any user, to get a report either for themselves or for other people in the company, in many different formats. We also use the Category feature—it’s been immensely valuable. I can obtain a list of terms covering a certain topic in our thesaurus and send it to someone within 15 minutes. This quick outcome helps us do our work and support people throughout the company.
What advice would you share with someone starting a similar taxonomy project?
McCoy: Know what your requirements are going in. Think about what you need. Technology is so complicated, and the service offerings are complex. Taxonomy seems to be everywhere. You must make sure the solution applies to your business situation. There is a danger you could end up being pushed into using a software product that is designed for another use, and that is not an ideal situation.
Be rigorous in ensuring that potential vendors know what you do. Vendors should understand where you are coming from, what your business needs are. They must understand your enterprise and how your users are going to use the tool.
What do you think are the major challenges for the future?
McCoy: Some of the biggest challenges in our industry are in the area of search and search interfaces. Precise indexing at the document level has become more critical. Plus, just because something has been indexed doesn’t mean it can be found in a search. Knowing which pieces of metadata are searchable and making sure the end user knows how to find that metadata and use it to hone their search is vital. You can end up with thousands of results after a search—that’s too much information.
Our customers would prefer twenty quality results that are exactly what they were searching for. How do you help them find the content that is useful to them? We need to keep on top of this, to consistently push quality documents to researchers so they don’t have to spend their valuable time trying to find the right information.
Synaptica Insights is our regular series of case studies sharing stories and learnings from our customers, partners, influencers, and colleagues.