Skip to main content

Voice of the Customer

In a previous blog, Text Analytics Use Cases – The Unknowns, I wrote about the more general use cases in which text analytics can be used to provide value to an organization. In this blog, I will dive more deeply into the use of text analytics in uncovering the voice of the customer (VoC).

The VoC is the collection and analysis of customer wants and needs, preferences, and expectations. Frequently used in market research, the VoC is highly valued by marketing departments. However, understanding what the customer is really saying can be very difficult to assess in a crowded and noisy online space.

Know Your Goal

Before defining the tools and processes needed to conduct text analytics as part of VoC analysis, it is essential to define and state the goal. There is a common misperception that analyzing text will provide magical insights or answers to common challenges. The reality is, without clearly defined goals for analyzing customer information, text analytics efforts will likely fail.

The overarching goal for text analysis of customer information is to gain insights. Insights as a general mandate, however, allows too much room for aimless and haphazard exploration through vast quantities of data. Clearly defined questions will dictate the requirements for what tools and processes are needed in the program. For example,

  • What features in the new product release do customers like or dislike?
  • How is call center customer service experience impacting overall sales?
  • What insights can be gathered from free text input in customer surveys?

Not only do the questions and goals need to be defined, but a clear process and course of action based on the results needs to be in place. For example, if analysis results in identifying a particular product feature that is problematic for users, a clear line of communication to the product development group should be established and include the data, a metric linking the data to the organization’s key performance indicators (KPIs), and possible suggestions to resolve the issue.

Analyzing Customer Input

What kind of input is valuable for voice of the customer analysis? The most often overlooked input is unstructured text, such as product reviews, social media posts and blogs, free text notes taken in customer call center interactions and included in surveys, and the recordings or transcriptions of in-person interviews.

Structured data like point-of-sales, survey questions with predetermined choices, or product ratings are important for voice of the customer analysis, because the nature of the data makes them a little easier to collate and analyze. Organizations may have problems with data consistency, access, accuracy, and velocity, but these problems can generally be solved with improved tools and processes.

Unstructured content, however, presents different challenges. Because of these challenges, unstructured content is frequently viewed as second-class or less valuable. It’s tempting to avoid difficult analysis tasks, but the possible rewards that may be realized by a well-established program can far outweigh the pain of initiating it.

Know the Data

Knowing the data is essential to accurate and meaningful insights. Knowing the source of the data, its age (or lag between data production and analysis), the accuracy, any transformations performed on the data prior to analysis, the scope of the information, and how the data is defined prior to analysis are all important for actionable outcomes. Structured data fields need to have consistent values to produce valid results.

Having clean data is one of the most important factors, as the time spent to cleanse data causes delays to insights and can tie up valuable (and expensive!) resources like data scientists. Text isn’t usually what you’d call “clean” by nature, but correcting or accounting for misspellings, text length, irregular characters and emojis, as well as other inconsistencies can help with the analysis.

Making sure that the data being used is appropriate to the end goal is essential. While it may seem reasonable to analyze as much data as possible to get an insight, too broad a set of data can result in unclear or even unsound results. In addition, there’s the very practical problem of volume, as there may be thousands of product reviews and analysis can be time-sensitive. In most marketing departments, the employees know exactly which data will provide the insight they are looking for but are hampered by the difficulties of analysis.

One way to define the data is to use an enterprise taxonomy of products, marketing campaigns, customer features, or other relevant information to organize the content. If your organization is just starting to collect and organize marketing content, it may be necessary to analyze it first to build the taxonomy. The exercise of building a taxonomy will help to identify what content is important to produce desired insights, how that content should be categorized, and how those categories can in turn feed the content text analytics process.

Product Reviews

Product reviews are a great source of input for product development and improvement. They are often a mix of structured and unstructured fields such as the name of the product, the date, the reviewer, a rating, and unstructured text. In addition, product reviews often address some aspects of an organization beyond just the product being reviewed, such as customer service satisfaction or perceived company culture.

Some organizations have easy access to product reviews because they are completed on their own site. In this case, the product reviews are owned data which the organization can easily access and analyze. In other cases, product reviews are aggregated on third-party retail sites. Access to this data is not always straightforward and may have to be purchased. Paid data needs to be considered in the overall text analytics program.

Once access to product reviews are secured, the next decision is what should be analyzed. Including the structured data in the overall information capture is essential, but to get more detailed results, the unstructured text should be considered as well. For more specific insights, select a body of reviews that can be filtered by a specific aspect, such as by product, date, or location.

Start by normalizing the text of the content by eliminating misspellings or other irregularities, such as emojis. Several text preprocessing steps also need to take place. Tokenization breaks text up into smaller units, or tokens. These are done at the sentence and word level. Various normalization processes can also be included which convert all letters to the same case, remove punctuation, lemmatize and stem text, and remove stop words. Part-of-speech tagging may also be included. The preprocessing performed often depends on the state of the content. Most text analytics software packages include some of these processes.

Once the text is clean and machine readable, there are two options for identification of concepts. If your organization has a taxonomy, the text can be processed and categorized against these values, identifying concepts which are already known and maintained. The second option is to perform entity extraction, identifying previously unknown concepts. Ideally, processing the content using both a taxonomy and entity extraction provides more information.

Simply preparing the text for analysis is a lot of work. However, once the text is normalized and the process is put in place, the resulting analysis happens more quickly. So, what about that analysis? Looking at the analysis should provide numerous insights. For example,

  • The name of my product is in close proximity to a set of negative and positive words. How can I use this information to market or improve my product?
  • My internal enterprise taxonomy and the vocabulary of my product reviewers is very different. Are we out of touch with our customers?
  • The length of the product review is related to the tone. How can I get customers to write more positive information about my product?
  • There are a lot of comments about customer service in the product reviews. How do I use this information to provide better customer service and support?

The questions asked of the content and the insights gained are almost limitless. As a result, the text analytics processes and overall program should be in a constant state of evolution and improvement. It’s possible to use the same processes over various types of content, but analyzing product reviews is different enough from analyzing short comments on social media, and that may change the process.

Process & Integration

A sure way to hinder the success of a text analytics project is a lack of process development. Also make sure to take into consideration existing ways of working and systems. Starting off with a text analytics proof of concept project on a limited set of content can be a great way to show value. If, however, text analytics is seen as a one-time effort, the program will fade away. Like any other new business process, be sure to integrate text analytics into existing processes as this will be key to the program’s success.

Text analytics is fairly complicated. It’s important to have a champion who can convey the complexities in a straightforward manner and clearly demonstrate the value to the organization, especially to the C-level executives who sign off on information programs.

Voice of the customer should be an important focus for most organizations, and text analytics is an important part of the equation for generating meaningful insights.