How To Check If Your Social Listening & Analytics Is Appropriate For Customer Insights

By Michalis Michael

Before we start sharing real data and case studies from social listening, we thought it prudent to explain how this is properly done for market research purposes. The most important notion to be understood and accepted is that unlike the way DIY social media monitoring tools work, in order to achieve the required data accuracies that are of paramount importance for market research purposes, a 3-4 week human-led set-up phase is needed (see Fig.1). Once the set-up of a product category or any other subject in a specific language is done, from then on it is possible to take a real-time DIY approach like with any other social media monitoring tool.

Figure1: Social Media Listening & Analytics Process for MR (Source: DigitalMR Presentation at LT-Accelerate Conference 2015)

In the first article of this series, we mentioned that it is important for insights experts to be able to connect the dots between listening, asking questions, and tracking behaviour. In order to do that, an insights expert needs to trust that the thousands of posts analysed are actually about the brands and product category of interest. This brings us to the first of 3 issues to pay attention to when using social listening for market research purposes.

Noise Elimination

The set of keywords that is used to collect posts from social media and other public websites is called a “harvest query”. This harvest query can be as simple as one word or as complex as multiple pages of Boolean logic. The problem with harvesting only the relevant posts is that we need to also know all of the irrelevant synonyms and homonyms of our keywords; which we never do. Thus, an iterative process is required, involving humans who can improve the harvest query as they find new irrelevant words that they did not think of during the previous iteration. The most common example we use to make this clear is this: when we want to harvest posts about “Apple computers” we know from the beginning that there will be posts about apple the fruit, so we create a harvest query that excludes posts about the fruit; but what about (the actress) Gwyneth Paltrow’s daughter named Apple that everybody talks about on Twitter? I’m sure you see my point…

Sentiment Accuracy

There are quite a few ways to annotate posts with sentiment, ranging from manual to using linguistic or statistical methods of NLP (Natural Language Processing). There are pros and cons for each method especially when we are looking at a data set with 10,000 posts or fewer that will be used for a one-off report. However, for any continuous reporting or even a one-off report with over 20,000 posts, using humans as opposed to machines is both expensive and slow. In the previous article of this series we talked about the proper metrics for accuracy: Precision & Recall. Most social media monitoring tools can barely achieve a sentiment precision of 60%; as a matter of fact in all cases when we were asked to check, their accuracy ranged between 44%-53%. Anything over 70% sentiment precision could be acceptable at the beginning of a tracking project for market research, but then it should climb over 80% within a short period of time.

Semantic Accuracy

When we say semantic analysis we mean analysing the topics of online conversation around the product category of interest. Similarly to sentiment accuracy, precision & recall are the appropriate metrics to measure semantic accuracy. If a hierarchical taxonomy (a dictionary with multiple layers/hierarchies that describes a product category with the words people use in their social media posts) is used to report on topics for market research purposes, over 85% semantic precision for hierarchy 1 topics is achievable. You will have noticed that even though we mention “recall” as one of the accuracy measures, in this big data analytics space we have not used it to describe what is appropriate for market research purposes. Recall for semantic analysis is about how many of the posts that actually exist in a data set on a certain topic, were identified as such. In the world of big data where we deal with millions of posts, in order to be cost efficient, it is in our interest to only look at keywords that are mentioned multiple times. If a keyword is mentioned only a couple of times in a data set of millions we will be entering in “diminishing returns territory” if we attempt to annotate posts with it. It is however possible to maximise recall and it should be the end-client’s decision if they want to spend their money and time this way.

Let me know if you have questions; in the meantime, stay tuned for the next article which will be about some real data from social listening.

By Michalis A. Michael, DigitalMR, Connect with Michalis via @DigitalMR, his personal @DigitalMR_CEO or via email mmichael@digital-mr.com

Cookie	Type	Duration	Description
cli_user_preference	persistent	1 year	Keeps track of the cookie consents for on the current domain.
cookielawinfo-checkbox-marketing	persistent	1 year	Keeps track of the cookie consent for a specific category on the current domain.
cookielawinfo-checkbox-measurement	persistent	1 year	Keeps track of the cookie consent for a specific category on the current domain.
cookielawinfo-checkbox-necessary	persistent	1 year	Keeps track of the cookie consent for a specific category on the current domain.
cookielawinfo-checkbox-non-necessary	0	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Non Necessary".
cookielawinfo-checkbox-preferences	persistent	1 year	Keeps track of the cookie consent for a specific category on the current domain.
hustle_module_show_count-	persistent	1 day	This cookie is used to determine when the internal slide-in/pop-up/embed module for newsletter opt-ins is displayed to the user.
inc_optin_	persistent	1 hour	This cookie is used to determine when the internal slide-in/pop-up/embed module for newsletter opt-ins is displayed or hidden to the user.
PHPSESSID	session	0 minute	Preserves user session state across page requests. The PHPSESSID cookie is native to PHP and enables websites to store serialised state data. On the website it is used to establish a user session and to pass state data via a temporary cookie, which is commonly referred to as a session cookie. Stores unique session ID.
viewed_cookie_policy	persistent	1 hour	Stores the user's cookie consent state for the current domain.
viewed_cookie_policy	0	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
wordpress_	session	session	WordPress cookie for a logged in user.
wordpress_logged_in_	session	session	WordPress cookie for a logged in user.
wordpress_test_	session	session	WordPress cookie for a logged in user.
wordpress_test_cookie	session	session	WordPress test cookie.
wp-settings-	session	session	Wordpress also sets a few wp-settings-[UID] cookies. The number on the end is your individual user ID from the users database table. This is used to customize your view of admin interface, and possibly also the main site interface.
wp-settings-time-	session	session	Wordpress also sets a few wp-settings-{time}-[UID] cookies. The number on the end is your individual user ID from the users database table. This is used to customize your view of admin interface, and possibly also the main site interface.

Cookie	Type	Duration	Description
AMP_TOKEN	persistent	1 year	This cookie name is associated with Google Universal Analytics - which is a significant update to Google's more commonly used analytics service. It contains a token that can be used to retrieve a Client ID from AMP Client ID service. Other possible values indicate opt-out, inflight request or an error retrieving a Client ID from AMP Client ID service.
collect	third party	session	Used to send data to Google Analytics about the visitor's device and behaviour. Tracks the visitor across devices and marketing channels.
_ga	persistent	2 year	Registers a unique ID that is used to generate statistical data on how the visitor uses the website.
_gid	persistent	1 day	Registers a unique ID that is used to generate statistical data on how the visitor uses the website.
__gads	third party	2 years	Associated with the DoubleClick for Publishers service from Google. It serves purposes such as measuring interactions with the ads on our domain and preventing the same ads from being shown to you too many times.
__utma	persistent	2 years	This cookie is typically written to the browser upon the first visit. If the cookie has been deleted by the browser operator, and the browser subsequently visits strategy-business.com, a new __utma cookie is written with a different unique ID. In most cases, this cookie is used to determine unique visitors to strategy-business.com, and it is updated with each page view. Additionally, this cookie is provided with a unique ID that Google Analytics uses to ensure both the validity and the accessibility of the cookie as an extra security measure.
__utmb	persistent	30 minutes	This cookie is typically written to the browser upon the first visit. If the cookie has been deleted by the browser operator, and the browser subsequently visits strategy-business.com, a new __utma cookie is written with a different unique ID. In most cases, this cookie is used to determine unique visitors to strategy-business.com, and it is updated with each page view. Additionally, this cookie is provided with a unique ID that Google Analytics uses to ensure both the validity and the accessibility of the cookie as an extra security measure.
__utmc	persistent	30 minutes	Historically, this cookie operated in conjunction with the __utmb cookie to determine whether or not to establish a new session for the user. For backward compatibility purposes with sites still using the urchin.js tracking code, this cookie will continue to be written and will expire when the user exits the browser. However, if you are debugging your site tracking and you use the ga.js tracking code, you should not interpret the existence of this cookie in relation to a new or expired session.
__utmv	persistent	2 years	This cookie is not normally present in a default configuration of the tracking code. The __utmvcookie passes the information provided via the _setVar() method, which you use to create a custom user segment. This string is then passed to the Analytics servers in the GIF request URL via the utmcc parameter. This cookie is written only if you have added the¬_setVar() method for the tracking code on your website page.
__utmz	persistent	6 months	This cookie stores the type of referral used by the visitor to reach strategy-business.com, whether via a direct method, a referring link, a website search, or a campaign such as an ad or an email link. It is used to calculate search engine traffic, ad campaigns, and page navigation within strategy-business.com. The cookie is updated with each page view to strategy-business.com.

Cookie	Type	Duration	Description
GoogleAdServingTest	persistent	session	Used to register what ads have been displayed to the user.
IDE	persistent	1 year	Used by Google DoubleClick to register and report the website user's actions after viewing or clicking one of the advertiser's ads with the purpose of measuring the efficacy of an ad and to present targeted ads to the user.
test_cookie	third party	1 day	Used to check if the user's browser supports cookies.
__ab12#	persistent	2 years	Pending

Top 10 Global Consumer Trends 2020

Top 10 Global Consumer Trends 2021

Understanding the Why? Projective Techniques in Qualitative…

African consumers resistance to e-commerce and what is…

The fascinating dynamism of the African Insights industry

Christmas 2020: Opportunities to close the year on…

Make your customer experience meaningful, not only frictionless

There Is a Way Out of This Mess

Nail Biting in Georgia US Senate Races –…

Media polling and the way forward

U.S. election pollsters: watch Florida for key indicators!

Post-pandemic marketing & advertising trends among marketers

Cross-Media Measurement, XMM: no viewing – no outcomes!

XMM Disconnect? As Alice went into Wonderland, things…

Innovations in media measurement, accelerated by COVID, establish…

Insight from the Insight250 winners: Data-driven leadership

Insights from the Insight250 winners: Evolutions and innovations…

Customer advocacy: How to turn customers into friends,…

Brands as provocations: How to connect at scale…

Predictive qual: How to turn the art of…

What It truly means to be tech-enabled in…

Insights on insights: Which survey data analysis solution…

Eating in, is the new testing out –…

Behavioural tech-heads: What technology needs to learn from…

SHOBSERVATORY Research Chronicles: The heart of the brand…

ESOMAR announces the 2021 award winners

SHOBSERVATORY Research Chronicles: How presentations are created

How To Check If Your Social Listening & Analytics Is Appropriate For Customer Insights

Leave a Comment Cancel Reply

Predictive qual: How to turn the art of qual into a science...

How To Check If Your Social Listening & Analytics Is Appropriate For Customer Insights

Leave a Comment Cancel Reply

Related Articles

We value your privacy!