The time to re-evaluate Bayes?

Colin Strong

We are often so used to the tools that we use, we forget they reflect and shape the way in which we understand the world.

One of our earliest tools, the map, originally simply reflected what humans could see in front of them, in no particularly accurate manner. Over time, with an understanding of the proportions of a space, including the space outside that which is immediately visible, maps became more realistic. As Nicholas Carr puts it in his excellent book The Shallows, “The more frequently and intensively people used maps, the more their minds came to understand reality in the maps’ terms.” He argues that they “advanced the evolution of abstract thinking” in man, as he was then able to “understand the unseen forces that shape his surroundings and his existence.”

Carr makes a similar case for the mechanical clock which again changed the way we saw ourselves. The clock defined time in terms of units of equal duration, so we were able to start comprehending the concepts of division and measurement. So we began to see, in the world around us, how the whole is composed of individual pieces which are in turn composed of pieces. So we began to understand that there are abstract patterns behind the visible appearance of the material world. And this mindset effectively propelled us out of the middle ages, into the Renaissance and then the Enlightenment.

Frequentist vs prior priority
Nate Silver, particularly in his book The Signal and the Noise, has recently been making a very persuasive case that the dominant school of statistics that we use has shaped the way in which we think about the world. He considers that “frequentist” statistics, which are almost universally used across most disciplines, including market research, imply that we are able to arrive at a definitive conclusion on the matter in question if only we are able to use our tools accurately and effectively, in other words that “uncertainty is something that is intrinsic to the experiment rather than something intrinsic to our ability to understand the world.”

In contrast to frequentist statistics, “Bayesian” statistics require us to think about the world in a connected, consequential manner, rather than in an isolated way. The Bayesian statistician is required to identify the ‘prior probability’ of a hypothesis being true. Evidence is then collected that allows us to estimate the probabilities that the hypothesis is true or false. These are applied to the prior probability to derive a ‘posterior probability.’ And, of course, the posterior probability can become the new prior probability when new information comes in. Silver works through a variety of examples of this process, from analysing breast cancer probabilities, to the likelihood of terrorist attacks, to the drivers of climate change.

Identifying a prior probability is often considered to be highly subjective, yet this is typically what happens in social science – we consider the context of our findings before giving them undue weight. To take a simple example, consumers often over-estimate their propensity to purchase a new product or service. Bayes forces researchers to look at the context of the broader market for new products and services and requires us to generate a prior probability based on that success rate.

Silver points out that the lack of explicit focus on the broader context can mean that we generate false positives – studies that have statistically significant findings but which are manifestly wrong, such as one purporting to show how toads can predict earthquakes. Other influential voices also point out the ways in which the intentions of the researcher and their effects on experimental design render studies based on frequentist statistics particularly vulnerable to false positives.

Big Data
To date, market research has been able to manage these challenges through the intelligent application of frequentist statistics. Of course we look at the context prior to deciding which analysis to run, of course we apply market knowledge to weed out false positives. As market research moves into the sphere of big data, however, there are serious implications.

Much of the speculation around big data has effectively dismissed the need for an understanding of context. Indeed, Chris Anderson famously championed the perspective that we no longer need to really understand consumers or hold theories of human behaviour – we can simply use large computers to uncover the important patterns and trends. Anderson suggests that some sciences have drifted into “arid, speculative theorising,” with the implication that in the meantime big data is breaking new ground.

So the heritage of commercial big data is firmly rooted in highly numeric disciplines, including statistics, computer science, applied mathematics and economics. All very useful, but not necessarily areas in which analysts will have the experience and understanding of the market context to be able to estimate prior probabilities.

This potential for a lack of understanding of the context is compounded by the sheer volume of big data, pointing to the need for machine-based analytics systems. This further removes the analyst from integrating a contextual view of the data and as such makes it likely that the number of false positives obtained will be high. The sheer volume of data analysis that is run means that even the most intelligent frequentist statistician will struggle to manage taking context into account. This seems to call for machine-based Bayesian approaches in which contextual information is integrated into the analysis.

Watts suggests that perhaps the best we can do when attempting to make predictions within complex systems is to model the probability of particular outcomes, as a Bayesian approach is good at delivering. This gives the recipient a much better understanding of how robust the prediction is and as such gives guidance to the appropriate business response.

Rain, rain, go away!
However, reporting predictions in terms of probabilities approach also has its issues as the study of weather forecasting illustrates – a complex system if ever there was one. Many studies have found that we are poor at dealing with probabilistic guidance – we don’t want to know if there is a 40% chance of rain tomorrow, we definitively want to know if it will rain tomorrow so we can take appropriate action. And these same principles apply to business decisions – a product manager will generally be underwhelmed by receiving guidance saying there is a 50% chance of generating sales of say, $50m. The language traditionally demanded by the business case is way more definitive than that. Yet perhaps we need to work at better communicating in this way to avoid the inevitable overconfidence that results when providing predictions of absolute numbers within complex systems.

Given there is so much variance in the data then in many situations a probabilistic prediction is perhaps as much as we can do. So whilst we can predict the way in which the aerodynamics of an aircraft wing will work with great confidence, trying to predict the sales volumes of a new product has much more variance that is hard to measure and as such a probabilistic approach may well be more appropriate.

What are the chances?
Debates are currently raging about the relative merits of each approach, and pragmatists among us argue that there is no difference between the sensible Bayesian and the sensible frequentist. However, Bayesian approaches certainly chime with the complex, nuanced world we find ourselves in, and a tool that helps us to think more in a more probabilistic manner with a strong focus on the wider context may well be better suited to our times.

Colin Strong is managing director at GfK NOP Business & Technology in the UK

1 comment

Oleg Klepikov June 13, 2013 at 8:17 pm

It is possible that people will not understand fully Bayesian approach – because it means that you will see the results of, say, 60% chance of rain in the next morning and all similar mornings with the same model of the weather and some other similar changes in the environment … but it will take at least some time to feel this level of probability …
I would say there are two approaches. One of them is to use a large probability of the results and make the big bet on them. On the other – small bets for not very large probability. That is kind of trading on the stock exchange. However, for a manager to assume this could means the ability to make mistakes. And sometimes in business to be wrong is the same as being fired after that … It’s kind of a vicious circle

Cookie	Type	Duration	Description
cli_user_preference	persistent	1 year	Keeps track of the cookie consents for on the current domain.
cookielawinfo-checkbox-marketing	persistent	1 year	Keeps track of the cookie consent for a specific category on the current domain.
cookielawinfo-checkbox-measurement	persistent	1 year	Keeps track of the cookie consent for a specific category on the current domain.
cookielawinfo-checkbox-necessary	persistent	1 year	Keeps track of the cookie consent for a specific category on the current domain.
cookielawinfo-checkbox-non-necessary	0	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Non Necessary".
cookielawinfo-checkbox-preferences	persistent	1 year	Keeps track of the cookie consent for a specific category on the current domain.
hustle_module_show_count-	persistent	1 day	This cookie is used to determine when the internal slide-in/pop-up/embed module for newsletter opt-ins is displayed to the user.
inc_optin_	persistent	1 hour	This cookie is used to determine when the internal slide-in/pop-up/embed module for newsletter opt-ins is displayed or hidden to the user.
PHPSESSID	session	0 minute	Preserves user session state across page requests. The PHPSESSID cookie is native to PHP and enables websites to store serialised state data. On the website it is used to establish a user session and to pass state data via a temporary cookie, which is commonly referred to as a session cookie. Stores unique session ID.
viewed_cookie_policy	persistent	1 hour	Stores the user's cookie consent state for the current domain.
viewed_cookie_policy	0	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
wordpress_	session	session	WordPress cookie for a logged in user.
wordpress_logged_in_	session	session	WordPress cookie for a logged in user.
wordpress_test_	session	session	WordPress cookie for a logged in user.
wordpress_test_cookie	session	session	WordPress test cookie.
wp-settings-	session	session	Wordpress also sets a few wp-settings-[UID] cookies. The number on the end is your individual user ID from the users database table. This is used to customize your view of admin interface, and possibly also the main site interface.
wp-settings-time-	session	session	Wordpress also sets a few wp-settings-{time}-[UID] cookies. The number on the end is your individual user ID from the users database table. This is used to customize your view of admin interface, and possibly also the main site interface.

Cookie	Type	Duration	Description
AMP_TOKEN	persistent	1 year	This cookie name is associated with Google Universal Analytics - which is a significant update to Google's more commonly used analytics service. It contains a token that can be used to retrieve a Client ID from AMP Client ID service. Other possible values indicate opt-out, inflight request or an error retrieving a Client ID from AMP Client ID service.
collect	third party	session	Used to send data to Google Analytics about the visitor's device and behaviour. Tracks the visitor across devices and marketing channels.
_ga	persistent	2 year	Registers a unique ID that is used to generate statistical data on how the visitor uses the website.
_gid	persistent	1 day	Registers a unique ID that is used to generate statistical data on how the visitor uses the website.
__gads	third party	2 years	Associated with the DoubleClick for Publishers service from Google. It serves purposes such as measuring interactions with the ads on our domain and preventing the same ads from being shown to you too many times.
__utma	persistent	2 years	This cookie is typically written to the browser upon the first visit. If the cookie has been deleted by the browser operator, and the browser subsequently visits strategy-business.com, a new __utma cookie is written with a different unique ID. In most cases, this cookie is used to determine unique visitors to strategy-business.com, and it is updated with each page view. Additionally, this cookie is provided with a unique ID that Google Analytics uses to ensure both the validity and the accessibility of the cookie as an extra security measure.
__utmb	persistent	30 minutes	This cookie is typically written to the browser upon the first visit. If the cookie has been deleted by the browser operator, and the browser subsequently visits strategy-business.com, a new __utma cookie is written with a different unique ID. In most cases, this cookie is used to determine unique visitors to strategy-business.com, and it is updated with each page view. Additionally, this cookie is provided with a unique ID that Google Analytics uses to ensure both the validity and the accessibility of the cookie as an extra security measure.
__utmc	persistent	30 minutes	Historically, this cookie operated in conjunction with the __utmb cookie to determine whether or not to establish a new session for the user. For backward compatibility purposes with sites still using the urchin.js tracking code, this cookie will continue to be written and will expire when the user exits the browser. However, if you are debugging your site tracking and you use the ga.js tracking code, you should not interpret the existence of this cookie in relation to a new or expired session.
__utmv	persistent	2 years	This cookie is not normally present in a default configuration of the tracking code. The __utmvcookie passes the information provided via the _setVar() method, which you use to create a custom user segment. This string is then passed to the Analytics servers in the GIF request URL via the utmcc parameter. This cookie is written only if you have added the¬_setVar() method for the tracking code on your website page.
__utmz	persistent	6 months	This cookie stores the type of referral used by the visitor to reach strategy-business.com, whether via a direct method, a referring link, a website search, or a campaign such as an ad or an email link. It is used to calculate search engine traffic, ad campaigns, and page navigation within strategy-business.com. The cookie is updated with each page view to strategy-business.com.

Cookie	Type	Duration	Description
GoogleAdServingTest	persistent	session	Used to register what ads have been displayed to the user.
IDE	persistent	1 year	Used by Google DoubleClick to register and report the website user's actions after viewing or clicking one of the advertiser's ads with the purpose of measuring the efficacy of an ad and to present targeted ads to the user.
test_cookie	third party	1 day	Used to check if the user's browser supports cookies.
__ab12#	persistent	2 years	Pending

Top 10 Global Consumer Trends 2020

Top 10 Global Consumer Trends 2021

Understanding the Why? Projective Techniques in Qualitative…

African consumers resistance to e-commerce and what is…

The fascinating dynamism of the African Insights industry

Christmas 2020: Opportunities to close the year on…

Make your customer experience meaningful, not only frictionless

There Is a Way Out of This Mess

Nail Biting in Georgia US Senate Races –…

Media polling and the way forward

U.S. election pollsters: watch Florida for key indicators!

Post-pandemic marketing & advertising trends among marketers

Cross-Media Measurement, XMM: no viewing – no outcomes!

XMM Disconnect? As Alice went into Wonderland, things…

Innovations in media measurement, accelerated by COVID, establish…

Insight from the Insight250 winners: Data-driven leadership

Insights from the Insight250 winners: Evolutions and innovations…

Customer advocacy: How to turn customers into friends,…

Brands as provocations: How to connect at scale…

Predictive qual: How to turn the art of…

What It truly means to be tech-enabled in…

Insights on insights: Which survey data analysis solution…

Eating in, is the new testing out –…

Behavioural tech-heads: What technology needs to learn from…

SHOBSERVATORY Research Chronicles: The heart of the brand…

ESOMAR announces the 2021 award winners

SHOBSERVATORY Research Chronicles: How presentations are created

The time to re-evaluate Bayes?

1 comment

Leave a Comment Cancel Reply

Predictive qual: How to turn the art of qual into a science...

The time to re-evaluate Bayes?

1 comment

Leave a Comment Cancel Reply

Related Articles

We value your privacy!