The ABC’s of Bayesian Statistics

By Kevin Gray and Andrew Gelman

Marketing scientist Kevin Gray asks Columbia University Professor Andrew Gelman to spell out the ABCs of Bayesian statistics.

KG: Most marketing researchers have heard of Bayesian statistics but know little about it. Can you briefly explain in layperson’s terms what it is and how it differs from the ‘ordinary’ statistics most of us learned in college?

AG: Bayesian statistics uses the mathematical rules of probability to combine data with ‘prior information’ to give inferences which – if the model being used is correct – are more precise than would be obtained by either source of information alone.

Classical statistical methods avoid prior distributions. In classical statistics, you might include in your model a predictor (for example), or you might exclude it, or you might pool it as part of some larger set of predictors in order to get a more stable estimate. These are pretty much your only choices. In Bayesian inference you assign a prior distribution representing the set of values the coefficient can take. You can reproduce the classical methods using Bayesian inference, but in Bayesian inference you can do much more – by setting what is called an ‘informative prior’, you can partially constrain a coefficient, setting a compromise between noisy least-squares estimation and completely setting it to zero. It turns out this is a powerful tool in many problems.

KG: Could you give us a quick overview of its history and how it has developed over the years?

AG: The theory of Bayesian inference originates with its namesake, Thomas Bayes, an 18^th-century English cleric, but it really took off in the late 18^th century with the work of the French mathematician and physicist Pierre-Simon Laplace. Bayesian methods were used for a long time after that to solve specific problems in science, but it was in the mid-20^th century that they were proposed as a general statistical tool.

Within statistics, Bayesian and related methods have become gradually more popular over the past several decades, often developed in different applied fields, such as animal breeding in the 1950s, educational measurement in the 1960s and 1970s, spatial statistics in the 1980s, and marketing and political science in the 1990s. Eventually a sort of critical mass developed in which Bayesian models and methods that had been developed in different applied fields became recognized as more broadly useful.

Another factor that has fostered the spread of Bayesian methods is progress in computing speed and improved computing algorithms. Except in simple problems, Bayesian inference requires difficult mathematical calculations – high-dimensional integrals – which are often most practically computed using stochastic simulation (computation using random numbers). This is the so-called Monte Carlo method, which was developed systematically by the mathematician Stanislaw Ulam and others when trying out designs for the hydrogen bomb in the 1940s and then rapidly picked up in the worlds of physics and chemistry. The potential for these methods to solve otherwise intractable statistics problems became apparent in the 1980s, and since then each decade has seen big jumps in the sophistication of algorithms, the capacity of computers to run these algorithms in real time, and the complexity of the statistical models that practitioners are now fitting to data.

Now, don’t get me wrong – computational and algorithmic advances have become hugely important in non-Bayesian statistical and machine learning methods as well. Bayesian inference has moved, along with statistics more generally, away from simple formulas toward simulation-based algorithms.

KG: What are its key strengths in comparison with Frequentist methods? Are there things that only Bayesian statistics can provide? What are its main drawbacks?

AG: I wouldn’t say there’s anything that only Bayesian statistics can provide. When Bayesian methods work best, it’s by providing a clear set of paths connecting data, mathematical/statistical models, and the substantive theory of the variation and comparison of interest. From this perspective, the greatest benefits of the Bayesian approach come not from default implementations, valuable as they can be in practice, but in the active process of model building, checking, and improvement. In classical statistics, improvements in methods often seem distressingly indirect – you try a new test that’s supposed to capture some subtle aspect of your data, or you restrict your parameters or smooth your weights, in some attempt to balance bias and variance. Under a Bayesian approach, all the tuning parameters are supposed to be interpretable in real-world terms, which implies – or should imply – that improvements in a Bayesian model come from, or supply, improvements in understanding of the underlying problem under study. The drawback of this Bayesian approach is that it can require a bit of a commitment to construct a model that might be complicated, and you can end up putting effort into modeling aspects of data that maybe aren’t so relevant for your particular inquiry.

KG: Are there misunderstandings about Bayesian methods that you often encounter?

AG: Yes, but that’s a whole subject in itself – I have written papers on the topic! The only thing I’ll say here is that Bayesian methods are often characterized as ‘subjective’ because the user must choose a ‘prior distribution’, that is, a mathematical expression of prior information. The prior distribution requires information and user input, that’s for sure, but I don’t see this as being any more ‘subjective’ than other aspects of a statistical procedure, such as the choice of model for the data (for example, logistic regression) or the choice of which variables to include in a prediction, the choice of which coefficients should vary over time or across situations, the choice of statistical test, and so forth. Indeed, Bayesian methods can in many ways be more ‘objective’ than conventional approaches in that Bayesian inference, with its smoothing and partial pooling, is well adapted to including diverse sources of information.

KG: Do you think Bayesian methods will one day mostly replace Frequentist statistics?

AG: There’s room for lots of methods. What’s important in any case is what problems they can solve. We use the methods we already know and then learn something new when we need to go further. Bayesian methods offer a clarity that comes from the explicit specification of a so-called ‘generative model’ – a probability model of the data-collection process and a probability model of the underlying parameters. But construction of these models can take work, and it makes sense to me that for problems where you have a simpler model that does the job, you just go with that.

Looking at the comparison from the other direction, when it comes to big problems with streaming data, Bayesian methods are useful but the Bayesian computation can in practice only be approximate. And once you enter the zone of approximation, you can’t cleanly specify where the modeling approximation ends and the computing approximation begins. At that point, you need to evaluate any method, Bayesian or otherwise, by looking at what it does to the data, and the best available method for any particular problem might well be set up in a non-Bayesian way.

KG: Does Bayesian inference have a special role in Big Data or the Internet of Things?

AG: Yes, I think so. The essence of Bayesian statistics is the combination of information from multiple sources. We call this data and prior information, or hierarchical modeling, or dynamic updating, or partial pooling, but in any case it’s all about putting together data to understand a larger structure. Big data, or data coming from the so-called internet of things, are inherently messy – scraped data not random samples, observational data not randomized experiments, available data not constructed measurements. So statistical modeling is needed to put data from these different sources on a common footing. I see this in the analysis of internet surveys where we use multilevel Bayesian models and non-random samples to make inferences about the general population, and the same ideas occur over and over again in modern messy-data settings.

KG: What are the developments in Bayesian statistics that might have an impact on the behavioral and social sciences in the next few years?

AG: Lots of directions here. From the modeling direction, we have problems such as polling where our samples are getting worse and worse, less and less representative, and we need to do more and more modeling to make reasonable inferences from sample to population. For decision making we need causal inference, which typically requires modeling to adjust for differences between so-called treatment and control groups in observational studies. And just about any treatment effect we care about will vary depending on scenario. The challenge here is to estimate this variation, while accepting that in practice we will have a large residue of uncertainty. We’re no longer in the situation where ‘p < .05’ can be taken as a sign of a discovery. We need to accept uncertainty and embrace variation. And that’s true no matter how ‘big’ our data are.

KG: Thank you, Andrew!

Kevin Gray is president of Cannon Gray, a marketing science and analytics consultancy. Andrew Gelman is Professor of Statistics and Political Science and Director of the Applied Statistics Center at Columbia University. Professor Gelman is also one of the principal developers of the Stan software, which is widely-used for Bayesian analysis.

Cookie	Type	Duration	Description
cli_user_preference	persistent	1 year	Keeps track of the cookie consents for on the current domain.
cookielawinfo-checkbox-marketing	persistent	1 year	Keeps track of the cookie consent for a specific category on the current domain.
cookielawinfo-checkbox-measurement	persistent	1 year	Keeps track of the cookie consent for a specific category on the current domain.
cookielawinfo-checkbox-necessary	persistent	1 year	Keeps track of the cookie consent for a specific category on the current domain.
cookielawinfo-checkbox-non-necessary	0	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Non Necessary".
cookielawinfo-checkbox-preferences	persistent	1 year	Keeps track of the cookie consent for a specific category on the current domain.
hustle_module_show_count-	persistent	1 day	This cookie is used to determine when the internal slide-in/pop-up/embed module for newsletter opt-ins is displayed to the user.
inc_optin_	persistent	1 hour	This cookie is used to determine when the internal slide-in/pop-up/embed module for newsletter opt-ins is displayed or hidden to the user.
PHPSESSID	session	0 minute	Preserves user session state across page requests. The PHPSESSID cookie is native to PHP and enables websites to store serialised state data. On the website it is used to establish a user session and to pass state data via a temporary cookie, which is commonly referred to as a session cookie. Stores unique session ID.
viewed_cookie_policy	persistent	1 hour	Stores the user's cookie consent state for the current domain.
viewed_cookie_policy	0	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
wordpress_	session	session	WordPress cookie for a logged in user.
wordpress_logged_in_	session	session	WordPress cookie for a logged in user.
wordpress_test_	session	session	WordPress cookie for a logged in user.
wordpress_test_cookie	session	session	WordPress test cookie.
wp-settings-	session	session	Wordpress also sets a few wp-settings-[UID] cookies. The number on the end is your individual user ID from the users database table. This is used to customize your view of admin interface, and possibly also the main site interface.
wp-settings-time-	session	session	Wordpress also sets a few wp-settings-{time}-[UID] cookies. The number on the end is your individual user ID from the users database table. This is used to customize your view of admin interface, and possibly also the main site interface.

Cookie	Type	Duration	Description
AMP_TOKEN	persistent	1 year	This cookie name is associated with Google Universal Analytics - which is a significant update to Google's more commonly used analytics service. It contains a token that can be used to retrieve a Client ID from AMP Client ID service. Other possible values indicate opt-out, inflight request or an error retrieving a Client ID from AMP Client ID service.
collect	third party	session	Used to send data to Google Analytics about the visitor's device and behaviour. Tracks the visitor across devices and marketing channels.
_ga	persistent	2 year	Registers a unique ID that is used to generate statistical data on how the visitor uses the website.
_gid	persistent	1 day	Registers a unique ID that is used to generate statistical data on how the visitor uses the website.
__gads	third party	2 years	Associated with the DoubleClick for Publishers service from Google. It serves purposes such as measuring interactions with the ads on our domain and preventing the same ads from being shown to you too many times.
__utma	persistent	2 years	This cookie is typically written to the browser upon the first visit. If the cookie has been deleted by the browser operator, and the browser subsequently visits strategy-business.com, a new __utma cookie is written with a different unique ID. In most cases, this cookie is used to determine unique visitors to strategy-business.com, and it is updated with each page view. Additionally, this cookie is provided with a unique ID that Google Analytics uses to ensure both the validity and the accessibility of the cookie as an extra security measure.
__utmb	persistent	30 minutes	This cookie is typically written to the browser upon the first visit. If the cookie has been deleted by the browser operator, and the browser subsequently visits strategy-business.com, a new __utma cookie is written with a different unique ID. In most cases, this cookie is used to determine unique visitors to strategy-business.com, and it is updated with each page view. Additionally, this cookie is provided with a unique ID that Google Analytics uses to ensure both the validity and the accessibility of the cookie as an extra security measure.
__utmc	persistent	30 minutes	Historically, this cookie operated in conjunction with the __utmb cookie to determine whether or not to establish a new session for the user. For backward compatibility purposes with sites still using the urchin.js tracking code, this cookie will continue to be written and will expire when the user exits the browser. However, if you are debugging your site tracking and you use the ga.js tracking code, you should not interpret the existence of this cookie in relation to a new or expired session.
__utmv	persistent	2 years	This cookie is not normally present in a default configuration of the tracking code. The __utmvcookie passes the information provided via the _setVar() method, which you use to create a custom user segment. This string is then passed to the Analytics servers in the GIF request URL via the utmcc parameter. This cookie is written only if you have added the¬_setVar() method for the tracking code on your website page.
__utmz	persistent	6 months	This cookie stores the type of referral used by the visitor to reach strategy-business.com, whether via a direct method, a referring link, a website search, or a campaign such as an ad or an email link. It is used to calculate search engine traffic, ad campaigns, and page navigation within strategy-business.com. The cookie is updated with each page view to strategy-business.com.

Cookie	Type	Duration	Description
GoogleAdServingTest	persistent	session	Used to register what ads have been displayed to the user.
IDE	persistent	1 year	Used by Google DoubleClick to register and report the website user's actions after viewing or clicking one of the advertiser's ads with the purpose of measuring the efficacy of an ad and to present targeted ads to the user.
test_cookie	third party	1 day	Used to check if the user's browser supports cookies.
__ab12#	persistent	2 years	Pending

Top 10 Global Consumer Trends 2020

Top 10 Global Consumer Trends 2021

Understanding the Why? Projective Techniques in Qualitative…

African consumers resistance to e-commerce and what is…

The fascinating dynamism of the African Insights industry

Christmas 2020: Opportunities to close the year on…

Make your customer experience meaningful, not only frictionless

There Is a Way Out of This Mess

Nail Biting in Georgia US Senate Races –…

Media polling and the way forward

U.S. election pollsters: watch Florida for key indicators!

Post-pandemic marketing & advertising trends among marketers

Cross-Media Measurement, XMM: no viewing – no outcomes!

XMM Disconnect? As Alice went into Wonderland, things…

Innovations in media measurement, accelerated by COVID, establish…

Insight from the Insight250 winners: Data-driven leadership

Insights from the Insight250 winners: Evolutions and innovations…

Customer advocacy: How to turn customers into friends,…

Brands as provocations: How to connect at scale…

Predictive qual: How to turn the art of…

What It truly means to be tech-enabled in…

Insights on insights: Which survey data analysis solution…

Eating in, is the new testing out –…

Behavioural tech-heads: What technology needs to learn from…

SHOBSERVATORY Research Chronicles: The heart of the brand…

ESOMAR announces the 2021 award winners

SHOBSERVATORY Research Chronicles: How presentations are created

The ABC’s of Bayesian Statistics

Leave a Comment Cancel Reply

Predictive qual: How to turn the art of qual into a science...

The ABC’s of Bayesian Statistics

Leave a Comment Cancel Reply

Related Articles

We value your privacy!