Quant Essentials: Sampling

By Kevin Gray

In RW Connect’s new Quant Essentials series, we discuss critical methodological skills in simple, jargon-free language. The first article, What Is Quantitative Research? gives some more background about the series. Our second article was about research design

Our topic in this article is sampling.

Though it doesn’t excite most people, sampling is fundamental to nearly any kind of research. It’s probably easiest to explain sampling by using an illustration. Let’s imagine that you work in the human resources (HR) department of a large company. Your databases include information on all company employees, and represent the population of employees working at your company. If you wanted to see what percent of your employees is female or their average age, you can query the appropriate data bases to obtain this information. This would be analogous to looking up data from a national census.

On the other hand, say you are a data scientist working in HR and have been asked to conduct statistical modeling – predicting employee churn, for instance. This kind of modeling is often sophisticated and may demand a substantial amount of computing power. Therefore, it might be more practical for you to draw a sample of records of current and former employees and use that sample for your modeling. Since you would be using a sample and not the population, there would be some loss of precision. Most statistical procedures have been developed for small samples, however, and originally that was all most statisticians had to work with.

In marketing research, we almost never are in the position where a census of consumers is feasible. We must use samples because of budget or time constraints. Fortunately, it is seldom necessary to interview millions of consumers. Sample quality does matter though, and one sample is not just as good as another.

There are two basic kinds of samples: probability and non-probability. With probability sampling, each unit (e.g., consumer) has a known, non-zero chance of being selected. This is often called random sampling, since some form of random or (systematic) selection mechanism is employed. Fieldwork staff do not choose who participates in the research and who does not. With non-probability sampling, some elements of the population have no chance of selection, or the probability of their selection cannot be precisely determined. In the case of mall and street intercepts, fieldwork staff do choose who participates in the research and who does not. These are non-probability samples.

Simple random sampling (SRS) and systematic (“every nth”) sampling are the sampling procedures most of us would probably think of when we hear “random sample.” There are also stratified and cluster samples, among many other kinds. In cluster (or multi-stage) sampling, we take samples of smaller units within larger units. For example, we might sample geographic areas, and then housing units and, finally, individuals within these housing units. Stratification entails breaking down the target population into segments, such as age group or gender, before sampling and then taking independent samples within each stratum. Quota sampling is a non-probability method that resembles stratified sampling except that selection of units is at least partly judgmental.

There is an important but often overlooked distinction between probability sampling and a probability sample. A research agency may utilize a probability sampling procedure for a consumer survey, for instance, but because many people invited to join the survey refuse to do so the sample the research agency obtains is not a true probability sample. If the differences between this sample and those refusing to participate are small, then this self-selection won’t matter much and the respondents can be treated as a probability sample with negligible risk. If research agency had this information, however, there would be no need to conduct the survey in the first place.

Post-survey weight adjustments can be used in many situations to make the actual sample represent the target population more closely. Marketing researchers often weight survey data by age, gender, region or other variables for which national census data are available. Weighting cannot transform a non-probability sample into a probability sample, though, and we can only try to make our sample more representative of the population. Weighting can be tricky and is not a panacea.

Something else to bear in mind is that we very often use online panels in marketing research. The quality of these panels can vary enormously. Internet access is not ubiquitous in any country, and rare in some, so there will always be some skew in any panel. The panelists may be different in important ways from the population of consumers we’re studying but, judging from sales figures and other data, they often are close enough overall for most marketing research purposes. Online Panel Research: A Data Quality Perspective (Callegaro et al.) is a good source for more information about this topic, as is the Public Opinion Quarterly (AAPOR).

Marketing research textbooks introduce the fundamental concepts of sampling but I would encourage you to learn about the total survey error framework developed by survey methodologists. If you’d like to study sampling in depth, Survey Sampling (Kish), Sampling Techniques (Cochran) and Model Assisted Survey Sampling (Särndal et al.) are classic, if technical, references. Sampling: Design and Analysis (Lohr) and Practical Tools for Designing and Weighting Survey Samples (Valliant et al.) are recent and also excellent. The Research Methods Knowledge Base is a handy online reference. Hard-to-Survey Populations (Tourangeau et al.) is a fascinating read and will be of particular interest to those of you who study specialized populations.

We hope you’ve found this brief overview of sampling interesting and helpful!

Kevin Gray is President of Cannon Gray, a marketing science and analytics consultancy. He also co-hosts the audio podcast series MR Realities. He would like to thank Stas Kolenikov, Senior Scientist, Abt Associates, for reviewing a draft of this article.

Cookie	Type	Duration	Description
cli_user_preference	persistent	1 year	Keeps track of the cookie consents for on the current domain.
cookielawinfo-checkbox-marketing	persistent	1 year	Keeps track of the cookie consent for a specific category on the current domain.
cookielawinfo-checkbox-measurement	persistent	1 year	Keeps track of the cookie consent for a specific category on the current domain.
cookielawinfo-checkbox-necessary	persistent	1 year	Keeps track of the cookie consent for a specific category on the current domain.
cookielawinfo-checkbox-non-necessary	0	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Non Necessary".
cookielawinfo-checkbox-preferences	persistent	1 year	Keeps track of the cookie consent for a specific category on the current domain.
hustle_module_show_count-	persistent	1 day	This cookie is used to determine when the internal slide-in/pop-up/embed module for newsletter opt-ins is displayed to the user.
inc_optin_	persistent	1 hour	This cookie is used to determine when the internal slide-in/pop-up/embed module for newsletter opt-ins is displayed or hidden to the user.
PHPSESSID	session	0 minute	Preserves user session state across page requests. The PHPSESSID cookie is native to PHP and enables websites to store serialised state data. On the website it is used to establish a user session and to pass state data via a temporary cookie, which is commonly referred to as a session cookie. Stores unique session ID.
viewed_cookie_policy	persistent	1 hour	Stores the user's cookie consent state for the current domain.
viewed_cookie_policy	0	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
wordpress_	session	session	WordPress cookie for a logged in user.
wordpress_logged_in_	session	session	WordPress cookie for a logged in user.
wordpress_test_	session	session	WordPress cookie for a logged in user.
wordpress_test_cookie	session	session	WordPress test cookie.
wp-settings-	session	session	Wordpress also sets a few wp-settings-[UID] cookies. The number on the end is your individual user ID from the users database table. This is used to customize your view of admin interface, and possibly also the main site interface.
wp-settings-time-	session	session	Wordpress also sets a few wp-settings-{time}-[UID] cookies. The number on the end is your individual user ID from the users database table. This is used to customize your view of admin interface, and possibly also the main site interface.

Cookie	Type	Duration	Description
AMP_TOKEN	persistent	1 year	This cookie name is associated with Google Universal Analytics - which is a significant update to Google's more commonly used analytics service. It contains a token that can be used to retrieve a Client ID from AMP Client ID service. Other possible values indicate opt-out, inflight request or an error retrieving a Client ID from AMP Client ID service.
collect	third party	session	Used to send data to Google Analytics about the visitor's device and behaviour. Tracks the visitor across devices and marketing channels.
_ga	persistent	2 year	Registers a unique ID that is used to generate statistical data on how the visitor uses the website.
_gid	persistent	1 day	Registers a unique ID that is used to generate statistical data on how the visitor uses the website.
__gads	third party	2 years	Associated with the DoubleClick for Publishers service from Google. It serves purposes such as measuring interactions with the ads on our domain and preventing the same ads from being shown to you too many times.
__utma	persistent	2 years	This cookie is typically written to the browser upon the first visit. If the cookie has been deleted by the browser operator, and the browser subsequently visits strategy-business.com, a new __utma cookie is written with a different unique ID. In most cases, this cookie is used to determine unique visitors to strategy-business.com, and it is updated with each page view. Additionally, this cookie is provided with a unique ID that Google Analytics uses to ensure both the validity and the accessibility of the cookie as an extra security measure.
__utmb	persistent	30 minutes	This cookie is typically written to the browser upon the first visit. If the cookie has been deleted by the browser operator, and the browser subsequently visits strategy-business.com, a new __utma cookie is written with a different unique ID. In most cases, this cookie is used to determine unique visitors to strategy-business.com, and it is updated with each page view. Additionally, this cookie is provided with a unique ID that Google Analytics uses to ensure both the validity and the accessibility of the cookie as an extra security measure.
__utmc	persistent	30 minutes	Historically, this cookie operated in conjunction with the __utmb cookie to determine whether or not to establish a new session for the user. For backward compatibility purposes with sites still using the urchin.js tracking code, this cookie will continue to be written and will expire when the user exits the browser. However, if you are debugging your site tracking and you use the ga.js tracking code, you should not interpret the existence of this cookie in relation to a new or expired session.
__utmv	persistent	2 years	This cookie is not normally present in a default configuration of the tracking code. The __utmvcookie passes the information provided via the _setVar() method, which you use to create a custom user segment. This string is then passed to the Analytics servers in the GIF request URL via the utmcc parameter. This cookie is written only if you have added the¬_setVar() method for the tracking code on your website page.
__utmz	persistent	6 months	This cookie stores the type of referral used by the visitor to reach strategy-business.com, whether via a direct method, a referring link, a website search, or a campaign such as an ad or an email link. It is used to calculate search engine traffic, ad campaigns, and page navigation within strategy-business.com. The cookie is updated with each page view to strategy-business.com.

Cookie	Type	Duration	Description
GoogleAdServingTest	persistent	session	Used to register what ads have been displayed to the user.
IDE	persistent	1 year	Used by Google DoubleClick to register and report the website user's actions after viewing or clicking one of the advertiser's ads with the purpose of measuring the efficacy of an ad and to present targeted ads to the user.
test_cookie	third party	1 day	Used to check if the user's browser supports cookies.
__ab12#	persistent	2 years	Pending

Top 10 Global Consumer Trends 2020

Top 10 Global Consumer Trends 2021

Understanding the Why? Projective Techniques in Qualitative…

African consumers resistance to e-commerce and what is…

The fascinating dynamism of the African Insights industry

Christmas 2020: Opportunities to close the year on…

Make your customer experience meaningful, not only frictionless

There Is a Way Out of This Mess

Nail Biting in Georgia US Senate Races –…

Media polling and the way forward

U.S. election pollsters: watch Florida for key indicators!

Post-pandemic marketing & advertising trends among marketers

Cross-Media Measurement, XMM: no viewing – no outcomes!

XMM Disconnect? As Alice went into Wonderland, things…

Innovations in media measurement, accelerated by COVID, establish…

Insight from the Insight250 winners: Data-driven leadership

Insights from the Insight250 winners: Evolutions and innovations…

Customer advocacy: How to turn customers into friends,…

Brands as provocations: How to connect at scale…

Predictive qual: How to turn the art of…

What It truly means to be tech-enabled in…

Insights on insights: Which survey data analysis solution…

Eating in, is the new testing out –…

Behavioural tech-heads: What technology needs to learn from…

SHOBSERVATORY Research Chronicles: The heart of the brand…

ESOMAR announces the 2021 award winners

SHOBSERVATORY Research Chronicles: How presentations are created

Quant Essentials: Sampling

1 comment

Leave a Comment Cancel Reply

Predictive qual: How to turn the art of qual into a science...

Quant Essentials: Sampling

1 comment

Leave a Comment Cancel Reply

Related Articles

We value your privacy!