By Kevin Gray
In RW Connect’s new Quant Essentials series, we discuss critical methodological skills in simple, jargon-free language. The first article in the series, What Is Quantitative Research? gives some more background about the series. Our second article was about research design and our third, sampling. Our most recent was about questionnaire design.
In this article, we’ll have a quick look at data analysis.
For those of you wishing to study this subject in greater depth, I’ve listed some textbooks and other resources at the end of the article.
Everyone has their own way of doing things. I now mostly use statistical software for exploratory data analysis and to make the tables and graphics I need for my analyses. In many cases, however, it can save time if I have cross tabulations already prepared which I can refer to as needed. What I do now after more than 30 years as a marketing science person probably would not be best for you, so let’s return to the basics (which still apply to me too).
First, use automation as an aid, not as a substitute for thinking.
Never assume the data are clean, even when they have been “cleaned”. If some numbers don’t make sense, it may be an interesting finding that leads to a useful insight…or it may be a data error. Check with the fieldwork or data processing company.
Learn how to read number tables! Many young researchers are losing this fundamental and crucial skill and relying on data visualization instead. This can be unwise since the default settings of automated or semi-automated graphics can mislead us. We need to be careful.
Indexing can give us a distorted picture as well – 90%/45% = 2%/1% – and indexing to the total will penalize larger respondent groups. Client presentations and reports should include number tables sparingly, but I would urge you to look at the raw percentages when you are analyzing and interpreting the data.
Looking at box scores of rating scales (e.g. percent Agree + percent Strongly Agree) is common practice in marketing research, but this discards information. Box scores are also less stable than means or medians and, furthermore, most scales we use in marketing research are better interpreted directionally than literally. Mean and median ratings can easily be converted to a 0-100 percentage score if this is easier for a client to understand.
Consider having two sets of cross tabs prepared. The first would consist of only the questions and banners you’ll need for the topline report and will be the foundation for your Management Summary. A second, larger set of tabulations that supplements the first set can serve as an electronic appendix for detail-oriented clients.
Use column-wise percentages and read down the page. For some questions, it may be necessary to look at row-wise percentages – for these questions, you may wish to request tabs with the stubs (rows) and banners (columns) reversed. With a spreadsheet, though, it’s easy to compute row-wise percentages from column-wise percentages, especially if you also have also requested frequency tables that show the actual numbers. Avoid tables that have both the numbers and percentages on the same page – this causes eyestrain and makes it easier to misread the table.
Unless your client absolutely insists upon it or your company policy mandates it, avoid significance tests. Significance testing has little meaning for the vast majority of consumer surveys for myriad reasons:
- It assumes probability samples, which are almost non-existent in survey research.
- Significance testing is generally conducted under the assumption the data have been collected by simple random (or systematic) sampling, though cluster sampling and other complex sampling schemes are more typical when probability sampling has been employed
- It assumes no measurement error – again, unrealistic in survey research
- Multiple comparison procedures offered by tabulation software do not accurately account for the massive number of significance tests we conduct on a typical set of cross tabulations
- Even with modest base sizes, very small and substantively meaningless differences and correlations will be statistically significant
- Significance testing is often used as a crutch. “Significant” doesn’t mean important, but many mistakenly believe they can ignore anything that doesn’t have an asterisk next to it, and that anything flagged by an asterisk has managerial importance. In fact, negative findings can be extremely important – for instance, discovering that awareness of your brand is the same across age groups when it is targeted at people in their 20s!
- If you absolutely must spot check for statistical significance – pseudo-significance, really – you can use Excel or freely-downloadable software such as StatCalc.
Look for meaningful patterns in the data instead of relying on significance tests to do your thinking for you. Think (cautiously) about causation and what mechanism or mechanisms might have given rise to the data. Descriptive findings are not insights and pattern recognition is not understanding. These are just early steps in your analysis. It’s very easy to capitalize on chance…only to pay a hefty price later.
When possible, use multivariate analysis (MVA) in place of masses of cross tabs or data mining. When conducted professionally, MVA will allow you to develop richer and more meaningful insights from your data and reduce the risk of fluke results. However, don’t let the method become the Master. Analytics methods and software are tools and we should always use the tools or combination of tools that are best suited to the task at hand.
Don’t try to fly before you can walk. Data analysis – MVA in particular – is much harder to do competently than many may think, and you should not allow yourself to become dazzled by sexy, user-friendly software. Please do not attend a seminar on statistics or analytics offered by a software vender and walk away thinking you have mastered data analysis! Work closely with your statistical consultants and try to understand their point of view. Few are marketing experts, even after having worked in marketing research for many years. They can learn a lot from you and you can learn a lot from them.
Survey Research Analytics, Stuff Happens and Research Thinking provide many more tips about how to deliver better insights to your client while, at the same time, using your budget more efficiently. A vast number of books and articles have been published about statistics and how to analyze data. Introductory textbooks on marketing research and statistics and are a good place to begin.
Theory Construction and Model-Building (Jaccard and Jacoby), while not written specifically for marketing researchers, is relevant to practically anyone doing research. Likewise, though not an easy read, Experimental and Quasi-Experimental Designs (Shadish et al.) is a wonderful resource that can help you both design research and analyze data. Once you’ve had a few years’ experience, and if you’ve taken a few statistics courses, some of the references listed in my company library may be of interest to you.
We hope you’ve found this brief overview of data analysis interesting and helpful!
Kevin Gray is President of Cannon Gray, a marketing science and analytics consultancy. He also co-hosts the audio podcast series MR Realities.