Smart Data & Analytics

Data Science and Analytics Demystified

Kevin Gray

Analytics is H-O-T. Unfortunately, it seems to mean different things to different people. In the business media and blogosphere the term is often attached to data science or big data, apparently leading some to believe that analytics is something new and exclusively a part of data science or big data. These misunderstandings are compounded by substantial confusion about data science and big data themselves and a lack of generally agreed-upon definitions of either term. This confluence of hype and fuzziness impacts decision-makers and budgets in many industries, not only marketing researchers going about their daily business.

Some basics
I should acknowledge at the outset that much of what is called analytics or data science has no relationship to marketing.1 Medical and pharmaceutical research, fraud detection, credit scoring, oil and gas exploration, military and security are just a few of the other areas in which they are in use.

Being a marketing science person, I am sometimes asked questions along the lines of “what does analytics look like?” That these questions are at times posed by experienced marketing researchers brings to light just how much mystification there has been about these topics. While I tend to associate analytics with the Renaissance and emergence of the scientific method, I recognize that its origins can be traced much further back.2 There are many ways analytics can be defined and one way is very broadly as a research procedure for decision making:

Analytics is the discovery and communication of meaningful patterns in data. It makes use of information technology, statistics and mathematical algorithms to develop knowledge, to quantify performance or to make predictions. It uses the insights gained from this process to recommend action or to guide decision making. Analytics is best thought of as a research procedure for decision making, not simply as isolated tools or steps in a process.

This is my own definition, with no official sanction, though it has been heavily inspired by Wikipedia and other knowledgeable sources. For operational purposes, I sometimes find it helpful to break down this procedure into eight basic components:

  1. Defining Objectives
  2. Data Collection
  3. Data Preparation and Cleaning
  4. Model Building
  5. Model Evaluation
  6. Interpretation
  7. Scoring New Data or Simulations Using the Model
  8. Communication of Results and Implications to Decision Makers

Another and perhaps more common way to look at analytics is as statistical procedures, essentially steps 4-7 in the process I’ve outlined above. There are countless methods for analysing data and, for simplicity, I’ve listed some general categories of statistical methods below (with brief illustrations given in parentheses):

  • Descriptive and Exploratory Analysis (frequencies, means, bar charts)
  • Models that Predict (predicting consumption frequency of new customers)
  • Models that Explain (identifying brand choice drivers)
  • Analysis of Cross Sectional Data (data collected at one period in time)
  • Analysis of Longitudinal or Time Series Data (data collected at several periods in time)
  • Models with Quantitative Dependent Variables (monthly spend)
  • Models with Categorical Dependent Variables (product user/non user)
  • Time to Event Models (customer churn analysis)
  • Methods that Group Variables (factor analysis of attribute ratings)
  • Methods that Group Cases (cluster analysis of consumers)
  • Text Mining (analysis of social media conversations)
  • Simulations and Forecasts (sales forecasts under various marketing mix scenarios)

This list may strike you as long but, if anything, I’ve probably omitted classifications some readers feel are important. I should mention that the various types of methods I’ve cited can overlap and are sometimes combined in one project. Also, there are models that both predict and explain; for many projects, however, one or the other may be more fundamental to the objectives and, reflecting this, I’ve separated prediction and explanation above. I should note that “model” is frequently used ambiguously.3

Then and now
Analytics and data science by assorted definitions have always been part of marketing research, but what has changed is the amount and variety of data now available and the range of tools now at our disposal for analyzing data. These differences are not trivial nor, in my opinion, are they revolutionary. To make my views more concrete, I’ll take you on a brief journey back to the 1980’s when I was beginning my career as a marketing researcher with a financial services company in Manhattan. At the time there was no World Wide Web, no social media and PCs had limited capacity and were mainly regarded as curiosities. However, there was plenty for a researcher to do. We conducted quite a lot of survey research by postal mail or telephone. Some of our postal surveys were entirely DIY, from design through analysis and reporting. I also spent many days behind one-way mirrors observing focus groups, which were run by professional moderators and qualitative analysts.

For legal, operational and marketing purposes my company maintained a substantial amount of customer data, which were mostly updated by batch processing overnight, not in real time. Our computers were mainframes networked together into “virtual machines” and processing was quite speedy. Some data were still stored on magnetic tape but much of it had been migrated to servers. In addition to older software, I used SAS for building data files and for analysis; we did not have a data warehouse and I don’t recall even having heard that term until many years later. Instead, with the support of my very helpful and capable MIS colleagues, I created data files ad hoc for specific purposes.

By today’s standards these tasks were a headache, but much of the computer code had already been written and I was able to handle most of these steps with minimal supervision. A larger challenge was finding out what data were kept in which parts of the organization and what the assorted data codes meant. Since the data ecosystem was not static, there were always new things to learn but this data management part of my role was not daunting. Though this may come as a surprise to some, even in those ancient times we were able to integrate “hard” customer data with “soft” data from surveys or with exogenous data such as economic trends and perform fairly advanced statistical analyses of merged data files.

What has changed and what hasn’t?
Absolutely none of what I’ve just described was my inspiration and my company was not on the cutting edge of these developments, at least within the financial services sector. Much of my job, in its fundamentals, was very similar to what I now often hear referred to as data science. To be sure, there have been dramatic advances made since the ’80s. Big data as many would characterize it today4 was scarce and Hadoop didn’t exist. Bayesian methods were very limited and techniques such as LARS, Stochastic Gradient Boosting and Random Forests had not yet been developed.

However, as someone who still does data science, at least by some definitions,5 I cannot accept claims that it is entirely new, and that it has caught many marketing researchers by surprise is disturbing. That many veteran researchers have become confused about the meaning of analytics is even more worrying, since analysis is a core component of marketing research! The truth is that, for many years in the US and other countries, there have been specialist agencies and consultancies working in this space and many clients, as I’ve noted, have been doing these sorts of things internally for quite some time. To one degree or another, the multinational marketing research agencies and their ancestors have also been at these things for a long while. Terms such as data base marketing, data mining and predictive analytics describe much of is now loosely called analytics or data science and these terms have been in use since the 80’s and 90’s. Moreover, aren’t “traditional” marketing science methods such as driver analysis and segmentation, really just forms of analytics and data science?

So, while much has changed, much hasn’t.

Professional development
To be fair, many marketing researchers put in very long hours and work under extremely tight budget and time constraints. There is seldom much leeway for experimentation in spite of our persistent chatter about innovation. Like it or not, many of us are highly specialised and things other marketing researchers have been up to for years can easily escape our attention, and the daily struggle to meet our targets makes it hard to keep up with new developments even within our own areas of expertise.

There is a pressing need for more cross training as well as more in-depth training among both buyers and sellers of marketing research, especially for those whose experience has mostly been limited to traditional methods such as tracking studies and focus groups. Besides seminars, conferences and MOOCs, there are literally thousands of books, articles and online sources devoted to analytics and data science. I can also recommend the American Statistical Association and Royal Statistical Society to marketing scientists, and in the Notes section below I’ve listed some textbooks on analytics topics I have found very useful, if not light reading.6

Change is a threat to those who stick too closely to the tried and true but an opportunity for those able to blend new skills with knowledge that has stood the test of time.

Kevin Gray is president of Cannon Gray, a marketing science and analytics consultancy.

_____________________________________________________________________________________

Notes

1 See “Where Analytics, Data Mining, Data Science is applied“:

2 Counting may be natural for many species. See, for example, “More Animals Seem to Have Some Ability to Count“: Cats and dogs certainly appear to have a knack for pattern recognition and anomaly detection!

3 I’ve offered some thoughts on what constitutes a model in “A Model’s Many Faces“:

4 There doesn’t appear to be much agreement about what big data means:

5 Data science can be defined in numerous ways: Regarding this topic, a veteran US recruiter working in this space recently commented that “sometimes things don’t change as much as all the terminology changes!” (personal communication).

6 These recommendations are purposely skewed and there are many books on marketing, psychology, survey research, programming and other topics valuable to marketing scientists. Some titles have been abbreviated to conserve space.

– Handbook of Statistical Distributions (Krishnamoorthy)

– Practical Tools for Designing and Weighting Survey Samples (Valliant et al.)

– Design and Analysis of Experiments (Montgomery)

– Experimental and Quasi-Experimental Designs (Shadish et al.)

– Propensity Score Analysis (Guo and Fraser)

– Methods of Meta-Analysis (Schmidt and Hunter)

– Applied Multivariate Statistical Analysis (Johnson and Wichern)

– The R Book (Crawley)

– Regression Modelling Strategies (Harrell)

– Categorical Data Analysis (Agresti)

– Multilevel and Longitudinal Modeling (Rabe-Hesketh and Skrondal)

– Time Series Analysis (Wei)

– Multiple Time-Series Analysis (Lütkepohl)

– Bayesian Data Analysis (Gelman et al.)

– Applied Bayesian Hierarchical Methods (Congdon)

– Risk Assessment and Decision Analysis (Fenton and Neil)

– The Data Warehouse Toolkit (Kimball and Ross)

– Data Mining Techniques (Linoff and Berry)

– Data Mining (Whitten et al.)

– Applied Predictive Modeling (Kuhn and Johnson)

– An Introduction to Statistical Learning (James et al.)

– Elements of Statistical Learning (Hastie et al.)

 

Kevin Gray is president of Cannon Gray, a marketing science and analytics consultancy.

 

Leave a Comment

* By using this form you agree with the storage and handling of your data by this website.
Please note that your e-mail address will not be publicly displayed.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Related Articles