The total amount of data generated worldwide increases by 40% every year. Many companies seek to make use of the constantly increasing mountain of data in order to increase their e-commerce business. But making use of such Big Data alone doesn’t add any value — enter data mining. Below you can find a discussion of the various analysis approaches involved in data mining, to give you an idea of how...Data mining: analysis methods for big data
Successful marketing campaigns all have one thing in common: they are perfectly tailored to target groups. But finding and reaching these target groups can prove tricky for online marketers. Without intensively studying users’ behavior as part of a comprehensive web analysis, you can only guess whether your planned marketing steps are creating the desired effect. For example, a complete data set usually acts as a basis where you can find out which devices visitors use to access the website. A different approach to web analysis is known as cohort analysis. Here, instead of collecting different information to analyze at once, different groups (cohorts) are allocated for analysis. The criteria of the cohorts can vary quite considerably, which we discuss below.
Cohort analysis: definition
For decades the concept of cohort analysis has played an important role in statistical surveys in social science and demographics. Cohorts (from the Latin 'cohors' meaning 'crowd') are groups of people who share a common demographic. For example, this could be the birth year or the year they started working, or certain historical events such as a president’s inauguration. The term 'generation' is often used. When a cohort analysis (also referred to as a 'cohort study') is carried out, the behavioral changes of the defined groups of people over the time period they are examined. Once you’ve collected the data, you can either:
- Obtain an accurate picture of the underlying cohorts (intra cohort study), in order to analyze, for example, the development of the birthrate and the change in consumer behavior (either over a long period, or on a random basis).
- Make a comparison with at least one other group of people (inter cohort study), in order to obtain useful insights into behavioral differences.
At the end of the 19th century, statisticians Karl Becker (1874) and Wilhelm Lexis (1875) laid down the foundation for the analysis of certain population groups. Through advancements made by demographer Pascal Whelpton (1949), these approaches known as cohort analyses finally obtained international notoriety. The aim of Whelpton’s research was to analyze the increase in the US’s birthrate after WWII. Today the process is increasingly used for studies in medicine, politics, and the market economy.
Implementation and interpretation
Cohort studies can be carried out in two different ways: you can arrange the cohorts together and accompany them in future (prospective cohort study), or you can access data from the past so that you can analyze the present (retrospective cohort study). In order to be able to implement one of these types of cohort analyses, the following steps need to be taken:
- Define the research question and aim: to obtain relevant information, you have to ask the right questions. Only when you have concrete ideas about the content and purpose of the investigation, can you create the necessary structure of the study.
- Define cohort events: the second step is to define the events in which cohorts occur, as these can lead to an answer to the research question.
- Determine relevant cohorts: now you determine which and how many cohorts are to be parts of the study. It is also possible to split or specify the formed cohorts.
- Perform the cohort study and evaluate it: if the desired cohorts have been found, you can carry out the respective type of study (prospective/retrospective, inter/intra cohort study) and interpret the data received.
The changes in behavior you want to obtain by carrying out the cohort analysis are determined by three factors or effects. The evaluation and weighting of these are the main tasks of interpretation:
- Cohort effects
- Age effects
- Period effects
Cohort effects are the behavioral differences and changes between different cohorts. They can be generally explained by the existence of different social and environmental influences. Age effects, on the other hand, are the changes that can be attributed to the increasing age of people and their related attitudes. Lastly, period effects represent behavior changes that result from changing environmental conditions – regardless of generational and socio-demographic factors.
From these three effects, you can notice any clear trends regarding the behavior of individual groups. On the basis of these trends, you can use them to develop future prognoses or solution strategies. The main task is to separate age, cohort, and period effects, which can occur in every result, from each other. If you include these as identification problems in the cohort analysis, you can find a clear reason for the behavioral changes.
The benefit of cohort analysis in marketing
Analyzing the market and the associated target groups is an important part of strategic planning that precedes every marketing campaign. In online marketing, the focus is increasingly becoming more about the behavior of users. The millions of data that have already been collected serve as a strong basis for further planning, but this information first needs to be extensively evaluated. If you want to go a step further than just gaining knowledge about the behavior of the average user and want to organize the visitors depending on specific criteria, you should definitely take advantage of cohort analysis. For observing the behavior of new and existing customers or recognizing regional trends, this procedure has been an indispensable tool in e-commerce for a long while.
Example: cohort analysis in e-commerce
Cohort analyses enable you to check how successful your marketing campaigns are in a very precise way, as the following example shows:
You, an online store owner, decide you want a total redesign and layout change. To check how the new design is fairing with customers, you should look at the recorded transactions and categorize your customers into existing customers (cohort 1) and new customers (cohort 2). After two months, you look at the results and notice that the number of transactions has decreased. Without further information, you could say that the new layout wasn’t very well received. A look at the separate figures of both cohorts could reveal two other scenarios:
- Cohort 1 (existing customers) completed more transactions than before the store redesign. In contras,t there were fewer purchases made by cohort 2 (new customers).
- There were more purchases made by cohort 2 (new customers) than before. Cohort 1 (existing customers) has carried out less transactions.
Cohorts: the more specific, the more meaningful
The example above shows the advantages of implementing a cohort analysis, which is that it is much more flexible and specific than a mere analysis of average user behavior. Thanks to the powerful features of current tools such as Google Analytics with regards to data collection, it’s now possible to differentiate between new and existing customers; the tools help you to check the behavior of more complex cohorts. You can include, for example, the age and location of customers, or the device being used in the categorization. You can also access the information you need, so that you can respond to the needs of individual customer groups.