What is cross-sectional data?
Common types of cross-sectional data and vizzes
Analyzing cross-sectional data
Project 2 overview and example
Cross-sectional data is collected by observing a study population at a single point in time or for a period of time and aggregating information to a single observation per subject.
We call it cross-sectional data because we are observing information for a slice, snapshot, or cross-section of a group subjects.
This differs from time series data in that we only observe information at a single point in time.
Individual-level data
Business or point of interest-level data
Individual-level data
Business or point of interest-level data
Country-level data
Region-level data
Spatial data
Cross-sectional analysis typically involves:
One powerful method to uncover these groups is Cluster Analysis, which we introduce next.
Cluster analysis helps businesses, policymakers, and organizations group similar observations to better understand their characteristics, behaviors, and needs.
For example:
Watch this video on K-means clustering
Usually, we repeat this whole process a number of times and choose the group assignment that minimizes the overall variance.
After clustering, summarize attributes of the clusters to understand the groups.
Two common methods for selecting the right number of clusters:
Suppose you manage convenience stores and want to segment your customers based on purchasing behavior:
Example clusters could reveal insights like:
Goal: Use sales data to divide your customers into groups to better tailor promotions for each customer segment.
We live in a noisy world and crowded marketplace
People are getting better at ignoring and there are more niches
Increase the chances of of reaching the right customers with the right message at the right time
Increase sales, revenue, and hopefully profit
Identify customer personas
Customer stage - leads, prospects, existing customers
Customer demographics - age, gender, income, location, occupation
Customer behaviors - purchase history, web browsing activity
After doing the cluster analysis, you identify three clusters:
Cluster | Avg. spending | Most frequent purchase | Avg. visits per month |
---|---|---|---|
1 | $4.37 | Cigars | 27 |
2 | $16.20 | Water | 6 |
3 | $4.25 | Carbonated Soft Drinks | 4 |
Insights:
The ESRI Tapestry Segmentation Example uses cluster analysis on census demographic data (+ others) to define groups.
A common application of cluster analysis is to use sales data to divide customers into groups to improve target marketing.
Let’s look at a case study. . .
In your analysis, you’ll:
Form groups of 2 (THIS WEEK)
We will provide convenience store data on shoppers, stores, and purchases
Steps of the project include: