Project 3 Problem Set 1: Starting Project 3 and working with panel data

Problem Set Overview

This problem set is intended to get you started on your third project for the course.

The Question & The Data

Your question for project 3 should be in the form of: What is the association between \(x\) (an explanatory variable) and \(y\) (some outcome)? For this project, you will use the convenience store data (shopper_info, store_info, gtin) to select your \(y\), and choose one of the two additional datasets (census data or weather data) to choose your \(x\).

Building your panel dataset

Once you have the question in mind, you need to build the dataset. Building the dataset may look different depending on your question and the regression model you intend to estimate. For this problem set, you will practice building datasets using both the Census data and the weather data.

Part 1: Using Census data

  1. In the lab, we loaded median income from the American Community Survey (ACS) Census data. For your problem set, find the table associated population. Then, create a new spatial dataframe that does the following:
  • Only includes the data for the 48 contiguous US states.

  • Joins population to the geocoded store data.

  • Calculates the population density (population per square miles). Note: You will need to refer to the Tiger Shapefile data dictionary to determine what units aland is measured in and perform the necessary mutation. Tip: Refer back to the lab notes from Week 3.

  • Export the data to a csv file that can be read into Tableau.

  1. Generate a filled map in Tableau that plots county-level population density. Your final map should only include the data for the 48 contiguous US states. Note: We covered filled maps in Tableau in Week 10.

Part 2: Using weather data

  1. Read in the weather data and create a dataframe with the total monthly rainfall for each county. Tip: Refer back to the lab notes from Week 3.

  2. Read in the shopper info data and create a dataframe that aggregates total sales across stores at the county-level by month. Note: There is only one month of data.

  3. Join the dataframe from #1 (total precipitation) with the dataframe from #2 (total sales) and create a dataset where the unit of analysis is at the county level. Export the data into a csv file that can be read into Tableau.

  4. Generate a map in Tableau that plots county-level total rainfall and total sales. You decide how best to display the data so that both pieces of information are portrayed effectively. Only include the data for the 48 contiguous US states.

Project 3

  1. Who are your Project 3 team members?

  2. Indicate whether you are selecting Option 1 or Option 2, described above.

  3. What is the relationship you want to examine?

How to Submit

You should create a new webpage on your Google site titled Project 3 Problem Set 1. Your webpage should include the code you used to generate the data for the two maps, the two maps themselves, and responses to the questions, where pertinent. Submit the link to your Google site webpage in Canvas. You should work on building your own webpage, but only one member of the group needs to submit the link.