Project 2 – Problem Set 2: Cross Sectional Data Analysis

Problem Set Overview

This problem set is designed to get you started analyzing the data for your project.

Part 1: R

The purpose of this step is to gain familiarity with your data and identify any issues (e.g., missing data, outliers). Complete the following steps with the convenience store transactions data.

  1. Specify your research question. Review the data dictionary accompanying the three data files and articulate the research question that you will seek to answer for Project 2.

  2. Generate a table of summary statistics. The table should include at least: variable names, mean (average), standard deviation, min, and max.

  3. Generate a ggpairs plot (i.e., the density and scatter plots) of your quantitative measures. Note that the visuals may be different for qualitative data.

  4. Perform any necessary data processing. For instance, did you drop any observations? Did you have to transform any variables? Write a narrative of your processing steps justifying your decisions.

  5. Write a narrative for your webpage describing your EDA and provide an interpretation of your data. (Tip: If someone were to read your narrative, would they be able to reproduce your steps?)

Part 2: Tableau

  1. In Tableau, produce an effective visualization to accompany the results from your explanatory data analysis.
  1. In Tableau, demonstrate that you can integrate summary statistics into your map Tooltip.

How to Submit

You should create a new webpage on your Google site titled Project 2 Problem Set 2. Submit the link to your Google site webpage in Canvas.