Week 4 Lab: Mini Project

This Lab Contributes to Course Objectives: 2, 3, 4, 5, 8

Learning Objectives R

  • Read data from a URL (web address)

  • Append data to create a single dataset

  • Use data manipulation tools to clean data

Learning Objectives Tableau

  • Combine sheets into a dashboard

  • Add interactivity to your dashboard

  • Format your dashboard

Project Overview

The objective of the project is to integrate the R and Tableau skills that you have learned so far in the course, conduct an analysis, and present your findings. We have covered most of the skills you will need to conduct your analysis. This lab will cover a few tasks that will aid you in developing an entire analytics workflow. By workflow, we mean a sequence of steps beginning with an R script and ending with a set of steps in Tableau to create visuals.

R

Reading data from a file path

Download the data: You now have 3 years worth of data (2017 - 2019). The data are hosted on the course materials webpage here: https://csu-arec-330.github.io/course-materials/.

Import the data: The structure of the data is the same in each year, but you need to combine them into a single dataframe.

Inspect the data: There are some slight problems that need fixing. Then you need to export the data to be used in Tableau.

Start a new R script and get set up:

  • write a brief description of the script

  • set working directory

  • load required packages

Reading data from a URL

Up until now, we have downloaded data and read it into R or Tableau. We want you to understand the process of reading data from your file system and understand the structure of your file system. However, R is capable of downloading and reading files directly from web addresses. Using the read_csv() function from the readr package, you will read in the data and assign the output to a new dataframe.

super_2017 <- read_csv("your_url_here.csv")

Do the same for years 2018 and 2019.

On occasion datasets are hosted on a website and updated regularly. Writing an R script to pull data from another website and process it can save you time in the future.

Data processing

Now you have three dataframes with the same columns. You need to append 2018 and 2019 onto 2017 (i.e., bind the dataframes together).

Once you have a single dataframe with all of the years, inspect the dataframe for missing data. Use the appropriate dplyr functions to fill in the missing data.

Note: Gross margin is calculated as the proportion of revenue that exceeds the cost of goods sold (COGS).

Exporting dataframes

So far, you have read in data and processed it all within R. For this project, you will need to export the cleaned data from R in a format that can be read into Tableau. The readr package contains write_*() functions that are analogous to the read_*() functions.

write_csv(your_dataframe,"filename.csv")

Read through the documentation. Since write_csv() is writing data, you need to tell R what dataframe (e.g., super_2017) you want to write and the name of the file you intend to create filename.csv. Note that you can also write to another directory just as you read from another directory.

Tableau

So far, we have focused on making single graphics. One strength of Tableau is the ability to create interactive dashboards with interconnected elements.

Part 1: Create your visualizations

Follow the steps from previous labs to connect to the dataset and create two graphics:

  1. A line graph plotting total sales by date

  2. A bar chart plotting the count of transactions by payment type

Part 2: Build your dashboard

Now create a dashboard with these two visualizations by following these steps:

  1. Click the ‘New Dashboard’ icon at the bottom of Tableau Desktop.

  2. Drag each of your visualizations (one at a time) to the dashboard window.

  3. Experiment with moving these visualizations around, resizing them, and editing their titles.

  4. Add interactivity 1: Create a filter based on payment type:

  • Select the bar chart

  • Click ‘More Options’ (the little down arrow on the toolbar next to the graph)

  • Select ‘Use as filter’

What did this do?

  1. Add interactivity 2: Create a filter based on city that applied to both visualizations:
  • Go back to the worksheet with your line graph

  • Drag the field ‘City’ to the filters card (and select all)

  • Return to your dashboard

  • Select the line chart

  • Click ‘More Options’ -> ‘Filter’ -> ‘City’

  • Change the city filters

What happens?

  • Select the new pane with your city filter

  • Click ‘More Options’ -> ‘Apply to Worksheets’ -> ‘Selected Worksheets’ (this dropdown appears next to the city filter)

What did this do?

  1. Play around with formatting:
  • Try adding text. Can you change the text box so that you can place it on top of your figures?

  • Try adding an image from the web.

  • Try adding an extension (just pick one that sounds cool)