Problem Set 3: Data Processing with R

Link to the description of the supermarket sales data available in the Supermarket Sales Data Overview.

Write an R script to complete the following tasks:

  1. Read in the dataset supermarket_sales.csv using read_csv(). Note: What packages do you need to have installed before loading in the data using this command?

  2. Calculate the total value of the sale using the unit_price and quantity columns. Name the new column (variable) subtotal. Then verify that the value labeled tax_5_percent is indeed 5% of the subtotal by creating a new variable called tax_verify. Assign the object to a new dataframe.

  3. Create a dataframe containing only the subset of sales from the product line Food and beverages.

  4. Create a dataframe containing only the columns city, product_line, unit_price, quantity, total, rating where the product line is Food and beverages.

  5. Sort the dataframe by quantity in descending order.

  6. Load the dplyr package and use the appropriate commands to calculate the median sales by payment type.

  7. Suppose you are asked to develop a new performance indicator for the company. You wonder if the transaction rating per unit price might provide insights into consumer preferences for different product lines. Calculate the rating per unit price for each transaction and call this new variable rup. Explain what this performance indicator might tell decision-makers at the company.

  8. Calculate the mean rup and unit price by product line.

  9. Print the contents of this dataframe into the console using the print() function. What conclusions do you draw from your analysis? Explain.

  10. Include the following lines of code to ensure that you have loaded all the necessary packages for this assignment:

version
print(.packages())
  1. Generate a log file from your script to show that it ran successfully in R. Use the sink() command to capture both the code and its output. Here is a hint for the structure of your script:
sink("problem_set_3.log")  # Start capturing output
source("problem_set_3.R", echo = TRUE)  # Run your script and show commands + output
sink()  # Stop capturing output

How to Submit

  1. Create a Webpage on Your Google Site:
  • Title the new webpage Problem Set 3.
  • This page should contain an R script log file that responds to all questions of the problem set.
  1. Submit the Link to Your Google Site Page:
  • Copy the URL of your Google Site webpage for Problem Set 3.
  • Submit this link via Canvas.