Motivation for analyzing time series data
Time series analysis
Forecasting and prediction
Project 1 overview
Time series data consists of observations collected at successive points in time, typically at regular intervals (e.g., daily, monthly, annually). This type of data is useful for identifying trends, seasonality, and making forecasts.
Example: Monthly sales revenue of a grocery store over five years.
Cross-sectional data captures information at a single point in time across multiple entities. Unlike time series, this data does not track changes over time but rather compares differences between subjects at a given moment.
Example: Household income levels surveyed across different cities in a single year.
Panel data, or longitudinal data, combines elements of both time series and cross-sectional data. It follows the same entities (e.g., individuals, firms, regions) over multiple time periods, allowing for analysis of both individual differences and time-based changes.
Example: Annual employment records of the same group of workers over a 10-year period.
The problem:
We don’t know what will
happen in the future
The solution:
We can use past data to
make a guess about the future*
Time series data is a sequence of data points collected over time.
It typically consists of observations taken at regular intervals (e.g. every hour, every day, every week, etc.) and can be used to study trends and patterns over time.
Example: Store sales volume over time.
Time Series Analysis
learn about trends
compare current period to past periods
Forecasting
make an educated guess about future values
develop business strategy around forecasts
The statistical method often used with time series data is called time series decomposition.
This method breaks down time series data into several components, each representing an underlying pattern within the data.
Isolating these components and analyzing them separately is important for the following reasons:
Decomposition statistically deconstructs a time series into several components:
seasonality
trend
cyclical
residual or “noise”
Sometimes, data doesn’t follow a smooth pattern—something big happens, and the trend changes suddenly.
These major shifts in the data are called structural breaks.
A structural break occurs when there is a sudden and lasting change in how the data behaves. This means the underlying factors driving the data (also called the data-generating process) have changed.
Identifying these changes helps analysts understand whether a trend is continuing as expected or if something important has altered the pattern.
Common Causes of Structural Breaks:
Example: Imagine you are analyzing restaurant sales over time. For years, sales followed a seasonal pattern, with higher sales in summer and lower in winter. Suddenly, in 2020, there was a huge, lasting drop in sales. This was a structural break—caused by the COVID-19 pandemic, which led to lockdowns and changes in consumer habits.
On your own: Describe one example of a structural break in a time series.
We might want to forecast. . .
power demand to decide whether to build new power plant
call volumes to schedule staff in call center
inventory requirements to meet demands
Use historical data to develop a model, then use the model to predict the future
Model quality depends on past data and assumptions
Forecasts are uncertain; we can quantify some of that uncertainty
Problem definition
Gathering data and institutional knowledge
Preliminary (exploratory) analysis
Choosing and fitting models
Using and evaluating a forecast model
Example: A group of farmers in a drought-prone region wants to determine whether they will have enough water for irrigation in the upcoming growing season. They would analyze historical rainfall and reservoir level data from the past 20 years, and use forecasting models to predict water availability for the next six months to help make better decisions about crop selection, irrigation scheduling, and water conservation strategies.
On your own: How could you use forecasting to improve decision-making in ag business or environmental and natural resource management?
Time series data is a common structure of data
Certain techniques are designed to analyze time series data
Forecasting is a common need
Introductions in this course (see Hyndman & Athanasopoulos (2021))
Form groups of 2.
Context: Equipment dealers tend to sell more machinery when agricultural prices rise. Most farmers near Dealer X grow corn.
Decision: Dealer X needs to decide whether to increase inventory of new tractors and combines for next year.
Question: Will corn prices rise or fall in the next year, and by how much?
Hypotheses:
Approach:
Presentation:
Story: Corn prices have historically risen in stepwise jumps rather than gradual increases, with structural breaks around 1940, 1970, and 2006. Each break led to a new, higher average price level.
Result: If corn prices continue following this pattern, Dealer X should consider increasing equipment inventory to meet anticipated higher demand from farmers who will have greater revenues to invest in new machinery.
Your project must include:
Integration with the D3M Framework
Hyndman, R.J., & Athanasopoulos, G. (2021) Forecasting: principles and practice, 3rd edition, OTexts: Melbourne, Australia. https://otexts.com/fpp3/. Accessed on 02-14-2023.