This repository provides a structured approach to analyzing and forecasting Instagram reach using historical data. The primary goal is to identify patterns, trends, seasonality, and anomalies to develop a predictive model capable of forecasting future reach. The analysis and forecasting are implemented using Python and several data analysis libraries.
The dataset, Instagram-Reach.csv, includes:
- Date: The date of the Instagram post.
- Instagram Reach: The number of people reached by the post on the corresponding date.
- data/: Contains the dataset.
- notebooks/: Jupyter notebooks with analysis and forecasting steps.
- scripts/: Python scripts for data analysis and modeling.
- results/: Output plots and forecast results.
- Import the dataset.
- Check for null values, column info, and descriptive statistics.
- Convert
Dateto datetime datatype. - Set
Dateas the index.
- Plot a line chart to visualize Instagram reach trends over time.
- Plot a bar chart for daily Instagram reach.
- Create a box plot to visualize Instagram reach distribution.
- Create a
Daycolumn from theDatecolumn. - Group by day and calculate mean, median, and standard deviation of Instagram reach.
- Plot a bar chart for mean Instagram reach by day of the week.
- Decompose the time series to analyze trends and seasonal patterns.
- Use autocorrelation and partial autocorrelation plots to determine SARIMA parameters (p, d, q).
- Train a SARIMA model and make predictions.
- Plot historical data and forecasted values.
Install required libraries using:
pip install pandas numpy matplotlib seaborn statsmodels