sales forecasting using a time series ML model called SARIMA
This code loads the sales data from a CSV file, sets the date as the index, and aggregates the sales by day. It then splits the data into training and testing sets, fits a SARIMA model to the training data, and forecasts the sales for the next 30 days. Finally, it plots the actual and forecasted sales, along with the confidence intervals. Note that you may need to adjust the model parameters (order and seasonal_order) depending on your data.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from statsmodels.tsa.statespace.sarimax import SARIMAX
# Load sales data
sales_data = pd.read_csv('sales_data.csv', parse_dates=['date'])
# Set date as index and aggregate sales by day
sales_data.set_index('date', inplace=True)
sales_data = sales_data.resample('D').sum()
# Split data into training and testing sets
train_data = sales_data.iloc[:-30]
test_data = sales_data.iloc[-30:]
# Fit SARIMA model to training data
model = SARIMAX(train_data, order=(1, 1, 1), seasonal_order=(0, 1, 1, 7))
results = model.fit()
# Forecast sales for next 30 days
forecast = results.get_forecast(steps=30)
# Calculate forecasted sales and confidence intervals
forecasted_sales = forecast.predicted_mean
lower_ci = forecast.conf_int()['lower sales']
upper_ci = forecast.conf_int()['upper sales']
# Plot actual and forecasted sales
plt.plot(train_data.index, train_data, label='Training data')
plt.plot(test_data.index, test_data, label='Testing data')
plt.plot(forecasted_sales.index, forecasted_sales, label='Forecasted sales')
plt.fill_between(forecasted_sales.index, lower_ci, upper_ci, alpha=0.2)
plt.legend()
plt.show()
Comments
Post a Comment