sales forecasting using a time series ML model called SARIMA

This code loads the sales data from a CSV file, sets the date as the index, and aggregates the sales by day. It then splits the data into training and testing sets, fits a SARIMA model to the training data, and forecasts the sales for the next 30 days. Finally, it plots the actual and forecasted sales, along with the confidence intervals. Note that you may need to adjust the model parameters (order and seasonal_order) depending on your data.


 import pandas as pd

import numpy as np

import matplotlib.pyplot as plt

from statsmodels.tsa.statespace.sarimax import SARIMAX


# Load sales data

sales_data = pd.read_csv('sales_data.csv', parse_dates=['date'])


# Set date as index and aggregate sales by day

sales_data.set_index('date', inplace=True)

sales_data = sales_data.resample('D').sum()


# Split data into training and testing sets

train_data = sales_data.iloc[:-30]

test_data = sales_data.iloc[-30:]


# Fit SARIMA model to training data

model = SARIMAX(train_data, order=(1, 1, 1), seasonal_order=(0, 1, 1, 7))

results = model.fit()


# Forecast sales for next 30 days

forecast = results.get_forecast(steps=30)


# Calculate forecasted sales and confidence intervals

forecasted_sales = forecast.predicted_mean

lower_ci = forecast.conf_int()['lower sales']

upper_ci = forecast.conf_int()['upper sales']


# Plot actual and forecasted sales

plt.plot(train_data.index, train_data, label='Training data')

plt.plot(test_data.index, test_data, label='Testing data')

plt.plot(forecasted_sales.index, forecasted_sales, label='Forecasted sales')

plt.fill_between(forecasted_sales.index, lower_ci, upper_ci, alpha=0.2)

plt.legend()

plt.show()


Comments

Popular posts from this blog

Git commands

How to Debug Android TV App using IP