Sales forecasting for a supermarket chain in Natal, Brazil: an empirical assessment
Time series forecasting; machine learning; retail.
Time series forecasting is a consolidated, broadly used approach in several fields, including financing and industry. Retail can also benefit from forecasting in many areas, such as stock demand, and price and sales optimization. This study addresses retail sales forecasting in Nordestão, a large Brazilian supermarket chain. Though located in a state with a low gross domestic product (GDP), Nordestão respectively ranks 3rd and 27th in regional and national sales. The data considered here spans five years of daily transactions from eight different stores. Different machine learning techniques, knowingly effective for forecasting, are adopted, namely random forests and XGBoost. We further improve their performance with feature engineering to address seasonal effects.The best algorithm varies per store, but for most stores at least one of the methods is proven to be effective. Though we model transactions daily, the best models achieve above 90% R2 scores for 7-day forecasts. Besides the traditional relevance of sales forecasting, our work is a means for Nordestão to evaluate the impact of the COVID-19 pandemic on sales.