Expanded Latent Space Autoencoder for Covid-19 Times Series Forecasting
Machine Learning, Artificial Neural Network, Auto-encoder, Pandemic, COVID-19
The global SARS-CoV-2 pandemics compelled governments, institutions, and researchers to assess its impact and develop strategies based on general indicators to achieve the most accurate predictions possible, in order to help managers mitigating its effect. While known epidemiological models were naturally used, they often produced uncertain forecasts due to insufficient or missing data. In addition to data limitation, various machine-learning models such as random forests, support vector regression, LSTM, auto-encoders, and traditional time-series models like Prophet and ARIMA—were employed, yielding impressive yet somewhat limited results. Some of these methods struggle with precision when handling multi-variable inputs, which are crucial for problems like pandemics time series prediction that require both short- and long-term forecasting. In response to this challenge, we propose a novel approach for time-series prediction that utilizes a stacked auto-encoder structure. Our model uses $n$ internal autoencoders to process the input and generate different latent spaces for this respective input. Then these different latent spaces are concatenated and the expanded latent space is obtained. We conducted an experiment using previously published data series on COVID-19 cases, deaths, temperature, humidity, and the air quality index (AQI) in São Paulo City, Brazil. This experiment assessed the suitability of our model for short-, medium-, and long-term forecasting. Furthermore, we directly compared our proposed model with two existing works in the literature that have already undergone expert scrutiny. The first comparison places our model among those that use one network for feature extraction and another for predicting the pandemic trends. The second comparison highlights our model's effectiveness in multi-series forecasting of pandemic indicators. The results suggest that our proposed model possesses strong capabilities in both feature extraction and multi-series forecasting, offering improvements over the two comparison works. Finally, the model demonstrates promising forecasting accuracy and versatility across datasets of varying lengths, making it a standout option for time-series forecasting tasks.