PandemAI: A Machine Learning-based Framework for Pandemic Viral Disease Symptom Dynamics Analysis
COVID-19, symptoms, machine learning, framework, t-sne, k-means, apriori, xgboost, xai
Historically, pandemics have manifested in various forms, each impacting human societies around the world in different ways over time. Some pandemics are particularly remembered for their high mortality rates, widespread geographic reach, or prolonged duration. However, when dealing with pandemics caused by viruses, it is crucial to consider their inherent ability to mutate rapidly, generating different lineages and variants. In such a scenario, where the same disease can manifest in multiple forms, correctly identifying it becomes a complex and dynamic challenge.
This task becomes particularly critical when considering the symptomatic variations between different strains over time and their impact on disease characterization, especially since symptomatology often remains the primary basis for diagnosis. In this context, over the past few decades, Machine Learning (ML) algorithms have emerged as powerful analytical tools. By identifying patterns in complex datasets, ML techniques play a crucial role in recognizing the disease across its diverse manifestations, thereby contributing to more accurate diagnostic outcomes.
Thus, this study proposes PandemAI, a data-driven framework designed to analyze how symptom variations driven by the evolution of viral variants affect disease recognition over time. The framework comprises three phases: exploration of symptom patterns, symptom rule mining, and symptom-based diagnostic prediction. To validate the proposed approach, we employ data from the Brazilian Severe Acute Respiratory Syndrome (SARS) surveillance system.