A Framework for Large Data Analysis by REST Service
Big Data, Spark, Framework, Data Analysis, Time Series, Public Health
Given the increasing amount of electronic data recorded daily, related to public health, there is a current trend for the use of data analysis in order to implement changes in health systems, which will enable cost reduction and better service delivery to the population . The term Big Data in Health has been widely used in several studies, which seek to find and evaluate the possible benefits of Big Data, among which: the reduction of waste and inefficiency in public health by analyzing disease patterns and tracking disease outbreaks to improve surveillance and speed response, development of more targeted vaccines, etc. However, most of the studies involving Big Data and Public Health are based on theoretical and qualitative analyzes, presenting as limitations and challenges, for example, the validation of the proposed suggestions and the training of skills for the use of Big Data technologies, which are different from traditional technologies. In this sense, the present work aims to implement a framework for analysis of large volumes of data, which will allow the use of traditional methods available in the R language, much used by statisticians to analyze data in Epidemiology. To enable interoperability between systems, a web Application Programming Interface (API) will be developed to access framework functionality through Representational State Transfer (REST). As a case study, will be applied functionalities of the framework for manipulation of time series.