Proposal of an architecture for multilevel (pseudo)anonymization of healthcare data
anonymization, privacy, pseudonymization, LGPD
In recent decades, technological evolution has brought numerous advances allowing intensive collection, processing, and storage of personal data. There is much evidence, mainly revelations, about the operations and data breaches of large companies with data as their most significant asset, such as Facebook, Google, Amazon, and Uber. Due to this finding, there is a growing concern about using these data, evidenced by the profusion of laws worldwide that aim to protect individuals' privacy. The various legislations point to the need to implement processes and techniques that guarantee data privacy, among which is the (pseudo)anonymization of data. It is in this context and seeking to contribute to the protection of privacy that, in this work, an architecture is proposed for the multilevel (pseudo)anonymization of health data. Multilevel, as data is pseudonymized at two levels, one local and one global, thus ensuring that data from multiple providers can be related yet (pseudo)anonymized. The focus on the health area is, on the one hand, a challenging application, given the sensitivity of the data.
The architecture proposed in this work was implemented as a proof of concept and evaluated from a set of tests. Test results suggest that the architecture enables correct anonymization at the source, secure linking of (pseudo)anonymized data across multiple sources, and even allows reidentification for cases involving the security of the individuals involved.