Volumetric Behavior of Aqueous Electrolyte Solutions: machine learning model development based on molecular dynamics simulation
Extensive density database; Molecular Dynamic in NPT ensemble; Group contribution methods; Bayesian hyperparameter optimization.
Accurate prediction of density in aqueous electrolyte solutions is crucial to understanding their behavior in various industrial processes and environmental contexts. This pursuit is underscored by the pivotal roles these solutions play in applications ranging from chemical synthesis to biological systems. To address the challenges associated with density prediction, sophisticated models and theories have been developed. Molecular Dynamics (MD) simulations, particularly NPT simulations, offer a powerful tool for studying these systems at the molecular level. Using NPT MD simulations, this study describes the density behavior of 32 binary liquid aqueous electrolyte solutions over a wide temperature (from T = 278.15 to 368.15 K) and pressure (P = 0.1 a 100 MPa) rages, for concentrations from infinite dilution up to concentrated solutions, generating a dataset of XXXX data points. Uncertainty was estimated for all data points using the bootstrap technique and agreed with the experimental ones. The accuracy of MD calculations was systematically validated against experimental data and semi-empirical models (AAD < 19 kg·m-3 e Δρ < 2, 0% for all electrolyte species) demonstrating the success of MD calculations. The extensive density database successfully generated in this study was used to train Machine Learning (ML) models. Classical algorithms (e.g. decision tree and KNN) as well as boost techniques (e.g. gradient boost and extra trees) were evaluated for density prediction. For all models, hyperparameter sweeps were conducted using best practices of MLOps on the Weights & Biases platform. Considering a test dataset consisting of 10 n% from the original database, the model’s success in predicting the density was demonstrated (AAD < 19 kg·m-3 e Δρ < 2, 0% for all electrolyte species). Furthermore, the predictions were compared with an experimental database from the literature (XXX experimental points). AAD < 19 kg·m-3 e Δρ < 2, 0% for all electrolyte species corroborates the accuracy of the developed models. Finally, the ML models for aqueous electrolyte solutions presented in this work contribute to the design and optimization of chemical engineering processes.