3D-QSARpy: Combining Variable Selection Strategies and Various Machine Learning
Techniques to Build QSAR Models
QSAR Models, QSAR 3D, Chemoinformatics, Structural Bioinformatics, Biological Activity, Prediction, Regression, Feature Selection.
Quantitative Structure Activity Relationship (QSAR) is a technology in the field of medicinal chemistry that seeks to clarify the relationships between molecular structures and their biological activities. For this, QSAR models are constructed from the structural data (2D, 3D or 4D) from a series of molecules already tested for a given activity. Through predictions made by these models, it is aimed to identify which modifications in the molecule can influence, reinforcing or not the biological response. Such technology allows accelerating the development of new compounds by reducing the costs for drug design. Considering the briefly exposed context, the present work aims to propose a methodology and test it in several data sets through the development of a tool for QSAR-3D, then called 3D-QSARpy. The methodology was successfully validated through the application of the tool in two sets of data, which results outperformed those previously published. The first set involving diabetes treatment, it reached r 2 pred =0.91. The second set referring to cancer treatment, with r 2 pred =0.98. Finally, two applications of the tool were performed, contributing to the identification of new bioactive molecular structures using different approaches. The first of which is intended for the treatment of chagas disease, including the construction of hybrid QSAR models for three series, obtaining r 2 pred = 0.8, 0.68 e 0.85. The second application was the construction of QSAR-4D for the tuberculosis treatment with r 2 pred = 0.72. It doesn’t matter if the experiments were for validation or for the identification of these new molecules. All of them demonstrated not only the efficiency of the proposed methodology and the developed tool, but also the versatility of possible applications with this methodology, either following its general pipeline or using it in a partially way combined with other existing tools.