AI-Based Whole Slide Image Analysis for Automated Breast Cancer Classification
Breast cancer, machine learning, computer vision, digital pathology, histopathology
Breast cancer is one of the main public health challenges worldwide and remains among the most prevalent and lethal types of cancer. Histopathological analysis is essential for diagnosis and staging, but the high demand for examinations combined with the shortage of pathologists overloads healthcare services and slows the diagnostic process. In this context, digital pathology and computational analysis of Whole Slide Images (WSIs) have emerged as promising alternatives to support clinical decision making, although they bring challenges related to the large volume of data and high computational requirements. This study proposes a framework for the classification of breast tissue WSIs. In the proposed method, high-resolution images are segmented into tiles and processed using transfer learning techniques to extract discriminative features. These features are subsequently integrated through a novel tile-to-slide aggregation strategy, introduced in this work, resulting in a representative feature vector for each WSI. The representation may vary in dimensionality, enabling the assessment of the relationship between the degree of vector compression and its capacity to preserve discriminative information essential for distinguishing tumor from non-tumor tissues. Feature vectors obtained at different compression levels were applied to classical machine learning algorithms such as Support Vector Machine and Logistic Regression to evaluate whether reduced representations would maintain adequate performance. The results show that the extracted and compressed features retain high discriminative power, enabling accurate classification of WSIs containing tumor tissues and confirming the potential of the proposed approach as a supportive tool for triage in digital pathology.