This study proposes a novel method for extracting breast cancer tumor heterogeneity descriptors to non-invasively predict whether pathological complete response (pCR) can be achieved after neoadjuvant chemotherapy (NAC). These localized descriptors extract corresponding heterogeneity features for different radiomic features and are able to capture tumor characteristics at various localization levels. These descriptors also capture tumor heterogeneity both at the individual tumor level and across the whole dataset, providing decision-making models with features that are both more effective and interpretable. We validated the effectiveness of the proposed features with the Kolmogorov-Arnold network (KAN) across multiple centers, yielding an AUC of 0.92 when combined with pathological features and demonstrating good performance in external datasets (AUCs of 0.84 and 0.81). Additionally, we transform the best model into a symbolic formula to intuitively explain the machine learning model’s prediction process, showing how factors such as age, HER2, Ki-67 and heterogeneity influence the prediction. The symbolized model is consistent with the experience of clinical experts, which enhances users’ confidence in deep models. The experimental results show that our proposed features and method outperform classical heterogeneity features and end-to-end neural networks with a small additional computational cost.

Paper

Code(Coming Soon)