Objective To analyze the radiomic and clinical features extracted from 2D ultrasound images of thyroid tumors in patients with Hashimoto's thyroiditis (HT) combined with papillary thyroid carcinoma (PTC) using machine learning (ML) models, and to explore the diagnostic performance of the method in making preoperative noninvasive identification of cervical lymph node metastasis (LNM).
Methods A total of 528 patients with HT combined with PTC were enrolled and divided into two groups based on their pathological results of the presence or absence of LNM. The groups were subsequently designated the With LNM Group and the Without LNM Group. Three ultrasound doctors independently delineated the regions of interest and extracted radiomic features. Two modes, radiomic features and radiomics-clinical features, were used to construct random forest (RF), support vector machine (SVM), LightGBM, K-nearest neighbor (KNN), and XGBoost models. The performance of these five ML models in the two modes was evaluated by the receiver operating characteristic (ROC) curves on the test dataset, and SHapley Additive exPlanations (SHAP) was used for model visualization.
Results All five ML models showed good performance, with area under the ROC curve (AUC) ranging from 0.798 to 0.921. LightGBM and XGBoost demonstrated the best performance, outperforming the other models (P<0.05). The ML models constructed with radiomics-clinical features performed better than those constructed using only radiomic features (P<0.05). The SHAP visualization of the best-performing models indicated that the anteroposterior diameter, superoinferior diameter, original_shape_VoxelVolume, age, wavelet-LHL_firstorder_10Percentile, and left-to-right diameter had the most significant effect on the LightGBM model. On the other hand, the superoinferior diameter, anteroposterior diameter, left-to-right diameter, original_shape_VoxelVolume, original_firstorder_InterquartileRange, and age had the most significant effect on the XGBoost model.
Conclusion ML models based on radiomics and clinical features can accurately evaluate the cervical lymph node status in patients with HT combined with PTC. Among the 5 ML models, LightGBM and XGBoost demonstrate the best evaluation performance.