Abstract:
Objective The fully automatic segmentation of glioma and its subregions is fundamental for computer-aided clinical diagnosis of tumors. In the segmentation process of brain magnetic resonance imaging (MRI), convolutional neural networks with small convolutional kernels can only capture local features and are ineffective at integrating global features, which narrows the receptive field and leads to insufficient segmentation accuracy. This study aims to use dilated convolution to address the problem of inadequate global feature extraction in 3D-UNet.
Methods 1) Algorithm construction: A 3D-UNet model with three pathways for more global contextual feature extraction, or 3DGE-UNet, was proposed in the paper. By using publicly available datasets from the Brain Tumor Segmentation Challenge (BraTS) of 2019 (335 patient cases), a global contextual feature extraction (GE) module was designed. This module was integrated at the first, second, and third skip connections of the 3D UNet network. The module was utilized to fully extract global features at different scales from the images. The global features thus extracted were then overlaid with the upsampled feature maps to expand the model's receptive field and achieve deep fusion of features at different scales, thereby facilitating end-to-end automatic segmentation of brain tumors. 2) Algorithm validation: The image data were sourced from the BraTs 2019 dataset, which included the preoperative MRI images of 335 patients across four modalities (T1, T1ce, T2, and FLAIR) and a tumor image with annotations made by physicians. The dataset was divided into the training, the validation, and the testing sets at an 8∶1∶1 ratio. Physician-labelled tumor images were used as the gold standard. Then, the algorithm's segmentation performance on the whole tumor (WT), tumor core (TC), and enhancing tumor (ET) was evaluated in the test set using the Dice coefficient (for overall effectiveness evaluation), sensitivity (detection rate of lesion areas), and 95% Hausdorff distance (segmentation accuracy of tumor boundaries). The performance was tested using both the 3D-UNet model without the GE module and the 3DGE-UNet model with the GE module to internally validate the effectiveness of the GE module setup. Additionally, the performance indicators were evaluated using the 3DGE-UNet model, ResUNet, UNet++, nnUNet, and UNETR, and the convergence of these five algorithm models was compared to externally validate the effectiveness of the 3DGE-UNet model.
Results 1) In internal validation, the enhanced 3DGE-UNet model achieved Dice mean values of 91.47%, 87.14%, and 83.35% for segmenting the WT, TC, and ET regions in the test set, respectively, producing the optimal values for comprehensive evaluation. These scores were superior to the corresponding scores of the traditional 3D-UNet model, which were 89.79%, 85.13%, and 80.90%, indicating a significant improvement in segmentation accuracy across all three regions (P<0.05). Compared with the 3D-UNet model, the 3DGE-UNet model demonstrated higher sensitivity for ET (86.46% vs. 80.77%) (P<0.05) , demonstrating better performance in the detection of all the lesion areas. When dealing with lesion areas, the 3DGE-UNet model tended to correctly identify and capture the positive areas in a more comprehensive way, thereby effectively reducing the likelihood of missed diagnoses. The 3DGE-UNet model also exhibited exceptional performance in segmenting the edges of WT, producing a mean 95% Hausdorff distance superior to that of the 3D-UNet model (8.17 mm vs. 13.61 mm, P<0.05). However, its performance for TC (8.73 mm vs. 7.47 mm) and ET (6.21 mm vs. 5.45 mm) was similar to that of the 3D-UNet model. 2) In the external validation, the other four algorithms outperformed the 3DGE-UNet model only in the mean Dice for TC (87.25%), the mean sensitivity for WT (94.59%), the mean sensitivity for TC (86.98%), and the mean 95% Hausdorff distance for ET (5.37 mm). Nonetheless, these differences were not statistically significant (P>0.05). The 3DGE-UNet model demonstrated rapid convergence during the training phase, outpacing the other external models.
Conclusion The 3DGE-UNet model can effectively extract and fuse feature information on different scales, improving the accuracy of brain tumor segmentation.