Abstract:
Objective Gallstone disease (GSD) is one of the common digestive tract diseases with a high worldwide prevalence. The effects of GSD on patients include but are not limited to the symptoms of nausea, vomiting, and biliary colic directly caused by GSD. In addition, there is mounting evidence from cohort studies connecting GSD to other conditions, such as cardiovascular diseases, biliary tract cancer, and colorectal cancer. Early identification of patients at a high risk of GSD may help improve the prevention and control of the disease. A series of studies have attempted to establish prediction models for GSD, but these models could not be fully applied in the general population due to incomplete prediction factors, small sample sizes, and limitations in external validation. It is crucial to design a universally applicable GSD risk prediction model for the general population and to take individualized intervention measures to prevent the occurrence of GSD. This study aims to conduct a multicenter investigation involving more than 90000 people to construct and validate a complete and simplified GSD risk prediction model.
Methods A total of 123634 participants were included in the study between January 2015 and December 2020, of whom 43929 were from the First Affiliated Hospital of Chongqing Medical University (Chongqing, China), 11907 were from the First People’s Hospital of Jining City (Shandong, China), 1538 were from the Tianjin Medical University Cancer Institute and Hospital (Tianjin, China), and 66260 were from the People’s Hospital of Kaizhou District (Chongqing, China). After excluding patients with incomplete clinical medical data, 35976 patients from the First Affiliated Hospital of Chongqing Medical University were divided into a training data set (
n=28781, 80%) and a validation data set (
n=7195, 20%). Logistic regression analyses were performed to investigate the relevant risk factors of GSD, and a complete risk prediction model was constructed. Factors with high scores, mainly according to the nomograms of the complete model, were retained to simplify the model. In the validation data set, the diagnostic accuracy and clinical performance of these models were validated using the calibration curve, area under the curve (AUC) of the receiver operating characteristic curve, and decision curve analysis (DCA). Moreover, the diagnostic accuracy of these two models was validated in three other hospitals. Finally, we established an online website for using the prediction model (The complete model is accessible at
https://wenqianyu.shinyapps.io/Completemodel/, while the simplified model is accessible at
https://wenqianyu.shinyapps.io/Simplified/).
Results After excluding patients with incomplete clinical medical data, a total of 96426 participants were finally included in this study (35876 from the First Affiliated Hospital of the Chongqing Medical University, 9289 from the First People’s Hospital of Jining City, 1522 from the Tianjin Medical University Cancer Institute, and 49639 from the People’s Hospital of Kaizhou District). Female sex, advanced age, higher body mass index, fasting plasma glucose, uric acid, total bilirubin, gamma-glutamyl transpeptidase, and fatty liver disease were positively associated with risks for GSD. Furthermore, gallbladder polyps, total cholesterol, high-density lipoprotein cholesterol, low-density lipoprotein cholesterol, and aspartate aminotransferase were negatively correlated to risks for GSD. According to the nomograms of the complete model, a simplified model including sex, age, body mass index, gallbladder polyps, and fatty liver disease was constructed. All the calibration curves exhibited good consistency between the predicted and observed probabilities. In addition, DCA indicated that both the complete model and the simplified model showed better net benefits than treat-all and treat-none. Based on the calibration plots, DCA, and AUCs of the complete model (AUC in the internal validation data set=74.1% 95% CI: 72.9%-75.3%, AUC in Shandong=71.7% 95% CI: 70.6%-72.8%, AUC in Tianjin=75.3% 95% CI: 72.7%-77.9%, and AUC in Kaizhou=72.9% 95% CI: 72.5%-73.3%) and the simplified model (AUC in the internal validation data set=73.7% 95% CI: 72.5%-75.0%, AUC in Shandong=71.5% 95% CI: 70.4%-72.5%, AUC in Tianjin=75.4% 95% CI: 72.9%-78.0%, and AUC in Kaizhou=72.4% 95% CI: 72.0%-72.8%), we concluded that the complete and simplified risk prediction models for GSD exhibited excellent performance. Moreover, we detected no significant differences between the performance of the two models (P>0.05). We also established two online websites based on the results of this study for GSD risk prediction.
Conclusions This study innovatively used the data from 96426 patients from four hospitals to establish a GSD risk prediction model and to perform risk prediction analyses of internal and external validation data sets in four cohorts. A simplified model of GSD risk prediction, which included the variables of sex, age, body mass index, gallbladder polyps, and fatty liver disease, also exhibited good discrimination and clinical performance. Nonetheless, further studies are needed to explore the role of low-density lipoprotein cholesterol and aspartate aminotransferase in gallstone formation. Although the validation results of the complete model were better than those of the simplified model to a certain extent, the difference was not significant even in large samples. Compared with the complete model, the simplified model uses fewer variables and yields similar prediction and clinical impact. Hence, we recommend the application of the simplified model to improve the efficiency of screening high-risk groups in practice. The use of the simplified model is conducive to enhancing the self-awareness of prevention and control in the general population and early intervention for GSD.