欢迎来到《四川大学学报(医学版)》

胆结石患病风险预测模型的构建和多中心验证研究

Establishment and Validation of a Predictive Model for Gallstone Disease in the General Population: A Multicenter Study

  • 摘要:
    目的 基于我国的多中心体检人群数据构建和验证胆结石患病风险预测模型,以期能及早识别胆结石高风险患者,增强人们对该疾病的预防与控制意识。
    方法 最终纳入96426名研究对象,其中来自重庆医科大学附属第一医院的35976名研究对象被划分为训练集(80%, n=28781)和内部验证集(20%, n=7195)。来自济宁市第一人民医院、天津医科大学肿瘤医院和重庆市开州人民医院的研究对象将作为外部验证集对模型进行验证。采用logistic回归分析探究与胆结石病(gallstone disease, GSD)相关的风险因素,并利用列线图分别构建完整和简化的风险预测模型。校准曲线、受试者特征曲线下面积(AUC)和决策曲线分析用于验证这些模型的准确性和临床效用。此外,本研究基于研究结果建立了一个在线网站便于预测模型的使用(完全模型:https://wenqianyu.shinyapps.io/Completemodel/,简化模型:https://wenqianyu.shinyapps.io/Simplified/)。
    结果 女性、高龄、较高的体质量指数、空腹血糖、尿酸、总胆红素、γ-谷氨酰转肽酶和脂肪肝与GSD患病风险呈正相关。胆囊息肉、总胆固醇、高密度脂蛋白胆固醇、低密度脂蛋白胆固醇和天冬氨酸转氨酶与GSD患病风险呈负相关。完全模型内部验证AUC为74.1%(95%置信区间: 72.9%~75.3%)和简化模型内部验证AUC为73.7%(95%置信区间: 72.5%~75.0%),两种模型的决策曲线分析和校准曲线结果显示,GSD的完全和简化风险预测模型表现出优异的预测性能。此外,完全模型与简化模型的预测性能差异无统计学意义(P>0.05)。
    结论 本研究所建立的胆结石患病风险预测模型,以及在线GSD患病风险评估工具可以帮助患者和临床医生进行胆结石患病风险的预测。我们推荐在实践中使用简化模型以提高筛查高风险人群的效率。使用简化模型有助于增强普通人群的自我防控意识和GSD的早期干预。

     

    Abstract:
    Objective Gallstone disease (GSD) is one of the common digestive tract diseases with a high worldwide prevalence. The effects of GSD on patients include but are not limited to the symptoms of nausea, vomiting, and biliary colic directly caused by GSD. In addition, there is mounting evidence from cohort studies connecting GSD to other conditions, such as cardiovascular diseases, biliary tract cancer, and colorectal cancer. Early identification of patients at a high risk of GSD may help improve the prevention and control of the disease. A series of studies have attempted to establish prediction models for GSD, but these models could not be fully applied in the general population due to incomplete prediction factors, small sample sizes, and limitations in external validation. It is crucial to design a universally applicable GSD risk prediction model for the general population and to take individualized intervention measures to prevent the occurrence of GSD. This study aims to conduct a multicenter investigation involving more than 90000 people to construct and validate a complete and simplified GSD risk prediction model.
    Methods A total of 123634 participants were included in the study between January 2015 and December 2020, of whom 43929 were from the First Affiliated Hospital of Chongqing Medical University (Chongqing, China), 11907 were from the First People’s Hospital of Jining City (Shandong, China), 1538 were from the Tianjin Medical University Cancer Institute and Hospital (Tianjin, China), and 66260 were from the People’s Hospital of Kaizhou District (Chongqing, China). After excluding patients with incomplete clinical medical data, 35976 patients from the First Affiliated Hospital of Chongqing Medical University were divided into a training data set (n=28781, 80%) and a validation data set (n=7195, 20%). Logistic regression analyses were performed to investigate the relevant risk factors of GSD, and a complete risk prediction model was constructed. Factors with high scores, mainly according to the nomograms of the complete model, were retained to simplify the model. In the validation data set, the diagnostic accuracy and clinical performance of these models were validated using the calibration curve, area under the curve (AUC) of the receiver operating characteristic curve, and decision curve analysis (DCA). Moreover, the diagnostic accuracy of these two models was validated in three other hospitals. Finally, we established an online website for using the prediction model (The complete model is accessible at https://wenqianyu.shinyapps.io/Completemodel/, while the simplified model is accessible at https://wenqianyu.shinyapps.io/Simplified/).
    Results After excluding patients with incomplete clinical medical data, a total of 96426 participants were finally included in this study (35876 from the First Affiliated Hospital of the Chongqing Medical University, 9289 from the First People’s Hospital of Jining City, 1522 from the Tianjin Medical University Cancer Institute, and 49639 from the People’s Hospital of Kaizhou District). Female sex, advanced age, higher body mass index, fasting plasma glucose, uric acid, total bilirubin, gamma-glutamyl transpeptidase, and fatty liver disease were positively associated with risks for GSD. Furthermore, gallbladder polyps, total cholesterol, high-density lipoprotein cholesterol, low-density lipoprotein cholesterol, and aspartate aminotransferase were negatively correlated to risks for GSD. According to the nomograms of the complete model, a simplified model including sex, age, body mass index, gallbladder polyps, and fatty liver disease was constructed. All the calibration curves exhibited good consistency between the predicted and observed probabilities. In addition, DCA indicated that both the complete model and the simplified model showed better net benefits than treat-all and treat-none. Based on the calibration plots, DCA, and AUCs of the complete model (AUC in the internal validation data set=74.1% 95% CI: 72.9%-75.3%, AUC in Shandong=71.7% 95% CI: 70.6%-72.8%, AUC in Tianjin=75.3% 95% CI: 72.7%-77.9%, and AUC in Kaizhou=72.9% 95% CI: 72.5%-73.3%) and the simplified model (AUC in the internal validation data set=73.7% 95% CI: 72.5%-75.0%, AUC in Shandong=71.5% 95% CI: 70.4%-72.5%, AUC in Tianjin=75.4% 95% CI: 72.9%-78.0%, and AUC in Kaizhou=72.4% 95% CI: 72.0%-72.8%), we concluded that the complete and simplified risk prediction models for GSD exhibited excellent performance. Moreover, we detected no significant differences between the performance of the two models (P>0.05). We also established two online websites based on the results of this study for GSD risk prediction.
    Conclusions This study innovatively used the data from 96426 patients from four hospitals to establish a GSD risk prediction model and to perform risk prediction analyses of internal and external validation data sets in four cohorts. A simplified model of GSD risk prediction, which included the variables of sex, age, body mass index, gallbladder polyps, and fatty liver disease, also exhibited good discrimination and clinical performance. Nonetheless, further studies are needed to explore the role of low-density lipoprotein cholesterol and aspartate aminotransferase in gallstone formation. Although the validation results of the complete model were better than those of the simplified model to a certain extent, the difference was not significant even in large samples. Compared with the complete model, the simplified model uses fewer variables and yields similar prediction and clinical impact. Hence, we recommend the application of the simplified model to improve the efficiency of screening high-risk groups in practice. The use of the simplified model is conducive to enhancing the self-awareness of prevention and control in the general population and early intervention for GSD.

     

/

返回文章
返回