欢迎来到《四川大学学报(医学版)》

金属组学与机器学习构建肝豆状核变性无创诊断模型

Establishment of a Noninvasive Diagnostic Model for Wilson Disease Using Metallomics and Machine Learning

  • 摘要:
    目的 分析肝豆状核变性(Wilson disease, WD)患者尿液金属谱的差异,结合机器学习寻找早期诊断生物标志物,构建无创诊断模型。
    方法 纳入63例WD患者及63例性别、年龄匹配的健康对照。收集研究对象的尿液样本及临床资料,使用电感耦合等离子体质谱测定尿液中51种金属元素含量。采用Wilcoxon符号秩检验比较组间差异,根据检出率>50%、P < 0.05、∣log2FC∣>1筛选差异金属特征,并利用弹性网络回归模型进行特征选择。采用非度量多维标度分析组间金属分布。使用Spearman相关分析和分位数g计算模型分析金属与临床指标的关联性。基于随机森林算法构建诊断模型,并采用受试者工作特征曲线和混淆矩阵评估模型性能。
    结果 尿液金属组学分析显示Cu、Zn、Ca、Co、Sr、Ti、Y、Cs、Rb、Cd、Sn在病例组与对照组之间存在统计学差异。金属比值分析显示病例组Cu/Zn、Cu/Se、Zn/Se高于对照组,差异具有统计学意义。弹性网络回归筛选出14个关键特征,其中Cu的标准化回归系数最大(β=−14.628)。非度量多维标度分析证实两组金属谱存在分离。相关性分析表明Cu、Cu/Zn等特征与肝功能指标相关,Spearman相关系数在−0.58至0.67之间。分位数g计算模型提示金属混合物对A/G比值有显著负向效应。随机森林模型表现出优异的诊断效能,训练集受试者工作特征曲线下面积(area under the curve, AUC)为0.99〔95%置信区间(confidence interval, CI):0.97~1.00〕;测试集AUC为0.97(95%CI:0.94~1.00),优于单一指标。
    结论 尿液金属组学分析表明 Cu/Zn较传统尿铜检测具有更高的诊断效能;同时,基于多金属构建的诊断模型展现出极高的准确性,为WD的早期无创诊断提供了新方法。

     

    Abstract:
    Objective To analyze the differences in urine metal profiles between patients with Wilson disease (WD) and healthy controls, to identify early diagnostic biomarkers, and to develop a non-invasive diagnostic model using machine learning.
    Methods 63 WD patients and 63 matched healthy controls were included. Urine samples and clinical data were collected from all the participants. The concentrations of 51 urine metals were determined using inductively coupled plasma mass spectrometry (ICP-MS). Differences between the two groups were compared using the Wilcoxon signed-rank test. Differential metal features were selected based on detection rates > 50%, P < 0.05 and |log2FC| > 1, and feature selection was performed using elastic net regression. Non-metric multidimensional scaling was used to analyze the metal distribution between the two groups. Spearman correlation analysis and quantile g calculation models were used to analyze the correlation between metals and clinical indicators. A diagnostic model was developed using the random forest algorithm, and the performance was evaluated using receiver operating characteristic curves and confusion matrices.
    Results Urine metallomics analysis revealed statistically significant differences in the levels of Cu, Zn, Ca, Co, Sr, Ti, Y, Cs, Rb, Cd and Sn between the case and control groups. Cu/Zn, Cu/Se and Zn/Se ratios were significantly higher in the case group. Elastic net regression identified 14 key features, with Cu having the largest standardized regression coefficient (β = −14.628). Non-metric multidimensional scaling confirmed the separation of metal profiles between the two groups. Correlation analysis showed significant associations between Cu, Cu/Zn, and liver function indicators, with Spearman correlation coefficients ranging from −0.58 to 0.67. The quantile g calculation model suggested that metal mixtures had a significant negative effect on the A/G ratio. The random forest model exhibited excellent diagnostic performance, with an area under the curve (AUC) of 0.99 (95% CI: 0.97-1.00) in the training set and 0.97 (95% CI: 0.94-1.00) in the testing set, outperforming single indicators.
    Conclusion Urine metallomics analysis indicated that the Cu/Zn ratio obtained superior diagnostic efficiency compared to traditional urine copper test. Additionally, the diagnostic model based on differential metal characteristics demonstrated high accuracy, providing a new method for the early non-invasive diagnosis of WD.

     

/

返回文章
返回