Objective To screen for long non-coding RNA (lncRNA) molecular markers characteristic of osteoarthritis (OA) by utilizing the Gene Expression Omnibus (GEO) database combined with machine learning.
Methods The samples of 185 OA patients and 76 healthy individuals as normal controls were included in the study. GEO datasets were screened for differentially expressed lncRNAs. Three algorithms, the least absolute shrinkage and selection operator (LASSO), support vector machine recursive feature elimination (SVM-RFE), and random forest (RF), were used to screen for candidate lncRNA models and receiver operating characteristic (ROC) curves were plotted to evaluate the models. We collected the peripheral blood samples of 30 clinical OA patients and 15 health controls and measured the immunoinflammatory indicators. RT-PCR was performed for quantitative analysis of the expression of lncRNA molecular markers in peripheral blood mononuclear cells (PBMC). Pearson analysis was performed to examine the correlation between lncRNA and indicators for inflammation of the immune system.
Results A total of 14 key markers were identified with LASSO, 6 genes were identified with SVM-RFE, and 24 genes were identified with RF. Venn diagram was used to screen for overlapping genes identified with the three algorithms, showing HOTAIR, H19, MIR155HG, and NKILA to be the overlapping genes. The ROC curves showed that these four lncRNAs all had an area under the curve (AUC) greater than 0.7. The RT-PCR findings revealed relatively elevated expression of HOTAIR, H19, and MIR155HG and decreased expression of NKILA in the PBMC of OA patients compared with those of the normal group (P<0.01). The results were consistent with the bioinformatics predictions. Pearson analysis showed that the candidate lncRNAs were correlated with clinical indicators for inflammation.
Conclusion HOTAIR, H19, MIR155HG, and NKILA can be used as molecular markers for the clinical diagnosis of OA and are correlate with clinical indicators of inflammation of the immune system.