Aiming at the problem that the small samples of critical disease in clinic may lead to prognostic models with poor performance of overfitting, large prediction error and instability, the long short-term memory transferring algorithm (transLSTM) was proposed. Based on the idea of transfer learning, the algorithm leverages the correlation between diseases to transfer information of different disease prognostic models, constructs the effictive model of target disease of small samples with the aid of large data of related diseases, hence improves the prediction performance and reduces the requirement for target training sample quantity. The transLSTM algorithm firstly uses the related disease samples to pretrain partial model parameters, and then further adjusts the whole network with the target training samples. The testing results on MIMIC-Ⅲ database showed that compared with traditional LSTM classification algorithm, the transLSTM algorithm had 0.02-0.07 higher AUROC and 0.05-0.14 larger AUPRC, while its number of training iterations was only 39%-64% of the traditional algorithm. The results of application on sepsis revealed that the transLSTM model of only 100 training samples had comparable mortality prediction performance to the traditional model of 250 training samples. In small sample situations, the transLSTM algorithm has significant advantages with higher prediciton accuracy and faster training speed. It realizes the application of transfer learning in the prognostic model of critical disease with small samples.
ObjectiveTo study the differential expression of minichromosome maintenance protein (MCM) gene family in hepatocellular carcinoma (HCC) and to explore its survival predictive value.MethodsTranscriptome data, clinical data, and survival information of patients with HCC were extracted from The Cancer Genome Atlas (TCGA), and the differential expression of MCM gene was analyzed. The prognostic value of differentially expressed of MCM gene was studied by Cox proportional hazards regression model, the prognostic model and risk score system were constructed. On the basis of risk score, a number of indicators were included to construct a nomogram to predict the3- and 5-year survival probability of HCC patients, and to verify and evaluate their predictive ability and accuracy.ResultsThe expressions of MCM2, MCM3, MCM4, MCM5, MCM6, MCM7, MCM8, and MCM10 in HCC tissues were higher than those of normal liver tissues (P<0.05), and univariate analysis showed that they were all related to prognosis (P<0.05). Multivariate analysis showed that MCM6 and MCM10 were independent factors affecting survival of HCC patients (P<0.05). Through multivariate analysis, a prognostic model consisting of MCM6, MCM8, and MCM10 was constructed, and a risk scoring system was established. It had been verified that this risk score was an independent risk factor affecting the prognosis of patients with HCC, and the prognosis of patients with high scores were worse than those of patients with low scores (P<0.001). We used TNM stage, T stage, and risk score to construct a nomogram with a consistency index (C index) of 0.723 and draw a time-dependent receiver operating characteristic curve, the results showed that area under the curve of 3- and 5-year were 0.731 and 0.704, respectively.ConclusionsMCM6,MCM8, and MCM10 in the MCM gene family have important prognostic value in HCC. The nomogram constructed in this study can better predict the survival probability of HCC patients.
ObjectiveTo onstructe a prognostic model for gastric cancer based on disulfidoptosis-related genes. MethodsFirstly, transcriptome data and clinical data were obtained from the TCGA and GEO databases to explore the expression of disulfidoptosis-related genes in gastric cancer tissues and normal tissues, as well as their impact on the overall survival (OS) of gastric cancer patients. Subsequently, two clusters of disulfidoptosis-related gene were determined by consensus clustering, key genes were further selected by using LASSO regression, and a multivariate Cox proportional hazards regression model was constructed to predict OS. ResultsAmong the 24 kinds of disulfidoptosis-associated genes, 16 exhibited statistically significant differences in expression between gastric cancer tissues and normal tissues (P<0.05), and results of univariate Cox proportional hazards regression model showed that 9 kinds of disulfidoptosis-associated genes were associated with OS (P<0.05). The 24 kinds of disulfidoptosis-associated genes were grouped into 2 clusters by using the consensus clustering algorithm, with 299 differentially expressed genes between the two clusters. In the training set, 14 genes were determined by using LASSO regression to construct the OS prediction model, and risk scores were calculated. The OS of the high-risk group was significantly worse than that of the low-risk group (P<0.05), and this prediction model also had a high area under the curve value in the validation set. ConclusionsThe OS prediction model based on disulfidoptosis-associated genes can predict the prognosis of gastric cancer patients.
Non-small cell lung cancer is one of the cancers with the highest incidence and mortality rate in the world, and precise prognostic models can guide clinical treatment plans. With the continuous upgrading of computer technology, deep learning as a breakthrough technology of artificial intelligence has shown good performance and great potential in the application of non-small cell lung cancer prognosis model. The research on the application of deep learning in survival and recurrence prediction, efficacy prediction, distant metastasis prediction, and complication prediction of non-small cell lung cancer has made some progress, and it shows a trend of multi-omics and multi-modal joint, but there are still shortcomings, which should be further explored in the future to strengthen model verification and solve practical problems in clinical practice.
Objective To investigate independent prognostic factors influencing the prognosis of connective tissue disease-associated interstitial lung disease with pulmonary hypertension (CTD-ILD-PH), and construct a nomogram model using machine learning to predict 1-, 3- and 5-year mortality risks, providing evidence for clinical diagnosis and treatment. Methods Patients diagnosed with CTD-ILD-PH and treated at the First Affiliated Hospital of Zhengzhou University from February 2011 to June 2021 were screened. The least absolute shrinkage and selection operator (Lasso), univariate Cox regression, and multivariate Cox regression analyses were combined to identify independent prognostic factors for CTD-ILD-PH patients. A novel nomogram prognostic model was constructed and internally validated using 1000 bootstrap resamples. The receiver operating characteristic (ROC) curve and Harrell's C-index assessed the predictive performance of the model. Calibration curves evaluated the model fit. Decision curve analysis (DCA) assessed the clinical utility of the model, and external validation was conducted using a separate test set. Results The study included 313 patients, with 108 deaths observed during the follow-up. Using the Lasso-Cox method, albumin, alanine aminotransferase (ALT), red cell volume distribution width (RDW), age, smoking history, rural residence, and pulmonary artery systolic pressure were identified as independent prognostic factors. The Harrell's C-index in the training set was 0.802, and the area under ROC curve was 0.880 (95%CI 0.833 - 0.928). Internal validation showed an average Harrell's C-index of 0.791. Calibration curves indicated high consistency between predicted and observed results. DCA confirmed the model's good clinical utility. External validation results demonstrated the model's favorable predictive performance and clinical utility. Conclusions Our research suggest that lower albumin, elevated ALT, elevated RDW, advanced age, smoking history, rural areas and higher pulmonary artery systolic blood pressure are independent prognostic risk factors for patients with CTD-ILD-PH. In this study, a prognostic model was developed for the first time to predict 1-, 3- and 5-year mortality in CTD-ILD-PH patients, which provides some reference value for future mortality risk assessment in CTD-ILD-PH.