• 1. Research Center of Clinical Medicine, Affiliated Hospital of Nantong University, Nantong, 226001, Jiangsu, P. R. China;
  • 2. Department of Thoracic Surgery, Peking University Third Hospital, Beijing, 100191, P. R. China;
  • 3. Department of Pathology, Affiliated Hospital of Nantong University, Nantong, 226001, Jiangsu, P. R. China;
  • 4. Department of Cardiothoracic Surgery, Affiliated Hospital of Nantong University, Nantong, 226001, Jiangsu, P. R. China;
LIU Yifei, Email: ntdxliuyifei@sina.com; SHI Jiahai, Email: sjh@ntu.edu.cn
Export PDF Favorites Scan Get Citation

Objective To explore the accuracy of machine learning algorithms based on SHOX2 and RASSF1A methylation levels in predicting early-stage lung adenocarcinoma pathological types. Methods A retrospective analysis was conducted on formalin-fixed paraffin-embedded (FFPE) specimens from patients who underwent lung tumor resection surgery at Affiliated Hospital of Nantong University from January 2021 to January 2023. Based on the pathological classification of the tumors, patients were divided into three groups: a benign tumor/adenocarcinoma in situ (BT/AIS) group, a minimally invasive adenocarcinoma (MIA) group, and an invasive adenocarcinoma (IA) group. The methylation levels of SHOX2 and RASSF1A in FFPE specimens were measured using the LungMe kit through methylation-specific PCR (MS-PCR). Using the methylation levels of SHOX2 and RASSF1A as predictive variables, various machine learning algorithms (including logistic regression, XGBoost, random forest, and naive Bayes) were employed to predict different lung adenocarcinoma pathological types. Results A total of 272 patients were included. The average ages of patients in the BT/AIS, MIA, and IA groups were 57.97, 61.31, and 63.84 years, respectively. The proportions of female patients were 55.38%, 61.11%, and 61.36%, respectively. In the early-stage lung adenocarcinoma prediction model established based on SHOX2 and RASSF1A methylation levels, the random forest and XGBoost models performed well in predicting each pathological type. The C-statistics of the random forest model for the BT/AIS, MIA, and IA groups were 0.71, 0.72, and 0.78, respectively. The C-statistics of the XGBoost model for the BT/AIS, MIA, and IA groups were 0.70, 0.75, and 0.77, respectively. The naive Bayes model only showed robust performance in the IA group, with a C-statistic of 0.73, indicating some predictive ability. The logistic regression model performed the worst among all groups, showing no predictive ability for any group. Through decision curve analysis, the random forest model demonstrated higher net benefit in predicting BT/AIS and MIA pathological types, indicating its potential value in clinical application. Conclusion Machine learning algorithms based on SHOX2 and RASSF1A methylation levels have high accuracy in predicting early-stage lung adenocarcinoma pathological types.

Citation: HUANG Runqi, QIANG Guangliang, LIU Yifei, SHI Jiahai. Prediction of pathological type of early lung adenocarcinoma using machine learning based on SHOX2 and RASSF1A methylation levels. Chinese Journal of Clinical Thoracic and Cardiovascular Surgery, 2025, 32(1): 67-72. doi: 10.7507/1007-4848.202408048 Copy

Copyright © the editorial department of Chinese Journal of Clinical Thoracic and Cardiovascular Surgery of West China Medical Publisher. All rights reserved

  • Previous Article

    Comprehensive evaluation of benign and malignant pulmonary nodules using combined biological testing and imaging assessment in 1 017 patients: A retrospective cohort study
  • Next Article

    Construction of a predictive model for poorly differentiated adenocarcinoma in pulmonary nodules using CT combined with tumor markers