Objective To systematically review prediction models of small for gestational age (SGA) based on machine learning and provide references for the construction and optimization of such a prediction model. Methods The PubMed, EMbase, Web of Science, CBM, WanFang Data, VIP and CNKI databases were electronically searched to collect studies on SGA prediction models from database inception to August 10, 2022. Two researchers independently screened the literature, extracted data, evaluated the risk of bias of the included studies, and conducted a systematic review. Results A total of 14 studies, comprising 40 prediction models constructed using 19 methods, such as logical regression and random forest, were included. The results of the risk of bias assessment from 13 studies were high; the area under the curve of the prediction models ranged from 0.561 to 0.953. Conclusion The overall risk of bias in the prediction models for SGA was high, and the predictive performance was average. Models built using extreme gradient boosting (XGBoost) demonstrated the best predictive performance across different studies. The stacking method can improve predictive performance by integrating different models. Finally, maternal blood pressure, fetal abdominal circumference, head circumference, and estimated fetal weight were important predictors of SGA.
ObjectiveTo explore the risk factors for accompanying depression in patients with community type Ⅱ diabetes and to construct their risk prediction model. MethodsA total of 269 patients with type Ⅱ diabetes accompanied with depression and 217 patients with simple type Ⅱ diabetes from three community health service centers in two streets of Pingshan District, Shenzhen from October 2021 to April 2022 were included. The risk factors were analyzed and screened out, and a logistic regression risk prediction model was constructed. The goodness of fit and prediction ability of the model were tested by the Hosmer-Lemeshow test and the receiver operating characteristic (ROC) curve. Finally, the model was verified. ResultsLogistic regression analysis showed that smoking, diabetes complications, physical function, psychological dimension, medical coping for face, and medical coping for avoidance were independent risk factors for depressive disorder in patients with type Ⅱ diabetes. Modeling group Hosmer-Lemeshow test P=0.345, the area under the ROC curve was 0.987, sensitivity was 95.2% and specificity was 98.6%. The area under the ROC curve was 0.945, sensitivity was 89.8%, specificity was 84.8%, and accuracy was 86.8%, showing the model predictive value. ConclusionThe risk prediction model of type Ⅱ diabetes patients with depressive disorder constructed in this study has good predictive and discriminating ability.
ObjectiveTo construct and verify the nomogram prediction model of pregnant women's fear of childbirth. MethodsA convenient sampling method was used to select 675 pregnant women in tertiary hospital in Tangshan City, Hebei Province from July to September 2022 as the modeling group, and 290 pregnant women in secondary hospital in Tangshan City from October to December 2022 as the verification group. The risk factors were determined by logistic regression analysis, and the nomogram was drawn by R 4.1.2 software. ResultsSix predictors were entered into the model: prenatal education, education level, depression, pregnancy complications, anxiety and preference for delivery mode. The areas under the ROC curves of the modeling group and the verification group were 0.834 and 0.806, respectively. The optimal critical values were 0.113 and 0.200, respectively, with sensitivities of 67.2% and 77.1%, the specificities were 87.3% and 74.0%, and the Jordan indices were 0.545 and 0.511, respectively. The calibration charts of the modeling group and the verification group showed that the coincidence degree between the actual curve and the ideal curve was good. The results of Hosmer-Lemeshow goodness of fit test were χ2=6.541 (P=0.685) and χ2=5.797 (P=0.760), and Brier scores were 0.096 and 0.117, respectively. DCA in modeling group and verification group showed that when the threshold probability of fear of childbirth were 0.00 to 0.70 and 0.00 to 0.70, it had clinical practical value. ConclusionThe nomogram model has good discrimination, calibration and clinical applicability, which can effectively predict the risk of pregnant women's fear of childbirth and provide references for early clinical identification of high-risk pregnant women and targeted intervention.
ObjectiveTo construct a prediction model of diabetics distal symmetric polyneuropathy (DSPN) based on neural network algorithm and the characteristic data of traditional Chinese medicine and Western medicine. MethodsFrom the inpatients with diabetes in the First Affiliated Hospital of Anhui University of Chinese Medicine from 2017 to 2022, 4 071 cases with complete data were selected. The early warning model of DSPN was established by using neural network, and 49 indicators including general epidemiological data, laboratory examination, signs and symptoms of traditional Chinese medicine were included to analyze the potential risk factors of DSPN, and the weight values of variable features were sorted. Validation was performed using ten-fold crossover, and the model was measured by accuracy, sensitivity, specificity, positive predictive value, negative predictive value, and AUC value. ResultsThe mean duration of diabetes in the DSPN group was about 4 years longer than that in the non-DSPN group (P<0.001). Compared with non-DSPN patients, DSPN patients had a significantly higher proportion of Chinese medicine symptoms and signs such as numbness of limb, limb pain, dizziness and palpitations, fatigue, thirst with desire to drink, dry mouth and throat, blurred vision, frequent urination, slow reaction, dull complexion, purple tongue, thready pulse and hesitant pulse (P<0.001). In this study, the DSPN neural network prediction model was established by integrating traditional Chinese and Western medicine feature data. The AUC of the model was 0.945 3, the accuracy was 87.68%, the sensitivity was 73.9%, the specificity was 92.7%, the positive predictive value was 78.7%, and the negative predictive value was 90.72%. ConclusionThe fusion of Chinese and Western medicine characteristic data has great clinical value for early diagnosis, and the established model has high accuracy and diagnostic efficacy, which can provide practical tools for DSPN screening and diagnosis in diabetic population.
ObjectiveTo review individual treatment effect (ITE) models developed from randomized controlled trials, with the aim of systematically summarizing the current state of model development and assessing the risk of bias. MethodsPubMed and Embase databases were searched for studies published between 1990 and 14 June 2024. Data were extracted using the CHARMS inventory, and the PROBAST risk of bias tool was used to assess model quality. ResultsA total of 11 publications were included, containing 19 ITE models. The ITE modelling methods were regression models with interaction terms (n=8, 42.1%), dual-range models (n=5, 26.3%) and machine learning (n=6, 31.6%). The ITE models had a reporting rate of 78.9%, 73.2% and 10.5% for differentiation, calibration and clinical validity, respectively. Fourteen models were assessed as having a high risk of bias (73.7%), particularly in the area of statistical analysis, due to inappropriate handling of missing data (n=15, 78.9%), inappropriate consideration of model fit issues (n=5, 26.3%), etc. ConclusionCommon approaches to ITE model development include constructing interaction terms, dual procedure theory, and machine learning, but suffer from a low number of model developments, more complex modeling methods, and non-standardized reporting. In the future, emphasis should be placed on further exploration of ITE models, promoting diversified modeling methods and standardized reporting to improve the clinical promotion and practical application value of the models.
ObjectiveTo systematically evaluate postpartum depression risk prediction models in order to provide references for the construction, application and optimization of related prediction models. MethodsThe CNKI, VIP, WanFang Data, PubMed, Web of Science and EMbase were electronically searched to collect studies on predictive model for the risk of postpartum from January 2013 to April 2023. Two reviewers independently screened the literature, extracted data, and assessed the quality of the included studies based on PROBAST tool. ResultsA total of 10 studies, each study with 1 optimal model were evaluated. Common predictors included prenatal depression, age, smoking history, thyroid hormones and other factors. The area under the curve of the model was greater than 0.7, and the overall applicability was general. Overall high risk of bias and average applicability, mainly due to insufficient number of events in the analysis domain for the response variable, improper handling of missing data, screening of predictors based on univariate analysis, lack of model performance assessment, and consideration of model overfitting. ConclusionThe model is still in the development stage. The included model has good predictive performance and can help early identify people with high incidence of postpartum depression. However, the overall applicability of the model needs to be strengthened, a large sample, multi-center prospective clinical study should be carried out to construct the optimal risk prediction model of PPD, in order to identify and prevent PPD as soon as possible.
With the rapid development of artificial intelligence (AI) and machine learning technologies, the development of AI-based prediction models has become increasingly prevalent in the medical field. However, the PROBAST tool, which is used to evaluate prediction models, has shown growing limitations when assessing models built on AI technologies. Therefore, Moons and colleagues updated and expanded PROBAST to develop the PROBAST+AI tool. This tool is suitable for evaluating prediction model studies based on both artificial intelligence methods and regression methods. It covers four domains: participants and data sources, predictors, outcomes, and analysis, allowing for systematic assessment of quality in model development, risk of bias in model evaluation, and applicability. This article interprets the content and evaluation process of the PROBAST+AI tool, aiming to provide references and guidance for domestic researchers using this tool.
ObjectivesTo explore the construction method of prediction model of absolute risk for breast cancer and provide personalized breast cancer management strategies based on the results.MethodsA case-control design was conducted with 2 747 individuals diagnosed as primary breast cancer by pathology in West China Hospital of Sichuan University from 2000 to 2017 and 6 307 healthy controls from Breast Cancer Screening Cohort in Sichuan Women and Children Center and Chengdu Shuangliu District Maternal and Child Health Hospital. Standardized questionnaires and information management systems in hospital were used to collect information. Decision trees, logistic regression, the formula in Gail model and registration data in China were used to estimate the probability of 5-year risk of breast cancer. Eventually a ROC (receiver operating characteristics) curve was drawn to identify optimal cut-off value, and the power was evaluated.ResultsThe decision tree exported 4 variables, which were urban or rural sources, number of live birth, age and age at menarche. The median 5-year risk and interquartile range of the controls was 0.027% and 0.137%, while the median 5-year risk and interquartile range of the cases was 0.219% and 0.256%. The ROC curve showed the cut-off value was 0.100%. Through verification, the sensitivity was 0.79, the specificity was 0.73, the accuracy was 0.75, and the AUC (area under the curve) was 0.79.ConclusionsThe methods used in our study based on 9 054 female individuals in Sichuan province could be used to predict the 5-year risk for breast cancer. Predictor variables include urban or rural sources, number of live birth, age, and age at menarche. If the 5-year risk is more than 0.100%, the person will be judged as a high risk individual.
ObjectiveTo establish a hypertension prediction model for middle-aged and elderly people in China and to use the basic public health service database for performance validation. MethodsThe literature related to hypertension was retrieved from the internet. Using meta-analysis to assess the effect value of influencing factors. Statistically significant factors, which were also combined in the database, were extracted as the predictors of the models. The predictors’ effect values were logarithmarithm-transformed as the parameters of the Logit function model and the risk score model. Participants who were never diagnosed with hypertension at the physical examination of health service project of Hongguang Town Health Center in Pidu District of Chengdu from January 1, 2017, to January 1, 2022, were considered as the external validation group. ResultsA total of 15 original studies were involved in the meta-analysis and 11 statistically significant influencing factors for hypertension were identified, including age, female, systolic blood pressure, diastolic blood pressure, BMI, central obesity, triglyceride, smoking, drinking, history of diabetes and family history of hypertension. Of 4997 qualified participants, 684 individuals were identified with hypertension during the five-years follow-up. External validation indicated an AUC of 0.571 for the Logit function model and an AUC of 0.657 for the risk score model. ConclusionIn this study, we developed two different prediction models based on the results of meta-analysis. National basic public health service database is used to verify the models. The risk score model has a better prediction performance, which may help quickly stratify the risk class of the community crowd and strengthen the primary-level assistance system.
ObjectiveConstructing a prediction model for seizures after stroke, and exploring the risk factors that lead to seizures after stroke. MethodsA retrospective analysis was conducted on 1 741 patients with stroke admitted to People's Hospital of Zhongjiang from July 2020 to September 2022 who met the inclusion and exclusion criteria. These patients were followed up for one year after the occurrence of stroke to observe whether they experienced seizures. Patient data such as gender, age, diagnosis, National Institute of Health Stroke Scale (NIHSS) score, Activity of daily living (ADL) score, laboratory tests, and imaging examination data were recorded. Taking the occurrence of seizures as the outcome, an analysis was conducted on the above data. The Least absolute shrinkage and selection operator (LASSO) regression analysis was used to screen predictive variables, and multivariate Logistic regression analysis was performed. Subsequently, the data were randomly divided into a training set and a validation set in a 7:3 ratio. Construct prediction model, calculate the C-index, draw nomogram, calibration plot, receiver operating characteristic (ROC) curve, and decision curve analysis (DCA) to evaluate the model's performance and clinical application value. ResultsThrough LASSO regression, nine non-zero coefficient predictive variables were identified: NIHSS score, homocysteine (Hcy), aspartate aminotransferase (AST), platelet count, hyperuricemia, hyponatremia, frontal lobe lesions, temporal lobe lesions, and pons lesions. Multivariate logistic regression analysis revealed that NIHSS score, Hcy, hyperuricemia, hyponatremia, and pons lesions were positively correlated with seizures after stroke, while AST and platelet count were negatively correlated with seizures after stroke. A nomogram for predicting seizures after stroke was established. The C-index of the training set and validation set were 0.854 [95%CI (0.841, 0.947)] and 0.838 [95%CI (0.800, 0.988)], respectively. The areas under the ROC curves were 0.842 [95%CI (0.777, 0.899)] and 0.829 [95%CI (0.694, 0.936)] respectively. Conclusion These nine variables can be used to predict seizures after stroke, and they provide new insights into its risk factors.