ObjectiveTo establish a predictive model of surgical site infection (SSI) following colorectal surgery using machine learning.MethodsMachine learning algorithm was used to analyze and model with the colorectal data set from Duke Infection Control Outreach Network Surveillance Network. The whole data set was divided into two parts, with 80% as the training data set and 20% as the testing data set. In order to improve the training effect, the whole data set was divided into two parts again, with 90% as the training data set and 10% as the testing data set. The predictive result of the model was compared with the actual infected cases, and the sensitivity, specificity, positive predictive value, and negative predictive value of the model were calculated, the area under receiver operating characteristic (ROC) curve was used to evaluate the predictive capacity of the model, odds ratio (OR) was calculated to tested the validity of evaluation with a significance level of 0.05.ResultsThere were 7 285 patients in the whole data set registered from January 15th, 2015 to June 16th, 2016, among whom 234 were SSI cases, with an incidence of SSI of 3.21%. The predictive model was established by random forest algorithm, which was trained by 90% of the whole data set and tested by 10% of that. The sensitivity, specificity, positive predictive value, and negative predictive value of the model were 76.9%, 59.2%, 3.3%, and 99.3%, respectively, and the area under ROC curve was 0.767 [OR=4.84, 95% confidence interval (1.32, 17.74), P=0.02].ConclusionThe predictive model of SSI following colorectal surgery established by random forest algorithm has the potential to realize semi-automatic monitoring of SSIs, but more data training should be needed to improve the predictive capacity of the model before clinical application.