The level of evidence in randomized controlled studies is high. However, it cannot be widely applied due to its high cost, external authenticity, ethics and other reasons. The traditional observational studies reduce the internal authenticity due to various confounding factors, and the level of evidence is low. Regression discontinuity design (RDD) is a design that observes and compares outcome of object around the threshold under practical clinical conditions. Its capability to adjust confounding factors is second only to that of randomized control studies. It can be used in cases where the intervention (or exposure) is directly related to the value of a continuous variable. For instance, whether an HIV patient needs antiretroviral treatment mainly depends on whether the CD4 cell count is lower than 200/μL. Because the measurement of continuous variables has random error, whether intervention is given near the threshold or is close to random, the baseline of patients in the intervention group and non-intervention group near the threshold should be balanced and comparable. Based on this assumption, the causal effect of intervention (or exposure) and outcome can be estimated by comparing the outcomes of populations near the threshold. RDD is mainly applicable to the study of classification outcomes in medicine, among which two-stage least square method, likelihood ratio based estimation method and Bayesian method are more commonly used model estimation methods. However, the application conditions of RDD and the requirement of sample size limit its extensive application in medicine. With the improvement of data accessibility and the development of real world research, RDD will be more widely used in clinical research.
Research of generating real-world evidence using real world data has attracted considerable attention globally. Outcome research of treatment based on existing health and medical data or registries has become one of the most important topics. However, there exists certain confusions in this line of research on how to design and implement appropriate statistical analysis. Therefore, in the fourth chapter of the series technical guidance to develop real world evidence by China REal world data and studies Alliance (ChinaREAL), we aim to provide an guidance on statistical analysis in the study to assess therapeutic outcomes based on existing health and medical data or registries.In this chapter, we first emphasize the significance of pre-specified statistical analysis plan, recommending key components of the statistical analysis plan. We then summarize the issue of sample size calculation in this content and clarify the interpretation of statistical p-value. Secondly, we recommend procedures to be considered to tackle the issue related to the selection bias, information bias and most importantly, confounding bias. We discuss the multivariable regression analysis as well as the popular causal inference models. We also suggest that careful consideration should be made to deal with missing data in real-world databases. Finally, we list core content of the statistical report.
Causal inference is one of the main goals of medical research. However, due to the lack of an in-depth understanding of the theory of causal inference, researchers tend to blindly use multiple statistical methods to analyse the same question to enhance the credibility of the results, which leads to problems in interpretation of the analysis results. Based on the three basic concepts of potential outcomes, causal effects, and distributive mechanisms of the causal inference counterfactual framework, this paper introduced six main target effects in causal inference and discussed their comparability to help researchers understand the principle of causal inference and correctly interpret and compare research results to avoid misleading conclusions.
Randomized controlled trials (RCTs) are often limited because of ethical or operational reasons. Quasi-experimental studies could be an alternative to RCTs to make causal inferences without randomization by controlling the confounding effects of the study. This paper introduced the general statistical analysis methods of quasi-experimental design through basic ideas, characteristics, limitations and applications in medicine, including difference-in-difference models, instrumental variables, regression discontinuity design, interrupted time series, and so on, and to provide references for future research.