Objective To evaluate the quality of evidence of systematic reviews or meta-analyses regarding outcomes in nursing field in China using the Grade system, so as to get known of the status of the quality of evidence and promote the application of the evaluation of the quality of evidence of systematic reviews. Methods The quality of evidence regarding the included outcomes was input, extracted and qualitatively graded, using GRADEpro 3.6 software. Then, we carefully analyzed and elaborated the factors of downgrading and upgrading that affects the quality of evidence in the process of evaluation. Results 53 systematic reviews or meta-analyses involving 188 outcomes were identified and evaluated. The results showed that high, moderate, low and very low levels of quality of evidence were 2.7%, 27.1%, 51.1%, and 19.1%, respectively; and low-level quality of evidence accounted for the most. Conclusion The quality of evidence produced by systematic reviews or meta-analyses in nursing field in China is poor and urgently needs improvement. The reviewers should abide by the methodological standards in the process of making systematic reviews or meta-analyses. The quality of evidence in terms of each outcome should be evaluated and fully reported.
GRADE(Grades of Recommendation, Assessment, Development,and Evaluation)方法为卫生保健中的证据质量评价与推荐强度评级提供指导。对那些为系统评价、卫生技术评估及临床实践指南总结证据的人而言,GRADE具有重要意义。GRADE提供了一个系统而透明的框架用以明确问题,确定所关注的结局,总结针对某问题的证据,以及从证据到形成推荐或作出决策。GRADE方法的广泛传播与应用,获全球50余个组织认可,这些组织大多有很强的影响力(http://www.gradeworkinggroup.org/),足以证明该工作的重要性。本文介绍临床流行病学杂志将刊出的20篇系列文章,为如何使用GRADE方法提供指导。
GRADE方法中,随机试验起评即为高质量证据,观察性研究起评即为低质量证据;但若证据本身存在高发表偏倚风险,则两者证据质量级别都应降低。即使最佳证据汇总表纳入的各项研究仅有低发表偏倚风险,发表偏倚仍会极大高估效应值。当可得证据来自小样本研究、且多数由厂商资助时,作者应怀疑存在发表偏倚。若干基于检验数据类型的方法可用于评价发表偏倚,其中最常用的为漏斗图,但这些方法都有较大局限。发表偏倚可能较常见,必须特别关注早期结果、对样本量与事件数都很小的早期试验结果尤需小心。
GRADE建议通过检查95%可信区间(CI)为决定不精确性的最佳方法。在指南实际运用中,如果CI的上、下限值代表了真实效应,而临床实际情况与之不符时,必须降低证据质量级别(即对效应估计值的把握度)。除外当效应值很大且可信区间提示效应稳健,而总样本量不大且事件数很少的情况,其他应考虑因不精确性而降低证据质量级别。作此决定时,可计算有足够检验效能的单个试验所需的病例数(定义为“最优信息样本量”,即optimal information size,OIS)。对连续型变量,我们建议用类似方法,首先考虑可信区间上、下限值,再计算OIS。系统评价(SR)所需方法略有不同。如果95%CI不包括相对危险度(RR)为1,且总事件发生数或病例数超过OIS标准,则精确性良好。如果95%CI包括了明显获益或危害(我们建议以RR值lt;0.75或gt;1.25作粗标准),即使达到OIS要求,因不精确性而降低证据质量级别较恰当。
直接证据来自直接比较我们关注的干预措施用于我们关注的患者人群,并测量患者重要结局的研究。间接证据可由以下4种方式之一产生。第一,患者可能与我们关注的患者不同(适用性一词常用于这类间接性)。第二,所检验的干预措施可能与我们关注的干预措施不同。有关患者和干预措施间接性的决策取决于对生物或社会因素差异是否大到可能使效应尺度出现预期的较大差异的考虑。第三,结果可能有别于最初设定的结局指标——如替代结果本身不重要,但测量之是基于替代结果的变化反映患者重要结局变化这一假设。第四类间接性在概念上与前三类不同,发生于临床医生必须在未经直接比较的两种干预措施间做出选择时。这种情况下比较治疗方案需要特定的统计方法,并根据患者人群、联合干预措施、结局测量指标及备选干预措施试验方法的差异程度,将证据级别降低1或2级。
GRADE要求明确说明相关的背景、人群、干预措施和对照,同时要求不论研究结果能否形成证据,均需详述所有重要结果。对某一特定管理问题,人群、干预措施及结果应在不同研究间足够类似,才能认为得到相似的效应量合乎情理。指南制定者在收集证据前应先详细说明各结局的相对重要性,同样地,证据总结完成时也需要详细说明这一点。考虑到替代结局的重要性,对采用替代指标描述且对患者很重要的结局,作者应评估其重要性,并进而降低这种间接结果的证据质量等级。
本文是GRADE(Grading of Recommendations Assessment,Development,and Evaluation)系列文章的导论。该系列文章为使用GRADE系统提供指导,介绍如何将该系统用于系统评价、卫生技术评估(HTAs)及临床实践指南中备选方案的证据质量评价和推荐强度评级。GRADE方法始于提出一个明晰的问题,包括对所有重要结果的详细说明。证据被收集和汇总后,GRADE提供了明确的标准来评价其质量,包括研究设计、偏倚风险、不精确性、不一致性、间接性及效应量大小。
在GRADE方法中,若多数相关证据来自高偏倚风险的研究,则起初被定为高质量证据的随机试验和低质量证据的观察性研究均有可能被降低质量等级。随机试验已确定的局限性包括:未进行分配隐藏、未实施盲法、未报告失访情况及未恰当考虑意向性治疗原则。最近提出的局限性包括:因明显获益而早期终止试验和基于结果选择性报告结局。观察性研究的主要局限性包括使用不合适的对照及未能充分调整预后的不平衡。偏倚风险可因不同结果而异(如全死因死亡率的失访远少于生命质量的失访),许多系统评价都容易忽略这一点。在决定是否因偏倚风险而降低质量等级时,不管是随机试验还是观察性研究,作者不应采用对各个研究取平均值的方法。相反,对任何单个结果,当同时存在高、低偏倚风险的研究时,则应考虑只纳入较低偏倚风险的研究。
The methodology of conducting systematic review of prognostic studies has received a great deal of interest in recent years. Using GRADE for systematic review of prognostic studies, five aspects should be considered:risk of bias, indirectness, inconsistency, imprecision and publication bias. The methods of using GRADE system in systematic review of prognostic studies are similar to systematic review of interventional studies, meanwhile, there are differences. Not only the uniqueness of prognostic study but also the repeating downgrade should be taken into consideration in the GRADE process. Applying GRADE to systematic review of prognostic studies would be widely accepted along with the methodology development and quality improvement of systematic review of prognostic studies.
ObjectiveTo investigate the recommendations on imaging diagnosis in Chinese clinical practice guidelines (CPGs). MethodsWe electronically searched WanFang Data, VIP, CNKI and CBM databases from inception to December 31, 2014. Two reviewers independently screened literature and extracted data. The method of bibliometrics was used to analyze the data (including basic characteristics, strength of recommendation, quality of evidence, etc.). ResultsA total of 341 CPGs formulating the recommendations on diagnosis were included. 48.7% (166/341) guidelines developed the recommendations on imaging diagnosis (a total of 534). 25.7% (137/534) recommendations were with the symbols of quality of evidence and strength of recommendation, and 18.9% (101/534) with special words such as recommend, suggest. 22.3% (119/534) recommendations reported the strength of recommendation. Of which, 38.7% (46/119) were strong and 16.0% (19/119) were weak. However, 23.9% (11/46) strong recommendations were based on low quality of evidence. And 42.1% (8/19) weak recommendations were based on high quality of evidence. ConclusionAmong Chinese CPGs formulating the recommendations on diagnosis, the number of CPGs with recommendations on imaging is about 50%. And the quantity increases by years. The proportions of recommendations on imaging which report the strength of recommendation and/or quality of evidence are low. Meanwhile, the rating systems are uniform. Then the developers do not report the explanation for the strong recommendations based on low quality of evidence or the weak recommendations based on high quality of evidence in guideline.