Mobilising Evidence for Good Governance
Annex B. Mapping of Existing Standards of Evidence across a Range of Jurisdictions
Table B.1. Mapping of Standards of Evidence
# |
Country |
Organisation |
Framework |
URL |
URL for evidence standard |
---|---|---|---|---|---|
1 |
Australia |
Be you |
The Be You Programs Directory |
https://beyou.edu.au/resources/tools-and-guides/about-programs-directory |
|
2 |
Australia |
ARACY Australian Research Alliance for Children and Youth |
What Works for Kids (WW4K) |
||
3 |
Canada |
Public Health Agency of Canada |
Canadian Best Practices Portal |
http://cbpp-pcpe.phac-aspc.gc.ca/resources/evidence-informed-decision-making/ |
|
4 |
Canada |
McMaster University |
Health Evidence |
https://www.healthevidence.org/search.aspx |
https://www.healthevidence.org/documents/our-appraisal-tools/quality-assessment-tool-dictionary-en.pdf |
5 |
EU |
European Commission |
EU-Compass for Action on Mental Health and Well-being |
https://ec.europa.eu/health/non_communicable_diseases/mental_health/eu_compass_en |
|
6 |
EU |
European Platform for Investing in Children |
Evidence Based Practices |
https://ec.europa.eu/social/main.jsp?catId=1246&intPageId=4286&langId=en |
|
7 |
EU |
EMCDDA European Monitoring Centre for Drugs and Drug Addiction |
European drug prevention quality standards |
http://www.emcdda.europa.eu/system/files/publications/646/TD3111250ENC_318193.pdf |
|
8 |
EU |
EMCDDA European Monitoring Centre for Drugs and Drug Addiction |
Best practice portal |
http://www.emcdda.europa.eu/best-practice_en |
http://www.emcdda.europa.eu/best-practice/evidence/about |
9 |
Germany |
Crime Prevention Council of Lower Saxony |
Green List Prevention |
||
10 |
New Zealand |
SUPERU |
A Quality Scale for New Zealand |
https://www.superu.govt.nz/sites/default/files/Publications/Evidence%20Rating%20Scale.pdf |
|
11 |
New Zealand |
Education counts |
Best Evidence Synthesis Iteration |
||
12 |
Spain |
Prevención basada en la evidencia |
Criterios de selección de programas |
http://www.prevencionbasadaenlaevidencia.net/index.php?page=Criterios |
|
13 |
UK |
Darlington Service Design Lab |
Standards of evidence |
||
14 |
UK |
Project Oracle |
Standards of evidence |
https://project-oracle.com/uploads/files/Validation_Guidebook.pdf |
|
15 |
UK |
Early Intervention Foundation |
The Guidebook |
||
16 |
UK |
Nesta |
Standards of evidence |
http://www.alliance4usefulevidence.org/assets/What-Counts-as-Good-Evidence-WEB.pdf |
|
17 |
UK |
Bond |
Evidence Principles |
https://www.bond.org.uk/ngo-support/evidence-principles-download |
|
18 |
UK |
Centre for Analysis of Youth Transitions (CAYT) |
Standards of evidence (CAYT) |
http://cayt.mentor-adepis.org/wp-content/uploads/2017/06/CAYT-Scoring-Application-Form-2017-FINAL.pdf |
|
19 |
UK |
What Works Centre for Local Economic Growth |
The Maryland Scientific Methods Scale (SMS) |
https://whatworksgrowth.org/public/files/Methodology/16-06-28_Scoring_Guide.pdf |
|
20 |
UK |
Big Lottery Fund's Realising Ambition Programme |
The confidence review |
||
21 |
UK |
Education Endowment Foundation |
Teaching and Learning Toolkit |
https://educationendowmentfoundation.org.uk/public/files/Toolkit/Toolkit_Manual_2018.pdf |
|
22 |
UK |
HACT-Ideas and Innovation in Housing |
Standards for producing evidence |
https://www.hact.org.uk/sites/default/files/StEv2-1-2016%20Effectiveness-Specification.pdf |
|
23 |
UK |
Conservation evidence |
What Works in Conservation |
||
24 |
UK |
What Works Centre for Children’s Social Care |
Evidence Standards |
https://wwc-evidence.herokuapp.com/pages/our-ratings-explained |
|
25 |
UK |
What Works Centre for Wellbeing |
GRADE (Grading of Recommendations Assessment, Development and Evaluation) |
https://whatworkswellbeing.org/ |
https://whatworkswellbeing.org/product/a-guide-to-our-evidence-review-methods/ |
26 |
UK |
What Works Centre for Crime Reduction |
EMMIE Framework |
https://whatworks.college.police.uk/toolkit/Pages/About_the_CRT.aspx |
https://whatworks.college.police.uk/toolkit/Pages/Quality-Scale.aspx |
27 |
International |
Campbell Collaboration |
Campbell Collaboration Systematic Reviews: Policies and Guidelines |
||
28 |
USA |
Community Preventive Services Task Force (CPSTF) |
The Community Guide |
||
29 |
USA |
U.S. Department of Health & Human Services |
Home Visiting Evidence of Effectiveness |
||
30 |
USA |
What Works Clearinghouse |
Find What Works from Systematic Reviews |
||
31 |
USA |
Center for the Study and Prevention of Violence |
Blueprints |
https://www.blueprintsprograms.org/resources/Blueprints_Standards_full.pdf |
|
32 |
USA |
California Department of Social Services |
California Evidence-Based Clearinghouse for Child Welfare |
||
33 |
USA |
Center for Research and Reform in Education (CRRE) at Johns Hopkins University School of Education |
Best Evidence Encyclopedia |
||
34 |
USA |
Center for Research and Reform in Education (CRRE) at Johns Hopkins University School of Education |
Evidence for ESSA (Every Student Succeeds Act ) |
https://content.evidenceforessa.org/sites/default/files/On%20clean%20Word%20doc.pdf |
|
35 |
USA |
Society for Prevention Research |
Standards of Evidence for Efficacy, Effectiveness, and Scale-up Research in Prevention Science. |
http://www.preventionresearch.org/wp-content/uploads/2011/12/Standards-of-Evidence_2015.pdf |
|
36 |
USA |
U.S. Department of Health and Human Services |
Evidence Based Teen Pregnancy Programs |
https://tppevidencereview.aspe.hhs.gov/pdfs/TPPER_Review%20Protocol_v5.pdf |
|
37 |
USA |
National Institute of Justice |
Crimesolutions |
||
38 |
USA |
Arnold Ventures |
Social Programmes That Work |
||
39 |
USA |
Child Trends |
What Works for Child and Youth Development |
||
40 |
USA |
Washington State Institute for Public Policy (WSIPP) |
Washington State Institute for Public Policy Benefit -Cost Results |
http://www.wsipp.wa.gov/TechnicalDocumentation/WsippBenefitCostTechnicalDocumentation.pdf |
|
41 |
USA |
U.S Department of Justice |
Office of Juvenile Justice and Delinquency Prevention Model Programs Guide |
||
42 |
USA |
University of Wisconsin Population Health Institute’s |
What Works for Health |
||
43 |
USA |
U.S. Department of Health and Human Services |
The Agency for Healthcare Research and Quality (AHRQ) |
||
44 |
USA |
U.S. Department of Labor |
Clearinghouse for Labor Evaluation and Research (CLEAR) |
||
45 |
USA |
Clearinghouse for Military Family Readiness |
Continuum of evidence |
https://lion.militaryfamilies.psu.edu/programs/find-programs |
https://militaryfamilies.psu.edu/wp-content/uploads/2017/08/continuum.pdf |
46 |
USA |
Suicide Prevention Resource Center |
Evidence-Based Practices Project |
http://www.sprc.org/sites/default/files/ebpp_proj_descrip%20revised.pdf |
|
47 |
USA |
National Implementation Research Network |
The Hexagon: An Exploration Tool |
||
48 |
USA |
National Dropout Prevention Center |
Model Programs Database |
||
49 |
USA |
National Cancer Institute |
Research-Tested Intervention Programs (RTIPs) |
||
50 |
USA |
Strengthening Families Evidence Review |
Standards of evidence |
Table B.2. Type of Evidence Assessed and Key Standards of Evidence by Approach
# |
Country |
Organization |
Framework |
Type of evidence assessed |
Theory of change/ Logic Model |
Design and development |
Efficacy |
Effectiveness |
Cost |
Implementation |
|||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Impact evaluation |
Systematic review |
Quantitative methods |
Qualitative methods |
Cost information |
Cost-Benefit evaluation |
Requirements |
Intervention Readiness |
System readiness |
Experiences |
||||||||
1 |
Australia |
Be you |
The Be You Programs Directory |
X |
X |
X |
YES |
YES |
YES |
NO |
YES |
NO |
YES |
NO |
YES |
NO |
|
2 |
Australia |
ARACY Australian Research Alliance for Children and Youth |
What Works for Kids (WW4K) |
X |
X |
NO |
YES |
YES |
YES |
YES |
YES |
YES |
NO |
NO |
NO |
||
3 |
Canada |
Public Health Agency of Canada |
Canadian Best Practices Portal |
X |
NO |
YES |
YES |
YES |
NO |
NO |
YES |
NO |
YES |
NO |
|||
4 |
Canada |
McMaster University |
Health Evidence |
X |
X |
X |
X |
NO |
NO |
YES |
YES |
YES |
YES |
NO |
NO |
NO |
NO |
5 |
EU |
European Commission |
EU-Compass for Action on Mental Health and Well-being |
X |
YES |
NO |
YES |
YES |
YES |
YES |
YES |
YES |
YES |
NO |
|||
6 |
EU |
European Platform for Investing in Children |
Evidence Based Practices |
X |
X |
NO |
YES |
YES |
YES |
YES |
YES |
YES |
NO |
NO |
NO |
||
7 |
EU |
EMCDDA European Monitoring Centre for Drugs and Drug Addiction |
European drug prevention quality standards |
X |
X |
X |
YES |
YES |
YES |
YES |
YES |
YES |
YES |
YES |
NO |
NO |
|
8 |
EU |
EMCDDA European Monitoring Centre for Drugs and Drug Addiction |
Best practice portal |
X |
X |
X |
X |
NO |
NO |
YES |
YES |
NO |
NO |
NO |
NO |
NO |
YES |
9 |
Germany |
Crime Prevention Council of Lower Saxony |
Green List Prevention |
X |
YES |
YES |
YES |
YES |
YES |
NO |
YES |
NO |
NO |
NO |
|||
10 |
New Zealand |
SUPERU |
A Quality Scale for New Zealand |
X |
X |
YES |
YES |
YES |
YES |
YES |
YES |
YES |
NO |
NO |
YES |
||
11 |
New Zealand |
Education counts |
Best Evidence Synthesis Iteration |
X |
X |
X |
X |
YES |
YES |
YES |
YES |
NO |
YES |
YES |
NO |
NO |
YES |
12 |
Spain |
Prevención basada en la evidencia |
Criterios de selección de programas |
X |
X |
YES |
NO |
YES |
YES |
YES |
NO |
NO |
YES |
||||
13 |
UK |
Darlington Service Design Lab |
Standards of evidence |
X |
X |
YES |
YES |
YES |
YES |
YES |
NO |
YES |
YES |
YES |
NO |
||
14 |
UK |
Project Oracle |
Standards of evidence |
X |
X |
X |
YES |
YES |
YES |
YES |
NO |
YES |
NO |
YES |
NO |
NO |
|
15 |
UK |
Early Intervention Foundation |
The Guidebook |
X |
YES |
YES |
YES |
YES |
YES |
NO |
YES |
NO |
NO |
NO |
|||
16 |
UK |
Nesta |
Standards of evidence |
X |
X |
X |
YES |
YES |
YES |
YES |
YES |
NO |
YES |
YES |
NO |
NO |
|
17 |
UK |
Bond |
Evidence Principles |
X |
X |
X |
YES |
NO |
YES |
NO |
NO |
NO |
NO |
NO |
YES |
YES |
|
18 |
UK |
Centre for Analysis of Youth Transitions (CAYT) |
Standards of evidence (CAYT) |
X |
X |
X |
NO |
YES |
YES |
YES |
YES |
NO |
NO |
NO |
NO |
NO |
|
19 |
UK |
What Works Centre for Local Economic Growth |
The Maryland Scientific Methods Scale (SMS) |
X |
X |
NO |
YES |
YES |
NO |
NO |
NO |
NO |
NO |
NO |
NO |
||
20 |
UK |
Big Lottery Fund's Realising Ambition Programme |
The confidence review |
X |
X |
X |
YES |
YES |
NO |
NO |
YES |
YES |
YES |
YES |
YES |
NO |
|
21 |
UK |
Education Endowment Foundation |
Teaching and Learning Toolkit |
X |
X |
X |
NO |
YES |
YES |
NO |
YES |
YES |
NO |
NO |
NO |
NO |
|
22 |
UK |
HACT-Ideas and Innovation in Housing |
Standards for producing evidence |
X |
X |
YES |
YES |
YES |
YES |
NO |
YES |
YES |
YES |
YES |
YES |
||
23 |
UK |
Conservation Evidence |
What Works in Conservation |
X |
X |
NO |
NO |
YES |
NO |
NO |
NO |
NO |
NO |
NO |
NO |
||
24 |
UK |
What Works Centre for Children’s Social Care |
Evidence Standards |
X |
X |
NO |
NO |
YES |
YES |
YES |
YES |
YES |
YES |
NO |
YES |
||
25 |
UK |
What Works Centre for Wellbeing |
GRADE (Grading of Recommendations Assessment, Development and Evaluation) |
X |
X |
X |
YES |
YES |
YES |
YES |
YES |
YES |
NO |
NO |
NO |
YES |
|
26 |
UK |
What Works Centre for Crime Reduction |
EMMIE Framework |
X |
YES |
NO |
YES |
YES |
YES |
YES |
YES |
NO |
NO |
YES |
|||
27 |
International |
Campbell Collaboration |
Campbell Collaboration Systematic Reviews: Policies and Guidelines |
X |
X |
YES |
NO |
YES |
YES |
YES |
YES |
YES |
NO |
NO |
YES |
||
28 |
USA |
Community Preventive Services Task Force (CPSTF) |
The Community Guide |
X |
X |
YES |
YES |
YES |
YES |
YES |
YES |
YES |
YES |
YES |
YES |
||
29 |
USA |
U.S. Department of Health & Human Services |
Home Visiting Evidence of Effectiveness |
X |
X |
NO |
NO |
YES |
YES |
YES |
NO |
YES |
NO |
NO |
YES |
||
30 |
USA |
What Works Clearinghouse |
Find What Works from Systematic Reviews |
X |
X |
NO |
YES |
YES |
NO |
NO |
NO |
NO |
NO |
NO |
NO |
||
31 |
USA |
Center for the Study and Prevention of Violence |
Blueprints |
X |
YES |
NO |
YES |
YES |
NO |
NO |
YES |
YES |
NO |
YES |
|||
32 |
USA |
California Department of Social Services |
California Evidence-Based Clearinghouse for Child Welfare |
X |
X |
NO |
NO |
YES |
YES |
NO |
NO |
YES |
YES |
YES |
YES |
||
33 |
USA |
Center for Research and Reform in Education (CRRE) at Johns Hopkins University School of Education |
Best Evidence Encyclopedia |
X |
NO |
NO |
YES |
NO |
NO |
NO |
YES |
NO |
NO |
NO |
|||
34 |
USA |
Center for Research and Reform in Education (CRRE) at Johns Hopkins University School of Education |
Evidence for ESSA (Every Student Succeeds Act ) |
X |
X |
NO |
YES |
YES |
NO |
YES |
NO |
YES |
NO |
NO |
NO |
||
35 |
USA |
Society for Prevention Research |
Standards of Evidence for Efficacy, Effectiveness, and Scale-up Research in Prevention Science. |
X |
YES |
NO |
YES |
YES |
YES |
YES |
YES |
YES |
YES |
YES |
|||
36 |
USA |
U.S. Department of Health and Human Services |
Evidence Based Teen Pregnancy Programs |
X |
NO |
NO |
YES |
NO |
NO |
NO |
YES |
YES |
NO |
NO |
|||
37 |
USA |
National Institute of Justice |
Crimesolutions |
X |
X |
YES |
YES |
YES |
YES |
NO |
NO |
YES |
NO |
NO |
NO |
||
38 |
USA |
Arnold Ventures |
Social Programmes That Work |
X |
X |
NO |
YES |
YES |
YES |
NO |
NO |
NO |
NO |
NO |
NO |
||
39 |
USA |
Child Trends |
What Works for Child and Youth Development |
X |
NO |
NO |
YES |
YES |
NO |
NO |
NO |
NO |
NO |
NO |
|||
40 |
USA |
Washington State Institute for Public Policy (WSIPP) |
Washington State Institute for Public Policy Benefit -Cost Results |
X |
NO |
NO |
YES |
YES |
YES |
YES |
NO |
NO |
NO |
NO |
|||
41 |
USA |
U.S Department of Justice |
Office of Juvenile Justice and Delinquency Prevention Model Programs Guide |
X |
X |
YES |
NO |
YES |
YES |
YES |
NO |
YES |
NO |
YES |
NO |
||
42 |
USA |
University of Wisconsin Population Health Institute’s |
What Works for Health |
X |
X |
NO |
YES |
YES |
NO |
NO |
NO |
NO |
NO |
NO |
NO |
||
43 |
USA |
U.S. Department of Health and Human Services |
The Agency for Healthcare Research and Quality (AHRQ) |
X |
X |
X |
NO |
YES |
YES |
NO |
YES |
NO |
YES |
NO |
NO |
YES |
|
44 |
USA |
U.S. Department of Labor |
Clearinghouse for Labor Evaluation and Research (CLEAR) |
X |
X |
NO |
YES |
YES |
NO |
YES |
NO |
YES |
NO |
NO |
YES |
||
45 |
USA |
Clearinghouse for Military Family Readiness |
Continuum of evidence |
X |
X |
NO |
YES |
YES |
YES |
YES |
NO |
YES |
NO |
NO |
NO |
||
46 |
USA |
Suicide Prevention Resource Center |
Evidence-Based Practices Project |
X |
X |
YES |
NO |
YES |
NO |
YES |
NO |
YES |
YES |
NO |
NO |
||
47 |
USA |
National Implementation Research Network |
The Hexagon: An Exploration Tool |
X |
X |
YES |
YES |
YES |
YES |
YES |
NO |
YES |
YES |
YES |
NO |
||
48 |
USA |
National Dropout Prevention Center |
Model Programs Database |
X |
X |
NO |
YES |
YES |
YES |
YES |
NO |
YES |
NO |
NO |
NO |
||
49 |
USA |
National Cancer Institute |
Research-Tested Intervention Programs (RTIPs) |
X |
X |
YES |
YES |
YES |
NO |
YES |
YES |
YES |
YES |
NO |
NO |
||
50 |
USA |
Strengthening Families Evidence Review |
Standards of evidence |
X |
X |
NO |
NO |
YES |
NO |
NO |
NO |
NO |
NO |
NO |
YES |
Table B.3. Rating and ranking of quality of evidence by approach
# |
Organisation |
Framework |
Use of an assessment approach or scale |
---|---|---|---|
1 |
Be you |
The Be You Programs Directory |
No. Programs must: align with one or more of the five professional learning domains (Mentally Healthy Communities, Family Partnerships, Learning Resilience, Early Support, Responding Together) align with the Australian Curriculum or National Quality Framework be supported by a training/delivery/implementation manual or guide be offered as more than a one-off session (i.e., offer multiple, sequential sessions which, either as a set series of sessions or on an as-needs basis) be targeted at one of the following audiences as the intended beneficiary, for example: children, young people, parents, carers or families; early childhood educators, Out of Hours School Care have at least one research or evaluation study which demonstrates: a positive impact on mental health outcomes for children or young people a minimum of 20 participants in the study who received the program at least pre and post testing conducted on the group that received the program. |
2 |
ARACY Australian Research Alliance for Children and Youth |
Nest What Works for Kids (WW4K) |
Yes. Well supported
Supported
Promising
Emerging
◦the results of rigorous studies are not yet available. |
3 |
Public Health Agency of Canada |
Canadian Best Practices Portal |
No. Promising Practices: A Promising Practice is defined as an intervention, program, service, or strategy that shows potential (or “promise”) for developing into a best practice. Promising practices are often in the earlier stages of implementation, and as such, do not show the high level of impact, adaptability, and quality of evidence as best practices. However, their potential is based on a strong theoretical underpinning to the intervention. Aboriginal Ways Tried and True: Aboriginal ‘Ways Tried and True’ (WTT) refers to successful practices implemented in First Nations, Inuit, and Métis contexts to address local challenges. Success is measured not only by effectiveness, but also by how the intervention was designed and carried out. Interventions are intended to inspire and support public health practitioners, program developers, evaluators, and others by sharing information on programs and processes that have worked in Aboriginal contexts. Best Practices: A Best Practice is defined as an intervention, program, or initiative that has, through multiple implementations, demonstrated: high impact (positive changes related to the desired goals), high adaptability (successful adaptation and transferability to different settings), and high quality of evidence (excellent quality of research/evaluation methodology, confirming the intervention’s high impact and adaptability evidence). |
4 |
McMaster University |
Health Evidence |
Yes. Strong: Reviews with a score of 8 or higher in the Yes column Moderate: Reviews with a score between 5-7 in the Yes column Weak: Reviews with a score of 4 or less in the Yes column |
5 |
European Commission |
EU-Compass for Action on Mental Health and Well-being |
Yes, although not numbered. Detailed criteria around three issues: Exclusion Criteria assess the following aspects: Relevance Intervention Characteristics Evidence and Theory base Ethical aspects Core criteria assess the following aspects: Effectiveness Efficiency Equity Qualifier criteria assess the following aspects: Transferability Sustainability Participation Intersectoral collaboration |
6 |
European Platform for Investing in Children |
Evidence Based Practices |
Yes. Criteria to determine the evidence level are organised according to three categories:
Comparison group + Evaluation utilises at the minimum pre/post design with appropriate statistical adjustments employed in order to control for selection + +: Study design uses a convincing comparison group to identify practice impacts, including randomised-control trial (experimental design) or some quasi-experimental designs Statistical significance + Significant (p<0.1), positive results are shown on at least one relevant outcome + + Significant (p<0.05), positive results are shown on at least one relevant outcome Effect size + No requirement + + Effect size of at least 10% of a standard deviation. Sample size + Sample size of at least 20 in each group + + Sample size of at least 50 in each group Outcomes + Outcomes are directly or indirectly related to outcomes identified in topic definitions + No significant negative outcomes reported (excluding those negative outcomes that might be due to chance) + + No significant negative outcomes reported (excluding those negative outcomes that might be due to chance) + + Outcomes are directly related to outcomes identified in topic definitions + + Outcome assessments have been validated, where applicable + + Outcome assessments conducted at baseline and follow-up, where applicable Attrition + No requirement + + Attrition is less than 25% or has been accounted for using an acceptable procedure, where applicable Location: At least one evaluation that meets the above criteria must have been conducted within EU member state(s)
Replication + Practice has been evaluated in at least one additional population beyond the original study population** (broadly defined) in such a way that at least meets the basic criteria for internal validity as specified in the evidence of effectiveness criteria (e.g. significant positive results for at least one outcome are found, uses a comparison group, etc.) + +Same requirements as for + but in addition the practice has been found to be cost-effective/cost-beneficial (i.e. the practice can deliver positive impact at a reasonable cost) Practice materials: Practice materials (curriculum, etc.) are available, or documentation is sufficient, such that program can be replicated
Follow-up conducted An evaluation of the practice which meets the basic criteria for inclusion has conducted a follow-up of at least 2 years, and continues to find positive (p<0.1) and direct impact on at least one outcome Evidence-based practices on this site are assigned one of three evidence levels: •Emergent Practice: An “emergent practice” has achieved at least a + in “evidence of effectiveness.” •Promising Practice: A “promising practice” has achieved at least a + in “evidence of effectiveness” and a + in at least one of the other two categories, “transferability” and “enduring impact.” •Best Practice: A “best practice” has achieved at least a + in each of the three evidence categories, including “evidence of effectiveness”, “transferability” and “enduring impact.” |
7 |
EMCDDA European Monitoring Centre for Drugs and Drug Addiction |
European drug prevention quality standards |
No, it is an eight-stage project cycle with cross-cutting considerations. Organised in an eight-stage project cycle, the Standards cover the following areas: Stage 1: Needs assessment Stage 2: Resource assessment Stage 3: Programme formulation Stage 4: Intervention design Stage 5: Management and mobilisation of resources Stage 6: Delivery and monitoring Stage 7: Final evaluations Stage 8: Dissemination and improvement Cross-cutting considerations are relevant for each project stage and are therefore placed in the centre of the project cycle. These Standards relate to: (A) sustainability and funding, (B) communication and stakeholder involvement, (C) staff development, (D) the ethics of drug prevention. |
8 |
EMCDDA European Monitoring Centre for Drugs and Drug Addiction |
Best practice portal |
Yes, Evidence ratings The available information on the effects of specific interventions are examined and then ranked them as described below. Beneficial: Interventions for which precise measures of the effects in favour of the intervention were found in the systematic reviews of randomised controlled trials (RCTs), and that were recommended in guidelines with reliable methods for assessing evidence (such as GRADE*). An intervention ranked as ‘beneficial’ is suitable for most contexts. Likely to be beneficial: Interventions that were shown to have limited measures of effect, that are likely to be effective but for which evidence is limited, and/or those that are recommended with some caution in guidelines with reliable methods for assessing evidence (such as GRADE). An intervention ranked as ‘likely to be beneficial’ is suitable for most contexts, with some discretion. Trade-off between benefits and harms: Interventions that obtained measures of effects in favour of harm reduction and/or are recommended in guidelines with reliable methods for assessing evidence (such as GRADE), but that showed some limitations or unintended effects that need to be assessed before providing them. Unknown effectiveness: Interventions for which there are not enough studies or where available studies are of low quality (with few patients or with uncertain methodological rigour), making it difficult to assess if they are effective or not. Interventions for which more research should be undertaken are also grouped in this category. Evidence of ineffectiveness: Interventions that gave negative results if compared with a standard intervention, for example. Quality of evidence: High quality evidence— one or more up-to-date systematic reviews that include high-quality primary studies with consistent results. The evidence supports the use of the intervention within the context in which it was evaluated. Moderate quality evidence— one or more up-to-date reviews that include a number of primary studies of at least moderate quality with generally consistent results. The evidence suggests these interventions are likely to be useful in the context in which they have been evaluated but further evaluations are recommended. Low quality evidence— where there are some high or moderate quality primary studies but no reviews available OR there are reviews giving inconsistent results. The evidence is currently limited, but what there is shows promise. This suggests these interventions may be worth considering, particularly in the context of extending services to address new or unmet needs, but should be evaluated. |
9 |
Crime Prevention Council of Lower Saxony |
Green List Prevention |
Yes. Ratings of both programmes and evaluations. Programme ratings Level 1: Theoretically well grounded. Detailed criteria on the Conceptual Quality, Implementation Quality and Evaluation Quality Level 2: Probable Effectiveness Level1 and at least one evaluation study 1 to 3 stars with (predominantly) positive results. Level 3: Proven Effectiveness Level 1 and at least one evaluation study 4 or 5 stars with (predominantly) positive results and at least sufficient conclusiveness. Ratings of evaluations ***** Five Stars Randomized Controlled Trial (RCT) with follow-up (not less than 6 month, also below) **** Four Stars Quasi-Experimental Design (QED) with follow-up *** Three Stars RCT without follow-up, QED without follow-up. ** Two Stars “Clinical” RCT or QED with or without follow-up (not in routine context). Pre-post assessment with control-group(s) in routine context * One Star Benchmark / Norm-reference-study, Theory of Change – study No stars Participant-satisfaction assessment, Pre-post assessment without control-group, Goal-attainment study, Quality-assurance-study. |
10 |
SUPERU |
A Rating Scale for New Zealand |
Yes, there are two scales. The strength of evidence scale: Level 0 - a pilot of a new initiative. Level 1 - Intervention is in its early stages of implementation, or planned but not yet implemented. This intervention’s evidence base will be built over time. Level 2 - Typically, this intervention has been in operation for around one to three years. It has met all level 1 criteria and has been evaluated at least once. The evaluation indicates some effect, but it may not yet be possible to directly attribute outcomes to it. This intervention’s evidence base will continue to be built over time. Level 3 - Typically, this intervention has been in operation for around three to 10 years. It has an established design which is consistently implemented, and quality assurance procedures are in place. It has met all the level 2 criteria, plus it has at least one evaluation that provides evidence about impact. It also has some information available that will help with implementation in new contexts. Level 4 - Typically, this intervention has been in operation for around eight years or longer and is large scale or high risk, justifying extra evaluation effort. It has met all the level 3 criteria, plus it has been replicated at least once. It has been evaluated at least twice and the evaluations provide strong evidence about effectiveness and impact, insights into how the intervention causes change, what works well or less well for different participants, and cost-benefit. There is support for implementation in new contexts. The effectiveness scale: Beneficial Mixed effects No effect Harmful Not applicable |
11 |
Education counts |
Best Evidence Synthesis Iteration |
Yes, although not numbered. To evaluate the evidence, they consider:
|
12 |
Prevención basada en la evidencia |
Criterios de selección de programas |
Yes. (Original in Spanish)) ****Strong: Well-evaluated programs whose effect have been demonstrated through different studies. ***Moderate: Programs that having proven to be effective require more research to show that their effects maintain at long term. **Low: Programs whose effectiveness is not sufficiently demonstrated and it is necessary to investigate more about it to know the usefulness of the program. *Very Low/Not evidence: There is no evidence, or it is indirect, insufficient or contradictory according to the studies carried out with the program. |
13 |
Darlington Service Design Lab |
Standards of evidence |
Yes, although the scale is not numbered. Within each of the four dimensions there are sub-categories which rank the intervention's evidence as "good enough" or "best". 1) Evaluation Quality;• Have been subjected to an evaluation that compares outcomes for children receiving the intervention with children with the same needs who do not receive the intervention;• Ideally, have been independently evaluated using a well–executed randomised controlled trial. 2) Impact;
3) Intervention Specificity;
4) System Readiness
|
14 |
Project Oracle |
Standards of evidence |
Yes. Standard 1: We know what we want to achieve- Theory of Change and Evaluation Plan Standard 2: We have seen there is a change - Indication of impact Standard 3: We believe the change is caused by us - Evidence of impact Standard 4: We know why and how the change happened, this works elsewhere - Model ready Standard 5: We know why and how the change happened, this works everywhere - System ready |
15 |
Early Intervention Foundation |
The Guidebook |
Yes.
|
16 |
Nesta |
Standards of evidence |
Yes. A 1 to 5 scale. 1. You can give an account of impact. 2. You are gathering data that shows some change amongst those using or receiving your intervention. 3. You can demonstrate that your intervention is causing the impact, by showing less impact amongst those who don’t receive the product/service. 4. You can explain why and how your intervention is having the impact that you have observed and evidenced so far. An independent evaluation validates the impact. In addition, the intervention can deliver impact at a reasonable cost, suggesting that it could be replicated and purchased in multiple locations. 5. You can show that your intervention could be operated by someone else, somewhere else, whilst continuing to have positive and direct impact on the outcome, and whilst remaining a financially viable proposition. |
17 |
Bond |
Evidence Principles |
Yes. Principles 1. Voice and inclusion: the perspectives of people living in poverty, including the most marginalised, are included in the evidence, and a clear picture is provided of who is affected and how: 2. Appropriateness: the evidence is generated through methods that are justifiable given the nature of the purpose of the enquiry 3. Triangulation: the evidence has been generated using a mix of methods, data sources, and perspectives. 4. Contribution: the evidence explores how change happens, the contribution of the intervention and factors outside the intervention in explaining change: 5. and Transparency: the evidence discloses the details of the data sources and methods used, the results achieved, and any limitations in the data or conclusions. Each of the five principles has four questions and each question can be answered on a scale of 1-4. Scores for each of the questions are then added up and an overall score for the principles out of 16 is provided. Depending on the score, the principle is then assigned to a scale: 1) weak, 2) minimum standard, 3) good standard 4) gold standard. |
18 |
Centre for Analysis of Youth Transitions (CAYT) |
Standards of evidence |
Yes. Assessing impact grades (Score 0-4), they consider: a) Reach: the extent to which the programme attracts its intended audience and b) Significance: the effect that the programme is having on young people to influence health and wellbeing. Level of evidence grades (Score 0-7) 0. Basic 1. Descriptive, anecdotal, expert opinion 2. Study where a statistical relationship (correlation) between the outcome and receiving services is established 3. Study which accounts for when the services were delivered by surveying before and after 4. Study where there is both a before and after evaluation strategy and a clear comparison between groups who do and do not receive the youth services 5. As above but in addition includes statistical modelling to produce better comparison groups and of outcomes to allow for other differences across groups 6. Study where intervention is provided on the basis of individuals being randomly assigned to either 7. the treatment or the control group. 8. Various studies that evaluate an intervention which has been provided through random allocation at the individual level. Overall Programme Performance (Score 0-4) |
19 |
What Works Centre for Local Economic Growth |
The Maryland Scientific Methods Scale (SMS) |
Yes. Level 1: Either (a) a cross-sectional comparison of treated groups with untreated groups, or (b) a before-and-after comparison of treated group, without an untreated comparison group. No use of control variables in statistical analysis to adjust for differences between treated and untreated groups or periods. Level 2: Use of adequate control variables and either (a) a cross-sectional comparison of treated groups with untreated groups, or (b) a before-and-after comparison of treated group, without an untreated comparison group. In (a), control variables or matching techniques used to account for cross-sectional differences between treated and control groups. In (b), control variables are used to account for before-and-after changes in macro-level factors. Level 3: Comparison of outcomes in treated group after an intervention, with outcomes in the treated group before the intervention, and a comparison group used to provide a counterfactual (e.g. difference in difference). Justification given to choice of comparator group that is argued to be similar to the treatment group. Evidence presented on comparability of treatment and control groups. Techniques such as regression and (propensity score matching may be used to adjust for difference between treated and untreated groups, but there are likely to be important unobserved differences remaining. Level 4: Quasi-randomness in treatment is exploited, so that it can be credibly held that treatment and control groups differ only in their exposure to the random allocation of treatment. This often entails the use of an instrument or discontinuity in treatment, the suitability of which should be adequately demonstrated and defended. Level 5: Reserved for research designs that involve explicit randomisation into treatment and control groups, with Randomised Control Trials (RCTs) providing the definitive example. Extensive evidence provided on comparability of treatment and control groups, showing no significant differences in terms of levels or trends. Control variables may be used to adjust for treatment and control group differences, but this adjustment should not have a large impact on the main results. Attention paid to problems of selective attrition from randomly assigned groups, which is shown to be of negligible importance. There should be limited or, ideally, no occurrence of ‘contamination’ of the control group with the treatment. |
20 |
Big Lottery Fund's Realising Ambition Programme |
The confidence review |
No. "The Confidence Framework addresses the five dimensions that Realising Ambition assessed as being essential for effective replication – service design, service delivery, ability to monitor impact, ability to determine benefit and the prospects for sustainability"' |
21 |
Education Endowment Foundation |
Teaching and Learning Toolkit |
Yes. Security of evidence criteria: In term of : Quantity and type of study; Outcomes,; Causal inference; Consistency requirements; Effect Size requirements (from Four padlocks) Ranking 1. Very limited: One padlock: Single studies with quantitative evidence of impact with effect size data reported or calculable (such as from randomised controlled trials, well-matched experimental designs, regression discontinuity designs, natural experiments with appropriate analysis); and/or observational studies with correlational estimates of effect related to the intervention or approach; but no publically available meta-analyses. 2. Limited: Two padlocks: At least one publically available meta-analysis 3. Moderate: Three padlocks: Two or more publically available meta-analyses which meet the following criteria: they have explicit inclusion and search criteria, risk of bias discussed, and tests for heterogeneity reported. They include some exploration of methodological features such as research design effects or sample size. 4. Extensive: Four padlocks: Three or more meta-analyses which meet the following criteria: they have explicit inclusion and search criteria, risk of bias discussed, and tests for heterogeneity reported. They include some exploration of the influence of methodological features such as research design effects or sample size on effect size. The majority of included studies should be from school or other usual settings. 5. Very Extensive: Five padlocks: Three or more meta-analyses which meet the following criteria: They have explicit inclusion and search criteria, risk of bias discussed, and tests for heterogeneity reported. They include some exploration of the influence of methodological features such as research design effects or sample size on effect size. The majority of included studies should be from school or other usual settings. |
22 |
HACT-Ideas and Innovation in Housing |
Standards of evidence |
No, evidence should be assessed in a seven-step process. 1) Describe; 2) Design; 3) Proceed; 4) Plan; 5) Protocol; 6) Study; 7) Findings They establish the Purpose, limitations and intended usage evidence at different levels (Standard for Producing Evidence – Effectiveness of Interventions –Part 1: Specification, page 18) Level 1:Exploration and Development Level 2: Effectiveness Level 3: Scaling-up |
23 |
Conservation evidence |
What Works in Conservation |
Yes, although not numbered 1. Experts are asked to read the summarized evidence in the synopsis and then score to indicate their assessment of the following: 2. Effectiveness: 0 = no effect, 100% = always effective. 3. Certainty of the evidence: 0 = no evidence, 100% = high quality evidence; complete certainty. This is certainty of effectiveness of intervention, not of harms. 4. Harms: 0 = none, 100% = major negative side-effects to the group of species/habitat of concern. 5. The median score from all the experts’ assessments is calculated for the effectiveness, certainty and harms for each intervention. 6. Effectiveness categorization is based on these median values (i.e. on a combination of the size of the benefit and harm and the strength of the evidence), as and listed as follow: a) Beneficial b) Likely to be beneficial c) Trade-offs between benefits & harms d) Unknown effectiveness e) Unlikely to be beneficial f) f. Likely to be ineffective or harmful |
24 |
What Works Centre for Children’s Social Care |
Evidence Standards |
Yes. Overall effectiveness: looking at the consistency of effect across different research studies Negative effect: The balance of evidence suggests that the intervention has a negative effect (meta-analysis OR most of the studies) Mixed or no effect: The balance of evidence (including the pooled effect size from meta-analysis where available) suggests that the intervention has no effect overall, or studies show a mixture of effects. Tends to positive effect: The balance of evidence suggests that the intervention has a positive effect. There are one or more studies showing a negative effect, but either there was a meta-analysis OR most of the studies that showed a positive effect. Consistently positive effect: Most published studies have positive effects and none have negative effects for this outcome. Some individual studies may show no effect. However, either the pooled effect (in a meta-analysis) or most studies AND the studies involving most of the participants have a positive effect. Strength of evidence: looking at how confident we can be about a finding, based on how the research was designed and carried out. They overall framework is provided by the EMMIE system. They have adapted the EMMIE-Q to provide a four-point rating for strength of evidence. 0. Very low strength evidence: No acceptable quality studies 1. Low strength evidence: One or two acceptable quality studies 2. Moderate strength evidence: Three or more acceptable quality studies. High quality review therefore possible. Between 0-3 EMMIE-Q requirements are met. 3. High strength evidence: Three or more acceptable quality studies. High quality review therefore possible. Between 4-6 EMMIE-Q requirements are met including all themes marked* (see below). An acceptable quality study must have the following characteristics (definition used by the Early Intervention Foundation-EIF): 1. The sample is sufficiently large to test for the desired impact (e.g. a minimum of 20 participants in the treatment group AND comparison group). 2. The study must use valid measures. These measures should reliable, standardised and validated independently of the study. 3. Comparability of groups is addressed in selection and/ or analysis. 4. An ‘intent-to-treat’ design is used. 5. The study should report on overall and differential attrition. EMMIE-Q requirements 1. A transparent and well-designed search strategy* 2. High statistical conclusion validity (at least four of the following are necessary for a study to be considered sufficient)* (a) Calculation of appropriate effect sizes (b) The analysis of heterogeneity (c) Use of a random effects model where appropriate (d) Attention to the issue of dependency (e) Appropriate weighting of individual effect sizes in the calculation of mean effect sizes 3. Sufficient assessment of the risk of bias 4. Attention to the validity of the constructs, with only comparable outcomes combined and/or exploration of the implications of combining outcome constructs* 5. Assessment of the influence of study design (e.g. separate overall effect sizes for experimental and quasi-experimental design) 6. Assessment of the influence of unanticipated outcomes or spin-offs on the size of the effect (e.g. quantification of displacement or diffusion of benefit) Requirements 1-4 (highlighted by *) are considered particularly important, and are required for any review to achieve a rating of 3, which is the highest rating in the scale. |
25 |
What Works Centre for Wellbeing |
GRADE (Grading of Recommendations Assessment, Development and Evaluation) |
Yes. High quality: Further research is very unlikely to change our confidence in the estimate of effect Moderate quality: Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate Low quality: Further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate Very low quality: Any estimate of effect is very uncertain |
26 |
What Works Centre for Crime Reduction |
EMMIE Framework |
Yes. EMMIE rates systematic review evidence against five dimensions: Effect: focuses on whether the evidence suggests the intervention led to an increase, decrease or had no impact on crime.; Mechanisms: focuses on what it is about the intervention that could explain its effect; Moderators: focuses on the circumstances and contexts in which the intervention is likely (or unlikely) to work ; Implementation: focuses on the conditions that should be considered when implementing the intervention ; Economic cost: focuses on the costs associated with the intervention, both direct and indirect, and whether there is any evidence of cost-benefit. |
27 |
Campbell Collaboration |
Campbell Collaboration Systematic Reviews: Policies and Guidelines |
No |
28 |
Community Preventive Services Task Force (CPSTF) |
The Community Guide |
Yes, although not numbered. The CPSTF uses the terms below to describe its findings. Recommended The systematic review of available studies provides strong or sufficient evidence that the intervention is effective. The categories of "strong" and "sufficient" evidence reflect the degree of confidence the CPSTF has that an intervention has beneficial effects. They do not directly relate to the expected magnitude of benefits. The categorization is based on several factors, such as study design, number of studies, and consistency of the effect across studies. Recommended Against The systematic review of available studies provides strong or sufficient evidence that the intervention is harmful or not effective. Insufficient Evidence The available studies do not provide sufficient evidence to determine if the intervention is, or is not, effective. This does NOT mean that the intervention does not work. It means that additional research is needed to determine whether or not the intervention is effective. |
29 |
U.S. Department of Health & Human Services |
Home Visiting Evidence of Effectiveness |
Yes. HomVEE assigns a rating of high, moderate, or low to each effectiveness study according to the quality of causal evidence it provides.
Assessing evidence of effectiveness To meet HHS’ criteria for an “evidence-based early childhood home visiting service delivery model,” models must meet at least one of the following criteria:
In both cases, the impacts considered must either (1) be found for the full sample or (2) if found for subgroups but not for the full sample, be replicated in the same domain in two or more studies using non-overlapping analytic study samples. For results from single-case designs to be considered toward the HHS criteria, three additional requirements must be met:
The HomVEE team examined and reported other aspects of the evidence for each model based on all high- and moderate-quality studies available, including the following:
|
30 |
What Works Clearinghouse |
Find What Works from Systematic Reviews |
Yes. a) The results are sorted by evidence of effectiveness: Effectiveness Rating Key: it is based on the quality of research, the statistical significance of findings, the magnitude of findings, and the consistency of findings across studies Positive: strong evidence that intervention had a positive effect on outcomes. Potentially Positive: evidence that intervention had a positive effect on outcomes with no overriding contrary evidence. Mixed: evidence that intervention’s effect on outcomes is inconsistent. No Discernible: no evidence that intervention had an effect on outcomes. Negative: strong evidence that intervention had a negative effect on outcomes b) The program also lets you compare interventions (Max. 5 interventions). It will allow you to see basic information for each intervention, such as grades examined, program type, delivery method, and the effectiveness rating. c) For single-case design research, the WWC rates the effectiveness of an intervention in each domain based on the quality of the research design and the consistency of demonstrated effects. |
31 |
Center for the Study and Prevention of Violence |
Blueprints |
Yes. Blueprints considers four criteria:
Program Criteria: Promising Programs meet the following standards:
Model Programs meet these additional standards:
Model Plus Programs meet one additional standard:
|
32 |
California Department of Social Services (CDSS) |
California Evidence-Based Clearinghouse for Child Welfare |
Yes. Scientific Rating Scale 1. Well-Supported by Research Evidence Multiple Site Replication and Follow-up 2. Supported by Research Evidence Randomized Controlled Trial and Follow-up: 3. Promising Research Evidence At least one study utilizing some form of control (e.g., untreated group, placebo group, matched wait list study) has been established 4. Evidence Fails to Demonstrate Effect Two or more randomized controlled trials (RCTs) have found the practice has not resulted in improved outcomes, when compared to usual care. The studies have been reported in published, peer-reviewed literature. 5. Concerning Practice If multiple outcome studies have been conducted, the overall weight of evidence suggests the intervention has a negative effect upon clients served; and/or NR. Not able to be Rated on the CEBC Scientific Rating Scale Measurement Tools Rating Scale: based on the level of psychometrics (e.g., sensitivity and specificity, reliability and validity) found in published, peer-reviewed journals A - Psychometrics Well-Demonstrated: 2 or more published, peer-reviewed studies have established the measure’s psychometrics. B - Psychometrics Demonstrated: 1 published, peer-reviewed study has established the measure’s psychometrics. C - Does Not Reach Acceptable Levels of Psychometrics: A preponderance of published, peer-reviewed studies have shown that the measure does not reach acceptable levels of psychometrics. NR - Not Able to Be Rated: Published peer-reviewed studies demonstrating the measure’s psychometrics are not available. *Sensitivity: a measure of how well a test identifies people with a specific disease or problem *Specificity: a measure of how well a test excludes people without a specific disease or problem *Reliability: the extent to which the same result will be achieved when repeating the same measure or study again *Validity: the degree to which a result is likely to be true and free of bias. |
33 |
Center for Research and Reform in Education (CRRE) at Johns Hopkins University School of Education |
Best Evidence Encyclopedia |
Yes. Strong Evidence of Effectiveness: At least one large randomized or randomized quasi-experimental study and one additional large qualifying study, or multiple smaller studies, with a combined sample size of 500 and an overall weighted mean effect size of at least +0.20. Moderate Evidence of Effectiveness: Two large matched studies, or multiple smaller studies with a collective sample size of 500 students, with a weighted mean effect size of at least +0.20. Limited Evidence of Effectiveness: Strong Evidence of Modest Effects: Studies meet the criteria for “Moderate Evidence of Effectiveness” except that the weighted mean effect size is +0.10 to +0.19. Limited Evidence of Effectiveness: Weak Evidence with Notable Effect: A weighted mean effect size of at least +0.20 based on one or more qualifying studies insufficient in number or sample size to meet the criteria for “Moderate Evidence of Effectiveness”. |
34 |
Center for Research and Reform in Education (CRRE) at Johns Hopkins University School of Education |
Evidence for ESSA (Every Student Succeeds Act ) |
Yes. The organization recognizes four levels of evidence. The top three levels require findings of a statistically significant effect on improving student outcomes or other relevant outcomes. Strong evidence: At least one well-designed and well-implemented experimental (i.e., randomized) study. Moderate evidence: At least one well-designed and well-implemented quasi-experimental (i.e., matched) study. Promising evidence: At least one well-designed and well-implemented correlational study with statistical controls for selection bias. The fourth level is a program or practice that does not yet have evidence qualifying for the top 3 levels, and can be considered evidence-building and under evaluation. |
35 |
Society for Prevention Research |
Standards of Evidence for Efficacy, Effectiveness, and Scale-up Research in Prevention Science. |
Yes, although not numbered. Standards for Efficacy Standards for Effectiveness Standards for Scaling Up of Evidence-Based Interventions |
36 |
U.S. Department of Health and Human Services |
Evidence Based Teen Pregnancy Programs |
Yes. Study quality rating: In terms of study design, Attrition, Baseline equivalence, Reassignment, Confounding factors. 1. High 2. Moderate 3. Low All impact studies meeting the criteria for a high or moderate study quality rating are considered eligible for providing credible evidence of program impacts. The program’s evidence of effectiveness (by domain) is classified as 1. Positive impacts: Evidence of uniformly favourable impacts across one or more outcome measures, analytic samples (full sample or subgroups), and/or studies. 2. Mixed impacts: Evidence of a mix of favourable, null, and/or adverse impacts across one or more outcome measures, analytic samples (full sample or subgroups), and/or studies. 3. Indeterminate impacts: Evidence of uniformly null impacts across one or more outcome measures, analytic samples (full sample or subgroups), and/or studies. 4. Negative impacts: Evidence of uniformly adverse impacts across one or more outcome measures, analytic samples (full sample or subgroups), and/or studies. |
37 |
National Institute of Justice |
Crime solutions |
Yes. a) Programs undergo an eight-step review and evidence-rating process. b) Practices undergo a seven-step review and evidence-rating process c) Then they address program and practice evaluations in an evidence continuum with two axes: (1) Effectiveness and (2) Strength of Evidence. Effectiveness is determined by the outcomes of an evaluation in relation to the goals of the program or practice. Strength of evidence for programs is determined by the rigor and design of the outcome evaluation, and by the number of evaluations. Rated as Effective: Programs and practices have strong evidence to indicate they achieve criminal justice, juvenile justice, and victim services outcomes when implemented with fidelity. Rated as Promising: Programs and practices have some evidence to indicate they achieve criminal justice, juvenile justice, and victim services outcomes. Included within the promising category are new, or emerging, programs for which there is some evidence of effectiveness. Inconclusive Evidence: Programs and practices that made it past the initial review but, during the full review process, were determined to have inconclusive evidence for a rating to be assigned. Rated as No Effects: Programs have strong evidence indicating that they had no effects or had harmful effects when implemented with fidelity |
38 |
Arnold Ventures |
Social Programmes that Work |
Yes. Suggestive tier: Programs that have been evaluated in one or more well-conducted RCTs (or studies that closely approximate random assignment) and found to produce sizable positive effects, but whose evidence is limited by only short-term follow-up, effects that fall short of statistical significance, or other factors. Such evidence suggests the program may be an especially strong candidate for further research, but does not yet provide confidence that the program would produce important effects if implemented in new settings. Near top tier: Programs shown to meet almost all elements of the Top Tier standard, and which only need one additional step to qualify. This category primarily includes programs that meet all elements of the Top Tier standard in a single study site, but need a replication RCT to confirm the initial findings and establish that they generalize to other sites. This is best viewed as tentative evidence that the program would produce important effects if implemented faithfully in settings and populations similar to those in the original study. Top tier: Programs shown in well-conducted RCTs, carried out in typical community settings, to produce sizable, sustained effects on important outcomes. Top Tier evidence includes a requirement for replication – specifically, the demonstration of such effects in two or more RCTs conducted in different implementation sites, or, alternatively, in one large multi-site RCT (Is this equivalent to effectiveness?). Such evidence provides confidence that the program would produce important effects if implemented faithfully in settings and populations similar to those in the original studies. |
39 |
Childs Trends |
What Works for Child and Youth Development |
No, only Eligibility Criteria for Analysis |
40 |
Washington State Institute for Public Policy (WSIPP) |
Washington State Institute for Public Policy Benefit -Cost Results |
No |
41 |
U.S Department of Justice |
Office of Juvenile Justice and Delinquency Prevention Model Programs Guide |
Yes. Based on the reviewers’ assessment of the evidence, programs included in the Model Programs Guide and CrimeSolutions.gov the evidence ratings are: Effective : Programs have strong evidence indicating they achieve their intended outcomes when implemented with fidelity. Promising: Programs have some evidence indicating they achieve their intended outcomes. Additional research is recommended. No Effects: Programs have strong evidence indicating that they did not achieve their intended outcomes when implemented with fidelity. * The rating is given for a single study OR more than one study. * A single study icon is used to identify programs that have been evaluated with only one study. A multiple studies icon is used to represent a greater extent of evidence supporting the evidence rating. |
42 |
University of Wisconsin Population Health Institute’s |
What Works for Health |
Yes. Evidence Rating: Scientifically Supported
Some Evidence
Expert Opinion
Insufficient Evidence
Mixed Evidence
Evidence of Ineffectiveness
|
43 |
U.S. Department of Health and Human Services |
The Agency for Healthcare Research and Quality (AHRQ) |
Yes. Strong: The evidence is based on one or more evaluations using experimental designs based on random allocation of individuals or groups of individuals. The results of the evaluation(s) show consistent direct evidence of the effectiveness. Moderate: While there are no randomized, controlled experiments, the evidence includes at least one systematic evaluation of the impact of the innovation using a quasi-experimental design, which could include the non-random assignment of individuals to comparison groups, before-and-after comparisons in one group, and/or comparisons with a historical baseline or control. The results of the evaluation(s) show consistent direct or indirect evidence of the effectiveness. However, the strength of the evidence is limited by the size, quality, or generalizability of the evaluations, and thus alternative explanations cannot be ruled out. Suggestive: While there are no systematic experimental or quasi-experimental evaluations, the evidence includes non-experimental or qualitative support for an association between the innovation and targeted health care outcomes or processes, or structures in the case of health care policy innovations. This evidence may include non-comparative case studies, correlation analysis, or anecdotal reports. As with the category above, alternative explanations for the results achieved cannot be ruled out. |
44 |
U.S. Department of Labor |
Clearinghouse for Labor Evaluation and Research (CLEAR) |
Yes. Although only for Causal Studies High Causal Evidence This means there is strong evidence that the effects estimated in this study are solely attributable to the intervention being examined. This does not necessarily mean that the study found positive impacts, only that the analysis meets high methodological standards and the causal impacts estimated, whether positive, negative, or null, are credible. Currently, only well-implemented randomized controlled trials can receive this rating. Moderate Causal Evidence This means there is evidence that the effects estimated in the study are attributable at least in part to the intervention being examined. However, there may be other factors that were not accounted for in the study that might also have contributed. Causal studies that meet CLEAR evidence guidelines for no experimental designs (including randomized controlled trials with high attrition) can receive this rating. Low Causal Evidence This means there is little evidence that the effects estimated in the study are attributable to the intervention being examined, and other factors are likely to have contributed to the results. This does not imply that the study's results are not useful for some purposes, but they should be interpreted with caution. Causal studies that do not meet criteria for a high or moderate evidence rating receive this rating. They present, separately, Guidelines for reviewing quantitative descriptive studies and one for implementation studies. |
45 |
Clearinghouse for Military Family Readiness |
Continuum of evidence |
Yes. Criteria to evaluate evidence: Significant Effect, Sustained Effect, Successful, External Replication, Study Design, and Additional Criteria Regarding Study Execution Continuum of Evidence: 1. Effective 2. Promising 3. Unclear 4. Ineffective |
46 |
Suicide Prevention Resource Center |
Evidence-Based Practices Project |
Yes. Scoring Criteria Reviewers rated the quality of program evaluations using 10 items (See Table 1). Items were scored on a scale of 1-5 or 0-5. (A more detailed description of these items can be found in the Appendix.) 1. Theory 2. Intervention fidelity 3. Design 4. Attrition 5. Psychometric properties of measures 6. Analysis 7. Threats to validity 8. Safety 9. Integrity 10. Utility Classification Criteria Classifications of programs as insufficient current support, promising, or effective were based solely upon the average scores for two items: integrity and utility. After averaging the scores of the reviewers, the lower average score of the two determined the classification level. Insufficient current support < 3.5 Promising: 3.5 - 3.9 Effective 4.0 - 5.0 |
47 |
National Implementation Research Network |
The Hexagon: An Exploration Tool |
Yes, the rating criteria is for each of the following indicators in a scale from 1 to 5, 5 being the best. Implementing site indicators Fit with current initiatives 1. Alignment with community, regional, state priorities. 2. Fit with family and community values, culture and history 3. Impact on other interventions & initiatives 4. Alignment with organizational structure Need 1. Target population identified. 2. Disaggregated data indicating population needs 3. Parent & community perceptions of need 4. Addresses service or system gaps Capacity to implement 1. Staff meet minimum qualifications. 2. Able to sustain staffing, coaching, training, data systems, performance assessment, and administration: Financial capacity, Structural capacity AND Cultural responsivity capacity 3. Buy-in process operationalized: Practitioners AND families Program indicators Evidence 1. Strength of evidence—for whom in what conditions: Number of studies, Population similarities, Diverse cultural groups AND Efficacy or Effectiveness 2. Outcomes – Is it worth it? 3. Fidelity data 4. Cost – effectiveness data Usability 1. Well-defined program 3. Mature sites to observe 5. Several replications 6. Adaptations for context Supports 1. Expert Assistance 2. Staffing 3. Training 4. Coaching & Supervision 5. Racial equity impact assessment 6. Data Systems Technology Supports (IT) 7. Administration & system |
48 |
National Dropout Prevention Center |
Model Programs Database |
Yes. Strong Evidence of Effectiveness These programs have been in existence for three years or more. They were evaluated using an experimental or strong quasi-experimental design conducted by an external evaluation team and have strong empirical evidence demonstrating program effectiveness in reducing dropout and/or increasing graduation rates and/or having significant impact on dropout-related risk factors. Moderate Evidence of Effectiveness These programs have been in existence for three years or more. They were evaluated using a quasi-experimental design conducted by an external or internal evaluation team and have adequate empirical evidence demonstrating program effectiveness in reducing dropout and/or increasing graduation rates and/or having significant impact on dropout-related risk factors. Limited Evidence of Effectiveness These programs may be relatively new programs. They were evaluated using a limited evaluation design (single group pre- and post-test) conducted by an external or internal evaluation team. They have promising empirical evidence demonstrating program effectiveness in reducing dropout and/or increasing graduation rates and/or having significant impact on dropout-related risk factors that requires confirmation using more appropriate experimental techniques. Insufficient Evidence of Effectiveness These programs require additional information before a rating category is determined. |
49 |
National Cancer Institute |
Research-Tested Intervention Programs (RTIPs) |
No. Intervention evaluation and program materials are evaluated in four areas for the RTIPs review. Research Integrity Research Integrity reflects the overall confidence reviewers can place in the findings of a program's evaluation based on its scientific rigor. The Research Integrity rating system comprises 16 criteria scored by independent experts. Scores on each criterion are given on a 5-point scale ranging from low quality to high quality. The overall integrity score is an average of the 16 criteria reflecting the merits of the science that went into the program evaluation. Intervention Impact Intervention Impact describes whether, and to what degree, a program is usable and appropriate for widespread application and dissemination. This rating is determined by the Review Coordinators. Population Reach and Effect Sizes are separately rated on a 5-point scale; these ratings are then combined using the RTIPs Intervention Impact rating table to determine the impact score. Dissemination Capability Dissemination Capability refers to the readiness of program materials for use by others as well as a program's capability to offer services and resources to facilitate dissemination. The rating is given on a 5-point scale ranging from low quality (1.0) to high quality (5.0). Dissemination capability is measured through the assessment of three areas:
RE-AIM RE-AIM is a five-step framework designed to enhance the quality, speed, and public health impact of efforts to translate research into practice:
|
50 |
Strengthening Families Evidence Review |
Standards of evidence |
Yes. High Rating Randomized controlled trials received a high rating if:
Moderate Rating Randomized controlled trials received a moderate rating if:
OR
Quasi-experimental designs received a moderate rating if:
Pre/post or other designs received a moderate rating if:
Low Rating
Unrated
|