If Randomised Control Trials (RCTs) improve global development outcomes – why then are we still fighting about them?

By Priya Chattier

Have Randomised Control Trials (RCTs) contributed to reducing global poverty by generating rigorous scientific evidence which is turned into effective public policy? While the ‘randomistas’ proffer RCTs as the most rigorous approach to impact evaluation, there has been a pushback from critics on its gold-standard claim. This debate was at the heart of one of the key note panels at this year’s 2020 Australian Aid Conference held at the Australian National University. An invigorating myth-busting panel featured radical Randomista, Dr Andrew Leigh (a politician and an economist) and a staunch RCT critic, Dr Lant Pritchett. It was a fiercely debated and yet a very entertaining panel on the usefulness (or uselessness?) of RCT’s in the real world of development impact. A third panellist, Dr Jyotsna Puri (the Head of Independent Evaluation Unit at the Green Climate Fund) provided the necessary buffer between the two debaters trying to reconcile the two extreme viewpoints.

One thing that became quite clear to me in this session was that while the credibility revolution of RCTs have permeated popular discourse on development impact evaluations, the ‘gold standard’ claim on rigorous evidence has limited usefulness when it comes to understanding more complex and policy-based development interventions. While donors and politicians increasingly want to see financial accountability of their aid budgets, RCTs are rarely evidenced as part of assessing results of aid projects to the domestic constituents.

So, what’s all the fighting about? In this blog I do not intend to revisit these polarised positions or viewpoints which have been noted elsewhere but focus on particular strengths and weaknesses of RCTs and why it does matter for development impact evaluations.

Why RCTs are so popular?

RCT’s emergence in development economics almost two decades ago saw policy makers changing the way big development challenges, such as alleviating global poverty and improving public health outcomes were being evaluated for impact. As Andrew Leigh put it, the “randomisation revolution” saw the increased application of randomised controlled trials to identify the causal link between interventions and outcomes, and the extensive use of RCTs as the gold standard for evidence that informed development policies on health care, vaccines, schooling, cash transfers or microfinance.

Even the strongest proponents of the randomisation movement would argue that RCTs are not the only legitimate source of evidence that sheds light on what works and what doesn’t. But what we do know is that RCTs often require a specific intervention and a randomised population to show the net impact of the relationship between an intervention and outcome – just like in clinical trials (but largely overlooks the questions of applicability in diverse real-world settings). However, RCTs strength lies in its rigorous methodology:

The controlled element: a control group is used to explore the cause-and-effect relationship between an intervention and outcome in a randomised control trial so that causal inferences can be made about treatment/intervention effects whilst controlling factors outside of the intervention. Alternatively, a control group helps to compare the results and make accurate measurement of any observed changes to results/outcomes from an intervention.

Randomisation: randomisation not only minimises bias in the selection of participants for the control group and treatment/intervention group but also prevents results being skewed or manipulated. Random selection of participants for both groups allows for a direct comparison between the two groups in an experiential trial.

Statistical reliability: In order for RCTs to be statistically reliable, there are usually large populations under treatment and control so that the sample size can be randomly selected making RCTs statistically robust. A randomised sample ensures that all participants in both treatment and control groups have an equal chance of being selected.

No wonder policy makers love randomised trials. The beauty of RCTs lies in three things: (1) provide good data (2) randomise a reasonably large population (3) statistically estimate the likelihood of net results (often through a hypothesis) without having to worry about the confounding factors.

So, what’s all the fighting about RCTs?

Lant Pritchett presented an intellectual, entertaining and convincing argument – that the randomisation of development is madness because the fad for evidence is “so over”! Pritchett argues that many of the poverty alleviation programmes studied by the RCT movement are unlikely to really combat poverty because of its narrow focus on short term outcomes rather than addressing the root causes of poverty and inequality in the first place. In fact, the randomista movement is being criticised for having little value in predicting the impact of complex projects beyond the original experimental trial. Key criticisms of RCTs are:

It’s not enough to have one tool in an evaluation toolbox: While RCT may seem like the only ‘scientific and rigorous’ tool to effectively measure the impact of a specific intervention, it may not always be feasible due to practical constraints. In such instances, observational insights, participatory research methods, social mapping and other sources of qualitative evidence might offer more granularity to contribution tracing of an intervention as opposed to randomisation.

Context is all that matters in knowing what works: For many of us working in international development, finding out ‘what works’ is a common rhetorical slogan among the donors. But we also know that nothing works without a context. As many critics of RCTs like Pritchett would argue, evidence from a randomised evaluation in a particular development context or setting may not be generalisable to another context. RCTs do tell us “what has worked somewhere” but not necessarily that “it will work the same elsewhere”. International development is not a scientific laboratory where one could control development assistance, but the net impact of an intervention often depends on the complex and challenging real world contexts.

Data-driven discourse that focuses on “if x then y” may be crowding out investigating more complex development challenges: Well-designed boutique experiments does raise the bar for generating robust evidence for most (if not all) policy decisions on where to spend public funds. But the reality remains – RCTs may be influencing wrong type of donor investments on development outcomes. This discourse is in complete contrast to complexity theory that draws upon thinking and working adaptively or a multi-sectoral approach to development assistance. Hence there are fewer incentives for investments in evaluation that ask harder questions about the “complexity of context”. RCTs will continue to crowd out other types of inquiry that not only ignores the complexity of the development context but also use it as an excuse to avoid asking hard questions that reveal granularity of a problem.

Much more than the superiority claim of randomisation is needed in development practice that would significantly improve development outcomes in the real world. No matter how many RCTs are conducted to generate tangible evidence that meets the accountability needs of donors and governments, in reality – RCTs rarely provide useful information that solves complex development challenges.

So, what can we learn from all this fighting?

In reconciling the in-fighting regarding the power(lessness) of RCTs, it is worth noting that the problem is less in the RCT approach itself but more about claims that it constitutes the “gold standard”. There have certainly been some incredible achievements of RCTs bringing much more scientific rigour to the way we conduct program/project evaluations. All the people involved in the fighting have certainly contributed to generating knowledge on logic and evidence to inform public policy. But at the centre of a rigorous evaluation lies our ability as impact evaluators to strike the right balance between rigor and complexity when it comes to generating and using evidence to inform policy decisions.