To RCT, or not to RCT, that is the question

Michele Binci and Paul Jasper discuss the 2019 Nobel Prize in Economic Sciences.

Authors

Paul Jasper Monitoring, Evaluation, Research, and Learning (MERL), International
Michele Binci Principal Consultant

Date

January 2020
Area of expertise
Research and Evidence (R&E)
Keywords
Randomised Control Trials (RCTs) , Quasi-experimental
Office
OPM United Kingdom

In 2019, the Nobel Prize in Economic Sciences was awarded to Esther Duflo, Abhijit Banerjee and Michael Kremer for the crucial role they played in transforming the field of development economics through the systematic use of experiments. The Prize committee recognised the importance of the Nobel laureates’ focus on “identifying workable policies, for which one can make causal claims of impact” whilst ensuring that the evidence they generate feeds into a robust theoretical framework. Through their Randomised Controlled Trials (RCTs) as well as their non-experimental work, Duflo, Banerjee and Kremer have contributed enormously to the way in which pressing issues like poverty and vulnerability can be tackled.

However, not everyone has been equally impressed by the choice made by the Nobel Prize committee. In fact, a particularly feisty debate has erupted on the merits and demerits of what has been seen by many as “a Nobel Prize for Development RCTs”. A lot of critical voices have expressed their disapproval in harsh terms, with the reliance of development economics on experiments described as “impoverished economics” that shows “how poor the modern economic discipline is in terms of […] research methods”. Critics have gone as far as saying that what the rise of RCTs reveal is a “retreat from the biggest questions”.

More nuance is needed in this debate. Whilst there is some truth in the critiques, this does not invalidate Duflo, Banerjee, and Kremer’s work and RCTs more broadly. Rather, it points to the limitations in their approach that researchers and practitioners should be aware of – the same as any other research approach. Yet their contribution has critically improved the way in which development work is implemented in practice. This should be celebrated, however, RCTs are not always appropriate to use: they should be used within a theory-based framework and complemented by other research methods, whenever possible.

Criticisms missing the mark

An argument brought repeatedly against the usefulness of RCTs is that they lack ‘external validity’. While this is true, it is also a limitation of all research design in the social sciences. No approach can by default claim to be applicable to any context or population. The problem of deducing geographically and temporally generalised rules and insights from specific empirical observations affects any empirical work outside of a context of perfectly controlled experiments, e.g. in the natural sciences. Findings that are relevant to a specific context (e.g. improvements in health system service delivery in Ethiopia) can only very carefully be applied to other contexts (health system service delivery in a different country). To address this limitation, we either use theory to explain why and how specific observations can be applied to other contexts, or we repeat similar studies across different contexts so as to ‘build up’ an evidence base on the validity of the same findings regardless of the context. For this reason, RCTs studying a specific issue have been repeated in different countries and for different populations.

Another point made regularly is that there are ethical concerns with running experiments on people. Ethical questions surrounding the use of RCTs, especially on vulnerable populations, need to be taken seriously. There are several concerns that critics raise. First, whether random allocation of treatments is appropriate: How is it fair that the decision on who is eligible to receive a cash transfer, a food subsidy or a health worker visit is made through a lottery? Second, that unintended consequences are not well taken care of: How can researchers be sure that their experiment is not leading to negative effects that they might be missing due to the narrow focus of their quantitative data collection? Third, the problem of rich world bias: Most RCTs are designed, commissioned and implemented by researchers and organisations from wealthy countries – hence experimenting with the world’s poor. All of these concerns are valid. RCTs need to follow strict ethical procedures. These should include an assessment of whether random allocation of treatment is ethically justifiable (it often is – as it can arguably be a fair allocation of scarce resources). They should also include feedback procedures for affected communities and – ideally – should be co-owned and co-designed by organisations that legitimately represent the interest of populations involved in the trials. Results should also not just disappear into academic publishing, but should be fed back to, and discussed with, local communities, which happens all too rarely. However, it is very important to note that all of these points apply to any research and data collection conducted in poor or vulnerable communities across the world, irrespective of whether one is doing a RCT or any other quantitative or qualitative research.

Finally, an important criticism is that RCTs only answer small questions and do not provide solutions to the big, important economic problems of our times. There are two sides to this argument. Firstly, it highlights that RCTs cannot say much on the mechanics of how effects materialise. This seems to be a fair criticism of exclusively relying on quantitative experimental designs like RCTs and calls for the systematic adoption of mixed-methods frameworks, in which qualitative research enriches quantitative experimental designs and fills in at least some of the information gaps on ‘why’ and ‘how’ questions. Secondly, the criticism suggests that the very fact that RCTs only focus on specific questions is a problem in itself since economic research should look at bigger and more fundamental questions. Again, this is partially true but RCTs’ focus on ‘smaller’ questions is valuable. By investigating small targeted questions, RCTs can gather valuable, robust evidence on the effectiveness of particular programmes or policy initiatives, which in turn can guide donors and policymakers on where to channel their limited resources. And whilst trying to address big questions seems appealing and fulfilling, the track record of economic thinking around those questions is – at best – mixed.

The case for RCTs

RCTs have changed evaluations and the analysis of causality. The way in which research is produced in development economics is now fundamentally different from how it was before RCTs were extensively used. Impact evaluations are now carefully designed taking into account technical considerations (e.g. sample size calculations and survey implementation) as well as local circumstances (e.g. interacting with governments, local communities and other development actors like NGOs). The difference between simple correlations, causal contribution claims (i.e. a range of factors might have contributed to affecting measured outputs and outcomes), and attribution claims (i.e. the detected impact on indicators of interest can be directly attributed to an intervention) is more clearly spelt out. Our own experience at Oxford Policy Management tells us that an increasingly large range of stakeholders – including government Ministers and civil servants – are acutely aware of, and interested in, this key distinction. Funders and governments now want robust evidence on impact, to inform and guide their decisions.

From the success of experiments, a variety of quasi-experimental designs have emerged. The use of RCTs has highlighted the importance of designing robust impact evaluations to answer causality questions. However, as mentioned, RCTs are not always feasible. Valid alternatives to RCTs have been designed and developed. So called quasi-experimental designs that aim to deal with the same questions that experiments deal with, have been popularised and are regarded as good approximations of an experimental design, if certain assumptions hold. Matching, difference-in-differences, regression discontinuity are examples of these alternative quasi-experimental approaches. They exist, are understood by practitioners, and are employed because of the rise of RCTs and an appreciation of their strengths and weaknesses.

The way forward

On the whole, the 2019 Nobel Prize in Economics was justified. The Nobel laureates’ work has improved the way in which causality is investigated in development economics; increased the demand and use of experimental and quasi-experimental approaches to assess the impact of programmes and policies, and helped build a body of robust evidence and knowledge around interventions that work.

RCTs clearly have limitations. At Oxford Policy Management, we sometimes take the decision to replace RCTs with quasi-experimental designs still capable of generating robust evidence, but less challenging to implement for stakeholders and beneficiaries. Whenever possible, we strive to enrich our impact narratives by complementing our RCTs (or quasi-experiments) with other research approaches, which are more qualitative in nature. In fact, we would argue that ‘mixed-methods’ approaches – where qualitative and quantitative research methods offset each other’s limitations and complement each other’s insights – should become the new ‘gold standard’ in impact evaluations.

Area of expertise

Research and Evidence (R&E)