Bad data can be worse than no data.
Bad quality data can be more dangerous than no data at all. With recent demands for more data, whether to report on Sustainable Development Goals indicators, monitor large programmes, or provide a baseline for planning and decision making, this message has often been lost. The quality of data can be impacted by poor design, mistakes in data collection, or mistakes (intentional or not) in creation of indicators and aggregation.
Prevention better than cure
A recent blog post hosted by the World Bank explains how much more attention is given to correcting data after they have been collected, rather than quality assurance in the field. Unfortunately, systematic errors in data collection can have a huge impact on point estimates and cannot always be corrected – particularly if there is not sufficient understanding of what happened during the collection of the data. In this regard, we have recently conducted data quality assessments or audits to check on the quality of data collected. From this, we propose recommendations to make concrete adjustments to data collection.
Key to undertaking a data quality audit is identifying the main stakeholders and their interactions, with relation to data production and use. This is done through a mapping of users and producers and the flow of data. Understanding the interaction between the different stakeholders is important for identifying the incentives related to recording, maintaining, and reporting quality data.
International best practices for the production of quality statistics are outlined by sources such as the Handbook on Data Quality Assessment Methods and Tools (EuroSTAT) and include the standard data quality indicators of relevance, accuracy, timeliness and punctuality, comparability, coherence, accessibility, and clarity. These standardised indicators, which can be applied in the context of survey or administrative data, should be tailored to the specific project or programme requirements. Useful tools to assist in the assessment include:
- observational checklists, which are completed following ‘a day in the life of a data collector’ approach; and
- data quality checklists with a traffic light approach for evaluating data quality dimensions and providing a visual representation of results.
When conducting our data quality assessments, we aim to provide the relevant data collection entity with clear and concrete recommendations. They can take these forward to address any challenges that could impact the quality of data. This could be automating data collection to reduce data entry errors, combining data collection tools to avoid duplication, or providing further training to data collectors.
We have undertaken data quality assessment in a number of countries and across various sectors including the Millennium Challenge Account in Namibia (education) and Indonesia (green prosperity fund), and we are currently conducting a data quality audit for the WASH sector in Zambia.