A study on the quality of novel coronavirus (COVID-19) official datasets

Research output: Contribution to journalArticlepeer-review

20 Citations (Scopus)
33 Downloads (Pure)


Policy makers depend on complex epidemiological models that are compelled to be robust, realistic, defendable and consistent with all relevant available data disclosed by official authorities which is deemed to have the highest quality standards. This paper analyses and compares the quality of official datasets available for COVID-19. We used comparative statistical analysis to evaluate the accuracy of data collection by a national (Chinese Center for Disease Control and Prevention) and two international (World Health Organization; European Centre for Disease Prevention and Control) organisations based on the value of systematic measurement errors. We combined excel files, text mining techniques and manual data entries to extract the COVID-19 data from official reports and to generate an accurate profile for comparisons. The findings show noticeable and increasing measurement errors in the three datasets as the pandemic outbreak expanded and more countries contributed data for the official repositories, raising data comparability concerns and pointing to the need for better coordination and harmonized statistical methods. The study offers a COVID-19 combined dataset and dashboard with minimum systematic measurement errors, and valuable insights into the potential problems in using databanks without carefully examining the metadata and additional documentation that describe the overall context of data.

Original languageEnglish
Pages (from-to)291-301
Number of pages11
JournalStatistical Journal of the IAOS
Issue number2
Publication statusPublished - 1 Jun 2020


  • coronavirus disease (COVID-19)
  • data quality
  • epidemiology
  • health emergency
  • measurement error
  • official statistics
  • public health
  • SARS-CoV-2


Dive into the research topics of 'A study on the quality of novel coronavirus (COVID-19) official datasets'. Together they form a unique fingerprint.

Cite this