1. Miscellaneous
    author

    Fraser Nelson

    Covid statistics and the era of hyper-scrutiny

    Covid statistics and the era of hyper-scrutiny
    Text settings
    CommentsShare

    Amanda Pritchard, the new NHS England chief executive, has had quite a week. She wrote an article for the Health Service Journal about the pressures on the NHS and followed up with a Sky News interview where she had this to say:

    ‘There is no doubt that the NHS is running hot and there are some very real pressures on health and social care. We have had 14 times the number of people in hospital with Covid-19 than we saw this time last year. We also had a record number of A&E attendance and a record number of 999 calls.’

    Where did she get that 14 times figure from? By using statistics in a strange way, highlighted by Kate Andrews fairly shortly afterwards. By ‘have had’ she was technically correct insofar as this was the peak ratio. But comparing a wave to a non-wave, and presenting a peak value as somehow representing the current situation is fundamentally misleading. The actual picture for Covid hospitalisations is here. I won’t republish the graphs as this blog is about a broader point.

    Brief history of the ‘porkie’

    What Pritchard said was technically true but fundamentally misleading. Until fairly recently, this combo was a standard and successful tactic in public debate: cook up a dramatic sounding figure, release it on broadcast and it’s repeated. Speechwriters would be trained in how to select punchy figures: a standard part of the dark arts. But now, such tactics are called out – quickly and forcefully – as Ms Pritchard found out this week. In her old job she could have used this figure without pushback: in her current job, she faces hyper-scrutiny. Kate Andrews covered it here, and many other publications followed as did Radio 4’s Today programme. Ms Pritchard was admonished by the Health Secretary. It was a PR disaster.

    When it comes to reporting statistics, we’re all in a new era with new rules. Once, politicians could cook figures with impunity. I’d call them ‘porkies’: statistics that were not lies, in the strict sense of the word, but did nonetheless mislead. Jack Straw would talk about ‘5,000 new policemen’ when the police headcount had not increased, but there were 5,000 people who were not policemen before. George Osborne would talk about ‘falling debt’ when debt was rising, but at a slower rate than the economy was growing so the debt/GDP ratio was falling. In this way, a rise can be presented as a fall. Nerds might quibble, broadcasters would never get into so much detail. This changed during the 2010 election: bloggers would mount effective challenges to party political lies.

    Covid and the data glasnost

    The same scrutiny is now being applied to public health thanks to a hugely significant side-effect: the mass availability of data. The UK coronavirus dashboard team have pulled off a revolution – or certainly a glasnost – in public data. They don’t just release figures but release them via an API. That is to say, they make it easy for anyone to query their database. You can now find any serious Covid information and be informed pretty much as soon as the data is released.

    Numerate journalists like Sky’s Ed Conway (who keeps his own database) can now check claims against facts in seconds – and blow the whistle very quickly when officials start to stretch the truth. These APIs now drive The Spectator’s data hub: updated automatically, every day, due to developments in tech. They drive Our World In Data, Max Roser’s inspirational project that vastly broadens and democratises access to all kinds of figures. This easy-to-access database doesn’t just make the truth easier to discover but makes lying far riskier: the odds on being busted have just surged.

    So it’s hard to claim (as Ms Pritchard’s officials tried to) that the August NHS figures are the latest available when API tech means Covid hospital figures are now updated daily – not just on the government’s superb Covid-19 dashboard but the many services plugged in to that dashboard. The Spectator data hub is one of them. We also have the UK Statistics Authority which is willing to confront the government.

    The team behind the UK coronavirus dashboard has set a gold standard that can be copied throughout government: user-friendly APIs with Java and Python wrappers and great documentation. Organisations have also sprung up making the most important data available through APIs. Johns Hopkins university did this with worldwide Covid statistics (allowing all of those cases/deaths comparison charts). Our World In Data has also made its hugely-impressive collection open to others via GitHub feeds, so its staff take the time to manually collate figures that are then released automatically. A whole bunch of publications owe a debt to Max Roser and his team.

    You don't have to be rich a genius. With today's tech, a self-taught teenager can create a homemade data hub which allows anyone to fact-check ‘expert’ claims in real time. At The Spectator we use Datawrapper, a revolutionary (and free) tool that combines resizable graphs with the functionality to scrape data from live sources. In The Spectator Data Hub we have created something that I’ve wanted to do all of my journalistic career: open up the databases previously available only to journalists (my own nerdy laptop collection has thousands of data series) and give anyone interested the same access to the latest data - just as journalists and policymakers have. This also means giving anyone the tools to prove us wrong.

    The Spectator is a small magazine (we have far fewer editorial staff than the New Statesman, let alone newspapers) but a small-budget publication like ours has been able to create a data hub thanks to the brilliant work of the people behind the coronavirus dashboard and the spirit of open data. Datawrapper is free to use. The kind of info that used to cost tens of thousands is now free.

    The Covid response has shown the democratising power of open data. It empowers reform-minded politicians who are no longer confined by the metrics their officials wish to provide for them. If a minister orders metrics to be made openly available as an API then someone like Max Roser will come along and create a platform that vastly improves the quality of debate for everyone. Government officials use Our World In Data because they struggle to get the same information from their own officials. The app in your pocket tells you when your bus is due not because TfL has designed the software, but because they were told to open up bus geotagging data to anyone who wanted to use it. The apps followed later.

    As Cabinet Secretary, Steve Barclay has huge power to perform a glasnost on public data by ordering every government department that releases data series (and, indeed, every public body) to do so as an API using the public Covid data as a model. That would, in a stroke, open up everything to anyone.

    I’d love to add education statistics to our data hub, to look at social justice issues like ethnicity and university admission. But these figures are jealously guarded by Ucas (and quotes for data queries are prohibitively expensive). If Nadhim Zahawi wants to use data better to improve the system, he can ask Barclay to order officials to open up data as Public Health England has – and then let a thousand flowers bloom.

    PS Covid has also brought an unprecedented collaboration between journalists and academics. We've been delighted to publish many scholars throughout the pandemic. Academics also disagree: a point not always reflected in public debate.  Take for example Ferguson’s extraordinary claim that 20,000 to 30,000 lives would have been saved had his advice on lockdown been followed a week earlier. If it turns out that he cooked up this figure by stretching assumptions – something no journalist would be able to find out – it can be exposed by other academics who go through his coding and identify his (to put it politely) unusual assumptions. That critique can now be made not just in academic papers but mainstream publications. Similarly, when Imperial College falsely denies that it ever modelled 85,000 deaths for Sweden, it’s now possible to point to the dataset and cell (H521) where they did precisely this. 

    PPS The other factor (for us) is the new breed of data reporters. The Spectator data hub is edited by Simon Cook, who recently completed a masters in data analysis at the Georgia Institute of Technology. He has done a bunch of things with his life but likes the idea now of using his skills to make data more accessible. Michael Simmons, formerly a statistician at the National Records of Scotland, retrained as a data journalist and now works with us in 22 Old Queen St. You don’t see their bylines on their work in the same way that you do for reporters, but if you see an illuminating and well-sourced graph on The Spectator’s website you’ll probably have one of these two to thank.