Estimating Daily People Tested (part 5)
Estimation of the number of people undergoing virus tests in England prior to de-duplication of data records (rev 1.0)
In my previous newsletter in this series I opted for a spot of linear regression to tease out the relationship between the number of unique people tested, the number of tests undertaken and the number of cases detected on a daily basis, the idea being to predict the number of people tested from tests and their outcome. I ended by revealing a rather tasty model that accurately predicted the number of people tested for the period 8 Dec 2020 - 27 Dec 2020 (n=333). Today I shall reveal modelling results for the subsequent periods of 28 Dec 2020 - 20 Feb 2022 (n=420) and 21 Feb 2022 - 7 Jun 2022 (n=107).
Just To Recap
These are the three periods that mark the three different ways in which the UK GOV coronavirus team went about book-keeping to obtain the daily number of unique people tested as opposed to the number of people taking tests. There is a difference and that difference is important, the former being a count limited to one test result per person per week regardless of the number of tests they actually undertake, with the latter being a count of the tests each person actually undertakes each week. Since these counts go to deriving estimates of case positivity then rates churned out by the authorities will be necessarily inflated.
This is a shining example of how the public can be fleeced into thinking things are worse than they really are, with the authorities putting their hand on their heart and claiming, “but these are the facts!”
Now to business…
Model Performance for Regime 2 (28 Dec 2020 - 20 Feb 2022)
Once again I’m not going to present pages of statistical output and tests of model adequacy, and shall get straight to the point with a slide of observed Test & Trace daily counts of unique people tested over time and values predicted by generalised linear modelling (GLM), though I will mention that the adjusted R-square ended-up at a rather robust R2(adj) = 0.982. Here’s the proof of the pudding for the second time period:
Model Performance for Regime 3 (21 Feb 2022 - 7 Jun 2022)
Proof of the pudding for the third time period, R2(adj) = 0.986:
Cogitation & Coffee
I think we can safely say that I’ve demonstrated it is eminently possible to accurately predict the number of unique people tested per day using counts of the different types of test and their outcome. This should not come as a surprise. What is worth cogitating on is that if the number of people tested, the number of tests and the number of cases are all part of a rather intimate and cosy numerical relationship then we may deduce that the number of cases detected will be a function of people tested and tests and not just something the virus is doing!
Indeed so, for I undertook some cheeky modelling along these lines back in July 2021, and under the heading ‘Is Viral Testing Churning Out Random Numbers?’ I revealed that 92.5% of the variation we see in case counts can be attributed to varying levels of viral testing alone. That statement caused quite a stir on facebook and so I may well repeat the exercise with fresh picked data.
In part 6 I shall bolt all three modelled series together and have another cogitate & coffee session.
Kettle On!



