Excess Death Figures: Further Considerations (part 1)
Weekly excess death has become a popular yardstick for assessing the impact of COVID, government policies and COVID therapies but this method is brimming with issues.
My recent five part series entitled Excess Deaths by Cause, England 2020/w1 – 2022/w46 packed quite a punch. Back in the very first article I rambled on about inadequacies of this widely studied data under the sub-heading A Quick Word About Excess Deaths and I’d like to draw this to people’s attention once more by quoting the most pertinent parts:
Subscribers will be familiar with my usual ranting about this commonly used but rather inadequate method of calculating excess death. I’m not going to bang on about this again other than to say disease doesn’t carry a diary, and neither does the weather or any pathogen that I know of. There is seasonality with certain disease groups, especially respiratory, but that seasonality is not rigid. Thus, something nasty arriving a week or two earlier or later than ‘normal’ is going to throw such a basic calculation and give rise to spurious spike. Aside from the sliding of seasonal effects we’ve got the issue of longer-term trends that will rest on a vast raft of factors. The presence of any long term trend makes a mockery of any baseline based on a 5-year mean.
Then there’s the issue of the population changing over time both in size and age profile. Normally we have to adjust for this to avoid bias but the ONS don’t bother. I would bother (and have done so for previous analyses) but my time pressures are immense right now and I want to present excess death using the same method as the ONS. If I can, I’ll scrape some time together to produce a set of revised slides that take into account the changing age profile of the nation. One thing I will say in defence of ONS’ inadequate method is most dying is done by the oldest age groups and these sub-populations haven’t changed much over the last 10 years. Comparisons I have made between age-adjusted and non age-adjusted excess (not published) reveal minor differences that are somewhat academic, but I may well pen an article that reveals this!
…so I’ve gone and scraped some time together to produce a set of revised slides that take into account the changing age profile of the nation, and what I shall do below is set out a selection of dual time series plots so we may see with our very own eyeballs what differences arise when we tackle the changing national age profile. This, I feel, is most timely given the impact that these types of analyses (including mine) are having, and especially so given Professor Carl Heneghan and his team drew our attention to these tricky issues in this Substack article.
You know the drill… get the kettle on, fire up that toaster and let’s get munching!
Estimating Sub-populations
Before we eyeball yet more slides I better point out that all of my age-adjusted calculations, including mortality rates, are derived using population estimates for each age band using ONS per year of age data, thus a mortality for the 18 – 29y band will be the number of deaths for that age group divided by the population estimate for that precise age group (for the nation of England).
Anybody wanting to derive appropriately-banded sub-populations for themselves can find a rather large (77Mb) but tremendously useful Excel spreadsheet called analysistool2020uk.xlsx right here on the ONS website. This table provides annual trends based on midyear estimates only, so it is necessary to infill data for higher resolution monthly or weekly data-files. There are a number of ways of achieving this but we always need to bear in mind that sophistication is lost on a dataset that starts out being a pile of ONS estimates churned out by models resting on assumptions. We’re not very good at counting heads and there’s no way round this!
My preference is to utilise a combination of time series modelling, missing value estimation and linear interpolation modules within my stats package to arrive at sensible-looking growth curves. My eye is thus the final judge.
Using Sub-populations
Using the sub-population counts by age group is another matter entirely and again there are a number of ways of going about this. One method is to adopt a reference year, thence to derive a series of multiplication factors to apply to counts of deaths with each age band for other years. By way of example, if there were 10% less 70 – 79 year-olds back in 2015 compared to 2019 then the multiplication factor applied to 70 – 79 year-old deaths in 2015 would be x1.11. This is a simple calculation to make and is intuitively easy to understand, so I shall proceed by selecting a few choice time series so we may compare weekly excess death curves for unadjusted and adjusted (standardised) sub-population counts.
If sub-populations are changing little over time then we are not going to see much difference, if any, between the unadjusted and adjusted time series. However, if sub-populations have been somewhat dynamic we’ll start to see the two time series diverge. If the sub-population has been growing in size over time then the curve for standardised excess will be tucked beneath the unadjusted series. With this in mind let us peruse a modest selection of slides to get a taste.
A Modest Selection Of Slides
Round One – Neoplasms (80 – 89y)
Here we have the comparison for neoplasm in the 80 – 89y age group. There’s a difference between the two series indicating this sub-population has been increasing since 2015. To get a feel of the overall dynamic, herewith a table of summary statistics for these two series:
The shape of the distribution has remained the same, and we can still observe the CHEC (Catastrophic Health Collapse) death spike of spring 2020 but the overall mean excess is now bobbing well below the zero axis at -38.96 deaths per week instead of +9.96 deaths per week. Standardisation using sub-population head counts has thus served as a useful means of calibration, and we may conclude that, apart from the CHEC death spike, excess neoplasm deaths in this age group have been running below zero. This makes sense in view of survival following a pandemic of the elderly, and it does highlight the inadequacies of the crude method used by ONS which ignores changes in sub-populations over time.
Round Two – Diseases of the Circulatory System (70 – 79y)
There’s quite a difference here and that is because this age group has been increasing the most since 2015. The patterns are near-identical and this arises from the near-linear increase (though we must note this is a rather large assumption). Herewith that summary table:
We find that we’ve jumped from a mean positive excess of 32.83 deaths per week for this period to a mean negative excess of -31.76 deaths per week. Now that is what I call a shift, and it has come about by a rather dynamic age group!
Round Three – Diseases of the Respiratory System (60 – 69y)
We jump from one extreme to the other, for there has been little overall change in the estimated size of this subgroup since 2015 and so the two curves are near-identical. I say ‘overall’ because the dynamic has been one of rise, then fall, then rise which serves to sum to little net change.
Round Four – Mental and behavioural disorders (40 – 49y)
Another example of a population subgroup whose subtle dynamic over recent years hasn’t made much of an impact on the analysis.
Round Five – Diseases of the nervous system (50 – 59y)
Some subtle differences here brought about by changes to the underlying sub-population over the period 2015 – 2022.
Coffee & Cogitation
I am hoping that readers have gained further insight into the tricky business that is deriving excess death. We’ve seen some significant impacts of standardising deaths according to size of the sub-population and yet we’ve been pushed to detect any difference in excess for certain age groups where the underlying population has remained fairly stable over time.
We’ve learned that the pattern of excess death for the period 2020 - 2022 doesn’t change that much following standardisation and this is because estimates of the national population do not bounce around in value from month to month, or even year to year. The CHEC death spike of spring 2020 still stands as mysterious anomaly despite standardisation: somebody somewhere better start explaining this with authority and truthful rigour, for the current tranche of lame excuses put out by the government and its advisors are utterly inadequate.
One thing that standardisation has done is serve to calibrate what were some excessive excesses for certain age groups, and this result has been rather welcome. If we cogitate on this a little we’ll come to the realisation that these few discrepancies are going to blend with estimates that haven’t budged and so the overall picture will be one of reasonably modest error. I shall prove this right now by whipping out the slide for all age groups for heart disease:
Now that helps me breather a little easier! The lesson here is that the more we dis-aggregate the data the pickier we need to be when we come to handle it. I think that about wraps up what I wanted to cover for the time being so I reckon its high-time for a tea break!
Kettle On!
I'm confused:
"By way of example, if there were 10% less 70 – 79 year-olds back in 2015 compared to 2019 then the multiplication factor applied to 70 – 79 year-old deaths in 2015 would be x0.9. "
Isn't this backwards? Shouldn't the multiplier be (1/0.9)=1.11?
Example: In 2015 there were 10 deaths in a population size of 10. Everyone died. In 2019 there were 10 deaths in a population size of 1 million. 2015 was a very deadly year, but would have a multiple of basically zero. That can't be right. 2015 should be weighted more, not less.
Would the following be an equivalent way of phrasing the population size standardization?:
1) For each year in the baseline, calculate the event rate (occurrences per person).
2) A = the average of the event rates of the baseline years
3) B= the event rate in the year in question.
4) C = (B-A) = "excess event rate" in the year in question
5) D = the population size for the year in question
6) E = (C X D) = standardized excess count of events in the year in question.
Hi John,
When are you going to look at the method used by Our World in Data to calculate excess deaths? It would be helpful, as my government, NZ, is using their method to pat themselves on the back, since at the close of 2022, they still have negative excess deaths. Admittedly, their December 25 weekly figure is total crap, as it was based on an initial estimate of 602 deaths, but that has grown to 674 deaths as more death registrations have been processed. My colleague in Nz, who has developed his own regression analysis, says the OWID method lacks transparency.
Cheers
Terry