Age-standardised annualised all cause monthly mortality in England & Wales 2001 – 2021
A comparison of methods (rev 1.4)
That title is a bit of a mouthful so let me dice it into… er… slices, then grill them lightly with a knob of butter, grated cheese and a blob of mustard:
Age-standardised
In our society older people are at greater risk of dying each year compared to younger people, so if the age profile of the population changes over time then the underlying national risk of death changes. In the UK we have a post war baby boom that is maturing nicely so as these souls reach those golden years of pharmaceutical bliss more of them are going to die than compared to matured souls three decades ago. These changes need to be accounted for and we do this by calculating the risk of death by age group then adopting a reference year to nail a population profile standard. In my calculations I have adopted quinary (5-year) age groups for risk assessment for males and females separately and used the age profile of the English & Welsh population as it was in 2012 as my standard. The ONS have adopted a number of age groups that have been standardised to the 2013 European Standard Population1.
Annualised
With around 50,000 deaths each month across England & Wales in a population roughly around 50 million we are going to observe crude mortality rates of around 100 deaths per 100,000 population. ‘Crude’ in this sense does not mean sniggering and behaving like a complete boor, this being the term we use to indicate that we haven’t bothered to go to the trouble of standardisation. A month is one twelfth of a year last time I looked, so if we want to compare what has been happening in a month with what is happening across an entire year then we have to multiply our monthly rates by a factor of 12 or thereabouts. I say ‘thereabouts’ because not all months were made equal and so we have to fiddle with month length if we are being really picky about this. We are familiar with the phrase a month of Sundays, so think of this as a year of Aprils.
All-cause
Everybody who dies no matter how and when. The where bit does indeed matter for these are deaths that are registered in England & Wales, this being a statutory requirement. Deaths in Bolivia do not count. It is worth noting that once upon a time people died from all manner of things like gunshot wounds, food poisoning, heart disease, cancer and stroke but these days the government and their hired minions prefer you die from a positive test result and call it COVID.
Mortality
Deaths per 100,000 head of population is the rate most commonly used, which is why I also use it thus ensuring circular reasoning. Whilst counting deaths is pretty straightforward counting the population is not. Now and then we have a census but that is by no means a gold standard, neither are registers of births, marriages and deaths or any expensive large-scale survey we care to make. Been there, done that, got the carrier bag. It is a sobering thought that when we divide a fairly precise number (like deaths) by a seriously dodgy number (like population) then we end with a quotient that is also seriously dodgy. Please bear this in mind when listening to smiling academics and government officers with gleaming figures.
Plat Du Jour
The dishes of the day are two lightly poached scatterplots followed by a generalised linear model for dessert. The idea here is to compare ONS’ age-standardised all cause mortality estimates for England & Wales for the period 2001 – 2021 with crude mortality estimates I derived myself (n.b. start with this post for details of my method) to get a feel for the difference. Once digested we will scoff a second scatterplot that compares the same ONS’ age-standardised all cause mortality estimates with my age & gender-standardised mortality estimates simply to see what drops out of the bottom. If anything of interest does indeed drop out of the bottom we will wipe it clean using a generalised linear model to formally quantify differences we may eyeball. After this we have some cogitation with a coffee and mint. The waiter will now take your order…
ONS Age-Standardised vs. Crude Mortality Rate
I am hoping folk will realise the green line is the function X = Y and not one of those OLS regression trend jobbies. With that in mind we can see a lovely cloud of points that sits above the green line which means that the process of age standardisation has served to inflate estimates of the crude mortality rate. This is exactly how it should be because our population is mature in years and we need to account for the increased risk. The entire pandemic is represented by just two outliers that I’ve labelled.
ONS Age-Standardised vs. Dee’s Age & Gender-Standardised Mortality Rate
This is where it gets mighty interesting! In the blue corner we have the ONS and their methodology and in the red corner we have John ‘red under the bed’ Dee and his methodology. The correlation between these two datasets is stonking with r = 0.942 (p<0.001, n=252). Note, however, that the data cloud is now floating above the green line which means the ONS are producing age-standardised mortality estimates that may well be over-inflated because of their reliance on the 2013 European Standard Population and not the 2012 England & Wales population. Their figures may also be over-inflated because they haven’t standardised for gender differences across age groups, and possibly because they didn’t use as many age bands2 - but I’ll need to verify this. The grey dashed line brings us to the coffee and mint…
Coffee & Mint
Anybody who cares to study the distribution of mortality rates will realise that it is not Normal. When I say not Normal I don’t mean that it has grown a third appendage; I mean that the distribution is Normal-like but not truly and fabulously Normal. I can prove this in one simple slide…
Here we have a histogram of the frequency distribution of all 252 monthly estimates of age-adjusted all cause mortality as offered by the ONS. On the far right you’ll see Apr 2020 and in the middle you’ll see a load of blue bars with a positive skew, that is to say they are sitting slightly to the left of a theoretical Normal distribution (black curve). This skew means we should ideally resort to generalised linear modelling (GLM) rather than pressing a button and fitting an off-the-shelf OLSR trendline as many bods do.
A positively skewed bunch of numbers like this can be successfully modelled using the assumption of a underling Gamma distribution so these were the buttons I pressed to obtain an intercept of 62.2 deaths/100k (p=0.010) and slope of 1.045 deaths/100k (p<0.001) in the generalised linear modelling of the relationship seen in the second slide. This has been plotted as the grey dashed line, which I hope subscribers will agree is a darn decent fit!
In Plain English
What these numbers mean in plain English is that ONS’ age-standardised estimates are consistently higher than my age & gender-standardised estimates by +62.2 deaths per 100k population and that this discrepancy widens at a rate of +1.045 deaths per 100k population as we move from lower to higher rates.
I have no idea which set of estimates is a closer approximation to the truth for England & Wales for the period 2001 - 2021 but I’d like to think mine are très scorchio. When all is said and done and we have punched ourselves silly it is worth bearing in mind just how fragile these numbers can be that spout from the mouths of officials and experts. I know for I was that soldier!
Update 9th June 2022
I have just come across this fabulous webpage that summarises everything we need to know about using the 2013 European Standard Population. I’ll get a brew on and start to digest the info.
From the ONS Excel standardisation template…
Age-standardised rates are standardised to the European Standard Population (ESP). This is a hypothetical population and assumes that the age structure is the same in both sexes, therefore allowing comparisons to be made between the sexes as well as between geographical areas. The ESP was first introduced in 1976 but has recently been revised by the statistical office of the European Union (Eurostat). This revision was carried out in light of the changes in the age-structure of the population that occurred in the European Union Member States since the mid-seventies.
My method uses a real population (England & Wales 2012) and assumes the age structure is not the same in both sexes. If comparison is paramount then the ESP approach makes sense; it does not make any sense if you are trying to estimate the real-world age-standardised mortality in a given region. Some good news is that the ONS use quinary age bands from 5 - 9y onward.
I’ll try and dig out details if I can find them - but see update.
I used 36 in total (18 age bands by gender). It’s a good job I can tell the difference between males and females.





It is jolly instructive to see how the mind of an analytical statistician works and translates into visuals. Tufte would be proud.
btw - any idea WHY the ONS (in providing stats for England and Wales) standardise to a European age profile?