Age & sex standardised monthly all cause mortality 1970 – 2021
A quick look at the monthly series for England & Wales (rev 2.0)
Yesterday I took some stats tools for a spin on the annual series of 52 data points. This morning I am going to repeat the exercise for the 624 data point monthly series for age and sex standardised all cause mortality. We shall start with a time series plot of those 624 juicy points that reveals a few interesting things. The overall decline over time isn’t exactly new news, neither is the repetitive seasonality that turns this series into an alligator jaw. What is relatively new news is that our alligator has only two sharp front teeth, these being Apr 2020 and Jan 2021. The alligator also possesses some very sharp teeth way back in its mouth in Dec 1989, Feb 1976 and Jan 1970. Neither would I want to get snaffled by teeth at Jan 1997, Dec 1998 and Dec 1999!
Subscribers should know the drill by now, so it’s time to get the kettle on, open the biscuit barrel and have a cogitate on the same data series converted to its first order differential; that is to say, a series of the month-to-month changes in mortality. As before I’ve gone all control systems theory and scribbled on a pair of 3 sigma boundary conditions that traditionally sound the alarm in places like biscuit factories.
Yep, Apr 2020 and Jan 2021 sure sounded the broken biscuit alarm but no other month did! This is all very strange considering we are supposed to have had a deadly novel virus ripping through the entire population of England & Wales. Perhaps it was only contracted to work during two months of the year. In doing so it isn’t alone for we see five other instances when the alarm was sounded, with Dec 1990 really hitting the big red button for an upswing in mortality. In my way of thinking Apr 2020 was our sixth troublesome moment over the past 52 years yet the government and their intimately attached experts decided to panic as if history didn’t exist.
Intervention Analysis
After some eyeballing we now need to formalise what we’re seeing and the ideal way to do this for a time series such as this is to use… er… time series analysis (does what it says on the tin). So yes, it’s time for Box Jenkins Autoregressive Integrated Moving Average (ARIMA) time series modelling again. The base period for this was Jan 1970 to Feb 2020 and after a fair bit of handle turning I settled on an ARIMA(2,1,1)(0,1,1) model structure as the best fit to the data using common sense as well as formal methods for determining the best fitting model. This looks pretty cryptic, I admit, but all this boils down to is that there is a both a linear (local) trend and seasonal trend as well as seasonal and non-seasonal ‘shocks’. The more interesting component is arguably a non-seasonal memory effect, whereby what the mortality rate is doing this month depends on what it was doing last month and the month before that. Death has a memory span that is two months long!
The Pudding
The proof of the pudding is in the eating and so I place before you a tasty treat, being observed and predicted monthly age and sex standardised all cause mortality. I have left off the 95% confidence intervals because they were so tightly bound that the chart looked a right mess! With a well-fitting model such as this (r = 0.973, p<0.001, n=589 for the period Jan 1970 – Feb 2020) we reliably get to see what may well have happened if the pandemic had not arisen. If we take a magnifying glass we note the unseasonal first COVID spike that was Apr 2020 and the seasonal second spike that was Jan 2021.
The trick now is to set up a series of indicator variables that mark each month during 2020 and run the baseline model structure all the way from Jan 1970 to Dec 2021. When this is done we end up with this fascinating table…
What we need to focus on in this table are those regression coefficients. The column headed ‘Estimates’ indicate the factor increase in mortality over the modelled baseline in log form. The column headed ‘Approx Sig.’ tells us if each estimate is statistically significant. Thus we see that Apr 2020, May 2020 and Jan 2021 all reach p<0.001 whereas Nov 2020 and Dec 2020 reach p=0.002, with Feb 2021 fetching up at p=0.005. We can ignore all other months since these fail to rise above the null hypothesis to become significant predictors. Thus, the pandemic only reared its head for 6 months out of a total of 12, this being a half-hearted effort by SARS-COV-2 (literally). The gold goes to Apr 2020 for securing a 78.2% rise in all cause mortality over that anticipated by historical trends (e^0.578 = 1.782). Silver goes to Jan 2021 for securing a 47.4% rise (e^0.388 = 1.474), with May 2020 achieving bronze at 22.1% (e^0.200 = 1.221).
Identification Of Outliers Method
There’s another approach we may take for intervention analysis and that is to sweep away all the monthly indicator variables and get the ARIMA(2,1,1)(0,1,1) model process to tell us where it thinks the outliers are. There are several flavours of outlier within time series analysis that I shall not discuss here other than to reproduce the tabular output from the program…
Above is a list of all months that the ARIMA(2,1,1)(0,1,1) algorithm reckons stick out in some way. Interestingly, all but one are of the additive variety. An additive outlier is an outlier that affects a single observation, so we are looking at a collection of isolated outbreaks of something. Apart from Apr and May 2020 all are outliers that land during the winter flu season and this is likely the cause of additional death. Apr and May 2020 are thus very queer fish indeed in terms of how respiratory illness typically sweeps through the population. We may note the Apr 2020 estimate of 0.541, which indicates a 71.8% increase in mortality over that expected (e^0.541 = 1.718), this being in the same ball park as my earlier estimate of 78.2% using a different method. Aside from Apr and May no other months during 2020 are designated as outliers, which is intriguing for a pandemic that was purported to have swamped us with second and third waves according to the experts.
The negative value for Dec 2021 is almost certainly due to delays in processing death certificates and needs to be ignored, though I am sure that some will unscrupulously claim this is due to vaccine benefit. In fact, we may take that natural log figure of -0.307 and convert it back to 0.736 to suggest that the ONS are some 26.4% down on processing death certificates for Dec 2021. That’s quite a lot of missing deaths! Is there something they should be telling the nation?