Toward age-adjusted mortality (part 1)
A rummage in the Office for National Statistics' pantry...
We all know that age-adjusted mortality is where its at but do you think you can go find a source file that usefully provides all the info you need in one neat spreadsheet sitting on the ONS server somewhere?
No indeed!
Neither am I interested in any old age-adjustment either: I want population estimates by quinary age band (5-year) from zero upward by gender by year going all the way back to 1970. I want to have my cake and eat it!
Whilst such a file does not exist, the ONS do offer some raw ingredients among the datasets listed on this page. A careful rummage will yield three files of importance; namely:
analysistool2020uk.xlsx
populations20012016.xls
historicmaritalstatusestimatesmid1971tomid2001.xls
These offer exactly what I want for the periods 1971 - 1981, 1986 - 2001, 2001 - 2016 and 2011 - 2020 providing I stick to a common 18-level coding frame of 0-4y, 5-9y, 10-14y, 15-19y, 20-24y, 25-29y, 30-34y, 35-39y, 40-44y, 45-49y, 50-54y, 55-59y, 60-64y, 65-69y, 70-74y, 75-79y, 80-84y, 85+y.
There’s A Hole In My Bucket…
Bolting these three files together to provide an annual span from 1971 to 2020 wasn’t difficult but the process did leave holes for 1982 - 1985 and 1987 - 1990, and didn’t provide estimates for 1970 or 2021. To overcome this I resorted to linear interpolation and ARIMA time series modelling. By way of example I offer both time series for the 0 - 4y age group for the period 1970 - 2021 so you can see just how well ARIMA performed in providing estimates for 1970 and 2021 along with the efforts produced by linear interpolation (missing values are sitting between pairs of dashed lines):
I’m quite pleased with this. Straight away we can see more males than females are born each year and this ratio hasn’t changed that much over a span of 52 years. Over on the far left we observe the residual of the baby boom of the 1960s coming to an end by 1979, only to kick-off again the produce the millennials. I hadn’t appreciated a third birth peak (Generation Z / Generation Alpha) but there you go, here it is in orange and sea green!
In my next post I shall reveal a few curious features of the ONS population estimates, so you better get some toast under the grill and get the kettle on.
That third Gen Z boom is powerfully infuenced by recent immigration: much of it from Eastern European countries after the fall of the old USSR, much also from African and other locations.
One of the most common birth names in the last ten or more years is Mohammad, (@ #5) and there are at least four spelling variants of this name which come into the top 200 boys names.
Do we have any idea why those years are missing from the ONS data?