COVID Uncovered (part 4)
Lessons from an undisclosed NHS Trust
Yesterday I stumbled upon a most peculiar result: COVID patients who died in hospital within the NHS Trust under study over the period 1st Feb – 7th Dec 2020 were healthier than non-COVID cases. A staged multivariate logistic regression model revealed incidence of chronic respiratory disease, pulmonary disease, acute myocardial infarction, cardiac arrhythmia, heart failure, other cardiac conditions, hypertension, organ failure, clotting, haemorrhage, sepsis and general inflammation to be less frequent in patients testing positive with COVID prior to death when we’d expect some of these to be more frequent.
By way of example acute myocardial infarction – a disease strongly associated with the elderly – was found to be incident in 2,020/9,469 (21.2%) of in-hospital deaths for non-COVID admissions and incident in 276/1,687 (16.4%) of in-hospital deaths for COVID admissions (p<0.001). This is most odd! In contrast, acute respiratory conditions, diseases of the bronchus, pericarditis/myocarditis and diabetes were more frequently found in the electronic patient record of COVID cases as expected.
I decided to probe the relationship between COVID status and all other indicator variables I had gathered using Factor Analysis, this being a rather magical statistical technique for boiling large numbers of raw variables down into fewer family groupings (I attach a link to a Wiki entry for those interested in the methodological background). The colourful result is attached. Please note that I have only plotted score values exceeding 0.200 to avoid clutter – what you see are the scores that matter!
Down the left we have the names of the raw diagnostic variables that sit in my deaths database. Along the top we have 14 family groups (a.k.a. factors, a.k.a. components, a.k.a eigenvectors) to which these raw variables are deemed to belong in n-dimensional variance space. The matrix is arranged and coloured to reveal the relationships that have been unearthed.
Top left in pale grey are three raw variables that have been grouped together in a family under eigenvector 1 with scores of 0.968 (Falls), 0.762 (Fractures) and 0.648 (Serious wounding & injury). This grouping makes a great deal of sense and we might label it ‘heavy trauma’ or something of that ilk. The scores indicate how strongly each raw variable correlates with the family grouping, the maximum score being 1.000.
Below this family grouping sits a pale purple group of just diabetes (0.947) and immunocompromised (0.930). Diabetes does not necessarily mean that a patient is immunocompromised per se but many patients with diabetes have impaired immune systems due to the effect of both short-term and long-term hyperglycaemia. This grouping provides evidence that this is indeed the case.
So let us now look where COVID-19 sits. We find it under eigenvector 4 along with acute respiratory conditions (0.754), total diagnoses made (0.302), cancer (-0.407) and other respiratory/pulmonary disease (-0.210). What is this telling us? It is telling us that COVID and acute respiratory conditions are synonymous. It is also telling us that COVID deaths are inversely related to other respiratory/pulmonary conditions and cancer. That is to say, factor analysis is attempting to distinguish between respiratory viral deaths and respiratory deaths arising from cancers of the lung and bronchus.
Aside from this point of distinction COVID is not connected to any other disease or condition flagged within my sample. This is a little odd given the raft of symptoms declared in various case studies by various teams since onset of the pandemic. For example, I was fully expecting a link to sepsis, organ failure and cardiac conditions but this does not appear to be the case for these 11,156 deaths within this particular NHS Trust.


