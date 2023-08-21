In part 1 of this mini-series we took a quick look at how the COVID diagnosis as registered in the EPR of 19,457 in-hospital deaths stacked up against the predicted COVID diagnosis I have derived using machine learning applied to a matrix of symptoms, age, sex and disease prevalence. We observed a decent overlap of 86% of cases with just 14% falling into the ‘we-better-query-these’ category. This morning I’m going to follow through on the ‘we-better-query-these’ category by… er… running some queries, and I shall start by creating an indicator variable (COVagree) that identifies the 2,767 cases that were not in diagnostic agreement.

At this point I could sit with umpteen tables and a bucket of slides and work my way through trying to ascertain which, if any, factors stand out in distinguishing diagnostic agreement or I could rely on machine learning again and use multilayer perceptron to tell me in 6 seconds flat what it thinks is important. Herewith its final decision:

Case Complexity

