An Easy Way To Understand The Cross Correlation Function
Most folk have an intuitive grasp of correlation between two variables and the ubiquitous measure we use to reveal how strong this correlation is, this being Pearson's correlation coefficient a.k.a Pearson's product-moment correlation coefficient a.k.a Pearson bivariate correlation; this being denoted by the English symbol 'r' or Greek letter Rho (ρ).
When r = 1 we have perfect positive correlation which is rarely observed in nature. This occurs when two measures 'go up together' or 'go down together' such as hours of sunlight and temperature. When r = -1 we have perfect negative correlation which is again rarely observed in nature. This occurs when one measure goes up as the other goes down, this indicating an inverse relationship (a good example is a see-saw).
Between these two extremes sits r = 0, which indicates zero correlation. As a statistician I'd use an example along the lines of two sets of random numbers but we can get the same effect by considering government policies.
In the case of COVID we may correlate daily counts of cases against daily admissions to hospital. In doing so we must realise that these are for the same day such that I'm correlating Monday's case count against Monday's admission count, Tuesday's case count against Tuesday’s admission count and so on.
Is this reasonable? Not in the slightest because Monday's admission count might have been generated by cases arising two weeks ago! When exposed for what it is we may see the nonsense but this is the very nonsensical way in which expert after expert is going about analysing the data. You can fudge matters by assuming a standard two week delay, this being the preferred method for the ONS/PHE and others. This will work well if the delay to treatment is exactly two weeks and only ever two weeks. What if it is not two weeks? What if the delay to treatment varies?
What we need to do is compare Monday’s case count not just with Monday’s admission count but Tuesday’s admission count, Wednesday’s admission count and so on. Neither do we need to stop there for we can compare Monday’s case count with the previous Sunday admission count, previous Saturday count and so on. This way we get to see the full spectrum of all possible correlations between case count and admission count over time. No fudge needed!
This is precisely what the cross correlation function (CCF) is doing, and in applying CCF we get to spot anything hidden that doesn’t make sense. The stuff that doesn’t make sense in a big way leads to awkward questions about the quality of the data, how it is collected, how it is defined and what it actually means. As it stands it means we’ve been duped into thinking there is a link between the rise in cases and the rise in hospital admissions due to COVID.
https://en.wikipedia.org/wiki/Pearson_correlation_coefficient

