SIR models and Covid-19 spread

This is being written in early April 2020 from the United Kingdom (UK). A few weeks ago a friend sent me theoretical material on a model often used to understand the spread of disease. I had already done some preliminary spreadsheet presentations on UK data and extracted a rough idea of the rate of spread, using an exponential model. At that time, mid-March 2020, I had found that an exponential model would have the entire UK population infected by the virus within 40 days. That's less than 20 days from now.

In real life, exponential growth does not continue for ever. More sophisticated modeling uses a number of different approaches. For example, stochastic models set up scenarios with people-like objects subject to random influences that researchers imagine are relevant to the spread of the disease. The model is run many times and in many variations, speculative conclusions made, and plans drawn up. The plans depend on the relative and overlapping values placed by different societies on life, “the economy”, future prospects, class, the elderly, ethnicity, etc.

Here, we look at a discretized form of a compartmentalized deterministic model that splits the population into three parts: susceptible (S), infected (I), and recovered (R). Collecting the three parts as “SIR” gives the name to the model. On any given day, some susceptible people become infected and some infected people recover (hopefully). For simplicity, we also include death as a form of “recovery”, although to the victim or their family it is not. [Wondering here if a fourth compartment “(L)oss” could be set up, which would account for overwhelmed health services, or “businesses” in some countries.]

Mathematical details (click to reveal)

So how does the SIR model work? The proportion of susceptible people who become infected on a day depends on the number of infected people they are likely to contact. This reduces the pool of susceptibles. Using the common symbol “Δ” to represent the change we have:

\[\Delta S = - \alpha SI\]

The symbol S represents the susceptible pool, I the infected pool, and α some constant. Putting these symbols together indicates that they should be multiplied.

So we have increased the number of infected people, but we also have to account for the fact that a proportion of people recover if they are infected. The changes here are the increase from newly infected people and decrease from recovery:

\[\Delta I = + \alpha SI - \beta I\]

The new feature here is the constant β. Recovery is an individualized process. The sequence is completed with the change in R:

\[\Delta R = + \beta I\]

Notice that if we add up all the changes:

\[\Delta S + \Delta I + \Delta R = - \alpha SI + \alpha SI - \beta I + \beta I = 0\]

The total number of people (S + I + R) doesn't change, just their assignments to the compartments.

There are a number of valid criticisms of the model. The most obvious to me is that the disease doesn't act on a day-to-day timetable, but its action is spread over a period of around two weeks: a latent period where the victim doesn't have enough viral load to infect others (significantly?), the most dangerous infectious period where they don't appear sick, then the sickness itself, which also appears to have stages, before, hopefully, recovery. During the sick period, those at risk are family carers or, if necessary, hosptial staff, so there is a different infection dynamics. All these factors mean the daily data put out by government contain all sorts of delay factors in terms of the spread dynamics.

For the purpose of this article, my main concern is to show how one might get a handle on the gross α and β parameters from the data presented by governments.