In this figure we compare the current number of COVID-19 fatalities to date shown with color dots, with the multiple projections drawn from the posterior predictive distribution; these projections are shown as faint solid lines (note that the more lines we have the more likely that path will be). We have defined the zeroth time for each country to the day they announced their first fatality record. The vertical grid lines represent important events that may have affected the growth rate such as the separate lockdowns (LD) applied by China, Italy, and Spain. Note that all curves have been drawn from a logistic model and predict the # fatalities (N) for each country analyzed here.
A population will grow its size according to a growth rate. In the case of exponential growth, this rate stays the same regardless of the population size, inducing the population to grow faster and faster as it gets larger, without an end.
In theory, any kind of organism could take over the Earth just by reproducing. For instance, imagine that we started with a single pair of male and female rabbits. If these rabbits and their descendants reproduced at top speed "like bunnies" for 777 years, without any deaths, we would have enough rabbits to cover the entire state of Rhode Island. And that's not even so impressive – if we used E. coli bacteria instead, we could start with just one bacterium and have enough bacteria to cover the Earth with a 111-foot layer in just 36 hours!
As you've probably noticed, there isn't a 111-foot layer of bacteria covering the entire Earth (at least, not at my house), nor have bunnies taken possession of Rhode Island. Why, then, don't we see these populations getting as big as they theoretically could? E. coli, rabbits, and all living organisms need specific resources, such as nutrients and suitable environments, in order to survive and reproduce. These resources aren’t unlimited, and a population can only reach a size that match the availability of resources in its local environment.
Population ecologists use a variety of mathematical methods to model population dynamics (how populations change in size and composition over time). Some of these models represent growth without environmental constraints, while others include "ceilings" determined by limited resources. Mathematical models of populations can be used to accurately describe changes occurring in a population and, importantly, to predict future changes.
*end of the khanacademy citation*
The figure above shows how the logistic and exponential models are constructed; to underestand them better you can watch the video "Exponential growth and epidemics" bellow.
After reading this text it should be obvious to us that the growth of the virus cannot be exponential indefinitely but it has to flatten at some point. One of these functions is the logistic model, used here to predict the number of deaths.
If we solve the equation on the right of the previous figure, we obtain the logistic function. A logistic function or logistic curve is S-shaped. This type of curve is known as a sigmoid and its equation is as follows:
$$N(t) = K/(1 + e^{-r(t-t_0)}).$$If one talks about infection or death rate and aims to fit a logistic function to the data, one needs to calibrate the $K$, $t_0$ and $r$ parameters during the evolution. Its careful tracking can be very useful for getting zero order intuition about the efficiency of the measures taken to contain the disease. Interestingly, these quantities may be estimated by knowing that the following relations are satisfied,
Some of the quantities above can be estimated by tracking the data. On that sense, knowing (or estimating) the numbers of affected people $n(t_0)$ and the maximum rate, or what is more commonly known as the peak of the distribution, one can invert the equations to predict $K$.
A logistic function represents a simplified form of the more complete SI (Susceptible, Infected) models (wiki). A SI model is a dynamical system that simulates the interaction and evolution rates of a population with N=S+I elements, where the dynamical variables are:
The logistic function arises as the solution of this dynamical system that deals with the flow of people from S to I. Here I is taken as the total cumulative infected cases that is, people that has been infected regardless of whether they recover or not. Notice that with this definition, I it grows monotonically until it reaches a maximum, thus mimetising a Sigmoid-type behaviour. This can be naturally translated to the K number of death counts if there exists a known empirical relation between them. This applies in our particular study..
The logistic model defined above and a nonnegative binomial distribution as likelihood, to obtain the posterior predictive distribution of our model; from which we will sample to generate new data based on our estimated posteriors. Please do not get disturbed by this, if you want to have a rough idea of the concept behind all this go lower to the video title "The Bayesian Trap" by Veritasium. The figures show, considering this dataset and our model, the predicted evolution of the curves that are expected to be observed. Note that the predictions have the uncertainty into account. Meaning that in the cases where few data points are available the uncertainty grows i.e. the spam of the predictions. In short, the figures show that given the data and our model, what evolutions are expected to be observed. Note that the predictions have the uncertainty into account. This implies that for the cases where few data points are available this uncertainty grows.
This curve is an alternative model that could be taken at this point as upper bounds, we have realized that the logistic model tends to fit the inflection point close to the end of the available data, therefore giving most likely a lower bound prediction. We are not going to discuss the origins but simply mention that this curve is a sigmoid function.
Examples of uses for Gompertz curves include:
where:
Similarly to what happens with the logistc function, the variables $N(0)$, $a$ and $c$ may be estimated by checking the time series. For this function, these quantities are related with the counts and count rates as,
Some of the quantities above can be estimated by tracking the data. On that sense, knowing (or estimating) the numbers of affected people $n(t_0)$ and the maximum rate, or what is more commonly known as the peak of the distribution, one can invert the equations to predict $N(0)$.