Loading...

In this figure we compare the current number of COVID-19 fatalities to date shown with color dots, with the multiple projections drawn from the posterior predictive distribution; these projections are shown as faint solid lines (note that the more lines we have the more likely that path will be). We have defined the zeroth time for each country to the day they announced their first fatality record. The vertical grid lines represent important events that may have affected the growth rate such as the separate lockdowns (LD) applied by China, Italy, and Spain. Note that all curves have been drawn from a logistic model and predict the # fatalities (N) for each country analyzed here.

A population will grow its size according to a growth rate. In the case of exponential growth, this rate stays the same regardless of the population size, inducing the population to grow faster and faster as it gets larger, without an end.

- In nature, populations can only grow exponentially during some period, but inevitably the growth rate will ultimately be limited for example by the resource availability.
- In logistic growth, the population growth rate gets smaller and smaller as population size approaches a maximum. This maximum is, in essence, a product of overpopulation limiting the population's resources.
- Exponential growth produces a J-shaped curve, while logistic growth produces an S-shaped curve.
- When we read about bending the curve we are talking about using a l ogarithmic scale to plot the data, in that case, that J-shaped curve becomes a straight line. The moment when this straight line bends downwards we start seeing the limiting factors and we are close to the center of the S-shaped curve, which in this case looks like an inverse J-shape. Remember this is only a matter of how the data is plotted or shown, it does not affect the data itself.

In theory, any kind of organism could take over the Earth just by reproducing. For instance, imagine that we started with a single pair of male and female rabbits. If these rabbits and their descendants reproduced at top speed "like bunnies" for 777 years, without any deaths, we would have enough rabbits to cover the entire state of Rhode Island. And that's not even so impressive – if we used E. coli bacteria instead, we could start with just one bacterium and have enough bacteria to cover the Earth with a 111-foot layer in just 36 hours!

As you've probably noticed, there isn't a 111-foot layer of bacteria covering the entire Earth (at least, not at my house), nor have bunnies taken possession of Rhode Island. Why, then, don't we see these populations getting as big as they theoretically could? E. coli, rabbits, and all living organisms need specific resources, such as nutrients and suitable environments, in order to survive and reproduce. These resources aren’t unlimited, and a population can only reach a size that match the availability of resources in its local environment.

Population ecologists use a variety of mathematical methods to model population dynamics (how populations change in size and composition over time). Some of these models represent growth without environmental constraints, while others include "ceilings" determined by limited resources. Mathematical models of populations can be used to accurately describe changes occurring in a population and, importantly, to predict future changes.

*end of the khanacademy citation*

The figure above shows how the logistic and exponential models are constructed; to underestand them better you can watch the video "Exponential growth and epidemics" bellow.

After reading this text it should be obvious to us that the growth of the virus cannot be exponential indefinitely but it has to flatten at some point. One of these functions is the logistic model, used here to predict the number of deaths.

If we solve the equation on the right of the previous figure, we obtain the logistic function. A logistic function or logistic curve is S-shaped. This type of curve is known as a sigmoid and its equation is as follows:

$$N(t) = K/(1 + e^{-r(t-t_0)}).$$- $e$ = the natural logarithm base; also known as Euler's number,
- $t_0$ = the $t$-value of the sigmoid where the rate starts to decrease, the midpoint of its evolution and the 'inflexion point' of the sigmoid's curve.
- $K$ =the curve's maximum value; in this case the maximum number of deaths.
- $r$ = the logistic growth rate or steepness of the curve

If one talks about infection or death rate and aims to fit a logistic function to the data, one needs to calibrate the $K$, $t_0$ and $r$ parameters during the evolution. Its careful tracking can be very useful for getting zero order intuition about the efficiency of the measures taken to contain the disease. Interestingly, these quantities may be estimated by knowing that the following relations are satisfied,

- $t_0$ =$Log[K]$/r,
- $n(t_0)$ =$K$/2,
- $dn/dt (t_0)$= r K/4.
- time to double: $t_2 = (Log[-((2 E^{a t} K)/(e^{a t} - K))])/r $
- Susceptible to be infected by a disease.
- Infected by the desease.

Some of the quantities above can be estimated by tracking the data. On that sense, knowing (or estimating) the numbers of affected people $n(t_0)$ and the maximum rate, or what is more commonly known as the peak of the distribution, one can invert the equations to predict $K$.

A logistic function represents a simplified form of the more complete SI (Susceptible, Infected) models (wiki). A SI model is a dynamical system that simulates the interaction and evolution rates of a population with N=S+I elements, where the dynamical variables are:

The logistic function arises as the solution of this dynamical system that deals with the flow of people from S to I. Here I is taken as the total cumulative infected cases that is, people that has been infected regardless of whether they recover or not. Notice that with this definition, I it grows monotonically until it reaches a maximum, thus mimetising a Sigmoid-type behaviour. This can be naturally translated to the K number of death counts if there exists a known empirical relation between them. This applies in our particular study..

The logistic model defined above and a nonnegative binomial distribution as likelihood, to obtain the posterior predictive distribution of our model; from which we will sample to generate new data based on our estimated posteriors. Please do not get disturbed by this, if you want to have a rough idea of the concept behind all this go lower to the video title "The Bayesian Trap" by Veritasium. The figures show, considering this dataset and our model, the predicted evolution of the curves that are expected to be observed. Note that the predictions have the uncertainty into account. Meaning that in the cases where few data points are available the uncertainty grows i.e. the spam of the predictions. In short, the figures show that given the data and our model, what evolutions are expected to be observed. Note that the predictions have the uncertainty into account. This implies that for the cases where few data points are available this uncertainty grows.

This curve is an alternative model that could be taken at this point as upper bounds, we have realized that the logistic model tends to fit the inflection point close to the end of the available data, therefore giving most likely a lower bound prediction. We are not going to discuss the origins but simply mention that this curve is a sigmoid function.

Examples of uses for Gompertz curves include:

- Modelling of growth of tumors
- Modelling market impact in finance
- Detailing population growth in animals of prey, with regard to predator-prey relationships
- Examining disease spread
- Modelling bacterial cells within a population

where:

- $N(0)$ is the initial number of cells/organisms when time is zero
- $a$ denotes the rate of growth
- $b=e^{a c}$ is a positive number
- $c$ denotes the displacement in time

Similarly to what happens with the logistc function, the variables $N(0)$, $a$ and $c$ may be estimated by checking the time series. For this function, these quantities are related with the counts and count rates as,

- $t_0$ =$Log[b]$/a
- $n(t_0)$ =$N(0)$/e,
- $dn/dt (t_0)$= a N(0)/e.
- time to double: $t_2 = (Log[b/(Log[1/2 e^{b e^{-a t}})])])/a$

Some of the quantities above can be estimated by tracking the data. On that sense, knowing (or estimating) the numbers of affected people $n(t_0)$ and the maximum rate, or what is more commonly known as the peak of the distribution, one can invert the equations to predict $N(0)$.