Free time & chunking modeling
  • Versions
    • Latest
    • v0.1
  • Source Code
  • Report a Bug
  1. Notebooks
  2. Model 3: Non-linear recovery
  3. Linear recovery as a random variable
  • Version 0.1
  •  
  • About
  • Development notes
    • Notes
    • 2024-05-16 Meeting Notes
    • Extra primacy parameter
  • Notebooks
    • Data
      • View the data structure
      • Exploratory data analysis
      • Subject-level data
    • Model 1: Original model
      • Main results
      • Parameter identifiability
      • Sensitivity to tau
      • Experiment 3
      • Exploring model predictions
    • Model 2: Include encoding time
      • Main results
    • Model 3: Non-linear recovery
      • Explore model predictions
      • Basic fits
      • Bootstrapping data and fits for parameter uncertainty estimation
      • Extra primacy parameter
      • Linear recovery as a random variable
  • Function reference
    • Aggregate Data
    • Perform bootstrapped estimation
    • Calculate the deviance of a model
    • Get data object from a file
    • Generate a bootstrapped dataset
    • get_data function
    • Inverse Logit Transformation
    • Logit Transformation
    • Calculate the overall deviance
    • Plot Bootstrap Results
    • Plot Linear RV Recovery
    • Preprocesses the data
    • Execute an expression and save the result to a file or load the result from a file if it already exists.
    • Serial Recall Model

On this page

  • Overview
  • Linear recovery with a random variable rate
  • Analytical solutions for the average recovery over time
    • Exponential distribution
    • Gamma distribution
  • Compare with exponential recovery
    • Marginal recovery from an exponential distribution
    • Marginal recovery from a gamma distribution
  • Conversion of parameters
  • Summary
  1. Notebooks
  2. Model 3: Non-linear recovery
  3. Linear recovery as a random variable

Linear recovery as a random variable

  • Show All Code
  • Hide All Code

  • View Source
Author

Ven Popov

Published

May 27, 2024

Modified

May 28, 2024

Overview

In the simulations we have considered two different equations for the recovery of resources over time:

Linear recovery

Resources recover linearly over time at a constant rate r until they reach a maximum value of 1:

\[ R_{new} = \min(1, R + r \cdot t) \]

where \(R\) is the resource level before recovery, \(t\) is the recovery duration, and \(r\) is the recovery rate.

Exponential recovery

Alternatively, resources could recover non-linearly over time at a rate that depends on the current resource level:

\[ R_{new} = R + (1 - R) \cdot (1 - e^{-r t}) = 1 - (1 - R)e^{-r t} \]

which can be simplified in the case of recovery starting from 0 to

\[ R(t) = 1 - e^{-r t} \]

Klaus and I discussed the possibility that the recovery is linear, but the rate of recovery r is a random variable that varies over trials and individuals. Similarly to the evidence accumulation rate in the linear balistic accumulator models, at the aggregate level the recovery rate might be better modeled with the exponential recovery equation, even if the individual recovery is linear. In this notebook, I explore this idea further and derive analytical solutions for the average recovery over time under different assumptions about the distribution of r.

Linear recovery with a random variable rate

Assume that resource recover linearly over time at a constant rate r until they reach a maximum value of 1. Assume that r is a random variable. Let’s plot the average resource recovery over time under different assumptions about the distribution of r.

Code
par(mar = c(4, 0.5, 2, 1), lwd = 2)
curve(
  dunif(x, 0, 1),
  from = 0, to = 4, n = 1001, col = "black",
  xlab = "r", ylab = NA, main = "Distribution of r",
  ylim = c(0, 1.5), yaxt = "n",
)
curve(
  truncnorm::dtruncnorm(x, a = 0, mean = 0.5, sd = 0.3),
  from = 0, to = 4, n = 1001, col = "red", add = TRUE
)
curve(
  dgamma(x, shape = 1, scale = 1),
  from = 0, to = 4, n = 1001, col = "blue", add = TRUE
)
curve(
  dgamma(x, shape = 3, rate = 5),
  from = 0, to = 4, n = 1001, col = "orange", add = TRUE
)
# add legend
legend("topright",
  legend = c("Uniform(0, 1)", "Normal(0.5, 0.3)", "Exp(1)", "Gamma(3, 5)"),
  col = c("black", "red", "blue", "orange"), lty = 1
)

  1. r is uniformly distributed between 0 and 1.
  2. r is distributed as a truncated normal distribution with mean 0.5 and standard deviation 0.3
  3. r is distributed as an exponential distribution with rate 1
  4. r is distributed as a gamma distribution with shape 3 and rate 5

Red line shows the average recovery over time. Black lines show individual trajectories of a few random samples of r. The full distributions of r are shown in the margin.

Code
set.seed(3)
n <- 1000
r_unif <- runif(n, 0, 1)
r_truncnorm <- truncnorm::rtruncnorm(n, a = 0, mean = 0.5, sd = 0.3)
r_exp <- rgamma(n, shape = 1, scale = 1)
r_gamma <- rgamma(n, shape = 3, rate = 5)
t <- seq(0, 5, length.out = 100)

plot_linear_rv_recovery(r_unif, t, title = "r ~ Uniform(0, 1)") + 
  plot_linear_rv_recovery(r_truncnorm, t, title = "r ~ Normal(0.5, 0.3), r >= 0") +
  plot_linear_rv_recovery(r_exp, t, "r ~ Exp(1)") + 
  plot_linear_rv_recovery(r_gamma, t, "r ~ Gamma(3, 5)")

Analytical solutions for the average recovery over time

For simplicity, consider the case of recovery starting from 0. Given a continuous distribution of r, \(f(r)\), we can derive the average recovery over time by integrating two separate quantites over all possible values of r. We have to consider two cases:

  1. Resources have not yet reached their maximum value of 1 and are recovering linearly at a rate r. This is the case for \(r < 1/t\)
  2. Resources have reached their maximum value of 1 and are no longer recovering. This is the case for \(r \geq 1/t\)

The average recovery after time t is then given by the following combination of integrals:

\[ R(t) = \int_{0}^{1/t} r \cdot t \cdot f(r) \, dr + \int_{1/t}^{\infty} 1 \cdot f(r) \, dr \]

Below I give the solution for the exponential and gamma distributions of r.

Exponential distribution

Let’s consider the case where r is distributed as an exponential distribution with rate \(\lambda\). The probability density function of the exponential distribution is given by

\[ f(r) = \lambda e^{-\lambda r} \]

The average recovery over time is then given by

\[\begin{aligned} R(t) &= \int_{0}^{1/t} r \cdot t \cdot \lambda e^{-\lambda r} \, dr + \int_{1/t}^{\infty} 1 \cdot \lambda e^{-\lambda r} \, dr \\ &= \left[ -\frac{t e^{\lambda (-r)} (\lambda r+1)}{\lambda } \right]_{0}^{1/t} + \left[ -e^{-\lambda r} \right]_{1/t}^{\infty} \\ &= \frac{t-e^{-\lambda / t} (\lambda +t)}{\lambda } + e^{-\lambda / t} \\ &= \frac{t}{\lambda} (1 - e^{-\lambda/t}) \end{aligned}\]

Gamma distribution

The probability density function of the gamma distribution is given by

\[ f(r) = \frac{\lambda^{\alpha} r^{\alpha - 1} e^{-\lambda r}}{\Gamma(\alpha)} \]

where \(\alpha\) is the shape parameter and \(\lambda\) is the rate parameter. The average recovery over time is then given by

\[ \begin{aligned} R(t) &= \int_{0}^{1/t} r \cdot t \cdot \frac{\lambda^{\alpha} r^{\alpha - 1} e^{-\lambda r}}{\Gamma(\alpha)} \, dr + \int_{1/t}^{\infty} 1 \cdot \frac{\lambda^{\alpha} r^{\alpha - 1} e^{-\lambda r}}{\Gamma(\alpha)} \, dr \\ &= \frac{\lambda^{\alpha}}{\Gamma(\alpha)} \left( t \int_{0}^{1/t} r^{\alpha} e^{-\lambda r} \, dr + \int_{1/t}^{\infty} r^{\alpha - 1} e^{-\lambda r} \, dr \right) \\ \end{aligned} \]

which after some transformations can be written as

\[ \begin{aligned} R(t) &= \frac{\Gamma \left(\alpha ,\frac{\lambda }{t}\right)}{\Gamma (\alpha )}+\frac{\alpha t}{\lambda}\frac{\gamma \left(\alpha +1,\frac{\lambda }{t}\right)}{\Gamma (\alpha +1)} \\ &= P(\alpha, \lambda / t) + \frac{\alpha t}{\lambda} Q(\alpha + 1, \lambda / t) \end{aligned} \]

where \(\Gamma(a, z)\) is the upper incomplete gamma function, \(\gamma(a, z)\) is the lower incomplete gamma function; and \(P(a, z)\) and \(Q(a, z)\) are the regularized upper and lower incomplete gamma functions, respectively. P and Q are respectively the complimentary cdf and the cdf of the gamma distribution. They can be computed using the pgamma function in R with the lower.tail = FALSE argument for the upper incomplete gamma function and the pgamma function with the lower.tail = TRUE argument for the lower incomplete gamma function.

Compare with exponential recovery

The equations derived above under the assumption of an exponential or gamma distribution of r are obviously not the same as the exponential recovery equation. But is the exponential recovery equation a good approximation of the average recovery over time when r is a random variable following an exponential or gamma distribution?

In this part, I compare the recovery given by the exponential recovery equation with the recovery given by the marginalized linear recovery with r following an exponential distribution or gamma distribution. I will use the analytical solutions derived above to fit the parameters of the exponential and gamma distributions to the exponential recovery equation.

Let’s define our functions:

exp_recovery <- function(r, t) {
  1 - exp(-r * t)
}

marginal_recovery_exponential <- function(lambda, t) {
  t/lambda * (1 - exp(-lambda/t))
}

marginal_recovery_gamma <- function(alpha, lambda, t) {
  P <- pgamma(lambda / t, shape = alpha, lower.tail = FALSE) 
  R <- pgamma(lambda / t, shape = alpha + 1)
  P + alpha * t / lambda * R
}

Marginal recovery from an exponential distribution

Set the true recovery rate to 0.3 and fit the rate parameter of the exponential distribution to the exponential recovery equation.

Code
objective_fn <- function(log_lambda, t, target) {
  pred <- marginal_recovery_exponential(exp(log_lambda), t)
  sqrt(mean((pred - target)^2))
}

set.seed(2)
true_r <- 0.3
t <- seq(0, 60, length.out = 1000)
fit <- optim(1, objective_fn, t = t, target = exp_recovery(true_r, t))
lambda <- exp(fit$par)

curve(
  marginal_recovery_exponential(lambda, t),
  from = 0, to = 20, n = 1001,
  col = "blue", ylim = c(0, 1), xname = "t", ylab = "R(t)"
)
curve(
  exp_recovery(true_r, t),
  col = "red", add = TRUE, xname = "t"
)
legend("bottomright", legend = c("Marginal linear recovery\n(rate is an exponential random variable)", "Exponential recovery"), col = c("blue", "red"), lty = 1)

The overall shape is similar, but the approximation is not great.

Marginal recovery from a gamma distribution

The approximation when the rate follows a gamma distribution is much better::

Code
objective_fn <- function(pars, t, target) {
  pred <- marginal_recovery_gamma(exp(pars[1]), exp(pars[2]), t)
  sqrt(mean((pred - target)^2))
}

set.seed(2)
true_r <- 0.30
t <- seq(0, 200, length.out = 1000)
fit <- optim(c(0, 0), objective_fn, t = t, target = exp_recovery(true_r, t))
pars <- exp(fit$par)

curve(
  marginal_recovery_gamma(pars[1], pars[2], t),
  from = 0, to = 1 / true_r * 5, n = 1001,
  col = "blue", ylim = c(0, 1), xname = "t", ylab = "R(t)"
)
curve(
  exp_recovery(true_r, t),
  col = "red", add = TRUE, xname = "t"
)
legend("bottomright", legend = c("Marginal linear recovery\n(rate is a gamma random variable)", "Exponential recovery"), col = c("blue", "red"), lty = 1)

with an RMSE of 0.005. The fit is basically the same for all values of r.

Summary

The exponential recovery equation is a good approximation of the average recovery over time when the rate of recovery is a random variable following a gamma distribution. The approximation is not as good when the rate of recovery is a random variable following an exponential distribution.

Conversion of parameters

Optional

The section below is not directly relevant to our project, but I found the results quite fascinating. Feel free to skip it if you are not interested in the relationship between the parameters of the gamma distribution and the true recovery rate.

When the rate of recovery is a random variable following a gamma distribution, this is approximated well by the exponential recovery equation. The gamma distribution has two parameters: the shape parameter \(\alpha\) and the rate parameter \(\lambda\), while the exponential recovery equation has only one parameter: the rate of recovery r. How do the parameters of the gamma distribution relate to the recovery rate in exponential recovery equation?

I will simulate the recovery over time for different values of the true recovery rate r in the range [0.1, 5]. Then I will fit to this the marginal recovery predicted by a linear recovery process where the rate follows a gamma distribution. I will then explore the relationship between the parameters of the gamma distribution and the true recovery rate.

Code
t <- seq(0, 200, length.out = 10000)
get_pars <- function(true_r, t) {
  objective_fn <- function(pars, t, target) {
    pred <- marginal_recovery_gamma(exp(pars[1]), exp(pars[2]), t)
    sqrt(mean((pred - target)^2))
  }

  fit <- optim(c(0, 0), objective_fn, t = t, target = exp_recovery(true_r, t))

  pars <- exp(fit$par)
  pars
}

true_r <- seq(0.1, 5, length.out = 100)
pars <- run_or_load(sapply(true_r, get_pars, t = t), "output/gamma_recovery_pars.rds")

par(mfrow = c(1, 2))
plot(true_r, pars[1, ], type = "l", col = "blue", xlab = "Exponential recovery rate (r)", ylab = "alpha")
plot(true_r, pars[2, ], type = "l", col = "red", xlab = "Exponential recovery rate (r)", ylab = "lambda")

I did not expect this. The shape parameter \(\alpha\) is fixed at 2.7937 and only the rate \(\lambda\) changes. Turns out that lambda is linearly related to 1/r (red line is the “data” and blue line is the fitted line from a linear regression):

Code
par(mfrow = c(1, 1), lwd = 3)
plot(1 / true_r, pars[2, ], type = "l", col = "red", xlab = "1 / r", ylab = "lambda")
lines(pars[2, ] ~ I(1 / true_r), col = "blue", lty = 4)

with the following relationship:

\[ \lambda \approxeq 3.831 \,r^{-1} \]

Summary

An exponential recovery processes with rate \(r\) can be approximated well by a linear recovery process where the rate of recovery is a random variable which follows a gamma distribution with shape parameter \(\alpha =\) 2.794 and rate parameter \(\lambda = 3.831 \,r^{-1}\). That is, modeling resource recovery with the following two equations is almost equivalent:

\[ R(t) = 1 - e^{-r t} \]

and

\[ R(t) = min(1, r_{linear}t) \\ \]

\[ r_{linear} \sim \text{Gamma}(2.794, 3.831 \cdot r^{-1}) \]

Back to top
Bootstrapping data and fits for parameter uncertainty estimation
Aggregate Data
Source Code
---
title: "Linear recovery as a random variable"
format: html
author: "Ven Popov"
date: 05/27/2024
date-modified: 05/28/2024
---

```{r init, include = FALSE}
#| message: false
library(ggplot2)
library(dplyr)
library(targets)
library(patchwork)
tar_source()
```

## Overview

In the simulations we have considered two different equations for the recovery of resources over time:

#### Linear recovery

Resources recover linearly over time at a constant rate `r` until they reach a maximum value of 1:

$$
R_{new} = \min(1, R + r \cdot t)
$$

where $R$ is the resource level before recovery, $t$ is the recovery duration, and $r$ is the recovery rate.

#### Exponential recovery

Alternatively, resources could recover non-linearly over time at a rate that depends on the current resource level:

$$
R_{new} = R + (1 - R) \cdot (1 - e^{-r t}) = 1 - (1 - R)e^{-r t}
$$

which can be simplified in the case of recovery starting from 0 to

$$
R(t) = 1 - e^{-r t}
$$

Klaus and I discussed the possibility that the recovery is linear, but the rate of recovery `r` is a random variable that varies over trials and individuals. Similarly to the evidence accumulation rate in the linear balistic accumulator models, at the aggregate level the recovery rate might be better modeled with the exponential recovery equation, even if the individual recovery is linear. In this notebook, I explore this idea further and derive analytical solutions for the average recovery over time under different assumptions about the distribution of `r`.



## Linear recovery with a random variable rate

Assume that resource recover linearly over time at a constant rate `r` until they reach a maximum value of 1. Assume that `r` is a random variable. Let's plot the average resource recovery over time under different assumptions about the distribution of `r`.

```{r}
#| fig-column: margin
#| results: hold
#| fig-width: 3.5
#| fig-height: 3.5

par(mar = c(4, 0.5, 2, 1), lwd = 2)
curve(
  dunif(x, 0, 1),
  from = 0, to = 4, n = 1001, col = "black",
  xlab = "r", ylab = NA, main = "Distribution of r",
  ylim = c(0, 1.5), yaxt = "n",
)
curve(
  truncnorm::dtruncnorm(x, a = 0, mean = 0.5, sd = 0.3),
  from = 0, to = 4, n = 1001, col = "red", add = TRUE
)
curve(
  dgamma(x, shape = 1, scale = 1),
  from = 0, to = 4, n = 1001, col = "blue", add = TRUE
)
curve(
  dgamma(x, shape = 3, rate = 5),
  from = 0, to = 4, n = 1001, col = "orange", add = TRUE
)
# add legend
legend("topright",
  legend = c("Uniform(0, 1)", "Normal(0.5, 0.3)", "Exp(1)", "Gamma(3, 5)"),
  col = c("black", "red", "blue", "orange"), lty = 1
)
```

1. `r` is uniformly distributed between 0 and 1.
2. `r` is distributed as a truncated normal distribution with mean 0.5 and standard deviation 0.3
3. `r` is distributed as an exponential distribution with rate 1
4. `r` is distributed as a gamma distribution with shape 3 and rate 5

Red line shows the average recovery over time. Black lines show individual trajectories of a few random samples of `r`. The full distributions of `r` are shown in the margin.

```{r}
set.seed(3)
n <- 1000
r_unif <- runif(n, 0, 1)
r_truncnorm <- truncnorm::rtruncnorm(n, a = 0, mean = 0.5, sd = 0.3)
r_exp <- rgamma(n, shape = 1, scale = 1)
r_gamma <- rgamma(n, shape = 3, rate = 5)
t <- seq(0, 5, length.out = 100)

plot_linear_rv_recovery(r_unif, t, title = "r ~ Uniform(0, 1)") + 
  plot_linear_rv_recovery(r_truncnorm, t, title = "r ~ Normal(0.5, 0.3), r >= 0") +
  plot_linear_rv_recovery(r_exp, t, "r ~ Exp(1)") + 
  plot_linear_rv_recovery(r_gamma, t, "r ~ Gamma(3, 5)")
```

## Analytical solutions for the average recovery over time

For simplicity, consider the case of recovery starting from 0. Given a continuous distribution of `r`, $f(r)$, we can derive the average recovery over time by integrating two separate quantites over all possible values of `r`. We have to consider two cases:

1. Resources have not yet reached their maximum value of 1 and are recovering linearly at a rate `r`. This is the case for $r < 1/t$
2. Resources have reached their maximum value of 1 and are no longer recovering. This is the case for $r \geq 1/t$

The average recovery after time `t` is then given by the following combination of integrals:

$$
R(t) = \int_{0}^{1/t} r \cdot t \cdot f(r) \, dr + \int_{1/t}^{\infty} 1 \cdot f(r) \, dr  
$$

Below I give the solution for the exponential and gamma distributions of `r`.

### Exponential distribution

Let's consider the case where `r` is distributed as an exponential distribution with rate $\lambda$. The probability density function of the exponential distribution is given by

$$
f(r) = \lambda e^{-\lambda r}
$$

The average recovery over time is then given by

\begin{aligned}
R(t) &= \int_{0}^{1/t} r \cdot t \cdot \lambda e^{-\lambda r} \, dr + \int_{1/t}^{\infty} 1 \cdot \lambda e^{-\lambda r} \, dr \\
&= \left[ -\frac{t e^{\lambda  (-r)} (\lambda  r+1)}{\lambda } \right]_{0}^{1/t} + \left[ -e^{-\lambda r} \right]_{1/t}^{\infty} \\
&= \frac{t-e^{-\lambda / t} (\lambda +t)}{\lambda } + e^{-\lambda / t} \\
&= \frac{t}{\lambda} (1 - e^{-\lambda/t})
\end{aligned}

### Gamma distribution

The probability density function of the gamma distribution is given by

$$
f(r) = \frac{\lambda^{\alpha} r^{\alpha - 1} e^{-\lambda r}}{\Gamma(\alpha)}
$$

where $\alpha$ is the shape parameter and $\lambda$ is the rate parameter. The average recovery over time is then given by

$$
\begin{aligned}
R(t) &= \int_{0}^{1/t} r \cdot t \cdot \frac{\lambda^{\alpha} r^{\alpha - 1} e^{-\lambda r}}{\Gamma(\alpha)} \, dr + \int_{1/t}^{\infty} 1 \cdot \frac{\lambda^{\alpha} r^{\alpha - 1} e^{-\lambda r}}{\Gamma(\alpha)} \, dr \\
&= \frac{\lambda^{\alpha}}{\Gamma(\alpha)} \left( t \int_{0}^{1/t} r^{\alpha} e^{-\lambda r} \, dr + \int_{1/t}^{\infty} r^{\alpha - 1} e^{-\lambda r} \, dr \right) \\
\end{aligned}
$$

which after some transformations can be written as

$$
\begin{aligned}
R(t) &= \frac{\Gamma \left(\alpha ,\frac{\lambda }{t}\right)}{\Gamma (\alpha )}+\frac{\alpha t}{\lambda}\frac{\gamma \left(\alpha +1,\frac{\lambda }{t}\right)}{\Gamma (\alpha +1)} \\
&= P(\alpha, \lambda / t) + \frac{\alpha t}{\lambda} Q(\alpha + 1, \lambda / t)
\end{aligned}
$$

where $\Gamma(a, z)$ is the [upper incomplete gamma function](https://mathworld.wolfram.com/IncompleteGammaFunction.html), $\gamma(a, z)$ is the lower incomplete gamma function; and $P(a, z)$ and $Q(a, z)$ are the regularized upper and lower incomplete gamma functions, respectively. P and Q are respectively the complimentary cdf and the cdf of the gamma distribution. They can be computed using the `pgamma` function in R with the `lower.tail = FALSE` argument for the upper incomplete gamma function and the `pgamma` function with the `lower.tail = TRUE` argument for the lower incomplete gamma function.

## Compare with exponential recovery

The equations derived above under the assumption of an exponential or gamma distribution of `r` are obviously not the same as the exponential recovery equation. But is the exponential recovery equation a good approximation of the average recovery over time when `r` is a random variable following an exponential or gamma distribution?

In this part, I compare the recovery given by the exponential recovery equation with the recovery given by the marginalized linear recovery with `r` following an exponential distribution or gamma distribution. I will use the analytical solutions derived above to fit the parameters of the exponential and gamma distributions to the exponential recovery equation.

Let's define our functions:

```{r}
#| code-fold: false
exp_recovery <- function(r, t) {
  1 - exp(-r * t)
}

marginal_recovery_exponential <- function(lambda, t) {
  t/lambda * (1 - exp(-lambda/t))
}

marginal_recovery_gamma <- function(alpha, lambda, t) {
  P <- pgamma(lambda / t, shape = alpha, lower.tail = FALSE) 
  R <- pgamma(lambda / t, shape = alpha + 1)
  P + alpha * t / lambda * R
}
```


### Marginal recovery from an exponential distribution

Set the true recovery rate to 0.3 and fit the rate parameter of the exponential distribution to the exponential recovery equation.

```{r}
#| warning: false
#| message: false
#| fig-width: 5.5
#| fig-height: 5
objective_fn <- function(log_lambda, t, target) {
  pred <- marginal_recovery_exponential(exp(log_lambda), t)
  sqrt(mean((pred - target)^2))
}

set.seed(2)
true_r <- 0.3
t <- seq(0, 60, length.out = 1000)
fit <- optim(1, objective_fn, t = t, target = exp_recovery(true_r, t))
lambda <- exp(fit$par)

curve(
  marginal_recovery_exponential(lambda, t),
  from = 0, to = 20, n = 1001,
  col = "blue", ylim = c(0, 1), xname = "t", ylab = "R(t)"
)
curve(
  exp_recovery(true_r, t),
  col = "red", add = TRUE, xname = "t"
)
legend("bottomright", legend = c("Marginal linear recovery\n(rate is an exponential random variable)", "Exponential recovery"), col = c("blue", "red"), lty = 1)

```

The overall shape is similar, but the approximation is not great. 

### Marginal recovery from a gamma distribution

The approximation when the rate follows a gamma distribution is much better::

```{r}
#| warning: false
#| message: false
#| fig-width: 5.5
#| fig-height: 5
objective_fn <- function(pars, t, target) {
  pred <- marginal_recovery_gamma(exp(pars[1]), exp(pars[2]), t)
  sqrt(mean((pred - target)^2))
}

set.seed(2)
true_r <- 0.30
t <- seq(0, 200, length.out = 1000)
fit <- optim(c(0, 0), objective_fn, t = t, target = exp_recovery(true_r, t))
pars <- exp(fit$par)

curve(
  marginal_recovery_gamma(pars[1], pars[2], t),
  from = 0, to = 1 / true_r * 5, n = 1001,
  col = "blue", ylim = c(0, 1), xname = "t", ylab = "R(t)"
)
curve(
  exp_recovery(true_r, t),
  col = "red", add = TRUE, xname = "t"
)
legend("bottomright", legend = c("Marginal linear recovery\n(rate is a gamma random variable)", "Exponential recovery"), col = c("blue", "red"), lty = 1)
```

with an RMSE of `r round(fit$value, 4)`. The fit is basically the same for all values of `r`. 

::: {.callout-important}
## Summary

The exponential recovery equation is a good approximation of the average recovery over time when the rate of recovery is a random variable following a gamma distribution. The approximation is not as good when the rate of recovery is a random variable following an exponential distribution.

:::

## Conversion of parameters

::: {.callout-note}
## Optional

The section below is not directly relevant to our project, but I found the results quite fascinating. Feel free to skip it if you are not interested in the relationship between the parameters of the gamma distribution and the true recovery rate.
:::

When the rate of recovery is a random variable following a gamma distribution, this is approximated well by the exponential recovery equation. The gamma distribution has two parameters: the shape parameter $\alpha$ and the rate parameter $\lambda$, while the exponential recovery equation has only one parameter: the rate of recovery `r`. How do the parameters of the gamma distribution relate to the recovery rate in exponential recovery equation?

I will simulate the recovery over time for different values of the true recovery rate `r` in the range [0.1, 5]. Then I will fit to this the marginal recovery predicted by a linear recovery process where the rate follows a gamma distribution. I will then explore the relationship between the parameters of the gamma distribution and the true recovery rate.

```{r}
#| fig-width: 8
#| fig-height: 4
t <- seq(0, 200, length.out = 10000)
get_pars <- function(true_r, t) {
  objective_fn <- function(pars, t, target) {
    pred <- marginal_recovery_gamma(exp(pars[1]), exp(pars[2]), t)
    sqrt(mean((pred - target)^2))
  }

  fit <- optim(c(0, 0), objective_fn, t = t, target = exp_recovery(true_r, t))

  pars <- exp(fit$par)
  pars
}

true_r <- seq(0.1, 5, length.out = 100)
pars <- run_or_load(sapply(true_r, get_pars, t = t), "output/gamma_recovery_pars.rds")

par(mfrow = c(1, 2))
plot(true_r, pars[1, ], type = "l", col = "blue", xlab = "Exponential recovery rate (r)", ylab = "alpha")
plot(true_r, pars[2, ], type = "l", col = "red", xlab = "Exponential recovery rate (r)", ylab = "lambda")
```

I did not expect this. The shape parameter $\alpha$ is fixed at `r round(median(pars[1, ]),4)` and only the rate $\lambda$ changes. Turns out that lambda is linearly related to 1/r (red line is the "data" and blue line is the fitted line from a linear regression):

```{r}
par(mfrow = c(1, 1), lwd = 3)
plot(1 / true_r, pars[2, ], type = "l", col = "red", xlab = "1 / r", ylab = "lambda")
lines(pars[2, ] ~ I(1 / true_r), col = "blue", lty = 4)
```

with the following relationship:

$$
\lambda \approxeq 3.831 \,r^{-1}
$$

## Summary

An exponential recovery processes with rate $r$ can be approximated well by a linear recovery process where the rate of recovery is a random variable which follows a gamma distribution with shape parameter $\alpha =$ `r round(median(pars[1, ]),3)` and rate parameter $\lambda = 3.831 \,r^{-1}$. That is, modeling resource recovery with the following two equations is almost equivalent:

$$
R(t) = 1 - e^{-r t}
$$

and

$$
R(t) = min(1, r_{linear}t) \\
$$

$$
r_{linear} \sim \text{Gamma}(2.794, 3.831 \cdot r^{-1})
$$