session-04-presentation.qmd

---
title: Statistical programming
title-slide-attributes:
  data-background-image: mvtec-cover-statistical-programming-4x3.png
  data-background-size: contain
  data-background-opacity: "0.1"
subtitle: Overview of probability. Simulation
author: Marc Comas-Cufí
format: 
  revealjs:
    self-contained: true
    smaller: true
    logo: mvtec-cover-statistical-programming-4x3.png
fontsize: 12pt
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE, fig.align = 'center', dev.args = list(bg = 'transparent'),
                      collapse=TRUE, comment = "#>")
```

## Today's session

```{r, echo=FALSE, results='asis'}
cat(readr::read_lines("session-04-content.md"), sep='\n')
```

# A brief review of probability

## Event {.smaller}

* An _event_ is a set of possible results in a particular experiment.
* The event containing all possible events/results is called the _sample space_.
* For completeness, the _impossible event_ is defined and it is denoted as $\emptyset$.
* Example:
    * "Face 6" is a possible event when rolling a six-sided dice.
* Events are nicely represented with Euler diagrams:

```{r, fig.align='center', fig.cap='Euler diagram for three events A, B and C'}
knitr::include_graphics('session-04-presentation/euler_diagram.svg')
```

## Basic event operations {.smaller}

Given two events $A$ and $B$.

* $A \cap B$. Intersection of events $A$ and $B$. Event defined as $A$ and $B$ occur.

```{r, fig.align='center', out.height="80px"}
knitr::include_graphics('session-04-presentation/Venn0001.svg')
```

* $A \cup B$. Union of events $A$ and $B$. Event defined as either $A$ or $B$ occur.

```{r, fig.align='center', out.height="80px"}
knitr::include_graphics('session-04-presentation/Venn0111.svg')
```

* $A^c$. Complement of an event $A$. Event defined as $A$ does not occur.

```{r, fig.align='center', out.height="80px"}
knitr::include_graphics('session-04-presentation/Venn10.svg')
```

## Probability (of an event) {.smaller}

* The _probability of an event_ measures "how likely" the event is or "how big" with respect the sample space the event is.
* The probability of an event $A$ is denoted as $\text{P}(A)$.
* Probabilities are quantities between 0 and 1, i.e. for any event $A$, $0 \leq \text{P}(A) \leq 1$.
* $\text{P}(\emptyset) = 0$ and $\text{P}(\Omega) = 1$, where $\Omega$ denotes the sample space.

What are probabilities?

* __Frequentist__. Probabilities are long run relative frequencies of events.
* __Bayesian__. Probabilities are used to quantify our uncertainty about events.

* Example:
    * "Face 6" event has probability $1/6$ when rolling a six-sided dice.
 

## Conditional probability {.smaller}

The conditional probability of one event $A$ given another event $B$ is defined as
$$
\text{P}(A|B) = \frac{\text{P}(A\cap B)}{\text{P}(B)}.
$$

* $\text{P}(A|B)$ measures how likely is to happen $A$ once we know $B$ has happened.
* We can thing conditioning as if we were reducing our sample space with events where $B$ occurs.
* Example: "Face 6" event has probability $1/3$ once we are told the result is even.

__Bayes' rule__

For two events $A$ and $B$. Then,
$$
\begin{array}
\text{P}(A|B) &=& \frac{\text{P}(A) \text{P}(B|A)}{\text{P}(B)}\\
&=&  \frac{\text{P}(A) \text{P}(B|A)}{\text{P}(A) \text{P}(B|A) + \text{P}(A^c) \text{P}(B|A^c)}.
\end{array}
$$

## Simulating events with R (1) {.smaller}

```{r, include=FALSE}
library(tidyverse)
set.seed(10)
```

* We can randomize a set of possible events with function `sample()`.

```{r, echo=TRUE}
cards = c('2', '3', '4', '5', '6', '7', '8', '9', '10', 'J', 'Q', 'K', 'A')
sample(cards)
```

* We can pick a certain number of events:

```{r, echo=TRUE}
sample(cards, 5)
```

* By default, `sample()` picks elements without replacement. To force replacement:

```{r, echo=TRUE}
sample(cards, 5, replace = TRUE)
```

## Simulating events with R (2) {.smaller}

* We can assign a probability to each event:

```{r, echo=TRUE}
breast_cancer = c('yes', 'no')
sample(breast_cancer, 200, replace = TRUE, prob = c(0.004, 0.996))
```

## Activity: Medical diagnosis {.smaller background-color=#CCE5FF}

Suppose you are a women in your 40s, and you decide to have a medical test for breast cancer called a mammogram.

* If the test is positive, what is the probability you have cancer?

Information:

* The probability of having breast cancer at 40s is 0.004.
* The test has a __sensitivity__ of 80%. In other words, if you have breast cancer, the test will be positive with probability 0.8.
* The test has a __specificity__ 90%. In other words, if you don't have breast cancer, the test will be negative with probability 0.9.

## Activity: Medical diagnosis {.smaller background-color=#CCE5FF}

```{r, echo=TRUE}
library(tidyverse)
N = 100000
test_result_sampling = function(breast_cancer){
  if(breast_cancer == 'yes'){
    sample(c('+', '-'), 1, prob = c(0.8, 0.2))
  }else{
    sample(c('+', '-'), 1, prob = c(0.1, 0.9))
  }
}
women40s = tibble(
  breast_cancer = sample(c('yes', 'no'), N, replace = TRUE, prob = c(0.004, 0.996)),
  test_result = map_chr(breast_cancer, ~test_result_sampling(.x))
)
```

* What is the proportion of women with breast cancer in the population having a positive test?

```{r, include=FALSE}
women40s %>%
  group_by(test_result) %>% # or filter(test_result == '+')
  summarise(with_breast_cancer = mean(breast_cancer == 'yes'))
```

## Activity: Medical diagnosis {.smaller background-color=#CCE5FF}

Use probability theory to calculate the exact probability of a women having breast cancer once we know she got a positive test.

__Hint.__ Consider the following events:

* $A$ = "To have cancer", $A^c$ = "Not to have cancer" and 
* $B$ = "Test is positive".


## Independence between events {.smaller}

* Two events $A$ and $B$ are said to be independent, denoted $A \perp B$, if
$$
\text{P}(A) = \text{P}(A|B).
$$

* A practical equivalent definition is $A$ and $B$ are independent if

$$
\text{P}(A \cap B) = \text{P}(A) \text{P}(B).
$$

## Conditional independence {.smaller}

* Two events $A$ and $B$ are said to be conditionally independent given $C$, denoted $A \perp B | C$, if
$$
\text{P}(A|C) = \text{P}(A|B \cap C),
$$

or

$$
\text{P}(A \cap B | C) = \text{P}(A |C)\text{P}(B |C).
$$

## Activity {.smaller background-color=#CCE5FF} 

* Check if event __A = "face is even"__ and event __B = "the face is smaller or equal than 4"__ are independent when rolling a 6-faced dice.
* Check if event __A = "face is 6 in the blue dice"__ and __B = "face is 6 in the red dice"__ are independent when rolling a blue and red 6-faced dice.
* Check if event __A = "face is 6 in the blue dice"__ and __B = "face is 6 in the red dice"__ are conditionally independent when rolling a blue and red 6-faced dice when we know about event __C = "the sum is even"__.

## Activity: Monty Hall problem {background-color=#CCE5FF}

> Suppose you're on a game show, and you're given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say No. 1, and the host, who knows what's behind the doors, opens another door, say No. 3, which has a goat. He then says to you, "Do you want to pick door No. 2?" Is it to your advantage to switch your choice?

<!--
Consider these two events: $A$ = "car is selected" and $B$ = "goat is selected".

* __Not switching.__
  * We start on event $B$ with probability $\frac{2}{3}$.
  * Starting on event $A$ with probability $\frac{1}{3}$.

* __Switching.__
  * We start on event $B$ with probability $\frac{2}{3}$. After switching, we stay on event $A$.
  * Starting on event $A$ with probability $\frac{1}{3}$. After switching, we stay on event $B$.
-->

```{r, eval=FALSE}
monty_hall = function(CHANGE = FALSE){
  DOORS = sample(c('car', 'goat', 'goat'))
  PICKED = sample(1:3, 1)
  DISCARD = sample(rep((1:3)[(DOORS != 'car') & (1:3 != PICKED)], 2), 1)
  if(CHANGE){
    FINAL_PICK = setdiff(1:3, c(PICKED, DISCARD))
  }else{
    FINAL_PICK = PICKED
  }
  DOORS[FINAL_PICK]
}

prop.table( table(replicate(1000, monty_hall(CHANGE = FALSE))) )
prop.table( table(replicate(1000, monty_hall(CHANGE = TRUE))) )
```

## Random variables {.smaller}

```{r}
theme_set(theme_minimal())
```

* A random variable (r.v.) is a function from a sample space to a measurable space, like the real numbers.
* We can thing of a random variable as a variable taking values randomly.
* If values are discrete, the function assigning a probability to its values is called _probability mass function_ (pmf).

```{r, fig.align='center', out.height="300px"}
knitr::include_graphics('session-04-presentation/Random_Variable_as_a_Function-en.svg')
```

## Discrete random variables {.smaller}

* Discrete r.v. take values in a countable set of elements. For example a finite set of numbers or the integers.
* Discrete r.v. can be characterised with giving $\text{P}(X = x)$ for each possible value $x$. Function $p(x) = \text{P}(X = x)$ is called the _probability mass function_ (pmf).
* __Example:__ Discrete r.v. $X$ is uniquely determined by
    * $X \in \{1, 2, 3\}$ and
    * the pmf: $p(1) = 0.25$, $p(2) = 0.5$ and $p(3) = 0.25$.

```{r, fig.height=3, out.width="80%"}
pmf = tibble(
  X = c(1,2,3),
  Probability = c(0.25, 0.5, 0.25)
)
ggplot(data = pmf) +
  geom_bar(aes(x=X, y = Probability), col = 'black', width = 1, stat = 'identity', alpha = 0.4) + 
  labs(title = 'Probability mass function', x = '') +
  scale_y_continuous(limits = c(0,1)) +
  scale_x_continuous(breaks = 1:3, minor_breaks = NULL, labels = sprintf("X=%d", 1:3))
```

<!-- * Valid events for a discrete random variable $X$ are: $X = x$, $X \leq x$, $X \neq x$, ...-->

## Continuous random variables {.smaller}

* Continuous random variables take values in the real line. 
* Possible events for continuous r.v. are combinations of subsets the real line. 
* Continuous r.v. are characterised by providing $\text{P}(X \in E)$ for any event $E$. 
* The _probability density function_ (pdf) of a coninuous r.v. is a function $f(x)$ such that \(\text{P}(X \in E) = \int_{x \in E} f(x) dx. \)
* __Example:__ Continuous r.v. $X$ is uniquely determined with the pdf $f(x)$ defined as $f(x) = \frac{1}{3}$ if $x \in (0,3)$ and $f(x) = 0$ otherwise.

```{r, fig.height=2.5, out.width="80%"}
df = tibble(
  X = seq(-2,5,length.out=2000),
  Density = if_else(0<=X & X <=3, 1/3, 0)
)
yup = 0.003
ggplot(data = df) +
  geom_area(aes(x=X, y = Density), alpha = 0.4) + 
  geom_segment(aes(x=0, xend = 3, y = 1/3, yend = 1/3), size = 0.5) +
  geom_segment(aes(x=-2, xend = 0, y = yup, yend = yup), size = 0.5) +
  geom_segment(aes(x=3, xend = 5, y = yup, yend = yup), size = 0.5) +
  labs(title = 'Probability density function', x = 'x') +
  scale_y_continuous(limits = c(0,0.6))
```

<!--

## Cumulative distribution function

* The _cumulative distribution function_ cdf is the function $F(x)$ such that $F(x) = \text{P}(X \leq x)$. 
    * $F(x) = \sum_{z < x} p(z)$ for a discrete r.v. and 
    * $F(x) = \int_{z < x} f(z) dz$ for a continuous r.v. 
-->

## The expected value or mean of a r.v. {.smaller}

* The _expected value_ of a r.v. $X$, denoted  $\mathbb{E}[X]$, is the mean value we expect after an infinite number of runs.

  * For a discrete r.v. $X$, the expected value is
  $$
  \mathbb{E}[X] = \sum_{x} x \;p(x) \;\;\;\;\;\;\;\;\;\;[\sum_{x} x \;\text{P}(X = x)]
  $$

  * For a continuous r.v. $X$, the expected value is
  $$
  \mathbb{E}[X] = \int_{-\infty}^{\infty} x \;f(x) dx
  $$

* Example: the expected value of the r.v. with the value of rolling a 6-faced dice is

$$
1 \frac{1}{6} + 2 \frac{1}{6}  + 3 \frac{1}{6}  + 4 \frac{1}{6}  + 5 \frac{1}{6}  + 6 \frac{1}{6} = 3.5
$$

```{r, echo=TRUE}
mean( sample(1:6, 1000000, replace = TRUE) )
```

## More expected values {.smaller}

* The variance of a r.v., denoted $\text{var}[X]$, is defined as
$$
\text{var}[X] = \mathbb{E}[ \,(X - \mathbb{E}[X])^2\, ].
$$

* The covariance between two r.v.'s $X$ and $Y$, denoted $\text{cov}[X,Y]$, is defined as
$$
\text{cov}[X,Y] = \mathbb{E}[ \,(X - \mathbb{E}[X]) (Y - \mathbb{E}[Y]) \, ].
$$

* See visualisation at <https://seeing-theory.brown.edu/basic-probability/index.html>
* For any function $g(x)$, we can defined the expected value of $g(X)$ as
    * $\mathbb{E}[g(X)] = \sum_{x} g(x) \;p(x)$ when $X$ is discrete or
    * $\mathbb{E}[g(X)] = \int_{-\infty}^{\infty} g(x) \;f(x) dx$ when continuous.

# Some common probability distributions

## Categorical distribution {.smaller}

The categorical distribution is the distribution obtained after associating natural number $1$ to $k$ to a set of events with a certain probability $\boldsymbol\pi = (\pi_1, \dots, \pi_k)$.

* For a categorical r.v. $X$, denoted $X \sim Cat(\boldsymbol\pi = (\pi_1, \dots, \pi_k))$,
    * $\mu = \mathbb{E}[X] = \sum_{i \leq k} i\; \pi_k$ and
    * $\text{var}[X] = \sum_{i \leq k} (i - \mu)^2\; \pi_k$.

```{r, fig.height=3, out.width="80%"}
pmf = tibble(
  X = 1:4,
  Probability = c(1/8, 2/8, 3/8, 2/8)
)
ggplot(data = pmf) +
  geom_bar(aes(x=X, y = Probability), col = 'black', width = 1, stat = 'identity', alpha = 0.4) + 
  labs(title = latex2exp::TeX('Probability mass function for a $Cat(\\mathbf{\\pi} = (1/8, 2/8, 3/8, 2/8))$'), x = '') +
  scale_y_continuous(limits = c(0,0.5)) +
  scale_x_continuous(breaks = 1:5, minor_breaks = NULL, labels = sprintf("X=%d", 1:5))
```

## Gaussian (normal) distribution {.smaller}

The gaussian (normal) distribution is the distribution that reduces the probability exponentially (with velocity $1/\sigma^2$) for events far from a certain center ($\mu$).

* For a gaussian r.v. $X$, $X \sim N(\mu, \sigma)$,
    * $\mathbb{E}[X] = \mu$ and
    * $\text{var}[X] = \sigma^2$.

```{r, fig.height=3, out.width="80%"}
df = tibble(
  X = seq(-3,3,length.out=100),
  Density = dnorm(X)
)

ggplot(data = df) +
  geom_area(aes(x=X, y = Density), col = 'black', alpha = 0.4) + 
  labs(title = latex2exp::TeX('Density function for a $N(\\mu = 0,\\sigma = 1)$'), x = 'x') +
  scale_y_continuous(limits = c(0,0.5))
```

## Binomial distribution {.smaller}

The binomial distribution counts the number of success after repeating a certain experiments $n$ times when the probability of success is $\pi$.

* For a binomial r.v. $X$, $X \sim Bin(n, \pi)$,
    * $\mathbb{E}[X] = n \pi$ and
    * $\text{var}[X] = n \pi (1-\pi)$.

```{r, fig.height=3, out.width="80%"}
pmf = tibble(
  X = 0:10,
  Probability = dbinom(X, size = 10, prob = 0.25)
)
ggplot(data = pmf) +
  geom_bar(aes(x=X, y = Probability), col = 'black', width = 1, stat = 'identity', alpha = 0.4) + 
  labs(title = latex2exp::TeX('Probability mass function for a $Bin(n = 10,\\pi = 0.25)$'), x = '') +
  scale_y_continuous(limits = c(0,0.3)) +
  scale_x_continuous(breaks = 0:10, minor_breaks = NULL, labels = sprintf("X=%d", 0:10))
```

## Poisson distribution {.smaller}

The Poisson distribution counts the number of success detected in an interval of time or space when the expected number of successes is $\lambda$.

* For a Poisson r.v. $X$, $X \sim Poiss(\lambda)$,
    * $\mathbb{E}[X] = \lambda$ and
    * $\text{var}[X] = \lambda$.

```{r, fig.height=3, out.width="80%"}
pmf = tibble(
  X = 0:13,
  Probability = dpois(X, lambda = 3)
)
ggplot(data = pmf) +
  geom_bar(aes(x=X, y = Probability), col = 'black', width = 1, stat = 'identity', alpha = 0.4) + 
  labs(title = latex2exp::TeX('Probability mass function for a $Pois(\\lambda = 3)$'), x = '') +
  scale_y_continuous(limits = c(0,0.3)) +
  scale_x_continuous(breaks = 0:13, minor_breaks = NULL, labels = sprintf("X=%d", 0:13))
```

## Generating r.v. with R {.smaller}

* Categorical r.v.

```{r, echo=TRUE}
sample(1:4, 6, prob = c(1/8,2/8,3/8,2/8), replace=TRUE)
```

* Gaussian r.v.

```{r, echo=TRUE}
rnorm(6, mean = 0, sd = 1)
```

* Binomial r.v.

```{r, echo=TRUE}
rbinom(6, size = 10, prob = 0.25)
```


* Poisson r.v.

```{r, echo=TRUE}
rpois(6, lambda = 3)
```

## Activity {background-color=#CCE5FF}

Thing about possible variables that can be modelled with one of the four seen distributions:

* Categorical distribution
* Gaussian distribution
* Binomial distribution
* Poisson distribution

Think about possible parameters to use.

<!--

## Conditional independence for r.v.'s

* For random variables, conditional independence, $X \perp Y | Z$, holds if for some functions $g$ and $h$ we have
$$
p(x, y| z) = g(x, z) h(y, z)
$$
for all possible values $x$, $y$ and $z$ of $X$, $Y$ and $Z$ respectively.

* If $X$ and $Y$ are independent, then $cor(X,Y) = 0$.

```{r, fig.height=3, out.width="80%"}
set.seed(0)
X = tibble(x = replicate(1000, cor(rnorm(100), rbinom(100, 10, 0.4))))
ggplot(data=X) +
  geom_histogram(aes(x = x), col = 'black', alpha = 0.4, bins = 20) +
  labs(y = 'Frequencies', title = latex2exp::TeX('Correlation between independent $X$ and $Y$ distribution  observed with simulation'),
  subtitle = latex2exp::TeX('1000 simulations with $X\\sim N(\\mu = 0, \\sigma = 1)$ and $Y \\sim Bin(n=10, \\pi = 0.4)$'))
```

-->

## Joint probability distribution

The _joint probability distribution_ is a multivariate model for random variables. The joint probability distribution of r.v. $X_1, \dots, X_k$ is completely determined by providing

* the probability of all possible events for discrete r.v.'s
$$
p(x_1,\dots,x_k) = \text{P}(X_1 = x_1 \cap \cdots \cap X_k = x_k),
$$
* or the multivariate density for continous r.v's
$$
f(x_1,\dots,x_k).
$$

## Transformations of random variables

* If we transform a r.v. with a function $g(x)$ we get another r.v.
* Linear transformations. If we linearly transform a r.v. $\boldsymbol X=(X_1,\dots,X_k)$ with $f(\boldsymbol x) = \boldsymbol A \boldsymbol x + \boldsymbol b$ then
    * $\mathbb{E}[f(\boldsymbol X)] = \boldsymbol A \boldsymbol\mu + \boldsymbol b$ where $\boldsymbol\mu = \mathbb{E}[\boldsymbol X]$, and
    * $\text{cov}[f(\boldsymbol X)] = \boldsymbol A \boldsymbol\Sigma \boldsymbol A^{T}$ where $\boldsymbol\Sigma = \text{cov}[\boldsymbol X]$.

* For non-linear transformations we can simulate the r.v. to obtain these expected values.

__Example:__ The log-normal distribution is defined as the distribution obtained after exponentiating a gaussian r.v., $f(x) = e^x$, $Y = e^X$, where $X \sim N(\mu,\sigma)$. Check using simulating gaussian r.v. that

* $\mathbb{E}[Y] = e^{\mu+\sigma^2/2}$ and 
* $\text{var}[Y] = (e^{\sigma^2}-1) e^{2\mu+\sigma^2}$.

## The central limit theorem

If $X_1, \dots, X_n$ are r.v. equally distributed with expected value $\mathbb{E}[X]$ and variance $\text{var}[X]$. Then, when $n$ is big, we have
$$
\bar{X} = \frac{X_1 + \cdots + X_n}{n} \sim N(\mathbb{E}[X], \sqrt{\text{var}[X]/n})
$$

<hr>

Next plot shows the distribution of 1000 realisations of a r.v. obtained as the average of 30 binomial distributions with parameters $n=10$ and $\pi=0.25$. Blue line represents the gaussian distribution with $\mu=n \pi$ and $\sigma=\sqrt{n \pi (1-\pi) / 30}$

```{r, fig.height=3, out.width="80%"}
n_ = 10
pi_ = 0.25
X_mean = tibble(isim = 1:1000) %>%
  mutate(
    X = map(isim, ~rbinom(30, n_, pi_)),
    x_n = map_dbl(X, length),
    x_m = map_dbl(X, mean),
    x_s = map_dbl(X, sd))
MU = n_ * pi_
SIGMA = sqrt(n_ * pi_ * (1-pi_) / 30)
X_norm = tibble(x = seq(1.5, 3.5, length.out=1000),
                fx = dnorm(x, MU, SIGMA))
ggplot() +
  geom_histogram(data=X_mean, aes(x=x_m, y = ..density..), bins = 20, alpha=0.4, col = 'black') +
  geom_line(data=X_norm, aes(x=x, y = fx), col = 'blue') +
  labs(x = latex2exp::TeX("$\\frac{X_1+\\cdots+X_{30}}{30}$") )
```

## The central limit theorem

If $X_1, \dots, X_n$ are r.v. equally distributed with expected value $\mathbb{E}[X]$ and variance $\text{var}[X]$. Then, when $n$ is big, we have
$$
\frac{\bar{X} - \mathbb{E}[X]}{\sqrt{\text{var}[X]/n}} \sim N(0, 1)
$$

<hr>

Next plot shows the distribution of 1000 standardised realisations of a r.v. obtained as the average of 30 binomial distributions with parameters $n=10$ and $\pi=0.25$. Blue line represents the gaussian distribution with $\mu=0$ and $\sigma=1$

```{r, fig.height=3, out.width="80%"}
Z_mean = X_mean %>%
  mutate(
    z_m = (x_m - MU) / SIGMA )
Z_norm = tibble(x = seq(-3, 3, length.out=1000),
                fx = dnorm(x))
ggplot() +
  geom_histogram(data=Z_mean, aes(x=z_m, y = ..density..), bins = 20, alpha=0.4, col = 'black') +
  geom_line(data=Z_norm, aes(x=x, y = fx), col = 'blue') +
  labs(x = latex2exp::TeX("$\\frac{\\bar{X} - E\\[X\\]}{\\sqrt{var\\[X\\]/n}}$") )
```

## Sampling distribution

If $x_1, \dots, x_n$ is a sample i.i.d with expected value $\mathbb{E}[X]$. Then, when $n$ is big, we have
$$
\frac{\bar{x} - \mathbb{E}[X]}{s_x/\sqrt{n}} \sim t_{n-1}
$$

<hr>

Next plot shows the distribution of 1000 standardised realisations of a r.v. obtained as the average of 30 binomial distributions with parameters $n=10$ and $\pi=0.25$. Normal and Student distributionss are compared.

```{r, fig.height=3, out.width="80%"}
T_mean = X_mean %>%
  mutate(
    t_m = (x_m - MU) / (x_s/sqrt(x_n))
  )
T_norm = tibble(x = seq(-5,5, length.out=1000),
                fx = dt(x, 29))
ggplot() +
  geom_histogram(data=T_mean, aes(x=t_m, y = ..density..), bins = 20, alpha=0.4, col = 'black') +
  geom_line(data=Z_norm, aes(x=x, y = fx, col = 'Normal')) +
  geom_line(data=T_norm, aes(x=x, y = fx, col = 't-Student')) +
  labs(x = latex2exp::TeX("$\\frac{\\bar{X} - E\\[X\\]}{\\sqrt{var\\[X\\]/n}}$"), col = '') +
  scale_color_manual(values = c('blue', 'red')) +
  theme(legend.position = 'top')
```

## Final activity  {background-color=#CCE5FF}

* Fix some r.v. $Y$ after some transformation $f(x)$. 
    * For example by transforming certain binomial distribution with function $f(x) = \sin(x)$.
* Take a sample of size $50$ and calculate the mean of this sample. We we call it: $x_1$.
* Repeat the process 1000 times obtaining a sample of means: $X_m = \{x_1, \dots, x_{1000}\}$.
* Visualise the resulting distribution of sample $X_m$. 
* Think about how we can use $x_i$ to obtain information of $\mathbb{E}[Y]$?

# That's all for today

## Next week session

```{r, echo=FALSE, results='asis'}
cat(readr::read_lines("session-05-content.md"), sep='\n')
```