Flashcards in Bayesian inference Deck (19)

Loading flashcards...

1

## What is bayesian data analysis?

### • Bayesian data analysis is when you use probability to represent uncertainty in all parts of a statistical model

2

## Describe probability theory

###
• A random variable X is a variable that obtains different values, x, (observed value) in different realizations, each value with a defined probability

• (Conventionally, upper case letters denote random variables; the corresponding lower case letters denote their realizations.)

3

## What are realisations in probability theory?

### A realization, or observed value, of a random variable is the value that is actually observed (what actually happened).

4

## What is independence in probability?

###
• Two random variables are independent if they don’t depend on each other

o One event can occur without affecting the probability of the other event occurring

o Two events are independent if the statistics of one event happening is the same no matter the outcome of the other

E.g. the chances of rolling a 1 after flipping a head on a coin is still 1/6

Conditional probability will be the same as marginal probability for independent events

5

## What is marginal probability?

###
P(X = x)

the likelihood of one event happening, independent of all other events

You can think of marginal probability as being the probability totals in the ‘margins’ of the probability tables

6

## What is joint probability?

###
P(X = x ∧ Y = y)

The likelihood of two events occurring together

o The joint probability is the product of the marginal probabilities (i.e. multiply the marginal probabilities together), only if the 2 events are independent of each other

P(X = x ∧ Y = y) = P(X) x P(Y)

7

## What is conditional probability?

###
P(X = x|Y = y) = P(X = x ∧ Y = y) / P(Y = y)

• The probability of an event ( Y ), given that another ( X ) has already occurred.

• If data are obtained from two (or more) random variables, the probabilities for one may depend on the value of the other(s)

• (in this case, these events are NOT independent)

8

## What are discrete and continuous probabilities?

###
o Discrete: summing

Splitting the data up into chunks, e.g. there’s a .01 probability of an adult being over 120cm, .03 probability of it being 200-210 etc and dividing all the probabilities of heights into chunks so they all equate to 1

o Continuous: integration

Keep splitting the chunks into smaller and smaller/more precise probabilities and you will eventually get a smooth curve instead of the chunks that you can then predict events off of

• The probability distributions will always equal to 100% (1), no matter how much we share or distribute it

9

## What is probability density?

###
Probability density

• For continuous-valued random variables, denoted by x ∈ R, instead of specifying probabilities the distribution is described by the cumulative distribution function

o Or by its derivative, the probability density function

10

## Describe Bayesian probability theory

###
• Bayesian probability theory:

o Probability is a quantification of the degree of confidence we have for something to be the case based on our current knowledge- including prior knowledge and the new data.

• Bayesian methods enable statements to be made about the partial knowledge available (based on data) concerning some situation or ‘state of nature’ (observable or as yet unobserved) in a systematic way, using probability as a measure of uncertainty

• The guiding principle is that the state of knowledge about anything unknown is described by a probability distribution

11

## Principles of Bayes theorem

###
• The posterior probability of a model given the data

• If you’re uncertain about something, the uncertainty is described by a probability distribution called your prior distribution

• You then obtain relevant data, the new data changed your uncertainty, which is then described by a new probability distribution called your posterior distribution

o Most of Bayesian inference is about how to go from prior to posterior

o The way Bayesians go from prior to posterior is to use the laws of conditional probability

o Can be called Bayes’ rule or Bayes’ theorem

12

## Describe the Bayes theorem equation

###
P(M|D) = P(D|M) X P(M)/P(D)

M: model, D: data

P(M|D) The posterior probability of the model given the data

P(D|M) The probability of the data given the model

P(M) Prior, marginal probability

P(D) probability of the data given all evidence from all models

13

## How to work out P(D|M)

### P(D|M) = P(y1 = 0|M=1) ∧ (y2 = 0| M = 1) = P(y1=0) x P(y2=0) = 0.5x0.5 = 0.25

14

## How to work out P(D)

###
o The probability of the data taking into account the evidence for all models (M=1 and M=0)

o P(D) = P(y|M=1) P(M=1) + P(y|M = 0) P(M=0)

o P(D) = (.25 X .5) + (1 x .5)

o P(D) = .125 + .5

o P(D) = .625

15

## final step of bayes theorem- how to work out P(M|D)

###
o P(M|D) = P(D|M) x (P(M)/ P(D)

o P(M|D) = P(y|M=1) x (P(M=1)/P(D)

o P(M|D) = (.25 x .5)/ .625

o P(M|D) = .125/.625

o P(M|D) = .2

16

## Critique of Null Hypothesis Significance Testing

###
• If H0 is correct, then this datum (D) cannot occur. D has occurred. Therefore, H0 is false

o Saying because D has occurred, H0 is false otherwise it violates the rule

• P(D|H0) ≠ P(H0|D)

o What we really want to know is probability that the hypothesis is false (i.e. the probability of the model) give that the data has occurred P(H0|D)

• P(D|H0) is the likelihood function

• P(H0|D) is the posterior probability

• A primary motivation for Bayesian thinking is that it facilitates a common-sense interpretation of statistical conclusions

o For instance, a Bayesian (probability) interval for an unknown quantity of interest can be directly regarded as having a high probability of containing the unknown quantity

o A frequentist (confidence) interval may strictly be interpreted only in relation to a sequence of similar inferences that might be made in repeated practice

17

## What is use for significance testing in frequentist and bayesian approach?

###
Frequentist--> p value (null hypothesis significance test)

Bayes--> Bayes factor

18

## What is used for estimation with uncertainty in frequentist and bayesian approaches

###
Freq: Maximum likelihood estimate with confidence intervals

Bay: Posterior distribution with highest density interval

19