## 16 July, 2015

### Bayes' Rule

(This post acts as sort of a prequel to my post on Bayesian inference)

I've talked enough about it, so I figured I would make an actual post on a probability rule that I've been using quite a bit - Bayes' rule.

## Purpose

I want to jump straight into a probability example first. Much like the infield fly rule or the offside rule in soccer, I think Bayes' rule makes much more sense when you understand what it's trying to do before learning the technical details.

Suppose a manager has exactly two pinch hitters available to him - let's call them Adam and José.  Adam has a $0.350$ OBP and José has a $0.300$ OBP. The manager calls Adam 70% of the time and calls José 30% of the time.

So without knowing who the manager will call, how do we calculate what the probability is that the pinch hitter will get on-base? Like so:

$P(\textrm{ On-Base }) = P(\textrm{ On-Base } | \textrm{ Adam })P(\textrm{ Adam })+P(\textrm{ On-Base } | \textrm{ José })P(\textrm{ José })$

The notation $P(\textrm{ On-Base }|\textrm{ Adam })$ means the probability of getting on-base "given" that Adam was chosen to pinch hit - that is, if we knew that the manager selected Adam, there would be a $0.350$ probability of getting on-base. As stated above, $P(\textrm{ Adam })$ - the probability the manager selects player Adam - is $0.7$. Similarly, $P(\textrm{ On-Base }|\textrm{ José }) = 0.300$ and $P(\textrm{ José }) =0.3$. Plugging numbers into the formula above,

$P(\textrm{ On-Base }) = (0.350)(0.7) + (0.300)(0.3) = 0.335$

The OBP is $0.350$ with probability $0.7$, and $0.300$ with probability $0.3$, so overall there is a $0.335$ probability that the pinch hitter will get on-base.

Now, let's flip what you know around - suppose you know that the pinch hitter got on base, but not which pinch hitter it was. Which player do you think was picked, Adam or José? Logically it's more likely to be Adam - but how much more likely? Can you give probabilities?

## Bayes' Rule

This is the basic idea of Bayes' rule - it flips conditional probabilities around. Instead of $P(\textrm{ On-Base } | \textrm{ Adam })$, it allows you to find find $P(\textrm{ Adam } | \textrm{ On-Base })$. For two events $A$ and $B$, the basic formulation is

$P(B | A) = \dfrac{P(A|B)P(B)}{P(A)}$

$P(A)$ on the bottom can be calculated as

$P(A) = P(A | B)P(B) + P(A | \textrm{ Not B }) P( \textrm{ Not B })$

Note that this is why above I specified that there were exactly two pinch hitters, so saying that "Not Adam" is the same thing as saying "José".  If there are more than two pinch hitters available, the formula above can be expanded.

Applying this to the batting averages, we have

$P(\textrm{ Adam }| \textrm{ On-Base }) = \dfrac{P(\textrm{ On-Base }|\textrm{ Adam })P(\textrm{ Adam })}{P(\textrm{ On-Base })}$

$=\dfrac{P(\textrm{ On-Base }|\textrm{ Adam })P(\textrm{ Adam })}{ P(\textrm{ On-Base } | \textrm{ Adam })P(\textrm{ Adam })+P(\textrm{ On-Base } | \textrm{ José })P(\textrm{ José })}$

Plugging in numbers, we get

$P(\textrm{ Adam }| \textrm{ On-Base }) = \dfrac{(0.350)(0.7)}{0.335} \approx 0.731$

And similarly,

$P(\textrm{ José }| \textrm{ On-Base }) = \dfrac{(0.300)(0.3)}{0.335} \approx 0.269$

So, given that the pinch hitter got on base, there was approximately  a 73.1% chance it was Adam and approximately a 26.9% chance it was Jose.

## One More Example

Adam tests positive for PED use. Let's suppose 10% of all MLB players are using PEDs. The particular test Adam took has 95% specificity and sensitivity - that is, if the player is using PEDs it will correctly identify so 95% of the time, and if the player is not using PEDs it will correctly identify so 95% of the time. Given a positive test, what is the probability that Adam is actually using PEDs? It's not 95%! We have to use Bayes' rule to figure it out.

I'm going to use "+" to indicate a positive test (indicating the test says that the player is using drugs) and a "-" to indicate a negative test.

$P(\textrm{ PEDs } | +) = \dfrac{P( + | \textrm{ PEDs })P(\textrm{ PEDs })}{P( + | \textrm{ PEDs })P(\textrm{ PEDs })+P( + | \textrm{ Not PEDs })P(\textrm{ Not PEDs }) }$

Let's figure these out one by one. As stated in the problem description, if a person is using PEDs, the test will identify so 95% of the time. Hence, $P( + | \textrm{ PEDs }) = 0.95$. Furthermore, 10% of all players are using PEDs, so $P(\textrm{ PEDs }) = 0.1$ and $P(\textrm{ Not PEDs }) = 0.9$. Lastly, since $P( - | \textrm{ Not PEDs }) = 0.95$ as stated in the problem description, it must be that $P( + | \textrm{ Not PEDs }) = 0.05$.

Plugging all these numbers in, we get

$P(\textrm{ PEDs } | +) = \dfrac{(0.95)(0.1)}{(0.95)(0.1) + (0.05)(0.9)} \approx 0.678$

So given that Adam tests positive for PEDs, there's actually only about a 2/3 chance that  he's using. It seems counter-intuitive given that the tests were pretty good - 95% sensitivity and specificity - but since most players aren't using (90%), there's bound to be a lot of false positives, making it so that Adam has a very, very good argument if Adam gets suspended over this particular test (vindicated).

Put another way - suppose you have 200 MLB players. 180 (90% of the total) are clean, and 20 (10% of the total) are using PEDs. Of the 180 that are clean, 171 (95% of the clean) test negative and 9 (5% of the clean) test positive. Of those that are using, 19 (95% of the PED users) test positive and 1 (5% of the PED users) tests negative.

This gives 19 PED users testing positive and 9 clean players testing positive, so the probability of being a PED user given testing positive is 19/(9+19) = 0.678.

(Note that I made all these numbers up. I'm sure that the tests MLB actually uses have higher specificity and sensitivity than 95%, and I have no idea what proportion of all MLB players are using PEDs)

## From Bayes' Rule to Inference

So how do we go from the rule to inference? Given some sort of model with parameter $\theta$, we can calculate $p(x | \theta)$ -  the probability of seeing the data that you saw given a particular value of the parameter. You may recognize this as the likelihood from earlier posts. Bayesians use Bayes' rule to flip around what's inside the probability statement and calculate $p(\theta | x)$  - the probability of a particular value of the parameter given the data that you saw - by

$p(\theta | x) = \dfrac{p(x | \theta)p(\theta)}{\int p(x | \theta)p(\theta) d\theta}$

where $p(\theta)$ is the prior distribution chosen by the Bayesian and $p(\theta | x)$ is the posterior distribution that is calculated. Inference about $\theta$ is then performed using the posterior.

That's Bayesian inference in a nutshell - start with a model, calculate the probability of seeing the data $x$ given a parameter $\theta$, and then use Bayes' rule to flip that around to the probability of  the parameter $\theta$ given that you saw the data $x$.

(Oh, and then do checking to make sure your model fits - but that's another post)