# candy

## Reese’s Pieces and Bayesian Inference From http://en.wikipedia.org/wiki/Reese%27s_Pieces#/media/File:Reeses-pieces-loose.JPG.

I eat Reese’s pieces almost every day after lunch, and they come in three colors: orange, yellow, and brown.

I’ve wondered for a while whether the three colors occur in equal proportions, so for lunch today, I thought I’d try to infer the occurrence rates using Bayes’ Theorem.

Bayes’ Theorem provides a quantitative way to update your estimate of the probability for some event, given some new information. In math, the theorem looks like

<code> $P\left( H | E \right) = \dfrac{ P\left( E | H \right) P\left( H \right)}{P\left( E \right)},$</code>

The probability for event $H$ to happen, given that some condition $E$ is met, is the probability that $E$ is met, given that $H$ happened, times the probability for $H$ to happen at all, and divided by the probability for $E$ to be met at all.

The $P(H)$ and $P(E)$ are called the “priors” and often represent your initial estimates for the probability that $H$ and $E$ occur. $P\left(E | H \right)$ is called the “likelihood”, and $P(H | E)$ is the “posterior”, the thing we know AFTER $E$ is satisfied. $P(H | E)$ is usually the thing we’re trying to calculate.

So for my case, $P(H)$ will be the frequency with which a certain color occurs, and $E$ will be my experimental data.

For a given frequency $f_{\rm orange}$ of oranges (or browns or yellows), the probability $P(f_{\rm orange} | E)$ that I draw $N_{\rm orange}$ oranges is  ~ f^N (1 –  f)^N(not orange). As I select more and more candies, I can keep re-evaluating $P$ for the whole allowed range of f (0 to 1) and find the value that maximizes $P$.

Closing my eyes, I pulled ten different candies out of the bag, with following results in sequence: brown, orange, orange, yellow, orange, orange, orange, brown, orange, yellow, orange. These results obviously suggest orange has a higher frequency than yellow or brown.

This ipython notebook implements the calculation I described, and the plots below show how $P$ changes after a certain number of trials $n_{\rm trials}$:

So, for example, before I did any trials $n_{\rm trials} = 0$, I assumed all colors were equally likely. After the first trial when I chose a brown candy, the probability that brown has a higher frequency than the other colors goes up. After three trials (brown, orange, orange), orange takes the lead, and since I hadn’t seen any yellows, there’s a non-zero probability that yellow’s frequency is actually zero. We can see how the probabilities settle down after ten trials.

Based on this admittedly simple experiment, it seems that oranges have a frequency about twice that of yellows and browns. Although not as much fun, if I’d bothered to check wikipedia, I would have seen that “The goal color distribution is 50% orange, 25% brown, and 25% yellow” — totally consistent with my estimate.