*If you have made it this far, great! Below are a couple of bonus questions. How far can you reach?* ----------------------------------- 5. Cows and Mutant Cows ------------------------- Farmer Jöns has a huge number of cows. Due to a recent radioactive leak in a nearby power plant he fears that some of them have become *mutant cows*. Jöns is interested in measuring the effectiveness of a diet on normal cows, but not on mutant cows (that might produce excessive amounts of milk, or nearly no milk at all!). The following data set contains the amount of milk for cows on a diet and cows on normal diet: ```{r} diet_milk <- c(651, 679, 374, 601, 4000, 401, 609, 767, 3890, 704, 679) normal_milk <- c(798, 1139, 529, 609, 553, 743, 3,151, 544, 488, 15, 257, 692, 678, 675, 538) ``` Some of the data points might come from mutant cows (aka outliers). **→ Jöns now wants to know: Was the diet any good, does it results in better milk production for non-mutant cows?** **Hint:** Basically we have an outlier problem. A conventional trick in this situation is to supplement the normal distribution for a distribution with wider tails that is more sensitive to the central values and disregards the far away values (this is a little bit like trimming away some amount of the data on the left and on the right). A good choice for such a distribution is the t-distribution which is like the normal but with a third parameter called the "degrees of freedom". The lower the "degrees of freedom" the wider the tails and when this parameter is larger than about 50 the t-distribution is practically the same as the normal. A good choice for the problem with the mutant cows would be to use a t distribution with around 3 degrees of freedom: ``` y ~ student_t(3, mu, sigma); ``` Of course, you could also estimate the "degrees of freedom" as a free parameter, but that might be overkill in this case... 6. Chickens and diet ------------------------- Farmer Jöns has a huge number of cows. He also has chickens. He tries different diets on them too with the hope that they will produce more eggs. Below is the number of eggs produced in one week by chickens on a diet and chickens eating normal chicken stuff: ```{r} diet_eggs <- c(6, 4, 2, 3, 4, 3, 0, 4, 0, 6, 3) normal_eggs <- c(4, 2, 1, 1, 2, 1, 2, 1, 3, 2, 1) ``` **→ Jöns now wants to know: Was the diet any good, does it result in the chickens producing more eggs?** **Hint:** The Poisson distribution is a discrete distribution that is often a reasonable choice when one wants to model count data (like, for example, counts of eggs). The Poisson has one parameter $\lambda$ which stands for the mean count. In Stan you would use the Poisson like this: ``` y ~ poisson(lambda); ``` where y would be a single integer or an integer array of length `n` ( defined like `int y[n];`) and `lambda` a real number bounded at 0.0 (`real