Thanks for pointing this out, Alex. Note that if you plot the
birthday distribution for i in 1:100, you asymtotically approach
1 for values of i > 50 or so. At 0.99, the values are not
particularly well differentiated (because the CDF is nearly 1
and nearly flat), so I'll accept a small range of answers for
this subpart.
On numerical stability in general, you should probably try to
use the summation of the log terms for the Negative Binomial
problem this week rather than actually multiplying the
probabilities. Just remember to exp() the numbers you get back
to restore them to the proper scale.
----- Original Message -----
From: "Alex Montgomery" <ahm(a)stanford.edu>
To: "Olivia Lau" <olau(a)fas.harvard.edu>
Sent: Friday, October 08, 2004 12:13 AM
Subject: Assignment 2 Solution Set
> Hi Olivia-
>
> One minor discrepancy in the solution set: You need 57 people
> for 0.99 (if you work it out analytically, it comes to
> 0.9901225) - if you make sims high enough, it ends up with
> this answer, too.
>
> -A
>
Hi, everyone.
For the beta binomial, skip step 1 of Gary's instructions. Draw pi.tilde
from the beta distribution and then follow the rest of the steps.
pi.tilde is just one draw from the beta distribution, for one simulation
from the beta binomial. You can use rbeta() to get this value, but you
will have to reparameterize gamma and mu to get the inputs (alpha and
beta) to the rbeta() function.
You cannot use rbeta() for question 5.
______________________________
Hi Olivia, I'm trying to follow the steps on how to simulate the
beta-binomial but no luck. My questions are (sorry if the answers are very
obvious): In step 1: how do we define pi_j at this point? Could it just be
any random variable from 0 to 1? In step 3: I understand it is that
pi_tilda is just one draw from the beta distribution. Is that correct? Can
we use rbeta() to get this value? If we can, the parameters u and and
gamma correspond to the parameters needed in the rbeta() comand? Thanks a
lot, Cecilia
Olivia.
Does anybody know
1) how to find mode, the most frequent value in a variable,
NOT R attribution of an object?
2) how to make histogram which can distinguish the frequency of value 0 cases
and that of value 1 cases?
Thank you in advance,
Kentaro
We were a little curious about how to plot the PDFs in 3 and 5.
In 3, we should be plotting from the simulation, right? That is, is
the PMF going to be a histogram of simulations? Or can we plot from
the probability mass function itself.
What about 5? I don't see an obvious way to simulate it from other
more basic distributions. If we use the functional form
(reparameterized by King in class), I'm not sure what should be on the
X axis.
Thanks!
/\llan
Hi,
Thanks to Jens for pointing this out. In part 6, it should be: "Write a
function which generates iid draws from the Beta-Binomial distribution for
user-specified values of mu, gamma and n. (Not pi and n -- pi is
unobserved.)
A corrected version is online. Yours,
Olivia.
If anyone out there as big a geek as me, they might like this article
on Benford's Law from the American Scientist
Hill, T.P.; "The First Digit Phenomenon," American Scientist, 86:358, 1998.
PDF: www.math.gatech.edu/~hill/ publications/cv.dir/1st-dig.pdf
I was kind of curious about the coin flipping thing. Sure enough, the
algebra is a little prickly, but the simulation was a breeze.
Code if anyone wants to tinker:
#Simulates flipping a coin 200 times to see if six heads or tails arrive in
a row
#Based on class exercise of TP Hill to show use of probability in auditing
sixheads <- function(sims) {
counter <- 0
for(i in 1:sims) {
flips <- berns(.5, 200) ## toss a coin 200 times
flips <- paste(flips, collapse="") ## make a string for grep
if(any(grep("111111", flips))) {counter <- counter + 1}
##look for the string of 6 heads
else if(any(grep("000000", flips))) {counter <- counter +
1} ##and tails
}
counter/sims ##returns percentage where a string occured
}
A couple of dumb questions regarding problem 6. I am unclear about the
correct interpretation of the steps to simulate the beta-binominal outlined
in the lecture notes.
There it says:
1. Begin with N Bernoulli trials with parameter pi.j, j = 1, . . . ,N (not
necessarily
independent or identically distributed)
So pi.j is allowed to vary across each of N trials, but not across
simulations of each N, right? In the former interpretation we would have as
many pi.js as Ns, but each pi.j remains constant across simulations. In the
latter interpretation we would have N*sims pi.js.
2.1 Choose mu = E(pi.j)
I guess this refers to the mean of pi.j over all Ns and not the mean of pi.j
across simulations of each N? So for each simulation, mu will be a single
number.
3. Draw pi.tilda from Beta(pi|mu,gamma)
Which pi (given mu & gamma) is this referring to? In the slide this pi has
no subscript or tilda or anything.
Thanks,
Jens
Hi,
In anticipation of the deluge tonight (since most of the study groups are
meeting tonight), you should be aware that I will only be answering email
intermittently from 6pm to 11pm tonight, but you should feel free do drop
by the basement of the Carpenter Center and look for me. I will respond
to all email (sent during that period) by Wed morning.
Also, some of you sent email on Thursday afternoon last week -- I have
class all Thursday afternoon until section, so you should not expect a
reply (from me) during that period.
Yours,
Olivia.
Hi, everyone.
All of you who use the HMDC VNC connection tool
("Double_click_me") need to download a new version available at:
http://www.hmdc.harvard.edu/HMDC_VNC.EXE
Please do so IMMEDIATELY. This should cut down on the number of
multiple sessions.
And those of you with multiple sessions need to kill them
immediately. This is getting out of hand.
Yours,
Olivia