________________________________________
From: gov2001-l-bounces at lists.fas.harvard.edu [gov2001-l-bounces at lists.fas.harvard.edu] On Behalf Of gov2001-l-request at lists.fas.harvard.edu [gov2001-l-request at lists.fas.harvard.edu]
Sent: Sunday, February 07, 2010 11:40 PM
To: gov2001-l at lists.fas.harvard.edu
Subject: gov2001-l Digest, Vol 55, Issue 23
Send gov2001-l mailing list submissions to
gov2001-l at lists.fas.harvard.edu
To subscribe or unsubscribe via the World Wide Web, visit
http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l
or, via email, send a message with subject or body 'help' to
gov2001-l-request at lists.fas.harvard.edu
You can reach the person managing the list at
gov2001-l-owner at lists.fas.harvard.edu
When replying, please edit your Subject line so it is more specific
than "Re: Contents of gov2001-l digest..."
Today's Topics:
1. Lamont (Meryl Federman)
2. factorials and plots in R (Maya Sen)
3. The gradient of likelihood functions (Lin, Eric)
4. Re: The gradient of likelihood functions (Gary King)
5. last call for co-authors (Maya Sen)
6. Re: last call for co-authors (Meryl Federman)
7. Combinations and permutations (Malekovic, Nino)
----------------------------------------------------------------------
Message: 1
Date: Sun, 7 Feb 2010 12:28:09 -0500
From: Meryl Federman <federman at fas.harvard.edu>
Subject: [gov2001] Lamont
To: Class List for Gov 2001/E-2001 <gov2001-l at lists.fas.harvard.edu>
Message-ID: <ADEC34F3-76F3-4940-A4FD-F2319824A505 at fas.harvard.edu>
Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed
Anyone who wants to, come by the Lamont cafe to work on the problem
set at 2:30!
~Meryl
Hi Meryl,
Do you mean 2.30 tomorrow afternoon?
Nino Malekovic
MPA Candidate, Class 2011
Harvard Kennedy School
------------------------------
Message: 2
Date: Sun, 7 Feb 2010 13:43:27 -0500
From: Maya Sen <msen at fas.harvard.edu>
Subject: [gov2001] factorials and plots in R
To: "Class List for Gov 2001/E-2001" <gov2001-l at lists.fas.harvard.edu>
Message-ID:
<16e0be401002071043x35870befs8d52812da19be6e3 at mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"
Hey all,
A question came up on how to do factorials in R. The answer is that you can
use the factorial function ("factorial"). So, for example:
> factorial(6)
[1] 720
> 6*5*4*3*2*1
[1] 720
Another question that came up was how to save the plots and figures that you
create in R. There are a few ways to do it, but I like to save my plots as
PDF's using the following type of command:
pdf(file= "myPlot.pdf", width = 5, height = 5, family = "Helvetica",
pointsize = 10)
## opens device to save plot, you can specify size, font, etc
plot(xVec, yVec)
## creates plot -- you can also use here "hist" or the "curve"
functions, of course
dev.off()
## closes device and finishes saving
If you move forward saving your plots as PDFs, then you should compile your
LaTeX code using pdflatex. Otherwise, you might get an error.
hope that helps,
Maya
When we want to check two situations to see where the std err's are likely going to be higher, I take it we can look to the shape of the peak on the likelihood function. I know we'll get into how to do the detailed calcs this week, but as far as eyeballing the shape, can we make comparisons between two situations (coparing a steep hill vs. a flatter hill?). To do this, we need to have the y axis of likelihood on the same scale.
I remember that we cannot directly compare the likelihood number from one dataset to another, but we can compare the shape to assess precision of estimates, right (which requires the same y-axis scale)?
Hey all,
A question came up on how to do factorials in R. The answer is that you can
use the factorial function ("factorial"). So, for example:
> factorial(6)
[1] 720
> 6*5*4*3*2*1
[1] 720
Another question that came up was how to save the plots and figures that you
create in R. There are a few ways to do it, but I like to save my plots as
PDF's using the following type of command:
pdf(file= "myPlot.pdf", width = 5, height = 5, family = "Helvetica",
pointsize = 10)
## opens device to save plot, you can specify size, font, etc
plot(xVec, yVec)
## creates plot -- you can also use here "hist" or the "curve"
functions, of course
dev.off()
## closes device and finishes saving
If you move forward saving your plots as PDFs, then you should compile your
LaTeX code using pdflatex. Otherwise, you might get an error.
hope that helps,
Maya
In relation to Problem 3 of the 2nd homework, I am a bit confused regarding
likelihood of several observed independent events.
During last week's section, it was briefly mentioned that the likelihood of
several independent events produced by the same data-generating process can
be computed in the same way as we do for probabilities: by multiplying the
likelihood of each independent event.
Also, on a recent thread Maya wrote: "For more on combining these very
simple likelihoods, see the discussion in UPM on page 23". However, in page
23 I only found this brief mention: "the likelihoods of the same model
applied to independent data sets may be combined by taking their product".
However, does it make any sense to compound the likelihoods of separate data
sets? I thought that measures of likelihood cannot be compared across
datasets. Intuitively, it would make sense that successive observations
ought to improve our understanding of the underlying data-generating
process, but the whole notion that likelihood cannot be compared across
datasets is confusing me. Thank you.
When asked what the max likelihood estimate is, do you report the likelihood, or the value theta that achieves the max likelihood?
EXL
--
ERIC LIN
Technology and Operations Management
Harvard Business School
Boston, MA 02163
elin at hbs.edu
mobile: +1.216.225.2545
Dear Guys:
Since I am new to R, I am trying to get my feet wet with the software. I
have a random dataset. However, the dataset is in "list" format, instead of
"matrix" format. In addition, the dataset has both numeric and categorical
data.
In online tutorials, they recommend using the command "do.call." I used
that, but that didn't work. I tried to use a loop, but I don't know to call
on each individual list to be added to the previous one. Any help?
Yousuf
On Sat, Feb 6, 2010 at 7:59 PM, <gov2001-l-request at lists.fas.harvard.edu>wrote:
> Send gov2001-l mailing list submissions to
> gov2001-l at lists.fas.harvard.edu
>
> To subscribe or unsubscribe via the World Wide Web, visit
> http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l
> or, via email, send a message with subject or body 'help' to
> gov2001-l-request at lists.fas.harvard.edu
>
> You can reach the person managing the list at
> gov2001-l-owner at lists.fas.harvard.edu
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of gov2001-l digest..."
>
>
> Today's Topics:
>
> 1. Distribution and modeling (Lin, Eric)
> 2. Re: Distribution and modeling (Maya Sen)
> 3. Re: Model buildilng, and "the rules" (Gary King)
> 4. Re: Distribution and modeling (Gary King)
> 5. Likelihood of independent events being produced by the same
> model (Miguel Solano)
> 6. Binomial PDF when number of "successes" is unknown
> (Malekovic, Nino)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Sat, 6 Feb 2010 14:25:21 -0500
> From: "Lin, Eric" <elin at hbs.edu>
> Subject: [gov2001] Distribution and modeling
> To: Class List for Gov 2001/E-2001 <gov2001-l at lists.fas.harvard.edu>
> Message-ID: <C7932C51.4A5B%elin at hbs.edu <C7932C51.4A5B%25elin at hbs.edu>>
> Content-Type: text/plain; charset="iso-8859-1"
>
> I'm a little confused on the concept of modeling using a distribution -
> particularly, what is the level of analysis we are applying it to . .. .
>
> If we have a dataset of say n=100, that is a realization of a random
> process. We could plot that data and look at that distribution, but if I
> understand it right, that is not the distribution of interest. I think that
> when we use distributions to model, we are talking about random variables,
> so the distribution we pick is to model what the data could have looked like
> given an underlying data generation process, right?
>
> So , to make things simple, take one out of the 100 observations, obs_i.
> When we pick the distribution, we are modeling what value that single
> observation could have taken given an underlying process? We are not
> modeling what the collective 100 observation sample would look like for a
> given sample draw?
>
> Or is this the same thing? If it is the same thing, will the distribution
> we use to characterize the potential values of a given observation always
> match the distribution for the sample?
>
>
> EXL
>
Dear Guys:
Since I am new to R, I am trying to get my feet wet with the software. I
have a random dataset. However, the dataset is in "list" format, instead of
"matrix" format. In addition, the dataset has both numeric and categorical
data.
In online tutorials, they recommend using the command "do.call." I used
that, but that didn't work. I tried to use a loop, but I don't know to call
on each individual list to be added to the previous one. Any help?
--
Yousuf
As regards 2nd problem in our homework, what I asked is similar to the way extended beta-binomial distribution was developed in UPM.
In the book, professor King started with binomial distribution. He then relaxed the assumption that ? (Pi) is constant, and modeled ? (Pi) by using beta distribution. He applied beta distribution and Bayes' rule in order to update binomial distribution, and that is how beta binomial distribution came into being.
My question here is, do we have to use multinomial distribution to model the fact that we do not know how many banks in our data set failed (anything from 5 to 15), and then use that multinomial distribution and Bayes' rule in order to update our binomial distribution from the 1st problem, and reach our pdf for the 2nd problem?
Isn't there a simpler approach?
Nino Malekovic
MPA Candidate, Class 2011
Harvard Kennedy School
________________________________________
From: gov2001-l-bounces at lists.fas.harvard.edu [gov2001-l-bounces at lists.fas.harvard.edu] On Behalf Of gov2001-l-request at lists.fas.harvard.edu [gov2001-l-request at lists.fas.harvard.edu]
Sent: Saturday, February 06, 2010 7:59 PM
To: gov2001-l at lists.fas.harvard.edu
Subject: gov2001-l Digest, Vol 55, Issue 20
Send gov2001-l mailing list submissions to
gov2001-l at lists.fas.harvard.edu
To subscribe or unsubscribe via the World Wide Web, visit
http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l
or, via email, send a message with subject or body 'help' to
gov2001-l-request at lists.fas.harvard.edu
You can reach the person managing the list at
gov2001-l-owner at lists.fas.harvard.edu
When replying, please edit your Subject line so it is more specific
than "Re: Contents of gov2001-l digest..."
Today's Topics:
1. Distribution and modeling (Lin, Eric)
2. Re: Distribution and modeling (Maya Sen)
3. Re: Model buildilng, and "the rules" (Gary King)
4. Re: Distribution and modeling (Gary King)
5. Likelihood of independent events being produced by the same
model (Miguel Solano)
6. Binomial PDF when number of "successes" is unknown
(Malekovic, Nino)
----------------------------------------------------------------------
Message: 1
Date: Sat, 6 Feb 2010 14:25:21 -0500
From: "Lin, Eric" <elin at hbs.edu>
Subject: [gov2001] Distribution and modeling
To: Class List for Gov 2001/E-2001 <gov2001-l at lists.fas.harvard.edu>
Message-ID: <C7932C51.4A5B%elin at hbs.edu>
Content-Type: text/plain; charset="iso-8859-1"
I'm a little confused on the concept of modeling using a distribution - particularly, what is the level of analysis we are applying it to . .. .
If we have a dataset of say n=100, that is a realization of a random process. We could plot that data and look at that distribution, but if I understand it right, that is not the distribution of interest. I think that when we use distributions to model, we are talking about random variables, so the distribution we pick is to model what the data could have looked like given an underlying data generation process, right?
So , to make things simple, take one out of the 100 observations, obs_i. When we pick the distribution, we are modeling what value that single observation could have taken given an underlying process? We are not modeling what the collective 100 observation sample would look like for a given sample draw?
Or is this the same thing? If it is the same thing, will the distribution we use to characterize the potential values of a given observation always match the distribution for the sample?
EXL
Hi all,
In the second problem of our homework, we have to calculate a maximum likelihood estimate, but y (the number of "successes") can vary in the interval between 5 and 15.
Do we just use a number of successes as a random variable on {5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15}, in which case we work with a set of binomial pdf-s, and calculate an MLE interval instead of maximum likelihood estimate? In that case our pdf would look like p?(y??)=(100!/y!(100-y)!)* ?^y*(1-?)^(100-y), y ? {5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15}.
If anyone has a different idea, please share with the rest of us.
Thanks,
Nino Malekovic
MPA Candidate, Class 2011
Harvard Kennedy School
________________________________________
From: gov2001-l-bounces at lists.fas.harvard.edu [gov2001-l-bounces at lists.fas.harvard.edu] On Behalf Of gov2001-l-request at lists.fas.harvard.edu [gov2001-l-request at lists.fas.harvard.edu]
Sent: Saturday, February 06, 2010 12:00 PM
To: gov2001-l at lists.fas.harvard.edu
Subject: gov2001-l Digest, Vol 55, Issue 19
Send gov2001-l mailing list submissions to
gov2001-l at lists.fas.harvard.edu
To subscribe or unsubscribe via the World Wide Web, visit
http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l
or, via email, send a message with subject or body 'help' to
gov2001-l-request at lists.fas.harvard.edu
You can reach the person managing the list at
gov2001-l-owner at lists.fas.harvard.edu
When replying, please edit your Subject line so it is more specific
than "Re: Contents of gov2001-l digest..."
Today's Topics:
1. Re: Bayes Rule (Maya Sen)
2. looking for a co-author? (Maya Sen)
3. Model buildilng, and "the rules" (Lin, Eric)
----------------------------------------------------------------------
Message: 1
Date: Fri, 5 Feb 2010 14:40:07 -0500
From: Maya Sen <msen at fas.harvard.edu>
Subject: Re: [gov2001] Bayes Rule
To: "Class List for Gov 2001/E-2001" <gov2001-l at lists.fas.harvard.edu>
Message-ID:
<16e0be401002051140h22f5ebe9l4526652c154596ff at mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"
Hi Bob,
So, you've got two questions here. The first one is if there are conditions
that must be in place prior to using Bayes Rule. I'm not exactly sure of
what conditions you're looking for here, but you do have to have all of the
constituent parts of the equation. Bayes Rule is
P(A|B) = P(B|A)*P(A)/P(B)
where
P(B|A) = the conditional probability of B given A
P(A) = the marginal probability of A
P(B) = the marginal probability of B
A lot of times you don't have the marginal probabilities of A and B, in
which case you can use the law of total probability (which we talked about
in class):
P(B) = P(B|A_1)P(A_1) + ... + P(B|A_n)P(A_n)
In terms of what's the simplest way to identify and label the correct
events, I think it's useful to see what in the problem is giving you clues
about P(A), P(B), etc. You should also look for clues that you are being
provided with a conditional probability by identifying words such as "given
that" etc.
It also helps to work through examples, so I've taken the following example
directly from the Wikipedia page on Bayes Rule:
Suppose there is a school with 60% boys and 40% girls as students. The
female students wear trousers or skirts in equal numbers; the boys all wear
trousers. An observer sees a (random) student from a distance; all the
observer can see is that this student is wearing trousers. What is the
probability this student is a girl? The correct answer can be computed using
Bayes' theorem.
The event *A* is that the student observed is a girl, and the event *B* is
that the student observed is wearing trousers. To compute P(*A*|*B*), we
first need to know:
- P(*A*), or the probability that the student is a girl regardless of any
other information. Since the observers sees a random student, meaning that
all students have the same probability of being observed, and the fraction
of girls among the students is 40%, this probability equals 0.4.
- P(*B*|*A*), or the probability of the student wearing trousers given
that the student is a girl. As they are as likely to wear skirts as
trousers, this is 0.5.
- P(*B*), or the probability of a (randomly selected) student wearing
trousers regardless of any other information. Since P(*B*) = P(*B*|*A*)P(
*A*) + P(*B*|*A*')P(*A*'), this is 0.5?0.4 + 1?0.6 = 0.8.
Given all this information, the probability of the observer having spotted a
girl given that the observed student is wearing trousers can be computed by
substituting these values in the formula:
P(A|B) = P(B|A)P(A)/P(B) = .5*.4/.8 = .25
hope that helps --
Maya
On Thu, Feb 4, 2010 at 5:43 PM, Bobby L. Woods <blwoods at fas.harvard.edu>
wrote:
> Are there conditions that must met prior to using Bayes Rule, or can it be
used
> for any probability question? Also, what is the simplest way to identify
and
> label the correct events?
>
> Thanks,
> Bob
>
> _______________________________________________
> gov2001-l mailing list
> gov2001-l at lists.fas.harvard.edu
> http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l
>
>