Gov2001 February 2010

gov2001@lists.gking.harvard.edu

35 participants
72 discussions

by nino_malekovic＠hks11.harvard.edu

________________________________________ From: gov2001-l-bounces at lists.fas.harvard.edu [gov2001-l-bounces at lists.fas.harvard.edu] On Behalf Of gov2001-l-request at lists.fas.harvard.edu [gov2001-l-request at lists.fas.harvard.edu] Sent: Sunday, February 07, 2010 11:40 PM To: gov2001-l at lists.fas.harvard.edu Subject: gov2001-l Digest, Vol 55, Issue 23 Send gov2001-l mailing list submissions to gov2001-l at lists.fas.harvard.edu To subscribe or unsubscribe via the World Wide Web, visit http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l or, via email, send a message with subject or body 'help' to gov2001-l-request at lists.fas.harvard.edu You can reach the person managing the list at gov2001-l-owner at lists.fas.harvard.edu When replying, please edit your Subject line so it is more specific than "Re: Contents of gov2001-l digest..." Today's Topics: 1. Lamont (Meryl Federman) 2. factorials and plots in R (Maya Sen) 3. The gradient of likelihood functions (Lin, Eric) 4. Re: The gradient of likelihood functions (Gary King) 5. last call for co-authors (Maya Sen) 6. Re: last call for co-authors (Meryl Federman) 7. Combinations and permutations (Malekovic, Nino) ---------------------------------------------------------------------- Message: 1 Date: Sun, 7 Feb 2010 12:28:09 -0500 From: Meryl Federman <federman at fas.harvard.edu> Subject: [gov2001] Lamont To: Class List for Gov 2001/E-2001 <gov2001-l at lists.fas.harvard.edu> Message-ID: <ADEC34F3-76F3-4940-A4FD-F2319824A505 at fas.harvard.edu> Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Anyone who wants to, come by the Lamont cafe to work on the problem set at 2:30! ~Meryl Hi Meryl, Do you mean 2.30 tomorrow afternoon? Nino Malekovic MPA Candidate, Class 2011 Harvard Kennedy School ------------------------------ Message: 2 Date: Sun, 7 Feb 2010 13:43:27 -0500 From: Maya Sen <msen at fas.harvard.edu> Subject: [gov2001] factorials and plots in R To: "Class List for Gov 2001/E-2001" <gov2001-l at lists.fas.harvard.edu> Message-ID: <16e0be401002071043x35870befs8d52812da19be6e3 at mail.gmail.com> Content-Type: text/plain; charset="iso-8859-1" Hey all, A question came up on how to do factorials in R. The answer is that you can use the factorial function ("factorial"). So, for example: > factorial(6) [1] 720 > 6*5*4*3*2*1 [1] 720 Another question that came up was how to save the plots and figures that you create in R. There are a few ways to do it, but I like to save my plots as PDF's using the following type of command: pdf(file= "myPlot.pdf", width = 5, height = 5, family = "Helvetica", pointsize = 10) ## opens device to save plot, you can specify size, font, etc plot(xVec, yVec) ## creates plot -- you can also use here "hist" or the "curve" functions, of course dev.off() ## closes device and finishes saving If you move forward saving your plots as PDFs, then you should compile your LaTeX code using pdflatex. Otherwise, you might get an error. hope that helps, Maya

14 years, 2 months

The gradient of likelihood functions

by elin＠hbs.edu

When we want to check two situations to see where the std err's are likely going to be higher, I take it we can look to the shape of the peak on the likelihood function. I know we'll get into how to do the detailed calcs this week, but as far as eyeballing the shape, can we make comparisons between two situations (coparing a steep hill vs. a flatter hill?). To do this, we need to have the y axis of likelihood on the same scale. I remember that we cannot directly compare the likelihood number from one dataset to another, but we can compare the shape to assess precision of estimates, right (which requires the same y-axis scale)?

14 years, 2 months

factorials and plots in R

by msen＠fas.harvard.edu

Hey all, A question came up on how to do factorials in R. The answer is that you can use the factorial function ("factorial"). So, for example: > factorial(6) [1] 720 > 6*5*4*3*2*1 [1] 720 Another question that came up was how to save the plots and figures that you create in R. There are a few ways to do it, but I like to save my plots as PDF's using the following type of command: pdf(file= "myPlot.pdf", width = 5, height = 5, family = "Helvetica", pointsize = 10) ## opens device to save plot, you can specify size, font, etc plot(xVec, yVec) ## creates plot -- you can also use here "hist" or the "curve" functions, of course dev.off() ## closes device and finishes saving If you move forward saving your plots as PDFs, then you should compile your LaTeX code using pdflatex. Otherwise, you might get an error. hope that helps, Maya

14 years, 2 months

Lamont

by federman＠fas.harvard.edu

Anyone who wants to, come by the Lamont cafe to work on the problem set at 2:30! ~Meryl

14 years, 2 months

Likelihood of independent events being produced by the same model

by msolano＠fas.harvard.edu

In relation to Problem 3 of the 2nd homework, I am a bit confused regarding likelihood of several observed independent events. During last week's section, it was briefly mentioned that the likelihood of several independent events produced by the same data-generating process can be computed in the same way as we do for probabilities: by multiplying the likelihood of each independent event. Also, on a recent thread Maya wrote: "For more on combining these very simple likelihoods, see the discussion in UPM on page 23". However, in page 23 I only found this brief mention: "the likelihoods of the same model applied to independent data sets may be combined by taking their product". However, does it make any sense to compound the likelihoods of separate data sets? I thought that measures of likelihood cannot be compared across datasets. Intuitively, it would make sense that successive observations ought to improve our understanding of the underlying data-generating process, but the whole notion that likelihood cannot be compared across datasets is confusing me. Thank you.

14 years, 2 months

Convention

by elin＠hbs.edu

When asked what the max likelihood estimate is, do you report the likelihood, or the value theta that achieves the max likelihood? EXL -- ERIC LIN Technology and Operations Management Harvard Business School Boston, MA 02163 elin at hbs.edu mobile: +1.216.225.2545

14 years, 2 months

gov2001-l Digest, Vol 55, Issue 20

by usuf.marvi＠gmail.com

Dear Guys: Since I am new to R, I am trying to get my feet wet with the software. I have a random dataset. However, the dataset is in "list" format, instead of "matrix" format. In addition, the dataset has both numeric and categorical data. In online tutorials, they recommend using the command "do.call." I used that, but that didn't work. I tried to use a loop, but I don't know to call on each individual list to be added to the previous one. Any help? Yousuf On Sat, Feb 6, 2010 at 7:59 PM, <gov2001-l-request at lists.fas.harvard.edu>wrote: > Send gov2001-l mailing list submissions to > gov2001-l at lists.fas.harvard.edu > > To subscribe or unsubscribe via the World Wide Web, visit > http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l > or, via email, send a message with subject or body 'help' to > gov2001-l-request at lists.fas.harvard.edu > > You can reach the person managing the list at > gov2001-l-owner at lists.fas.harvard.edu > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of gov2001-l digest..." > > > Today's Topics: > > 1. Distribution and modeling (Lin, Eric) > 2. Re: Distribution and modeling (Maya Sen) > 3. Re: Model buildilng, and "the rules" (Gary King) > 4. Re: Distribution and modeling (Gary King) > 5. Likelihood of independent events being produced by the same > model (Miguel Solano) > 6. Binomial PDF when number of "successes" is unknown > (Malekovic, Nino) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Sat, 6 Feb 2010 14:25:21 -0500 > From: "Lin, Eric" <elin at hbs.edu> > Subject: [gov2001] Distribution and modeling > To: Class List for Gov 2001/E-2001 <gov2001-l at lists.fas.harvard.edu> > Message-ID: <C7932C51.4A5B%elin at hbs.edu <C7932C51.4A5B%25elin at hbs.edu>> > Content-Type: text/plain; charset="iso-8859-1" > > I'm a little confused on the concept of modeling using a distribution - > particularly, what is the level of analysis we are applying it to . .. . > > If we have a dataset of say n=100, that is a realization of a random > process. We could plot that data and look at that distribution, but if I > understand it right, that is not the distribution of interest. I think that > when we use distributions to model, we are talking about random variables, > so the distribution we pick is to model what the data could have looked like > given an underlying data generation process, right? > > So , to make things simple, take one out of the 100 observations, obs_i. > When we pick the distribution, we are modeling what value that single > observation could have taken given an underlying process? We are not > modeling what the collective 100 observation sample would look like for a > given sample draw? > > Or is this the same thing? If it is the same thing, will the distribution > we use to characterize the potential values of a given observation always > match the distribution for the sample? > > > EXL >

14 years, 2 months

Fwd: R help

by usuf.marvi＠gmail.com

14 years, 2 months

gov2001-l Digest, Vol 55, Issue 20

by nino_malekovic＠hks11.harvard.edu

As regards 2nd problem in our homework, what I asked is similar to the way extended beta-binomial distribution was developed in UPM. In the book, professor King started with binomial distribution. He then relaxed the assumption that ? (Pi) is constant, and modeled ? (Pi) by using beta distribution. He applied beta distribution and Bayes' rule in order to update binomial distribution, and that is how beta binomial distribution came into being. My question here is, do we have to use multinomial distribution to model the fact that we do not know how many banks in our data set failed (anything from 5 to 15), and then use that multinomial distribution and Bayes' rule in order to update our binomial distribution from the 1st problem, and reach our pdf for the 2nd problem? Isn't there a simpler approach? Nino Malekovic MPA Candidate, Class 2011 Harvard Kennedy School ________________________________________ From: gov2001-l-bounces at lists.fas.harvard.edu [gov2001-l-bounces at lists.fas.harvard.edu] On Behalf Of gov2001-l-request at lists.fas.harvard.edu [gov2001-l-request at lists.fas.harvard.edu] Sent: Saturday, February 06, 2010 7:59 PM To: gov2001-l at lists.fas.harvard.edu Subject: gov2001-l Digest, Vol 55, Issue 20 Send gov2001-l mailing list submissions to gov2001-l at lists.fas.harvard.edu To subscribe or unsubscribe via the World Wide Web, visit http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l or, via email, send a message with subject or body 'help' to gov2001-l-request at lists.fas.harvard.edu You can reach the person managing the list at gov2001-l-owner at lists.fas.harvard.edu When replying, please edit your Subject line so it is more specific than "Re: Contents of gov2001-l digest..." Today's Topics: 1. Distribution and modeling (Lin, Eric) 2. Re: Distribution and modeling (Maya Sen) 3. Re: Model buildilng, and "the rules" (Gary King) 4. Re: Distribution and modeling (Gary King) 5. Likelihood of independent events being produced by the same model (Miguel Solano) 6. Binomial PDF when number of "successes" is unknown (Malekovic, Nino) ---------------------------------------------------------------------- Message: 1 Date: Sat, 6 Feb 2010 14:25:21 -0500 From: "Lin, Eric" <elin at hbs.edu> Subject: [gov2001] Distribution and modeling To: Class List for Gov 2001/E-2001 <gov2001-l at lists.fas.harvard.edu> Message-ID: <C7932C51.4A5B%elin at hbs.edu> Content-Type: text/plain; charset="iso-8859-1" I'm a little confused on the concept of modeling using a distribution - particularly, what is the level of analysis we are applying it to . .. . If we have a dataset of say n=100, that is a realization of a random process. We could plot that data and look at that distribution, but if I understand it right, that is not the distribution of interest. I think that when we use distributions to model, we are talking about random variables, so the distribution we pick is to model what the data could have looked like given an underlying data generation process, right? So , to make things simple, take one out of the 100 observations, obs_i. When we pick the distribution, we are modeling what value that single observation could have taken given an underlying process? We are not modeling what the collective 100 observation sample would look like for a given sample draw? Or is this the same thing? If it is the same thing, will the distribution we use to characterize the potential values of a given observation always match the distribution for the sample? EXL

14 years, 2 months

Binomial PDF when number of "successes" is unknown

by nino_malekovic＠hks11.harvard.edu

Hi all, In the second problem of our homework, we have to calculate a maximum likelihood estimate, but y (the number of "successes") can vary in the interval between 5 and 15. Do we just use a number of successes as a random variable on {5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15}, in which case we work with a set of binomial pdf-s, and calculate an MLE interval instead of maximum likelihood estimate? In that case our pdf would look like p?(y??)=(100!/y!(100-y)!)* ?^y*(1-?)^(100-y), y ? {5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15}. If anyone has a different idea, please share with the rest of us. Thanks, Nino Malekovic MPA Candidate, Class 2011 Harvard Kennedy School ________________________________________ From: gov2001-l-bounces at lists.fas.harvard.edu [gov2001-l-bounces at lists.fas.harvard.edu] On Behalf Of gov2001-l-request at lists.fas.harvard.edu [gov2001-l-request at lists.fas.harvard.edu] Sent: Saturday, February 06, 2010 12:00 PM To: gov2001-l at lists.fas.harvard.edu Subject: gov2001-l Digest, Vol 55, Issue 19 Send gov2001-l mailing list submissions to gov2001-l at lists.fas.harvard.edu To subscribe or unsubscribe via the World Wide Web, visit http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l or, via email, send a message with subject or body 'help' to gov2001-l-request at lists.fas.harvard.edu You can reach the person managing the list at gov2001-l-owner at lists.fas.harvard.edu When replying, please edit your Subject line so it is more specific than "Re: Contents of gov2001-l digest..." Today's Topics: 1. Re: Bayes Rule (Maya Sen) 2. looking for a co-author? (Maya Sen) 3. Model buildilng, and "the rules" (Lin, Eric) ---------------------------------------------------------------------- Message: 1 Date: Fri, 5 Feb 2010 14:40:07 -0500 From: Maya Sen <msen at fas.harvard.edu> Subject: Re: [gov2001] Bayes Rule To: "Class List for Gov 2001/E-2001" <gov2001-l at lists.fas.harvard.edu> Message-ID: <16e0be401002051140h22f5ebe9l4526652c154596ff at mail.gmail.com> Content-Type: text/plain; charset="iso-8859-1" Hi Bob, So, you've got two questions here. The first one is if there are conditions that must be in place prior to using Bayes Rule. I'm not exactly sure of what conditions you're looking for here, but you do have to have all of the constituent parts of the equation. Bayes Rule is P(A|B) = P(B|A)*P(A)/P(B) where P(B|A) = the conditional probability of B given A P(A) = the marginal probability of A P(B) = the marginal probability of B A lot of times you don't have the marginal probabilities of A and B, in which case you can use the law of total probability (which we talked about in class): P(B) = P(B|A_1)P(A_1) + ... + P(B|A_n)P(A_n) In terms of what's the simplest way to identify and label the correct events, I think it's useful to see what in the problem is giving you clues about P(A), P(B), etc. You should also look for clues that you are being provided with a conditional probability by identifying words such as "given that" etc. It also helps to work through examples, so I've taken the following example directly from the Wikipedia page on Bayes Rule: Suppose there is a school with 60% boys and 40% girls as students. The female students wear trousers or skirts in equal numbers; the boys all wear trousers. An observer sees a (random) student from a distance; all the observer can see is that this student is wearing trousers. What is the probability this student is a girl? The correct answer can be computed using Bayes' theorem. The event *A* is that the student observed is a girl, and the event *B* is that the student observed is wearing trousers. To compute P(*A*|*B*), we first need to know: - P(*A*), or the probability that the student is a girl regardless of any other information. Since the observers sees a random student, meaning that all students have the same probability of being observed, and the fraction of girls among the students is 40%, this probability equals 0.4. - P(*B*|*A*), or the probability of the student wearing trousers given that the student is a girl. As they are as likely to wear skirts as trousers, this is 0.5. - P(*B*), or the probability of a (randomly selected) student wearing trousers regardless of any other information. Since P(*B*) = P(*B*|*A*)P( *A*) + P(*B*|*A*')P(*A*'), this is 0.5?0.4 + 1?0.6 = 0.8. Given all this information, the probability of the observer having spotted a girl given that the observed student is wearing trousers can be computed by substituting these values in the formula: P(A|B) = P(B|A)P(A)/P(B) = .5*.4/.8 = .25 hope that helps -- Maya On Thu, Feb 4, 2010 at 5:43 PM, Bobby L. Woods <blwoods at fas.harvard.edu> wrote: > Are there conditions that must met prior to using Bayes Rule, or can it be used > for any probability question? Also, what is the simplest way to identify and > label the correct events? > > Thanks, > Bob > > _______________________________________________ > gov2001-l mailing list > gov2001-l at lists.fas.harvard.edu > http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l > >

14 years, 2 months

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

Gov2001 February 2010