Hi,
I?m not sure whether I?m missing something ?obvious? (retrospectively, of
course) with the sample() function:
What I?ve got for d) so far is:
beta.ests <- matrix (ncol=4, nrow=100) # 1st col == 100 beta.zero
estimates, 2nd col == 100 beta.1 estimates, and so on...
colnames(beta.ests) <- c("beta.0", "beta.1", "beta.2", "beta.3")
N <- nrow(data) # number of observations
for (i in 1:100){
# sample can't be used directly (I think?) because it returns an element
from a vector
# rather than a row from a dataframe or matrix. However, the elements
returned can then be used
# to select the rows to use in a sampled subset
sample.selector <- sample(1:N, 50, replace=T) # select 50 rows between [1,
1000] allowing the same row to be selected more than once
sampled.subset <- data[sample.selector,] # selects rows
sample.lm <- lm(y ~ x1 + x2 + x3, data="sampled.subset") #constructs
linear model based on sample
beta.ests[i,] <- as.numeric(sample.lm$coefficients) #extracts the betas
from the lm object and places in rows on beta.ests
}
rm(i)
The problem, I?ve discovered is with the following:
sampled.subset <- data[sample.selector,]
because sample.selector could contain the same number twice and the rows to
select are only meant to be specified once.
I know this *could* be sorted (I think) by stripping sample.selector to only
include unique values, then appending nonunique rows to the sample dataframe
the required number of times afterwards, but is there a more appropriate
(e.g. one/two command) way of doing this instead?
Thanks in advance for your responses...
Jon Minton
--
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.5.441 / Virus Database: 268.17.36/681 - Release Date: 11/02/2007
18:50
hi,
some of you have reported problems with accessing the Homework
Dropbox. We'e working with the technical support people to get
this fixed as soon as possible. If it doesn't work by the time
you want to upload your R code and writeup, please just send them
as email attachments to Justin and me.
And don't forget, the deadline for submitting your homework is 6
pm EST tonight. Please bring a hardcopy of your writeup to section.
cheers,
Holger
--
Holger Lutz Kern
Graduate Student
Department of Government
Cornell University
Institute for Quantitative Social Science
Harvard University
1737 Cambridge Street N350
Cambridge, MA 02138
www.people.cornell.edu/pages/hlk23
Hi Everyone,
I am looking for someone (or two someones) to team up with for the
final paper. I'm a third year PHD student in the government
department. My interests are in African political economy, but
encompass development economics and political economy more generally.
For the paper I would prefer to find an article on a topic within my
fields of interest. I would be willing to head in other directions
(both regionally and substantively) if I had a chance to get some
experience with individual or household level data. Please email me
if you're interested.
Best,
Andy
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Andy Harris
Ph.D. Candidate
Department of Government
Harvard University
Hi all,
several Gov 2001 announcements:
i) We are having an intro to LaTeX session this FRIDAY, 3:30-5,
in CGIS N354. It's designed for those of you who haven't used
LaTeX before and want to learn how to typeset professionally
looking documents. We will explain LaTeX basics and then show you
how to use LaTeX on the server and your own machine. For those of
you who use Windows machines, we'll be around afterwards to help
you set up MikTeX and WinEdt on your laptop.
ii) Don't forget to submit ps 1 electronically by THURSDAY 6pm
EST. Also, please bring a hardcopy of your writeup to section.
iii) Find one or (at most) two coauthors for the final paper if
you haven't done so already. You can post your research interests
to the mailing list to contact like-minded classmates.
cheers,
Holger
--
Holger Lutz Kern
Graduate Student
Department of Government
Cornell University
Institute for Quantitative Social Science
Harvard University
1737 Cambridge Street N350
Cambridge, MA 02138
www.people.cornell.edu/pages/hlk23
Hi all,
here's a small clarification regarding the final paper. You will
have to write the paper with either one or two coauthors, but not
more than that. However, we strongly prefer groups of two, and if
you write the paper with two others, your group will be held to a
slightly higher standard than groups consisting of two members.
Holger
--
Holger Lutz Kern
Graduate Student
Department of Government
Cornell University
Institute for Quantitative Social Science
Harvard University
1737 Cambridge Street N350
Cambridge, MA 02138
www.people.cornell.edu/pages/hlk23
you can install multiple versions of R. tho R code is typically not backwards compatible.
Gary
-----Original Message-----
From: Holger Lutz Kern <hlk23 at cornell.edu>
Date: Saturday, Feb 10, 2007 12:16 pm
Subject: Re: [gov2001-l] IMPORTANT: CHANGE of ps 1
To: gov2001-l at lists.fas.harvard.eduReply-To: gov2001-l at lists.fas.harvard.edu
Matt,
>
>I *think* you can have several R versions installed at the same
>time. I don't use WinEdt together with R, so I can't make any
>predictions about what will happen. Maybe someone else knows?
>
>Holger
>
>
>
>
>Matt Chingos wrote:
> Should I uninstall the old version first? Also, is this going to screw
> up how WinEdt and R work together on my computer?
>
> Justin Ryan Grimmer wrote:
>> Matt,
>>
>> Please go to
>>
>> http://www.r-project.org/
>>
>> and download R.2.4.1, this should eliminate the problem,
>>
>> Justin
>>
>> On Sat, 10 Feb 2007, Matt Chingos wrote:
>>
>>> Windows, with version 2.3.1
>>>
>>> Holger Lutz Kern wrote:
>>>> Matt,
>>>>
>>>> what R version and operating system do you use? It loads
>>>> semlessly on my Windows machine with R 2.4.1.
>>>>
>>>> Holger
>>>>
>>>> Matt Chingos wrote:
>>>>> Same error message:
>>>>>
>>>>> > load("C:/Documents and Settings/Matt/My Documents/Courses/Gov
>>>>> 2001/data_ps1.RData")
>>>>> > ls()
>>>>> [1] "data"
>>>>> > data
>>>>> Error in data.frame(y = c("-0.1328318193", " 1.7342160480"), x1 = c(" > >> 3.9920993", :
>>>>> row names contain missing values
>>>>> In addition: Warning message:
>>>>> corrupt data frame: columns will be truncated or padded with NAs in:
>>>>> format.data.frame(x, digits = digits, na.encode = FALSE)
>>>>> >
>>>>>
>>>>>
>>>>> Holger Lutz Kern wrote:
>>>>>> Matt,
>>>>>>
>>>>>> please try saving it to your working directory and opening it
>>>>>> from there using load().
>>>>>>
>>>>>> Holger
>>>>>>
>>>>>>
>>>>>>
>>>>>> Matt Chingos wrote:
>>>>>>> Is anyone else getting the following error message when they try to load
>>>>>>> the data for the homework? If so, do you know what it means?
>>>>>>>
>>>>>>> >
>>>>>>> load(url("http://isites.harvard.edu/fs/docs/icb.topic137586.files/Problem_set_1/data_…"))
>>>>>>> >
>>>>>>> > data
>>>>>>> Error in data.frame(y = c("-0.1328318193", " 1.7342160480"), x1 = c("
>>>>>>> 3.9920993", :
>>>>>>> row names contain missing values
>>>>>>> In addition: Warning message:
>>>>>>> corrupt data frame: columns will be truncated or padded with NAs in:
>>>>>>> format.data.frame(x, digits = digits, na.encode = FALSE)
>>>>>>> >
>>>>>>>
>>>>>>>
>>>>>>> Holger Lutz Kern wrote:
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> we have slightly changed the last part of the problem set to make
>>>>>>>> it a little bit easier for those of you who haven't used R
>>>>>>>> before. The question now reads:
>>>>>>>>
>>>>>>>> "Randomly draw 100 datasets with 50 observations each from the
>>>>>>>> dataset in part (b)
>>>>>>>> with replacement (e.g. allowing the same row to be drawn multiple
>>>>>>>> times for a single
>>>>>>>> dataset) using sample(). Estimate the same regression as in (b)
>>>>>>>> for each of the 100
>>>>>>>> datasets using lm() and either a for() loop or the apply()
>>>>>>>> function. Report the
>>>>>>>> means and standard deviations of the beta estimates over all 100
>>>>>>>> datasets in the form
>>>>>>>> of a nicely formatted table."
>>>>>>>>
>>>>>>>> The problem set on the website has also been updated (but you
>>>>>>>> might have to hit the reload button in your browser to get to see
>>>>>>>> the new version).
>>>>>>>>
>>>>>>>>
>>>>>>>> Holger
>>>>>>>>
>>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> gov2001-l mailing list
>>>>>>> gov2001-l at lists.fas.harvard.edu
>>>>
After wasting 1 hour on this, I found a solution - use it to save yourself some time;
If you're having trouble importing .pdf plots from R into LATEX use .eps instead:
dev.copy2eps(file="myplot.eps")
and then in LATEX/editor:
\scalebox{0.7}{\includegraphics{myplot}}
assuming you have copied the file to the appropriate directory (where you .tec file is);
the example on one of the cheat sheets (Running R, LATEX, VNC FAQ) doesn't work because the .pdf file doesn't have a bounding box...
Hi all,
we have slightly changed the last part of the problem set to make
it a little bit easier for those of you who haven't used R
before. The question now reads:
"Randomly draw 100 datasets with 50 observations each from the
dataset in part (b)
with replacement (e.g. allowing the same row to be drawn multiple
times for a single
dataset) using sample(). Estimate the same regression as in (b)
for each of the 100
datasets using lm() and either a for() loop or the apply()
function. Report the
means and standard deviations of the beta estimates over all 100
datasets in the form
of a nicely formatted table."
The problem set on the website has also been updated (but you
might have to hit the reload button in your browser to get to see
the new version).
Holger
--
Holger Lutz Kern
Graduate Student
Department of Government
Cornell University
Institute for Quantitative Social Science
Harvard University
1737 Cambridge Street N350
Cambridge, MA 02138
www.people.cornell.edu/pages/hlk23
hi all,
I've posted the R code from the session today to the website
under Computing Documentation/Using R and Zelig. This folder also
contains 2 cheat sheets with R commands that might come in handy.
Holger
--
Holger Lutz Kern
Graduate Student
Department of Government
Cornell University
Institute for Quantitative Social Science
Harvard University
1737 Cambridge Street N350
Cambridge, MA 02138
www.people.cornell.edu/pages/hlk23
hi all,
quick reminder: we will have an introduction to R session TODAY
at 4 pm in CGIS-Knafel South 354. All of you who haven't used R
before should try to attend.
cheers,
Holger
--
Holger Lutz Kern
Graduate Student
Department of Government
Cornell University
Institute for Quantitative Social Science
Harvard University
1737 Cambridge Street N350
Cambridge, MA 02138
www.people.cornell.edu/pages/hlk23