Jon,
There are many ways to do a bootstrap. The two main ones are non-parametric
with re-sampling of the data, or parametric with re-sampling of the
residuals either from the estimated model or some specified distribution for
the error term. Then there are variants such as bias corrected bootstraps,
wild bootstrap, double bootstraps, etc. and there are also related
techniques such as sub-sampling where you break down the dataset into blocks
and compute estimate in each or the jackknife. A good reference on these is
Davison, A. C.; Hinkley, D. Bootstrap Methods and their Applications.
(2006). Bootstrap Methods and their Applications, 8th, Cambridge: Cambridge
Series in Statistical and Probabilistic Mathematics.
Which one you want to use depends on the application and the sampling
experiment you have in mind. In most cases a non-parametric bootstrap will
be more conservative because it implies almost no assumptions and relates to
super-population inference (you resample the data).
If you don't like the non-parametric bootstrap in the context of OLS you may
want to try a parametric bootstrap like:
out <- lm(Y~X)
# do this M times:
Y.boot <- X%*%out$coef + sample(out$resid,length(out$resid),replace=T)
b.boot <-lm(Y.boot~X)$coef
the distribution of betas is the bootstrap distribution. You can also draw
the residuals from a specified distribution. Essentially here you keep the
Xs fixed and just resample residuals.
Jens
-----Original Message-----
From: gov2001-l-bounces at
lists.fas.harvard.edu [mailto:gov2001-l-
bounces at
lists.fas.harvard.edu] On Behalf Of Jon Bischof
Sent: Tuesday, April 15, 2008 8:55 AM
To: gov2001-l at
lists.fas.harvard.edu
Subject: [gov2001-l] bootstrap
Hi all,
I understand that a bootstrap (like we did in an early problem set) is
one way to estimate the variability of a point estimate. However, I am
not sure how one should interpreted bootstrapped standard errors: it
seems like they should almost always be larger than the original (say,
OLS) errors because we're always using a subset of the data and so
less degrees of freedom with each simulation. How can one implement
the bootstrap to show that the standard errors for an estimation
technique used on the whole dataset underestimates the variability in
the point estimates? Does the difference between the regular and
bootstrapped standard errors just have to be huge?
--
Jon Bischof
Graduate Student
Department of Government
Harvard University
_______________________________________________
gov2001-l mailing list
gov2001-l at
lists.fas.harvard.edu
http://lists.fas.harvard.edu/mailman/listinfo/gov2001-l