I'm afraid we have had problems with this before, but I think(!) that it
is fixed in the new version of Amelia for windows. It is fixed in the
underlying Amelia for Gauss code. Have you updated Amelia for windows
since June or so?
In case I'm forgetting something, Ken or James will fill us in...
Sorry for the troubles.
Gary
On Sun, 29 Sep 2002, Torben Iversen wrote:
> I have used Amelia without problems in the past, so in general I don't find
> it hard to use. And I feel bad wasting your time on this. But, OK, I'd love
> to be able to move on. So here's the info:
>
> Under time series options, I choose the relevant time-series variable
> (AMts=2), the cross-sectional variable (AMcs=1), and I ask to include the
> dependent variable with a lag (AMlagvs=3). AMtstep is set to 1(annual data).
> All other globals are retained at their defaults. I did try to switch to EM
> under AMmthd, and at some point I also tried to change AMempri (though I
> don't think that's possible with time-series). But the same message about
> names exceeding variables always popped up.
>
> My data is sorted by country and years (14 countries * 31 years = 434 obs).
> I have 27 variables with missingness heavily concentrated on the dependent
> variable (there are only 61 observations on redistribution).
>
> Torben
>
>
> ----- Original Message -----
> From: "Gary King" <king(a)harvard.edu>
> To: "Torben Iversen" <iversen(a)fas.harvard.edu>
> Cc: "Amelia Listserv" <amelia(a)latte.harvard.edu>
> Sent: Sunday, September 29, 2002 7:46 PM
> Subject: Re: Quick question about Amelia
>
>
> >
> > Sorry for the inelegance (we're working on a new GUI that should make
> > things like this easier). What options are you choosing?
> > Gary
> >
> >
> >
> > On Sun, 29 Sep 2002, Torben Iversen wrote:
> >
> > > Gary,
> > >
> > > I tried to use Amelia on our data. The program seems to read the data
> fine
> > > (right number of variables and observations), and I'm following the
> > > directions for cross-sectional time series. However, when I hit "run" I
> get
> > > the error message "_AMvarnm has too many variable names to match each
> > > variable in the data set." What does that mean? I did not change the
> Amelia
> > > default variable names (var1, var2, ...).
> > >
> > > I'm sure I'm just making a simple mistake, but I can't figure out what
> it
> > > is.
> > >
> > > Thanks for any suggestions you may have,
> > >
> > > Torben
> > >
> >
>
-
amelia mailing list served by Harvard-MIT Data Center
List Address: amelia(a)latte.harvard.edu
Subscribe/Unsubscribe: http://lists.hmdc.harvard.edu/?info=amelia
Sorry for the inelegance (we're working on a new GUI that should make
things like this easier). What options are you choosing?
Gary
On Sun, 29 Sep 2002, Torben Iversen wrote:
> Gary,
>
> I tried to use Amelia on our data. The program seems to read the data fine
> (right number of variables and observations), and I'm following the
> directions for cross-sectional time series. However, when I hit "run" I get
> the error message "_AMvarnm has too many variable names to match each
> variable in the data set." What does that mean? I did not change the Amelia
> default variable names (var1, var2, ...).
>
> I'm sure I'm just making a simple mistake, but I can't figure out what it
> is.
>
> Thanks for any suggestions you may have,
>
> Torben
>
-
amelia mailing list served by Harvard-MIT Data Center
List Address: amelia(a)latte.harvard.edu
Subscribe/Unsubscribe: http://lists.hmdc.harvard.edu/?info=amelia
I agree with the main point below. refusals to answer that are actual
answers (I don't have an opinion about the national helium reserve) should
not be imputed. but things like income can be imputed since everyone has
an income (pos or neg) even if they won't tell us. some of this is
discussed in our APSR article.
optionally, we would know which was which before we do imputation, and
we'd only impute where the missing value actually exists, tho is missing.
In this case, its pretty straightforward. We just use Amelia to deal with
all the real cases and somehow model the others separately. (One way to
model the others would be to use a pair of conditional models. E.g.,
first a logit to model whether or not someone ventures an opinion on the
natl helium reserve, and then conditional on having an opinion, you could
use something like an ordinal probit to analyze the opinions).
There is another more difficult situation where the category "refuse to
answer" is a mix of DK's and variables that don't exist. This would be
harder to model and I don't know of a paper that models data like this.
To do it seems hard but straightforward (some kind of mixture model), but
whether it would be worth the extra trouble programming it is another
question.
Incidentally, if you're doing cross-cultural research, you might have a
look at our Anchoring Vignettes idea at http://gking.harvard.edu/vign/
Gary
On Wed, 25 Sep 2002, Randy Stevenson wrote:
>
> Gary,
>
>
> Thanks for your note about panel attrition. Sorry to bother you again,
> but preparing all of this survey data has foced me to think about some
> common problems in light of the imputation tecnology and I thought this
> one would iterest you.
>
> The question is what to do with don't know responses in surveys when the
> DK is potentially meaningful (i.e., it doesn't simply hide a real
> response). Below is an idea on how imputation may help with this
> problem. I havent found any discussion of this possility in any
> literature and was wondering if you had thought of this or thought it
> was an idea worth pursuing. At this point it is basically my intuition
> and I would obviously have to pursue a more rigorous exploration if it's
> a worthwhile idea.
>
> Consider, for example, DK responses to a question asking people to place
> themselves on a left/right scale. There are a variety of reasons why
> people might answer DK in this situation. I would group these into two
> main categories. The first are situations in which we expect that there
> is really an underlying answer but that it is not revealed. This
> category includes more than just situations in which the person wants to
> hide the answer. It also includes cases (1) in which the person just
> des not want to invest the cogntive energy to come up with a response,
> (2) in which they don't understand the question, or don't know the
> meaning of the words. Finally, people who answer don't know instead of
> putting themselves in the middle of the scale also belong here. The key
> idea here is that if we could probe the person's attitude further
> (explaining the meaning of the question or assuring them they could
> reveal it to us, we could obtain a meaningful answer).
>
> A second set of reason's for DK responses are when there is not an
> underlying answer that is being denied us. This can happen because the
> people answer DK when the respondent is really uncertain where they fit
> on the scale with which they are presented. Maybe they haven't thought
> about the issue at all. Maybe in a case like left/right self placement,
> they understand what the scale means but cant reconcile their
> conflicting policy views in a way that gives them a placement and are
> unwilling to say they are middle of the road, because they don't think
> they are (for example they could be policy extremists on some left and
> right policies and this doesn't jibe with what they think a centerist on
> a left/right scale is).
>
> In the first set of cases it seems to me that it is perfectly
> appropriate to impute the values of the DK category, but in the second
> it is not. In the second case, we would want to include DK as a valid
> response and model how this response contributes to the dependent
> variable in the explanatory model. I would suggest that we can adopt an
> assumption that since these are non-attitudes, we would expect that they
> can not have an impact on any kind of behavior that we are modelling and
> that they should be poorly predicted from other variables in the
> imputation model.
>
> If we adopt these assumptions, then it seems to me there is a reasonable
> way to proceed which is as follows:
>
> (1) impute all the DK's
> (2) in the analysis model interact the imputed variable with a dummy
> marking the DK's
> (3) If category 2 dominates the DK's then the interaction should
> indicate no relationship between variable and the DV for the imputed
> cases.
> (4) if category 1 dominates then they interaction will be insignificant.
>
> The real trick for this to be true is that the imputation model for
> people with non attitudes cannot systematically impute values that then
> predict the dependent variable in the same way that people with real
> attitudes (whether missing or not) do.
>
> Anyway, if you have any thoughts or have done any work on this, I would
> love to know. Again, ray and I are currently adopting a methodology to
> reanalyze a large number of elections studies in a lot of countires to
> provide a more definitive picture of cross national difference in the
> sources of voting behavior (and especially economic voting) and so we
> are trying to adopt the best practice on all these issues. Surprisingly,
> I thought there would be more concensus on this issue, but the DK
> literature seems to be more concerned with pointing out the trouble they
> can cause then in proposing remedies.
>
> Finally, since we will be estimating models with a fair number of
> variables and since we are using multinomial models for multiparty
> elections, inluding a full set of dummies for every categorical or quasi
> continuous varibel (with a don't know category included) doesn't seem
> like a feasible approach.
>
>
> Thanks,
>
> Randy
> ___________________________________________
> Randy T. Stevenson
> Albert Thomas Associate Professor of Political Science
> Dept. of Political Science /MS 24
> Rice University
> 6100 Main St.
> Houston, Texas 77005
>
> phone: 713 348-2104
> fax: 713 348-5273
> email: stevenso(a)ruf.rice.edu
>
-
amelia mailing list served by Harvard-MIT Data Center
List Address: amelia(a)latte.harvard.edu
Subscribe/Unsubscribe: http://lists.hmdc.harvard.edu/?info=amelia
Yeah, panel attrition is a big issue. Amelia in general is a pretty good
second-best approach for most applications, where the 1st best approach
differs from application to application and is generally a lot harder to
use. The question in this case is how well you can do in this case.
What I'd do is to give it a try but to carefully check how well its
fitting. In principle, there's nothing wrong with doing it, but I'd
compare the imputed to the nonmissing values and see whether they are
roughly similar. e.g., look at a scatterplot of say PID time 1 by PID time
2 with the imputed values in red. the imputed values can be different
than the others because of panel attrition _bias_ that is being corrected
by Amelia through other variables (not in the scatterplot but measured in
the first wave). But if the red dots are far from the others, then I'd go
another step and figure out why. If you can't find a good reason (like
attrition being due to high income), then I'd worry about what Amelia is
doing.
overall of course, the more you impute, the more your answers are
dependent on the model.
Gary
: Gary King, King(a)Harvard.Edu http://GKing.Harvard.Edu :
: Center for Basic Research Direct (617) 495-2027 :
: in the Social Sciences Assistant (617) 495-9271 :
: 34 Kirkland Street, Rm. 2 HU-MIT DC (617) 495-4734 :
: Harvard U, Cambridge, MA 02138 eFax (928) 832-7022 :
On Tue, 24 Sep 2002, Randy Stevenson wrote:
>
> Gary,
>
> I was writing to get your opinion on a common data analysis problem.
> Ray Duch and I are reanalyzing a large number of election studies for
> our book on comparative economic voting. We are using amelia to deal
> with missing data, but are debating what to do for cases in which a
> respondent in the pre-election survey did not respond in the
> post-election survey. Since we are using quite a few variables from
> most of the post election surveys, we are asking a lot from amelia to
> fill in so many missing values (even though for many there is reasonable
> information in the first stage to help impute them). Most of the
> codebooks include an anlysis of non-responde and whther there seem to be
> anything systematic about who these people are and most conclude that
> there is little systematic in the non-responses. I assume that you have
> thought about this particular kind of missing data and wondered if you
> had any thoughts.
>
>
> Randy
> ___________________________________________
> Randy T. Stevenson
> Albert Thomas Associate Professor of Political Science
> Dept. of Political Science /MS 24
> Rice University
> 6100 Main St.
> Houston, Texas 77005
>
> phone: 713 348-2104
> fax: 713 348-5273
> email: stevenso(a)ruf.rice.edu
>
-
amelia mailing list served by Harvard-MIT Data Center
List Address: amelia(a)latte.harvard.edu
Subscribe/Unsubscribe: http://lists.hmdc.harvard.edu/?info=amelia
we've done some runs like that. Glad its worked for you.
Its just that after 40 vars, users run out of patience!
(One person told me they ran it with 650,000 observations)
Gary
On Thu, 19 Sep 2002, bob wrote:
> Gary,
>
> I don't know whether you keep track of these things, but I managed to impute
> a data set containing 47 variables and 21,000 cases. It did take almost 12
> hours though (running on my laptop overnight). Is this out of the ordinary?
> The guidelines has seemed to suggest that 40 variables was pretty much the
> outer limits of the program.
>
> Bob
>
> Robert Mattes
> Democracy in Africa Research Unit
> University of Cape Town
> Afrobarometer
>
>
>
> ----- Original Message -----
> From: "Gary King" <king(a)harvard.edu>
> To: <adamt(a)who.int>
> Cc: <evansd(a)who.int>; "Amelia Listserv" <amelia(a)latte.harvard.edu>
> Sent: Thursday, September 19, 2002 3:03 PM
> Subject: [amelia] Re: A question about using Amelia in a prediction model
>
>
> >
> > pls see below...
> >
> > On Thu, 19 Sep 2002 adamt(a)who.int wrote:
> >
> > >
> > > > Dear Dr King, I am currently writing a paper on the work that I have
> been
> > > > doing using Amelia, and for which your team and yourself had provided
> me
> > > > with great help in using the software.
> > > >
> > > > This work involves the development of a model to predict hospital unit
> > > > cost (the dependent variable) using a set of explanatory variables-
> for
> > > > which missing values have been imputed by Amelia. I have been trying
> to
> > > > find out from the economic literature on multiple imputation what kind
> of
> > > > tests were used to discuss the goodness of fit of the models that
> were
> > > > developed after imputation-i.e., other than the adjusted R squared or
> the
> > > > F statistics, for example, which cannot be computed from the average
> > > > equation. I was not able to find any reference to model validation
> from my
> > > > search and I am not sure how I can defend my model or provide a way to
> > > > validate it especially that I do not have a gold standard to compare
> my
> > > > results with. I would very much appreciate if you could let me know of
> any
> > > > references or have any suggestions about ways to convince an economic
> > > > readership that the average equation fits well.
> >
> > any quantity (including R^2 if that makes sense to you) can be computed.
> > You follow the same rules for combining these quantities across the
> > multiply imputed data sets as for any other quantity. see
> > http://gking.harvard.edu/amelia/node3.html
> >
> > I'd add that I don't find R^2 of much use (see
> > http://gking.harvard.edu/files/abs/mist-abs.shtml and
> > http://gking.harvard.edu/files/abs/truth-abs.shtml for example) and would
> > instead look at some plots, such as the residuals by yhat or by some of
> > the X's. For the latter, you can look at the scatterplot of all M
> > datasets together (the nonmissing points will be plotted on top of one
> > another of course; the others will spread out).
> >
> > > >
> > > > Another related question: I used the ado file prepared by Kenneth
> Scheve
> > > > to estimate the combined beta coefficients and standard errors from
> the
> > > > five data sets generated by Amelia. The STATA output does not provide
> the
> > > > root mean square error which I need to estimate the fundamental
> > > > uncertainty around the predicted values. Does this mean that using the
> > > > multiple imputation technique can only allow for parameter
> uncertainty?
> > > > and if so, is there a way to justify this in the paper?
> >
> > no, multiple imputation includes both fundamental and estimation
> > uncertainty. If you want this quantity, you can compute it for each data
> > set and combine them as with other quantities.
> >
> > > >
> > > > I would very much appreciate your input on these two questions.
> >
> > best of luck with your research.
> > Gary
> >
> > : Gary King, King(a)Harvard.Edu http://GKing.Harvard.Edu :
> > : Center for Basic Research Direct (617) 495-2027 :
> > : in the Social Sciences Assistant (617) 495-9271 :
> > : 34 Kirkland Street, Rm. 2 HU-MIT DC (617) 495-4734 :
> > : Harvard U, Cambridge, MA 02138 eFax (928) 832-7022 :
> >
> > > >
> > > > Best regards, Taghreed
> > > >
> > > >
> > > > >-----Original Message-----
> > > > >From: Gary King [mailto:king@harvard.edu]
> > > > >Sent: Tuesday, 19 March 2002 19:21
> > > > >To: adamt
> > > > >Cc: evansd; James Honaker; Kenneth Scheve
> > > > >Subject: Re: problem with Amelia
> > > > >
> > > > >
> > > > >
> > > > >why don't you first see whether Amelia (through DataLoad) loaded the
> data
> > > > >in properly. Load it in and look at the descriptive statistics to
> make
> > > > >sure. If that doesn't work, try the new version of Amelia which has
> a
> > > > new
> > > > >Dataload incorporated. If that doesn't do it, you can save the data
> in
> > > > >ascii, and load it into Amelia that way, which we know always works.
> > > > >It sounds like that is the issue, but let me know if not. I'm CCing
> my
> > > > >coauthors in case they have other ideas.
> > > > >
> > > > >Gary
> > > > >
> > > > > : Gary King, King(a)Harvard.Edu http://GKing.Harvard.Edu :
> > > > > : Center for Basic Research Direct (617) 495-2027 :
> > > > > : in the Social Sciences Assistant (617) 495-9271 :
> > > > > : 34 Kirkland Street, Rm. 2 HU-MIT DC (617) 495-4734 :
> > > > > : Harvard U, Cambridge, MA 02138 eFax (928) 832-7022 :
> > > > >
> > > > >On Tue, 19 Mar 2002 adamt(a)who.ch wrote:
> > > > >
> > > > > >
> > > > > > > Dear Dr King,
> > > > > > >
> > > > > > > My name is Taghreed Adam and I am working in WHO in Chris
> Murray's
> > > > > > > cluster. Chris asked me to use Amelia to replace the missing
> values
> > > > in
> > > > >the
> > > > > > > dataset that I am working with. This is prior to running a
> > > > regression to
> > > > > > > predict unit costs per bed day in hospitals. The dataset I am
> using
> > > > for
> > > > > > > Amelia includes 21 variables; the log of the unit cost per bed
> day (
> > > > $),
> > > > >a
> > > > > > > set of explanatory variables such as log of GDP per capita,
> > > > occupancy
> > > > > > > rate (%), Average length of stay (days) etc. There are also
> > > > variables
> > > > > > > that describe the nature of the unit cost data - e.g. whether
> > > > capital
> > > > > > > costs, drugs, and other incidental costs are included (all
> dummies).
> > > > >Then
> > > > > > > there are desciptors like the country code,the region code,
> whether
> > > > it
> > > > >is
> > > > > > > a public or private hospital (dummy) etc... The total number of
> > > > >hospitals
> > > > > > > for which we have observations is 1097. The maximum percentage
> of
> > > > > > > missingness of observations per variable is 53%, i.e., the least
> > > > number
> > > > >of
> > > > > > > observations I have for any variable is around 600. they are
> all
> > > > >numeric
> > > > > > > variables.
> > > > > > >
> > > > > > > What I first did is to make sure that all variables included in
> the
> > > > >model
> > > > > > > are normally distributed. The the data is saved in excel
> version 2,
> > > > >with
> > > > > > > no headings.
> > > > > > > In Amelia, I set _AMempri option to 1 to control for the high
> degree
> > > > of
> > > > > > > missingness of some of the variables, I identified fully
> observed
> > > > > > > variables using the _AMfully and the one nominal variable for
> which
> > > > >there
> > > > > > > is missing data using the _AMnoms. (The other nominal
> variables do
> > > > not
> > > > > > > have missing data)
> > > > > > >
> > > > > > > What happens when I run Amelia is either that it crashes just
> after
> > > > I
> > > > > > > specify the input file name or it gives me the following
> message:
> > > > >elements
> > > > > > > of m can not be zero. I tried to check whether I have any
> variable
> > > > that
> > > > >is
> > > > > > > coded as a string variable that might explain this message but
> it is
> > > > not
> > > > > > > the case. I have no observations or variables that are all zeros
> or
> > > > > > > missing. I do not think it is a memory problem as my computer
> has
> > > > 550
> > > > >MHz
> > > > > > > memory and I do not work with other software while it is
> running.
> > > > > > >
> > > > > > > I tried to delete some of the variables, e.g., some of the
> dummies
> > > > or
> > > > > > > those that might be highly correlated with other variables and I
> > > > tried
> > > > >to
> > > > > > > run it again with a total of 8 variables. I used stata this time
> as
> > > > the
> > > > > > > type of input file. It started running about 7 hours ago and is
> > > > still
> > > > > > > running but clearly something is wrong as the number of
> iterations
> > > > is
> > > > >now
> > > > > > > 115000.
> > > > > > >
> > > > > > > I would be grateful if you could give me some advice on what
> could
> > > > be
> > > > >the
> > > > > > > source of the problem and what else I can try to do. I would be
> > > > happy to
> > > > > > > call you if it will make it easier to discuss. We have discussed
> > > > this
> > > > > > > question extensively with Josh Salomon who has also run out of
> ideas
> > > > >about
> > > > > > > what we can try next.
> > > > > > >
> > > > > > > I am looking forward to hearing from you.
> > > > > > > Yours sincerely,
> > > > > > > Taghreed
> > > > > > >
> > > > > > >
> > > > > > > Dr Taghreed Adam
> > > > > > > Global Programme on Evidence for Health Policy (GPE) and;
> > > > > > > Child and Adolescent Health Department ( CAH)
> > > > > > >
> > > > > > > World Health Organization
> > > > > > >
> > > > > > > 20 Avenue Appia
> > > > > > >
> > > > > > > CH-1211 Geneva 27
> > > > > > >
> > > > > > > Switzerland
> > > > > > >
> > > > > > > Tel: +41 22 791 3487
> > > > > > > Fax: +41 22 791 4328
> > > > > > > office: 3164
> > > > > > > e-mail: adamt(a)who.int
> > > > > > >
> > > > > > >
> > > > > >
> > >
> >
> >
> > -
> > amelia mailing list served by Harvard-MIT Data Center
> > List Address: amelia(a)latte.harvard.edu
> > Subscribe/Unsubscribe: http://lists.hmdc.harvard.edu/?info=amelia
>
-
amelia mailing list served by Harvard-MIT Data Center
List Address: amelia(a)latte.harvard.edu
Subscribe/Unsubscribe: http://lists.hmdc.harvard.edu/?info=amelia
pls see below...
On Thu, 19 Sep 2002 adamt(a)who.int wrote:
>
> > Dear Dr King, I am currently writing a paper on the work that I have been
> > doing using Amelia, and for which your team and yourself had provided me
> > with great help in using the software.
> >
> > This work involves the development of a model to predict hospital unit
> > cost (the dependent variable) using a set of explanatory variables- for
> > which missing values have been imputed by Amelia. I have been trying to
> > find out from the economic literature on multiple imputation what kind of
> > tests were used to discuss the goodness of fit of the models that were
> > developed after imputation-i.e., other than the adjusted R squared or the
> > F statistics, for example, which cannot be computed from the average
> > equation. I was not able to find any reference to model validation from my
> > search and I am not sure how I can defend my model or provide a way to
> > validate it especially that I do not have a gold standard to compare my
> > results with. I would very much appreciate if you could let me know of any
> > references or have any suggestions about ways to convince an economic
> > readership that the average equation fits well.
any quantity (including R^2 if that makes sense to you) can be computed.
You follow the same rules for combining these quantities across the
multiply imputed data sets as for any other quantity. see
http://gking.harvard.edu/amelia/node3.html
I'd add that I don't find R^2 of much use (see
http://gking.harvard.edu/files/abs/mist-abs.shtml and
http://gking.harvard.edu/files/abs/truth-abs.shtml for example) and would
instead look at some plots, such as the residuals by yhat or by some of
the X's. For the latter, you can look at the scatterplot of all M
datasets together (the nonmissing points will be plotted on top of one
another of course; the others will spread out).
> >
> > Another related question: I used the ado file prepared by Kenneth Scheve
> > to estimate the combined beta coefficients and standard errors from the
> > five data sets generated by Amelia. The STATA output does not provide the
> > root mean square error which I need to estimate the fundamental
> > uncertainty around the predicted values. Does this mean that using the
> > multiple imputation technique can only allow for parameter uncertainty?
> > and if so, is there a way to justify this in the paper?
no, multiple imputation includes both fundamental and estimation
uncertainty. If you want this quantity, you can compute it for each data
set and combine them as with other quantities.
> >
> > I would very much appreciate your input on these two questions.
best of luck with your research.
Gary
: Gary King, King(a)Harvard.Edu http://GKing.Harvard.Edu :
: Center for Basic Research Direct (617) 495-2027 :
: in the Social Sciences Assistant (617) 495-9271 :
: 34 Kirkland Street, Rm. 2 HU-MIT DC (617) 495-4734 :
: Harvard U, Cambridge, MA 02138 eFax (928) 832-7022 :
> >
> > Best regards, Taghreed
> >
> >
> > >-----Original Message-----
> > >From: Gary King [mailto:king@harvard.edu]
> > >Sent: Tuesday, 19 March 2002 19:21
> > >To: adamt
> > >Cc: evansd; James Honaker; Kenneth Scheve
> > >Subject: Re: problem with Amelia
> > >
> > >
> > >
> > >why don't you first see whether Amelia (through DataLoad) loaded the data
> > >in properly. Load it in and look at the descriptive statistics to make
> > >sure. If that doesn't work, try the new version of Amelia which has a
> > new
> > >Dataload incorporated. If that doesn't do it, you can save the data in
> > >ascii, and load it into Amelia that way, which we know always works.
> > >It sounds like that is the issue, but let me know if not. I'm CCing my
> > >coauthors in case they have other ideas.
> > >
> > >Gary
> > >
> > > : Gary King, King(a)Harvard.Edu http://GKing.Harvard.Edu :
> > > : Center for Basic Research Direct (617) 495-2027 :
> > > : in the Social Sciences Assistant (617) 495-9271 :
> > > : 34 Kirkland Street, Rm. 2 HU-MIT DC (617) 495-4734 :
> > > : Harvard U, Cambridge, MA 02138 eFax (928) 832-7022 :
> > >
> > >On Tue, 19 Mar 2002 adamt(a)who.ch wrote:
> > >
> > > >
> > > > > Dear Dr King,
> > > > >
> > > > > My name is Taghreed Adam and I am working in WHO in Chris Murray's
> > > > > cluster. Chris asked me to use Amelia to replace the missing values
> > in
> > >the
> > > > > dataset that I am working with. This is prior to running a
> > regression to
> > > > > predict unit costs per bed day in hospitals. The dataset I am using
> > for
> > > > > Amelia includes 21 variables; the log of the unit cost per bed day (
> > $),
> > >a
> > > > > set of explanatory variables such as log of GDP per capita,
> > occupancy
> > > > > rate (%), Average length of stay (days) etc. There are also
> > variables
> > > > > that describe the nature of the unit cost data - e.g. whether
> > capital
> > > > > costs, drugs, and other incidental costs are included (all dummies).
> > >Then
> > > > > there are desciptors like the country code,the region code, whether
> > it
> > >is
> > > > > a public or private hospital (dummy) etc... The total number of
> > >hospitals
> > > > > for which we have observations is 1097. The maximum percentage of
> > > > > missingness of observations per variable is 53%, i.e., the least
> > number
> > >of
> > > > > observations I have for any variable is around 600. they are all
> > >numeric
> > > > > variables.
> > > > >
> > > > > What I first did is to make sure that all variables included in the
> > >model
> > > > > are normally distributed. The the data is saved in excel version 2,
> > >with
> > > > > no headings.
> > > > > In Amelia, I set _AMempri option to 1 to control for the high degree
> > of
> > > > > missingness of some of the variables, I identified fully observed
> > > > > variables using the _AMfully and the one nominal variable for which
> > >there
> > > > > is missing data using the _AMnoms. (The other nominal variables do
> > not
> > > > > have missing data)
> > > > >
> > > > > What happens when I run Amelia is either that it crashes just after
> > I
> > > > > specify the input file name or it gives me the following message:
> > >elements
> > > > > of m can not be zero. I tried to check whether I have any variable
> > that
> > >is
> > > > > coded as a string variable that might explain this message but it is
> > not
> > > > > the case. I have no observations or variables that are all zeros or
> > > > > missing. I do not think it is a memory problem as my computer has
> > 550
> > >MHz
> > > > > memory and I do not work with other software while it is running.
> > > > >
> > > > > I tried to delete some of the variables, e.g., some of the dummies
> > or
> > > > > those that might be highly correlated with other variables and I
> > tried
> > >to
> > > > > run it again with a total of 8 variables. I used stata this time as
> > the
> > > > > type of input file. It started running about 7 hours ago and is
> > still
> > > > > running but clearly something is wrong as the number of iterations
> > is
> > >now
> > > > > 115000.
> > > > >
> > > > > I would be grateful if you could give me some advice on what could
> > be
> > >the
> > > > > source of the problem and what else I can try to do. I would be
> > happy to
> > > > > call you if it will make it easier to discuss. We have discussed
> > this
> > > > > question extensively with Josh Salomon who has also run out of ideas
> > >about
> > > > > what we can try next.
> > > > >
> > > > > I am looking forward to hearing from you.
> > > > > Yours sincerely,
> > > > > Taghreed
> > > > >
> > > > >
> > > > > Dr Taghreed Adam
> > > > > Global Programme on Evidence for Health Policy (GPE) and;
> > > > > Child and Adolescent Health Department ( CAH)
> > > > >
> > > > > World Health Organization
> > > > >
> > > > > 20 Avenue Appia
> > > > >
> > > > > CH-1211 Geneva 27
> > > > >
> > > > > Switzerland
> > > > >
> > > > > Tel: +41 22 791 3487
> > > > > Fax: +41 22 791 4328
> > > > > office: 3164
> > > > > e-mail: adamt(a)who.int
> > > > >
> > > > >
> > > >
>
-
amelia mailing list served by Harvard-MIT Data Center
List Address: amelia(a)latte.harvard.edu
Subscribe/Unsubscribe: http://lists.hmdc.harvard.edu/?info=amelia
21 years, 7 months
1
0
by William J. Carbonaro
auth 23a0f763 subscribe amelia wcarbona(a)nd.edu
-
amelia mailing list served by Harvard-MIT Data Center
List Address: amelia(a)latte.harvard.edu
Subscribe/Unsubscribe: http://lists.hmdc.harvard.edu/?info=amelia
auth 23a0f763 subscribe amelia wcarbona(a)nd.edu
-
amelia mailing list served by Harvard-MIT Data Center
List Address: amelia(a)latte.harvard.edu
Subscribe/Unsubscribe: http://lists.hmdc.harvard.edu/?info=amelia