After the following R commands:
a.out<-amelia(bd2,m=5 ,idvars =c("bd_id","cty"),noms=c("race4","outc")) ,
I received this error message:
"Error in if (sum(non.vary == 0)) { :
argument is not interpretable as logical"
What that means?
I used the current versions of R and Amelia on Windows
The data contains more than 100,000 records, with 6 variables. They are all factors except bwt3, a continuous variable.
See below the first rows of the data frame bd2:
> head(bd2)
bd_id
race4
outc
cty
bwt3
1989-35129
1
1
0
2665
1989-43790
2
1
0
3685
1989-44528
1
1
0
3402
1989-48485
1
1
0
3175
1989-49402
1
1
0
3742
1989-55241
4
1
0
3941
Thanks you for any help.
Ndiaye
... and variables measured just on one occasion.
Hello, Amelia authors and users,
I'm doing a longitudinal study on the motivational development after transition to secondary school. Some constructs, e.g. learning goal orientation, were measured at each of our three waves. Other characteristics were just measured on one occasion, some because they ought to be time-invariant (e.g. socioeconomic status), others because new questions and hypotheses developed after the first results (e.g. about the influence of parental support).
For using the time series-option, I rearranged my data in SPSS so that I got one cell per variable and time point, and added a time index-variable. The values of the time-invariant variables were kept constant over the measurement occasions, e.g. each case got one and the same value for each time point. Also, I did the same for the other variables that were just measured on one occasion.
Here is my problem:
The imputations of missings in the time-invariant variables (respectively the one-time measurements) result in different estimations per time point. How should I deal with this information? Should I just take the first value or calculate the mean of the values? Or do I have to deal differently with the problem right from the beginning? Is it adequate to do with time-invariant variables the same as with time-variant variables just measured on one occasion?
I would be grateful for some help,
greetings from Germany,
Felix Brümmer
-
Amelia mailing list served by Harvard-MIT Data Center
[Un]Subscribe/View Archive: http://lists.gking.harvard.edu/?info=amelia
More info about Amelia: http://gking.harvard.edu/amelia
Hello, Amelia authors and users,
My co-author and I wonder whether we could separate different types of
missing data when using Amelia, namely the Don't Knows and the Non-
applicable's. We wonder whether it is possible that we could impute
values for the DKs and not impute for NAs.
Thank you very much. we have been really enjoying Amelia!
Best,
Shanruo Ning Zhang
Assistant Professor
Political Science, California Polytechnic State University, San Luis
Obispo
-
Amelia mailing list served by Harvard-MIT Data Center
[Un]Subscribe/View Archive: http://lists.gking.harvard.edu/?info=amelia
More info about Amelia: http://gking.harvard.edu/amelia
Hi Matt
Yes, that was it.
Thanks
Peter
-----Original Message-----
>From: Matt Blackwell <blackwel(a)fas.harvard.edu>
>Sent: Jan 18, 2010 7:02 PM
>To: Peter Flom <peterflomconsulting(a)mindspring.com>
>Cc: "amelia(a)lists.gking.harvard.edu" <amelia(a)lists.gking.harvard.edu>
>Subject: Re: [amelia] tscsPlot error
>
>Hi Peter,
>
>Quick question. Do any of these cross sections contain completely
>missing data on a variable? Or do they only have one observation?
>These situations can lead to problems when you use the "intercs"
>option.
>
>Cheers,
>matt.
>
>On Fri, Jan 15, 2010 at 8:55 AM, Peter Flom
><peterflomconsulting(a)mindspring.com> wrote:
>> Good morning,
>>
>> I am still working with the data set described in my previous e-mails. Definitely making progress by adding more info. But I now get an error that I do not understand
>>
>>
>> If I run:
>> <<<
>>
>> susanMI.out <- amelia(susan2, m = 5, noms = "married", ts = 'time', cs = 'id',
>> intercs = T,
>> sqrts = "unprot_vag_sex")
>>>>>
>> and then (e.g.)
>> <<
>> tscsPlot(susanMI.out, cs = 11, var = "unprot_vag_sex")
>>>>>
>>
>> I get an error
>>
>> Error in identical(m, vector(mode = "logical", length = length(m))) :
>> subscript out of bounds
>>
>> This occurs with some values of cs but not others. I checked that those id numbers do, in fact, exist in the data.
>>
>> Any help appreciated.
>>
>> thanks
>>
>> Peter
>>
>> Peter L. Flom, PhD
>> Statistical Consultant
>> Website: http://www DOT statisticalanalysisconsulting DOT com/
>> Writing; http://www.associatedcontent.com/user/582880/peter_flom.html
>> Twitter: @peterflom
>> -
>> Amelia mailing list served by Harvard-MIT Data Center
>> [Un]Subscribe/View Archive: http://lists.gking.harvard.edu/?info=amelia
>> More info about Amelia: http://gking.harvard.edu/amelia
>>
>>
>-
>Amelia mailing list served by Harvard-MIT Data Center
>[Un]Subscribe/View Archive: http://lists.gking.harvard.edu/?info=amelia
>More info about Amelia: http://gking.harvard.edu/amelia
Peter L. Flom, PhD
Statistical Consultant
Website: http://www DOT statisticalanalysisconsulting DOT com/
Writing; http://www.associatedcontent.com/user/582880/peter_flom.html
Twitter: @peterflom
-
Amelia mailing list served by Harvard-MIT Data Center
[Un]Subscribe/View Archive: http://lists.gking.harvard.edu/?info=amelia
More info about Amelia: http://gking.harvard.edu/amelia
Good morning,
I am still working with the data set described in my previous e-mails. Definitely making progress by adding more info. But I now get an error that I do not understand
If I run:
<<<
susanMI.out <- amelia(susan2, m = 5, noms = "married", ts = 'time', cs = 'id',
intercs = T,
sqrts = "unprot_vag_sex")
>>>
and then (e.g.)
<<
tscsPlot(susanMI.out, cs = 11, var = "unprot_vag_sex")
>>>
I get an error
Error in identical(m, vector(mode = "logical", length = length(m))) :
subscript out of bounds
This occurs with some values of cs but not others. I checked that those id numbers do, in fact, exist in the data.
Any help appreciated.
thanks
Peter
Peter L. Flom, PhD
Statistical Consultant
Website: http://www DOT statisticalanalysisconsulting DOT com/
Writing; http://www.associatedcontent.com/user/582880/peter_flom.html
Twitter: @peterflom
-
Amelia mailing list served by Harvard-MIT Data Center
[Un]Subscribe/View Archive: http://lists.gking.harvard.edu/?info=amelia
More info about Amelia: http://gking.harvard.edu/amelia
Hi again
I am making some progress.... but...
My data set has data on 164 women, each measured at 4 time points.
The DV is number of unprotected sexual acts in last 3 months. Since this is a data set of commercial sex workers, this variable is highly skewed.
The IVs are age, marital status, highest grade of school, and income.
Marital status is (naturally) nominal. The others are numeric.
When I run
susanMI.out <- amelia(susan2, m = 5, ts = "time", noms = "married", cs = 'id',
intercs = T,
sqrts = "unprot_vag_sex",
polytime = 0)
I get warnings about noninvertible matrices and highly colinear variables - but none are that highly colinear.
If I run without the polytime option, I get no errors, and the overall distribution of the variables is pretty good, but the distribution within people is not good at all. That is, running
tscsPlot(susanMI.out, cs =2, var = "unprot_vag_sex")
shows that where the DV is missing, the imputed values aren't even close to the (admittedly high) value where it is present. In this case, only one value was present.
But for woman number 8, three values were present, all were 0, but the imputed value for the missing time was about 35, and the range was 0 to over 100.
Add lags = "unprot_vag_sex" and leads = "unprot_vag_sex", made the range much smaller, but the values were still very far off.
I had thought that polytime = 0 would set constant values ... but it led to the errors above.
Thanks in advance for any help and sorry to be so long winded, but I thought all these details would matter
Peter
Peter L. Flom, PhD
Statistical Consultant
Website: http://www DOT statisticalanalysisconsulting DOT com/
Writing; http://www.associatedcontent.com/user/582880/peter_flom.html
Twitter: @peterflom
-
Amelia mailing list served by Harvard-MIT Data Center
[Un]Subscribe/View Archive: http://lists.gking.harvard.edu/?info=amelia
More info about Amelia: http://gking.harvard.edu/amelia
Hello,
I just downloaded Amelia (and Zelig a little while ago). They look great!
I'm a statistician/data analyst, mostly working for psychologists, social scientists, doctors, etc.
My question:
I have a data set with 4 time points. The DV is a count, very overdispersed and right skew ... it's a number of unprotected sexual acts. IV's are marital status, age, income, and highest grade of school. Various patterns of missingness.
My questions:
1) Age clearly increases in a known pattern. It doesn't really need to be imputed. Can I make this happen in Amelia, or should I do it separately, before running Amelia?
2) I tried running
susanMI.out <- amelia(susan, m = 5, ts = "time", noms = "married", cs = 'id', intercs = T, polytime = 1)
and got an error
The number of observations is too low to estimate the number of
parameters. You can either remove some variables, reduce
the order of the time polynomial, or increase the empirical prior.
which is clear, but I am not sure what to do here.
Thanks for any advice, and thanks to Gary King for writing the software!
Peter
Peter L. Flom, PhD
Statistical Consultant
Website: http://www DOT statisticalanalysisconsulting DOT com/
Writing; http://www.associatedcontent.com/user/582880/peter_flom.html
Twitter: @peterflom
-
Amelia mailing list served by Harvard-MIT Data Center
[Un]Subscribe/View Archive: http://lists.gking.harvard.edu/?info=amelia
More info about Amelia: http://gking.harvard.edu/amelia