Hi Matt,
I am using Amelia version 1.7.4.
Actually, I was taking logical variables ( one with 0 & 1 response) in
noms. I moved them to idvars to ignore it.
In some categorical variables there are more than 1000 levels. I removed
those variables ( one with more than 1000 levels).
I am using following variables(1498 levels) in imputations:
*Var Levels*
"destinationcountry" = 155,
"cartype" = 39,
"browser.x" = 13,
"interactionchannel" = 4,
"paymentmethod" = 7,
"segmentname" = 3,
"geo_country" = 146,
"geo_region" = 378,
"operating_system" = 62,
"browser.y" = 18,
"language" = 35,
"latitude" = 235,
"longitude" = 243,
"device_model_id" = 146
This time I am getting following memory limit error:
Error: cannot allocate vector of size 4.0 Gb
In addition: There were 30 warnings (use warnings() to see them)
Warnings:
1: In amcheck(x = x, m = m, idvars = numopts$idvars, priors = priors, ... :
You've set the polynomials of time to zero with no interaction with
the cross-sectional variable. This has no effect on the imputation.
2: In amcheck(x = x, m = m, idvars = numopts$idvars, priors = priors, ...
:
The number of categories in one of the variables marked nominal has greater
than 10 categories. Check nominal specification.
12: In amcheck(x = x, m = m, idvars = numopts$idvars, priors = priors, ...
:
The variable NA is perfectly collinear with another variable in the data.
13: In ifelse(x[, i] == values[j], 1, 0) :
Reached total allocation of 8077Mb: see help(memory.size)
*Codes:*
dt.out <- amelia(x = dt, m = 3, idvars = c("device_unique_id",
"AirportTransaction", "status", "is_remarketing",
"post_click_conv",
"post_view_conv"), ts = "pickupdate", cs =
"destinationcountry", priors = NULL, lags = NULL, empri = 0.01*nrow(dt),
polytime = 0, intercs = FALSE, p2s = 2, incheck = TRUE, ords = NULL,
noms = c("cartype","browser.x",
"interactionchannel",
"paymentmethod", "segmentname", "geo_country",
"geo_region",
"operating_system", "browser.y", "language",
"latitude", "longitude",
"device_model_id"))
Regards,
On Sat, Jan 30, 2016 at 11:43 PM, Matt Blackwell <mblackwell(a)gov.harvard.edu
wrote:
> Hi Mithilesh,
>
> My guess is that you might be asking too much of the data here. You are
> including a separate quadratic function of time for each cross-sectional
> unit in the data (polytime = 2, intercs=TRUE) and this might be problematic
> if some of the characteristics of the cross-sectional unit are constant
> within unit. Can you try to run Amelia with intercs = FALSE and see if (a)
> things speed up and (b) if the error message disappears?
>
> Also, what version of Amelia are you using? There was a bug with that
> error message in previous versions, but should be fixed in 1.7.4.
>
> Cheers,
> Matt
>
> On Sat, Jan 30, 2016 at 12:53 PM, Mithilesh Kumar <mithileshk.in(a)gmail.com
>
wrote:
>
>> Hi Matt,
>>
>> After running with ridge prior for 4 hours I am getting following error:
>>
>> *Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
>> contrasts can be applied only to factors with 2 or more levels*
>> *In addition: There were 19 warnings (use warnings() to see them)*
>>
>> I am using following code:
>>
>> dt.out <- amelia(x = dt, m = 3, idvars = "device_unique_id", ts =
>> "pickupdate", cs = "destinationcountry",
>> priors = NULL, lags = NULL, empri = 0.01*nrow(dt), polytime = 2,
>> intercs = TRUE, p2s = 2, incheck = TRUE, ords = NULL,
>> noms = c("cartype", "AirportTransaction",
"status", "browser.x",
>> "interactionchannel", "paymentmethod",
>> "segmentname", "ip_address", "geo_country",
"geo_region",
>> "operating_system", "browser.y", "language",
>> "creative_freq", "creative_rec",
"user_group_id",
>> "is_remarketing", "post_click_conv",
"post_view_conv",
>> "advertiser_frequency", "advertiser_recency",
"latitude",
>> "longitude", "device_model_id"))
>>
>> Regards,
>>
>> On Sat, Jan 30, 2016 at 9:35 AM, Matt Blackwell <
>> mblackwell(a)gov.harvard.edu
wrote:
>>
>>> Hi Mithilesh,
>>>
>>> It's not so much a limitation on the number of observations, but you
are
>>> asking a lot of Amelia here. If there are 28 categorical variables each
>>> with more than 10 categories (and you have marked them so), then you adding
>>> roughly 280 variables to the imputation model which is quite a few. But
>>> that shouldn't be too bad, given the size of your data. It seems more
>>> likely to be the extremely high missingness rate. You might try using the
>>> ridge prior ("empri" argument in the amelia function). See section
4.7.1 of
>>> vignette for more information about this setting:
>>>
>>>
https://cran.r-project.org/web/packages/Amelia/vignettes/amelia.pdf
>>>
<https://urldefense.proofpoint.com/v2/url?u=https-3A__cran.r-2Dproject.org_web_packages_Amelia_vignettes_amelia.pdf&d=CwMFaQ&c=WO-RGvefibhHBZq3fL85hQ&r=EwICq0J5pL8CwgEJz8qkmauGonk0XmiLpxcYOEgk2a0&m=uEZ8qUv7U9gjWlMLKrTHFEkD3WeMo3tCAZqn7XKnGj8&s=sJ_wcTfgsvS3q8MLtKhFrLwQElq6TCoiEfXgMgKQwjo&e=>
>>>
>>> Cheers,
>>> Matt
>>>
>>> ~~~~~~~~~~~
>>> Matthew Blackwell
>>> Assistant Professor of Government
>>> Harvard University
>>> url:
http://www.mattblackwell.org
>>>
<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.mattblackwell.org&d=CwMFaQ&c=WO-RGvefibhHBZq3fL85hQ&r=EwICq0J5pL8CwgEJz8qkmauGonk0XmiLpxcYOEgk2a0&m=uEZ8qUv7U9gjWlMLKrTHFEkD3WeMo3tCAZqn7XKnGj8&s=std4gz2pQc2j7Th4J1LX3xAT4emsjOs2mjXiC8-Pb4w&e=>
>>>
>>> On Fri, Jan 29, 2016 at 10:54 PM, Mithilesh Kumar <
>>> mithileshk.in(a)gmail.com
wrote:
>>>
>>>> I have 761,592 obs for 31 variables on users behaviours towards online
>>>> ads. Out of 31 variables, 28 are categorical. Many cat. variables have
more
>>>> than 10 categories. I am using Amelia for missing data imputation.
>>>>
>>>> It's taking very long time. Are there other ways to do it fast?
What's
>>>> the Amelia limits on number of observations ?
>>>>
>>>> Is there any R-package which perform better on large dataset for
>>>> missing data imputation?
>>>>
>>>> I checked for complete cases, there are only 172 complete cases which
>>>> is very insignificant as compare to total dataset.
>>>>
>>>> --
>>>> Mithilesh Kumar
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Amelia mailing list served by HUIT
>>>> [Un]Subscribe/View Archive:
http://lists.gking.harvard.edu/?info=amelia
>>>> More info about Amelia:
http://gking.harvard.edu/amelia
>>>> Amelia mailing list
>>>> Amelia(a)lists.gking.harvard.edu
>>>>
>>>> To unsubscribe from this list or get other information:
>>>>
>>>>
https://lists.gking.harvard.edu/mailman/listinfo/amelia
>>>>
>>>
>>>
>>
>>
>> --
>> Mithilesh Kumar
>>
>
>
--
Mithilesh Kumar