Dear James,
I am now implementing Mamadou's suggestions. I previously tried in a 64 GB
PC, where I was able to see how much RAM is consumed (in Windows setting),
I saw the same error when R was using 50 GB of RAM. I will implement your
suggestions and contact the HPC Lab to check whether R is really configured
well in Linux.
Many thanks for your suggestions,
Best,
Ömer
_________________________________
Ömer Faruk Örsün
PhD Candidate
Department of International Relations
Koç University
CAS 289
_________________________________
On Thu, Feb 7, 2013 at 8:50 PM, Honaker, James <jhonaker(a)iq.harvard.edu>wrote;wrote:
Dear Ömer,
I'd second some of Matt's points. The 95 variables you point to is not
too extreme, but when you turn on "Intercs" you are creating very many more
variables. Exactly how many more variables depends on the number of unique
values, v, in your cross-sectional variable "cs" and the degree/order
"k"
of your spline or polynomials of time. You will be adding v*k variables.
If you have 100 countries (for example) and a 5th order spline, you have
added 500 variables (to the 95 you started with). In very large setting, I
would build up slowly from simple models that computationally work in your
environment, and then increase the complexity of the model and see how
large you can get before it fails. If it fails right off the bat in the
simplest settings, that might be a pointer to a problem elsewhere (a common
slip I've fallen to is too many unique values in the "CS" variable, like
a
country code that includes the year).
As for large memory machines, forgive me if my comments are too obvious,
but one pragmatic tip is see if there is anyway you can get load monitoring
of your R process, and see how much memory your job has before it fails,
perhaps as simple as linux "top". You can do a little of this within R
using gc(). In my experience in some high performance settings, you have
to bother admins to let them adjust your privileges to actually get the
potential amount of theoretically available, once the cluster has been
burned by some user with a forgotten job that never terminates and a memory
leak. Also, if they don't run R commonly, it might not be configured to
take advantage of the servers capabilities.
James.
________________________________________
From: amelia-bounces(a)lists.gking.harvard.edu [
amelia-bounces(a)lists.gking.harvard.edu] On Behalf Of OMER FARUK Orsun [
oorsun(a)ku.edu.tr]
Sent: Thursday, February 07, 2013 1:04 PM
To: Ndiaye, Mamadou
Cc: amelia(a)lists.gking.harvard.edu
Subject: Re: [amelia] Error of " resulting vector exceeds vector length
limit in 'AnswerType'"
Dear Ddiaye,
Thanks a lot for your suggestion.
Best,
Ömer
_________________________________
Ömer Faruk Örsün
PhD Candidate
Department of International Relations
Koç University
CAS 289
_________________________________
On Thu, Feb 7, 2013 at 7:41 PM, Ndiaye, Mamadou <
MNdiaye@publichealthmdc.com<mailto:MNdiaye@publichealthmdc.com>> wrote:
To improve the memory limitation impeding R, I found the package SOAR very
useful:
http://cran.r-project.org/web/packages/SOAR/vignettes/SOAR.pdf
Thank you
M. Ndiaye
From: amelia-bounces(a)lists.gking.harvard.edu<mailtolto:
amelia-bounces(a)lists.gking.harvard.edu> [mailto:
amelia-bounces(a)lists.gking.harvard.edu<mailtolto:
amelia-bounces(a)lists.gking.harvard.edu>] On Behalf Of Matt Blackwell
Sent: Thursday, February 07, 2013 11:24 AM
To: OMER FARUK Orsun
Cc: amelia@lists.gking.harvard.edu<mailto:amelia@lists.gking.harvard.edu>
Subject: Re: [amelia] Error of " resulting vector exceeds vector length
limit in 'AnswerType'"
Hi Ömer,
First, note that you may not have enough observations to get good
imputations with that many variables. Amelia might have poor properties in
that case. You can save a lot of hassle here by not interacting the
polynomials of time with the cross-section (setting "intercs = FALSE").
I imagine you have 500 GB of hard disk space, not RAM, but either way,
this is probably related to the maximum vector size that R can handle,
which is currently 2^31-1. Obviously that is *very* large, but if you want
to go beyond that you would have to use R 3.0.0 (still under development)
which will allow for longer vectors on certain machines. For more
information, see this help file in R:
?"Memory-limits"
If you are on Windows, you might be able to increase the amount of memory
dedicated the R process.
Hope that helps!
Cheers,
matt.
~~~~~~~~~~~
Matthew Blackwell
Assistant Professor of Political Science
University of Rochester
url:
http://www.mattblackwell.org
On Thu, Feb 7, 2013 at 12:10 PM, OMER FARUK Orsun <oorsun(a)ku.edu.tr
<mailto:oorsun@ku.edu.tr>> wrote:
Hi Matt,
Many thanks for your response. The missingness in my data is severe, as a
result, I might need to introduce all available data. Is there another way
to avoid memory related errors given that I have a 500 GB RAM computer?
Best Regards,
Ömer
_________________________________
Ömer Faruk Örsün
PhD Candidate
Department of International Relations
Koç University
CAS 289
_________________________________
On Thu, Feb 7, 2013 at 4:54 PM, Matt Blackwell <m.blackwell(a)rochester.edu
<mailto:m.blackwell@rochester.edu>> wrote:
Hi Ömer,
It seems as though you are running into memory issues with R itself. Note
that using "intercs = TRUE" and "polytime = 2" will add 3*K variables
to
the data, where K is the number of dyads in the data. Given your
description of the data, that could be an extremely large data set. You
might want to run Amelia on a smaller subset of the data to see how the
imputations go and then tentatively test out smaller imputation models.
Hope that helps!
Cheers,
matt.
~~~~~~~~~~~
Matthew Blackwell
Assistant Professor of Political Science
University of Rochester
url:
http://www.mattblackwell.org
On Thu, Feb 7, 2013 at 7:24 AM, OMER FARUK Orsun <oorsun(a)ku.edu.tr<mailtomailto:
oorsun(a)ku.edu.tr>> wrote:
Dear Lister,
I am using Amelia II (Version 1.6.4) with a 500 GB computer specification
and my data consist of directed dyads and my imputation model has 94
variables and 493,853 observations. I use the following command:
library(Amelia)
library(foreign)
mydata <- read.dta("data.dta")
require(Amelia)
set.seed(1234)
a.out <- amelia(mydata, m=10, p2s = 2, tolerance = 0.005, empri =
.1*nrow(mydata), ts="year", cs="dyadid" , polytime=2, intercs =
TRUE)
After 7 hours, I receive the following message:
amelia starting
beginning prep functions
Error in cbind(deparse.level, ...) :
resulting vector exceeds vector length limit in 'AnswerType'
I've already searched the Amelia II archieves and R archives, I was not
able to locate a solution.
I would deeply appreciate any help!
Best Regards,
Ömer
_________________________________
Ömer Faruk Örsün
PhD Candidate
Department of International Relations
Koç University
CAS 289
_________________________________
--
Amelia mailing list served by HUIT
[Un]Subscribe/View Archive:
http://lists.gking.harvard.edu/?info=amelia
More info about Amelia:
http://gking.harvard.edu/amelia
Amelia mailing list
Amelia@lists.gking.harvard.edu<mailto:Amelia@lists.gking.harvard.edu>
To unsubscribe from this list or get other information:
https://lists.gking.harvard.edu/mailman/listinfo/amelia