New subject: FW: Does whether or not the input file has a header row (variable names) affect how Amelia works? [WAT Issue #2]

29 Jun 2009

Please disregard all of the issues/questions I raised in my email below,
EXCEPT for one:

Q:  Does whether or not the input file has a header row (variable names)
affect how Amelia works?

Matt Blackwell's response to my first issue ( subj: Amelia for R produces no
imputed data output files [WAT Issue #1] ) resolved the other  issues in my
earlier email below.)

I changed the subject line of this message accordingly...

DISCUSSION:   It seems that Amelia (and AmeliaView) assume that the input
data set has a header row.

However I cannot find any discussion in the documentation to confirm this.

I have observed the following:

-- When I write the data.frame to a csv file to be read by AmeliaView... if
the csv file has no header row, then in AmeliaView -> Summarize Data ->
"Missing:  x / [total]"... The "total" listed is one less than the
rows
actually in the data set.

-- When I pass the data.frame to Amelia for R directly, it doesn't seem to
have this problem.

To prevent any problems of this nature, should Amelia and AmeliaView have an
input parameter telling it whether or not the input data set has a header
row?

Wayne Thornton

thornton(a)fas.harvard.edu 

  _____  

From: owner-amelia_at_lists_gking_harvard_edu(a)mail.hmdc.harvard.edu
[mailto:owner-amelia_at_lists_gking_harvard_edu@mail.hmdc.harvard.edu] On
Behalf Of Wayne Thornton
Sent: Sunday, June 28, 2009 16:24
To: amelia(a)lists.gking.harvard.edu
Subject: [amelia] Amelia output extracted from output[[ ]] looks odd [WAT
Issue #2]

RE:  Amelia output extracted from output[[ ]] looks odd  [WAT Issue #2]

PROBLEM:  After running Amelia to generate 5 imputed files, the output files
extracted using output[[ ]]  look odd....

BACKGROUND:  Here is my command line to run Amelia:

*******************

CONTROL PANEL

*******************

impruns  <- 5    

tolX     <- 0.0001  

empriX    <- 100  

autopriX  <- 0.05 

resampleX <- 100  

***************************

CONTROL PANEL

*******************

      imputed <- amelia(DATA8i,

                  m  = impruns      ,  p2s = 2      ,
idvars = c(3,4,5)   ,

                  ts = 1         ,  cs  = 2     ,    polytime  = NULL,

                                                      startvals = 0   ,
tolerance = tolX    ,

                                                                        noms
= nomIV8i      ,            

                  ords = ordIV8i ,  incheck = T  ,       collect = F     ,
outname = "DATA8imp",

                  write.out = T  ,     archive = T ,
keep.data = T    ,

                  empri = empriX ,
autopri = autopriX  ,

                  bounds = IVlims,                    max.resample =
resampleX               )

After a run I am able to extract output info from...

      imputed[[ ]] 

The user guide (p.27, under "Output") says.

"...you can refer to any of the datasets by referencing output[[i]], where i
is the number of the dataset you wish to reference.

These datasets will be returned in the same format which you passed
them...."

However, the files imputed[[1]], imputed[[2]], etc.......are quite different
from the original input file, and different from each other.

-- The input file is a data frame (1044 x 487). with no header.

-- Output files:

 imputed[[ 1]]               1044 x 2435                  numeric; looks
like imputed values

NOTE:  2435 = 5 * 287... 

 imputed[[ 2]]                1 x 1                             "5"

 imputed[[ 3]]                                                     TRUE /
FALSE

 imputed[[ 4]]                483 x  2415                   numeric,  does
NOT look line imputed values

NOTE: 483 = number of IVs minus 4;

Data set includes 3 identity variables, 1 time series var, 1 cross-section
var

 imputed[[ 5]]               483 x 5                           numeric,
does NOT look lile imputed values

These output files raise the following comments/questions:

(1)  Contrary to the info in the user guide, the output files extracted from
output[[i]] do not match the format of the input file.

(2)  Does whether or not the input file has a header row (variable names)
affect how Amelia works?

(This question may be an artifact of my lack of understanding about working
with data frames... But if you read in the output csv file and compute
nrow(file), the result is one less than the number of rows actually  in the
csv file.

(3)  Is the first output file [[1]] the 5 sets of imputed data?

(4) I have no idea what the other files are... Are they for diagnostics?

Thanks,

Wayne

SUBMITTED BY:     Wayne A. Thornton

                              Harvard Univ.

                        thornton(a)fas.harvard.edu

                        781-492-3131

<http://1429236.signature1.mailinfo.com/confirm2.6/0205010E/0202054D/0B034F0
5/13137013.jpg> 

FW: [amelia] Does whether or not the input file has a header row (variable names) affect how Amelia works? [WAT Issue #2]