Page 1 of 1

Problem when imputing second level variable

Posted: Tue Jan 29, 2013 9:22 pm
by amitlazarus
Hello all,
I'm trying to impute missing cases in both level 1 and level 2 variables (individual and school respectively in this case) where some are continues and others are categorical (ordered and non-ordered). I have a few explanatory variables in the model which have no missing values from both levels of analysis.

The syntax (in STATA) for the command I used was:
realcomImpute X1 o.X2 m.W1 W2 X3 X4 W3 W4 W5 using MI3.dat, replace numresponses(4) level2id(school) cons(cons)
Where X is level 1 variables and W is level 2 variables.

The problem is that the imputed results of the continues level 2 variable I get are very different from the original data. For example, the SD of the variables in the imputed data sets is about 6 times bigger than in the original data set. Moreover, the range of the results is much wider in the new data sets (Max and Min values are much farther apart). This causes the coefficient of the relevant variables in the analytical model to drop dramatically and SE is very big.
Is there any way to use PMM in realcom or set Max or Min boundaries in the imputation of this variable? If not, is there any other way of dealing with this problem?

Any help will be much appreciated.
Kind regards,
Amit

Re: Problem when imputing second level variable

Posted: Tue Feb 05, 2013 10:33 pm
by Harvey Goldstein
There is no way in the REALCOM version on the web site that you can set impute limits. I was not aware that there was a bug of this sort - certainly not in latest version. It is possible that there is a problem with STATA. So - check you have latest version. If still a problem I suggest that you prepare the necessary input files for REALCOM outside of STATA - you can do this directly (see manual) or through MLwiN and try again. If that doesn;t work please send me those files and I will have a look.
Harvey Goldstein

Re: Problem when imputing second level variable

Posted: Thu Feb 07, 2013 8:09 pm
by amitlazarus
Thanks so much for your time and help.
I downloaded realcom-impute very recently and I'm working with STATA 12 so I don't think that's the problem. Following your advice I am now re-running the imputation after preparing the input file for realcom via MLwiN 2.26. It will probably take at least 24 hours (if not 40) to complete the datasets but I will let you know if it worked better this time around.

Thanks again, It is very much appreciated.
Kind regards,
Amit.

Re: Problem when imputing second level variable

Posted: Sat Feb 09, 2013 7:30 pm
by amitlazarus
Dear Prof. Goldstein,

Updating my previous response, I tried preparing the files for the imputation trough MLwiN before loading them to realcom but the results came out more or less the same - SD in the imputed files is much higher (now about 2-3 times higher) and the range in most of them is much wider. Are there any other possible causes for this?
I will be happy to send you the files if you are still willing to take a look. If so, how should I send you them and which of the files do you need?

Thanks so much for your time and willingness to help.
Kind regards,
Amit.