imputation of ordered categorical level-2 variables

Welcome to the forum for REALCOM users. Feel free to post your question about REALCOM here. The Centre for Multilevel Modelling take no responsibility for the accuracy of these posts, we are unable to monitor them closely. Do go ahead and post your question and thank you in advance if you find the time to post any answers!

Go REALCOM (Developing multilevel models for REAListically COMplex social science data) >> http://www.bristol.ac.uk/cmm/software/realcom/
Post Reply
matryoshka
Posts: 3
Joined: Thu Oct 06, 2011 1:30 pm

imputation of ordered categorical level-2 variables

Post by matryoshka »

Hi, I seem to be having a problem when trying to impute ordinal level-2 variables. Within each imputation either all missing observations are assigned the lowest value (1) or (less commonly) only the highest value is imputed. This does not seem to be depend on explanatory variables or correlation with other responses. Also the issue does not arise when imputing the same variable as if it was an unordered categorical level-2 variable.
Thanks in advance!

Edit: I was thinking that it would probably help if I provide an example for others to replicate. Perhaps this is just a silly mistake I did. I am using the example dataset for the realcomImpute command in Stata ("prac2full.dta") which can be downloaded here (http://missingdata.lshtm.ac.uk/examplea ... eStata.zip). I generate a random level-2 variable which I divide into quartiles and delete 10% completely at random. I was using the default imputation settings for the below example but from what I recall, changing them does not seem to make a difference.

Code: Select all

gen r=invnormal(uniform())
bys school: egen randomlevel2variable=mean(r)
gen quartile=.
quietly sum randomlevel2variable, d
replace quartile=1 if randomlevel2variable<r(p25)
replace quartile=2 if randomlevel2variable<r(p50) & quartile==.
replace quartile=3 if randomlevel2variable<r(p75) & quartile==.
replace quartile=4 if quartile==.
gen s=uniform()
bys school: egen mmissing=mean(s)
quietly sum mmissing, d
replace quartile=. if mmissing<r(p10)
drop r s randomlevel2variable mmissing

sort school

realcomImpute nlitpre o.quartile nlitpost fsmn gend using prac2fullMIInput.dat, replace numresponses(2) level2id(school) cons(cons)

*** after imputation ****

realcomImputeLoad

mi convert flong, clear

tab quartile _mi_m


_mi_m
quartile 0 1 2 3 4 5 6 ...

1 1,100 1,566 1,566 1,566 1,566 1,566 1,566 ....
2 1,147 1,147 1,147 1,147 1,147 1,147 1,147 ...
3 1,114 1,114 1,114 1,114 1,114 1,114 1,114 ...
4 1,046 1,046 1,046 1,046 1,046 1,046 1,046 ...

Total 4,407 4,873 4,873 4,873 4,873 4,873 4,873 ...

(Sorry about the formatting of this table but the rows correspond to the quartiles and the columns are the imputations)
Harvey Goldstein
Posts: 49
Joined: Sun Sep 06, 2009 5:30 pm

Re: imputation of ordered categorical level-2 variables

Post by Harvey Goldstein »

You seem to have uncovered a bug. I cant be sure but will have a look at this and see if I can see what is happening
Harvey Goldstein
matryoshka
Posts: 3
Joined: Thu Oct 06, 2011 1:30 pm

Re: imputation of ordered categorical level-2 variables

Post by matryoshka »

Harvey Goldstein wrote:You seem to have uncovered a bug. I cant be sure but will have a look at this and see if I can see what is happening
Harvey Goldstein
Are there any news on this issue? Thanks for looking into it
Post Reply