Imputation to level 2 variable: vary within group?

Welcome to the forum for REALCOM users. Feel free to post your question about REALCOM here. The Centre for Multilevel Modelling take no responsibility for the accuracy of these posts, we are unable to monitor them closely. Do go ahead and post your question and thank you in advance if you find the time to post any answers!

Go REALCOM (Developing multilevel models for REAListically COMplex social science data) >> http://www.bristol.ac.uk/cmm/software/realcom/
Post Reply
AyaH
Posts: 10
Joined: Tue Oct 11, 2011 5:30 pm

Imputation to level 2 variable: vary within group?

Post by AyaH »

Dear Forum users,

My data are comprised of 2 levels, individuals (=level 1) are nested within household (=level 2,identifier is household_id that ranges from 1 to several tens of thousands).

One of response variables in my imputation model is a level 2 variable (household income), which should have a constant value within a household.
Household income in observed data are indeed constant within a household. And if it is missing, it is missing for all household members.

I found the imputed values varied across individuals within a household.
Is there a way to impute a constant value across individuals?

I prepared data by Stata's 'realcomimpute'. Response variables were ordered by level1 followed by level 2 (as it was instructed).
Predictor variables included continuous and categorical variables at both level1 and level 2 variables.

I highly appreciate your input.

Thank you for your time, in advance.
ChrisCharlton
Posts: 1348
Joined: Mon Oct 19, 2009 10:34 am

Re: Imputation to level 2 variable: vary within group?

Post by ChrisCharlton »

Realcom-Impute works out which responses are at each level by checking whether they vary within the provided level identifier. You could check that that the file generated by the Stata 'realcomimpute' command is correct by comparing it with the example on page 6 of the Realcom-Impute guide. The level identifier column should follow directly after the columns that contain the response variables. You should also be able to check which responses are being used at each level by looking at the equation within Realcom-Impute as well as the output in the command window. If these don't look correct I would suggest checking the data again to ensure that the values really do not vary within each group, It is probably worth doing this with the files generated by the Stata command in case an error has been been introduced when the files were exported.
AyaH
Posts: 10
Joined: Tue Oct 11, 2011 5:30 pm

Re: Imputation to level 2 variable: vary within group?

Post by AyaH »

Dear Chris,

Thank you for you swift and helpful reply. I was not aware fo the updated manual - thank you for referring to it.

I have checked the Stata code for realcomimpute and .txt file generated by it. Both appeared correct. Also, when household income was observed, it was constant within households.

I had 2 imputations (different cross-sectional survey waves, at year xxxx and yyyy, so I imputed separately). In one of imputations, imputed level-2 values were correctly constant within households. But in the other imputation, it was inconsistent within households.

The only difference between the two datafiles was that in the datafile with failed imputation, household id was long, from 80,000 to 120,000. The other one remained <100,000. When I changed the household id in the dataset with failed imputation to 0 to 50,000, imputed household income became consistent within each household.

Thank you for your time, once again.
ChrisCharlton
Posts: 1348
Joined: Mon Oct 19, 2009 10:34 am

Re: Imputation to level 2 variable: vary within group?

Post by ChrisCharlton »

Thank you for confirming the cause of the error. I am curious as to where this apparent truncation is taking place. Would you be able to check the household id column in the file exported from Stata to see whether these high-valued IDs match the original data there? The file is loaded into Realcom-Impute using the Matlab importdata function so they should be read with enough precision to hold the whole record.
AyaH
Posts: 10
Joined: Tue Oct 11, 2011 5:30 pm

Re: Imputation to level 2 variable: vary within group?

Post by AyaH »

Dear Chris,

I am sorry for taking time to reply.
I have checked the .txt file generated by Stata, and high-value IDs appeared intact. Just to be sure (and out of curiousity), I imported .txt to excel and those high-value IDs remained correct in exel too.

Thank you for thining about this.
ChrisCharlton
Posts: 1348
Joined: Mon Oct 19, 2009 10:34 am

Re: Imputation to level 2 variable: vary within group?

Post by ChrisCharlton »

Thank you very much for confirming this, it does sound like the problem is somewhere within the Matlab code then. The Matlab documentation I linked previously suggests that the data read should remain in double precision, however this documentation is for Matlab 2021a whereas Realcom runs under the 2012b runtime, so it could be that the function has changed behaviour over the years.
Post Reply