Page 1 of 1

non-linear relationship & restricting imputation of variable

Posted: Tue Apr 23, 2013 3:29 pm
by sdblspin
Hello,

I am a STATA user and have been getting my data in and out of Realcom using the realcomImpute set of commands.
I have two levels of data: repeated measures on the same individuals over time.

Consider, for example, that I want to examine two trajectories of cognitive ability (a continuous measure) from (a) 11-50y and (b) 23-50y. I want to impute missing data on covariates. My data gives rise to several questions:

- I would like to adjust my trajectories for “baseline” BMI, i.e. BMI at (a) 11y and (b) 23y. BMI is not linearly related to age, so in the equations set up in REALCOM, I would like age and age-squared to predict missing BMI. Does this then mean that all equations being set up for all responses will need to have age and age-squared as predictors? I am not sure how to use a variable (age-squared) to predict one response and not the others.

- For the 23-50y trajectory, I would like to adjust for (a time varying covariate) number of children. Currently REALCOM imputes “number of children” for ages younger than 23y. Is there a way to restrict the age range that this variable is imputed for? I have thought of (a) setting number of children to 0 for ages younger than 23y and (b) running two sets of imputations – one for the 11-50y trajectory, where number of children wouldn’t feature as a covariate and one for the 23-50y trajectory, where I would include number of children. I don’t think that (b) is an ideal option, as it seems to me it will be imputing some information twice. If I use (a), there would be no variation in this variable within and between individuals for ages<23 – I think this would be OK, but would like someone to confirm this.

Many thanks for your help.
S

Re: non-linear relationship & restricting imputation of vari

Posted: Wed Apr 24, 2013 1:26 pm
by Harvey Goldstein
Regarding your first Q I would suggest simply using the squared term for each variable. Since this is just for imputation purposes and assuming that the relationship is very close to linear for non-BMI it should be fine.
Re second question option a) would be fine but make sure that you treat number of children as ordered categorical, otherwise this won't lead to a proper specification.
Harvey Goldstein