longitudinal dataset

Welcome to the forum for REALCOM users. Feel free to post your question about REALCOM here. The Centre for Multilevel Modelling take no responsibility for the accuracy of these posts, we are unable to monitor them closely. Do go ahead and post your question and thank you in advance if you find the time to post any answers!

Go REALCOM (Developing multilevel models for REAListically COMplex social science data) >> http://www.bristol.ac.uk/cmm/software/realcom/
Post Reply
caroline1
Posts: 6
Joined: Fri Jul 19, 2013 5:20 pm

longitudinal dataset

Post by caroline1 »

Hi REALCOM users,

I’m new to multiple imputation and this software, and wanted to run by how I am using the software to see if anything thinks I am doing something wrong.

I have a longitudinal dataset with repeated measures clustered within patients. The number of time points and their time spacing varies across patients. There are a total of 168,708 observations and 14,429 unique patients. My analysis model is a 2-level random intercept logistic regression model (random intercept for the patient), with the unit of analysis being patient-day. The analysis model includes independent variables measured at the day level, year level, time-invariant ones, and also includes physician-level demographic variables.

I am using the realcomImpute command in Stata and have included as explanatory variables in the imputation model all the non-missing variables included in the analysis model, the dependent variable, and some additional variables that might predict the probability of the variable being missing. Here is my code (too many explanatory variables to list, put explanatory_variables in their place):

realcomImpute m.comorbiditystatus_year_grp m.insurance_rev explanatory_variables using dxhyp.dat, replace numresponses(2) level2id(Patient_Key) cons(cons)

In REALCOM, I left the model specification, MCMC estimation settings, values in Impute procedure at their default values.

A few specific questions:
• Does this approach seem OK in general for the type of data I have? Or are there modifications I should make, for example to what some of the default values are?
• In an imputation model for longitudinal data, does one typically include the value of the variable with missing values at the prior time point as an explanatory variable in the model?
• I’m not sure I understand how REALCOM comes up with the default model specification. Are there situations where that should be changed?

Any comments would be much appreciated!!

Thanks,
Caroline
ChrisCharlton
Posts: 1390
Joined: Mon Oct 19, 2009 10:34 am

Re: longitudinal dataset

Post by ChrisCharlton »

As your questions are fairly general, rather than being specifically related to Realcom, you might also want to ask them on the missing data discussion group (https://groups.google.com/forum/#!forum/missing-data) if you don't get any answers here.
caroline1
Posts: 6
Joined: Fri Jul 19, 2013 5:20 pm

Re: longitudinal dataset

Post by caroline1 »

Thanks for the advice, I will try that.
drsatpalsandhu
Posts: 7
Joined: Tue Oct 29, 2013 12:14 pm

Re: longitudinal dataset

Post by drsatpalsandhu »

Hi
Can we extend Realcom Impute (through stata realcomImpute commands) to three level data structure. For example in case of one-stage meta analysis of longitudinal studies where repeated measurements (level 1) nested within individual (level 2), which are further nested in studies (level 3).
Regards
Sandhu
Jamoo
Posts: 36
Joined: Wed Oct 05, 2011 2:33 pm

Re: longitudinal dataset

Post by Jamoo »

Hi Sandhu,
drsatpalsandhu wrote:Hi
Can we extend Realcom Impute (through stata realcomImpute commands) to three level data structure. For example in case of one-stage meta analysis of longitudinal studies where repeated measurements (level 1) nested within individual (level 2), which are further nested in studies (level 3).
Regards
Sandhu
Harvey Goldstein writes - http://www.bristol.ac.uk/media-library/ ... tation.pdf (p.1) that
Currently, only 2-level hierarchical data can be handled, although in some cases it will
possible to substitute fixed for random effects. Thus, in a 3-level structure a fixed (dummy)
variable for each level 3 unit could be used and a similar procedure for a cross classified
model.
Assuming that it still stands that this is still the only way to specify a third level (which I think it is), then it seems like it would be possible. Chris/Harvey please correct me if I'm wrong, but I think you do this in practical terms by adding n dummary variables for each study. This seems like it would be fine if you have a limited number of studies.

So that's my attempt at an answer to your question. I have a linked question to ask:

In my case, I have around 400 unique level 3 units - is it possible that a Realcom model could handle this number of additional dummy variables?
ChrisCharlton
Posts: 1390
Joined: Mon Oct 19, 2009 10:34 am

Re: longitudinal dataset

Post by ChrisCharlton »

There are no hard-coded limits on the number of variables that you can specify in Realcom, however depending on your data size you may find that you come across memory size or other computation problems (plus the models may run very slowly). I would suggest that you just try the model and see how far you get.
Post Reply