Realcom Stata and Longitudinal Data: Datafiles and Import

Welcome to the forum for REALCOM users. Feel free to post your question about REALCOM here. The Centre for Multilevel Modelling take no responsibility for the accuracy of these posts, we are unable to monitor them closely. Do go ahead and post your question and thank you in advance if you find the time to post any answers!

Go REALCOM (Developing multilevel models for REAListically COMplex social science data) >> http://www.bristol.ac.uk/cmm/software/realcom/
Post Reply
AlexandderWuttke
Posts: 5
Joined: Tue Jul 21, 2015 9:32 am

Realcom Stata and Longitudinal Data: Datafiles and Import

Post by AlexandderWuttke »

I am confused on how to impute longitudinal data with Stata and RealComImpute and with the funtions of the different datasets that were created. I will describe the steps I took first.

I open my Stata14-File, which is formatted in long-Format, generate the necessary constant, drop all cases with missing values in the explanatory variables (MI Procedure with RealCom doesn´t work otherwise) and shorten the dataset for purposes of testing

Code: Select all

use "long_old - Kopie.dta", clear
gen cons = 1
drop if missing(sex) | missing(wbt_)
sort wbt_
keep in 1/3000
In convert it to Stata13-Format and sort it by ID-Identifier
quietly saveold long_old_small, replace
sort unique_id
I export it do RealCom

Code: Select all

realcomImpute wahlnorm_ sex wbt_ using long_old_small.dta, numresponses(1) replace level2id(unique_id) cons(cons)

Now, my old Datafile ( long_old_small) is replaced with a file only 61kb of size. I cannot open this new long_old_small.dta in stata, because Stata says it´s not Stata Format.
Furthermore, a second Dataset (long_old_small_wts.dta) was created with only 15kb of size. I cannot open it with Stata either.
I am already confused at this point. Why two files?

I start Realcom as an Adminstrator (Error Message after Imputation if not as Admin), open long_old_small.dta (is this the right file?) and start the imputation.

This is where I start to need your help.

As the RealComImputeStata-Guide advises, executing -realcomImputeLoad- in Stata will be the next step.
Executing the cmd without data in memory leads to an error message.

Which Data am I supposed to have in memory?

As I pointed out earlier I cannot open the datafile long_old_small.dta that is saved on the disk any more.
So, do I just have to keep Stata open while RealCom does the imputing so that the data will still be in memory?

In this case -realcomImputeLoad- appends 5 variables to my dataset named "_1_wahlnorm_" to my dataset in Long-Format. I imputed 5 Datasets.
Furthermore 5 Datafiles were created names imp1-imp5, each containing a single variable named wahlnorm_1, wahlnorm_2...
What do I do with this data...?

Do I need to append these variables to my dataset as well with -mi import flongsep-, as suggested here, https://www.cmm.bristol.ac.uk/for ... 6e33#p1253?


I have searched the forums and manuals. I hope one of you clarify how to use RealCom.


edit: If I execute -realcomImputeLoad- twice he will also append those wahlnorm_1, wahlnorm_2... variables that were stored in the datafiles namedd imp1-imp5 next to the variables _1_wahlnorm_.... Both sets of variables seem to be identical, though.
ChrisCharlton
Posts: 1348
Joined: Mon Oct 19, 2009 10:34 am

Re: Realcom Stata: Datafiles and Importing of Longitudinal Data

Post by ChrisCharlton »

The file name that you provide to the -realcomImpute- command is the name of the output file (in text format) that will be sent to Realcom. This is created from the currently loaded data, as well as the specification that you give -realcomImpute- and does not have to exist prior to running the command. The _wts file that is also created is related to a currently unsupported weighting functionality, and so can be ignored.

When you execute the -realcomImputeLoad- command you need to make sure that you have the data that includes missing data in memory and that you have set your current working directory to the location that Realcom has saved the file "impvals.txt", as well as the imputed data files. If this is the case then -realcomImputeLoad- should read in the imputed data and stores it in the wide (see http://www.stata.com/help.cgi?mi_styles) mi style as used by Stata. Once this has occurred then you can use the data as if it had been imputed within Stata. At this point you can also get Stata to change the mi style if you prefer another way of storing the data.
AlexandderWuttke
Posts: 5
Joined: Tue Jul 21, 2015 9:32 am

Re: Realcom Stata and Longitudinal Data: Datafiles and Import

Post by AlexandderWuttke »

Thank you so much, Chris.

The Link you provided really clarified things for me. I didnt know that wide/long-styles also matter in the context of imputed datasets. The talk about wide and long format here and in the manual confused me as I am dealing with longitudinal data (in the long-format).
As I understand it now, my data are now both in Long-Format (when it comes to the multiple measurement for each Individual: One Row per Observation) and in Wide-Format (Imputed Values). This makes sense. Thanks!
AlexandderWuttke
Posts: 5
Joined: Tue Jul 21, 2015 9:32 am

Re: Realcom Stata and Longitudinal Data: Datafiles and Import

Post by AlexandderWuttke »

Dear Chris,

I am sorry. I´ve encountered yet another problem. May be you could help me, again.

Reading in the dataset works fine, starting the imputation procedure as well.

But after a while it quits the imputation and displays an error message:
"Error using reshape.
To RESHAPE the number of elements must not change."

Error in plotchains>makechain (line 64)
Error in plotchains (line 7)

And then it goes on with the usual error message regarding mcmdriver and the gui...


I ran the imputation twice.
Settings First Imputation:
15.000 Iterations (500 for each imputed dataset)
Screen Refresh Rate 300
Burn In 1500
The Imputation stopped after Iteration 1800

Settings Second Imputation:
300 Iterations (30 for each imputed dataset)
Screen Refresh Rate 50
Burn In Length 50
The Imputation stopped after Iteration 100

So, to me it seems that the error occurs when the burn in is finished and Realcom tries to create the first imputed dataset.

Fortunately, the error message is rather specific. Does it have to do with irregularities in my multilevel (longitudinal) datastructure or do you have any hint where to look for for the mistake?
ChrisCharlton
Posts: 1348
Joined: Mon Oct 19, 2009 10:34 am

Re: Realcom Stata and Longitudinal Data: Datafiles and Import

Post by ChrisCharlton »

As the error is in plotchains this would suggest that it is related to plotting that chains that you have decided to monitor. It might be worth reviewing these and also seeing whether selecting different parameters for this make any difference.
Post Reply