Page 1 of 2

Don't have MLWin

Posted: Thu Mar 28, 2013 5:58 pm
by shakespeare
I don't own MLWin (I'm a SAS user). I'm looking at an article fron Dec 2011 in the Journal of Statistical Software that says Realcom Impute can be used with any software, but it's not clear from the article how I could use it with SAS. The algorithm for use with MLWin is:

1) Run a model in MLWin
2) Use MLWin to save an imputation model
3) Use the model exported to REALCOM to carry out the imputation

It looks like I could export my SAS file to Stata, build the equations using REALCOM and then carry out the imputation. Am I right?

Re: Don't have MLWin

Posted: Fri Mar 29, 2013 6:33 pm
by ChrisCharlton
If you have access to Stata then you will probably find it easiest to use the realcom-impute Stata package (see http://missingdata.lshtm.ac.uk/index.ph ... Itemid=102).

If not then you will need to generate an input (text) file that matches the specification described on page 6 of the Realcom-Impute manual (http://www.bris.ac.uk/cmm/software/real ... tation.pdf). You can then use this file with Realcom-Impute to generated complete data sets which you could then use in the package of your choice to run your model of interest and combine the results.

Re: Don't have MLWin

Posted: Mon Apr 01, 2013 7:11 pm
by shakespeare
This was quite helpful. I used SAS to create a text file that I opened in Excel to add the three descriptor lines to at the top. I saved this as a tab delimited file and opened it in WordPad to check. It looks just like the example in the second link. So far so good. I opened this file in REALCOM and it seemed to understand everything. I have a two level model with all my variables at level one. The level 2 variable is correctly preselected, the repsonses are marked selected, and the explanatory variables say lev1+lev2 for cons and level 1 for all my auxillary variables. I've tried to set up the estimation as per the paper with 4501 iterations, a burn in of 100, and files to be produced every 500 iterations. Double checked my path to be sure my output is going where it should. I'm seeing an error when I try to run the estimation that says, "Error using + Matrix dimensions must agree." Of course, this spawns a whole host of other error message. Not sure what that means but I noticed the line preceeding the inital error counts the number of cases and level 2 units. The number of cases is correct but it's reporting 120 level 2 units when there are only 100. I copied that column out of the text file and ran it through SAS to count what was there and it came up 100 which is correct, so I'm not sure why REALCOM is having a problem. Any ideas?

Re: Don't have MLWin

Posted: Mon Apr 01, 2013 9:10 pm
by ChrisCharlton
Like MLwiN, Realcom-Impute determines the number of level-2 units by the number of times that the code changes within the ID variable. This means that the data will need to be sorted by this ID before importing it. If it isn't sorted then you will see more units than you expect, so this may explain what you are seeing. You will also need to make sure that you either use the same value for missing as Realcom-Impute expects (-9.999e29) or you tell it which missing code you are using in your data.

Re: Don't have MLWin

Posted: Mon Apr 01, 2013 10:32 pm
by shakespeare
I'm using 9999 for missing and I specified that in the dialog. However, I failed to sort on the level 2 id. Thanks for the tip.

Re: Don't have MLWin

Posted: Tue Apr 02, 2013 12:51 pm
by shakespeare
So I re-ran everything and the count of the second level units is now correct, but I received the same error. I looked to see if what I'm doing differs from the documentation. I'm using 9999 as missing but it's declared as such in the program, so that shouldn't be a problem. I crosstabbed my binary variables before beginning and cell sizes look ok. Some small amounts of missing data but no valid values with small numbers. For the checklist on the last page, there are no missing values in my axuillary variables. It's possible there could be dependencies among the variables. On single level models I usually examine VIFs to screen predictors for collinearity, but I'm not sure of the best way to screen in the multilevel case. My data are properly ordered within level 2 units and I have no level 2 resposnes. I don't think the IGLS iterations applies. I have no records for which all responses are missing and I have added the CONS. All my binary variables are coded as 2. I have 100 level two units for 6 responses. That should be enough, even with 6 auxillry variables, right? All my predictors are at level 1 anyway as I have no level variables other than location. It seems deleting variables due to rule out linear dependencies is the most likely corse of action, but I don't really know how to interpret the error message that I getting.

Re: Don't have MLWin

Posted: Tue Apr 02, 2013 4:52 pm
by ChrisCharlton
In that case I think that we will need to see an example of the files that you have having trouble with. If you could send this to Professor Goldstein (contact details here: http://www.bris.ac.uk/cmm/team/hg/) then he should be able to check this next week.

Re: Don't have MLWin

Posted: Thu Apr 04, 2013 7:02 pm
by shakespeare
I tinkered with this and fianlly got it to run. Not sure exactly what made it go, but dropping one of my variables seemed to help. Another thing I figured out is that it's not enough to change the missing data value. One must also click on the adjacent set missing value button to activate that new value. One problem I'm having is that I'm only getting imputations for my first column of data. The other 5 response variables still have 9999 for the missing data. Any idea why? The documentation is more for WlWin users rather than stand alone users like myself, so it's hard to get a picture of why only part of my data is being imputed.

Re: Don't have MLWin

Posted: Mon Apr 08, 2013 3:14 pm
by ChrisCharlton
I am not sure why it would only apply the change to one variable, it's possible that there is a bug. Could you check whether it works okay if you replace the 9999 values with the default missing value before importing the data?

Re: Don't have MLWin

Posted: Mon Apr 08, 2013 6:09 pm
by shakespeare
I solved part of the problem. I went into the variable selection and instead of accepting the defaults, I manually selected the variables and it imputed the continuous variables. The categorical variables were still missing, however. My categorical variables are binary and unordered multicategory variables that I have designated type 2. When I put in the default missing value REALCOM throws an error and never runs. I'm thinking some of the problems I'm having may be because I'm using Excel to add the three lines of description at the top of the data file. I'll try using SAS to create the entire file. Does REALCOM prefer fixed foremat or tab delimted?