Random intercept vs. random slopes in 2LevelImpute

richardparker · Post by **richardparker** » Fri Nov 14, 2014 2:28 pm

Hi - looking at your input string, it's likely that the choice of distribution for the MOI is causing the problem. It's specified as 'Multivariate Normal', but only one response variable for the MOI is specified (ivh), rather than the larger number (>1) of response variables Stat-JR will be expecting given the distribution chosen. Currently, the 2LevelImpute template supports normal, binary, Poisson and multivariate normal distributions in the MOI, but if none of those is suitable for your needs you could use the imputed datasets to fit an alternative distribution via other means (including via other Stat-JR templates).

You'll only be able to download the datasets via Dataset>Choose, and then Dataset>Download, so you won't see all those files listed as not all of them are .dta files. We ironed out a few issues with the more generic Download function (the green button which appears after your model has finished running) in the latest version of Stat-JR, so that might work better with the 2LevelImpute template now (I've just tried it, albeit with a different model), but it's fine (if a bit fiddly) to download the datasets via the method you're using: it's just a question of distinguishing the ones produced from the latest model run from other datasets (in fact, you could move the datasets shipped with Stat-JR from the StatJR\datasets folder and temporarily store them elsewhere, just to de-clutter the list of datasets in TREE - might help? As long as you leave the tutorial dataset (TREE expects to find this when it starts up), and the tutmiss dataset if you're regularly running the 2LevelImpute eBook - and of course your own datasets - should be fine.)

Yes, you should be able to download the imputed datasets; these are saved, in TREE's datasets list, as...

Imputation_Model_impute_datafile_chainA_iterB
...and...
impute_datafile_chainA_iterB

Note, there's no need to download both, as they're the same: i.e. only necessary to download either Imputation_Model_impute_datafile_chain0_iter0, Imputation_Model_impute_datafile_chain1_iter0, etc. or impute_datafile_chain0_iter0, impute_datafile_chain1_iter0, etc... they're simply saved twice, with and without the Imputation_Model prefix (you only get one type if do so via the green Download button).

Also, don't confuse these with the level 2 datasets (e.g. impute__L2Data_chainA_iterB or Imputation_Model_impute__L2Data_chainA_iterB).

Best wishes,

Richard

shakespeare · Post by **shakespeare** » Fri Nov 14, 2014 4:42 pm

Thank you. The documentation for 2LevelImpute talks about the latent normal model used in the MOI (at least for single level models). if says:

For an ordered categorical response a similar procedure is used with additionally a set of thresholds defined on the standard normal scale that delineate the ordered categories.

The procedure builds on the probit model for binary responses. Does this not mean an ordered outcome can be fit? I'm confused.

I can definitely move the example files around as you suggest. One question I have is whether I need to request an update of StaJR. Also, it looks like my version of the 2LevelImpute template is date 9/10/14. Is this current?

richardparker · Post by **richardparker** » Tue Nov 18, 2014 2:31 pm

Hi,

Sorry for the confusion. You can fit categorical response and/or categorical explanatory variables in the imputation model. Currently, however, there isn't support for categorical (other than binary) response variables in the MOI, but we hope to resolve this soon. In the meantime you can take the imputed datasets and fit them in a MOI specified via other means (e.g. fitting your MOI via another Stat-JR template, or via another software package): i.e. the lack of support for categorical responses in the MOI doesn't impact on 2LevelImpute's ability to handle categorical variables in the imputation model.

You can get a new version of Stat-JR via: http://www.cmm.bris.ac.uk/clients/newstatjrdownload/. There isn't a version of the 2LevelImpute template newer than the version date you noted.

Best wishes,

Richard

shakespeare · Post by **shakespeare** » Tue Nov 18, 2014 6:28 pm

Thanks for your response. This raises a few of questions:

1) Do the answers to questions specifying the model of interest have any effect?
2) If so, what distribution should be chosen for multinomial outcomes?
3) Should I choose yes for the MVN update for beta?
4) It seems that there should be a question at this point re: whether there are random slopes. I did not see it, and I don't want random slopes in this case, but I was wondering if there is a bug in the template.
5) I have missing data on my outcome (ivh). Should I include that as a response variable in the imputation model? In general, for any variable that is an outcome, i.e., response in the MOI, should it be included as a response in the imputation model if it has missing data? Seems so, but I wasn't sure this was permissible within the framework of the software.

Thx.

richardparker · Post by **richardparker** » Wed Nov 19, 2014 12:52 pm

Hi - I've answered your questions below - hope that helps.

1) Do the answers to questions specifying the model of interest have any effect?

2) If so, what distribution should be chosen for multinomial outcomes?

The MOI inputs in the 2LevelImpute template only have an effect on how the MOI is fit; it doesn't have any effect on the imputation model.

3) Should I choose yes for the MVN update for beta?

If you've chosen a multivariate normal response in your MOI, you'll be asked this question about how you wish to estimate it (your answer to that question only pertains to how the MOI is fit, not the imputation model). We advise choosing 'Yes' for the MNV update for beta, because it improves mixing.

4) It seems that there should be a question at this point re: whether there are random slopes. I did not see it, and I don't want random slopes in this case, but I was wondering if there is a bug in the template.

The 2LevelImpute template doesn't allow for random slopes (or coefficients) in the MOI if you're modelling a multivariate normal response in the MOI. There are other Stat-JR templates available which do allow for such models to be fitted (e.g. to imputed datasets; see 2LevelMVNormalRS and 2LevelMVNormalRScc available in the zipped folder of 'further multivariate normal and mixed response models' accessible towards the bottom of this page: http://www.bristol.ac.uk/cmm/software/statjr/downloads, for example).

5) I have missing data on my outcome (ivh). Should I include that as a response variable in the imputation model? In general, for any variable that is an outcome, i.e., response in the MOI, should it be included as a response in the imputation model if it has missing data? Seems so, but I wasn't sure this was permissible within the framework of the software.

Yes, that's permissible with the framework of this software. There is a new 'Missing Data' module (# 14) as part of the LEMMA (free) online course (http://www.bristol.ac.uk/cmm/learning/o ... index.html) which provides useful guidance, including the following: "clearly any variable we wish to impute missing values for must be included. Furthermore, all variables involved in the model of interest must be included, irrespective of whether they have missing values, including what will be the outcome variable."

Best wishes,

Richard

shakespeare · Post by **shakespeare** » Wed Nov 19, 2014 1:34 pm

That pretty much confirms what I was thinking. Thanks for your help. I'm running the ivh model now. We'll see how things turn out...

shakespeare · Post by **shakespeare** » Thu Nov 20, 2014 4:28 pm

Ok. I'm getting output. Will evaluate when I have time. One problem I'm having is that when I go to Dataset>Choose I can only see .dta files. Nothing else is visible, including the .svg files. Without the graphs, it will be difficult to assess the mixing properties of my imputations. What do you think I should do?

richardparker · Post by **richardparker** » Thu Nov 20, 2014 4:52 pm

Hi - if the template runs to completion, then the svg files are available for viewing in the results pane (via the drop-down list); if there's been a crash when it's fitting the MOI, however, then this process (of being uploaded onto the results pane drop-down list) may not have completed (or you may naturally have navigated away from that screen by choosing another dataset), and, as you point out: svg files aren't available in the list of datasets (i.e. via Dataset>Choose; only .dta files will appear there). In which case you could run diagnostics on the chains independently: e.g. using the MCMCColumnDiagnostics template in Stat-JR (if you select that template and run it on a chain, diagnostics plots will be available for viewing as sixway.svg from the drop-down list in the results pane), or by exporting/importing your datasets into other software packages (e.g. MLwiN, R, etc.) and using their facilities (MLwiN has a variety of MCMC diagnostics, as does the coda package in R; both will import .dta files, R via the foreign package).

shakespeare · Post by **shakespeare** » Thu Nov 20, 2014 6:40 pm

Don't see them in the viewing pane-there's a work file there. Thanks for the work around using the diagnostics template. I'll give that a try.

shakespeare · Post by **shakespeare** » Mon Dec 01, 2014 9:46 pm

We've been on holiday across the pond and I was swamped before that, so I just got to feed my results into MLWin for a look at the diagnostic plots on my ivh model. The time series trace doesn't look great and the density trace was skewed and multimodal. I used 5000 iterations between imputations. The Raftery-Lewis Nhat was 23714,6454 which I'm not exactly sure how to interpret in the MI context. Does this mean I need 23k+ iterations between imputations? I ran 50k total the first time, so that would be my guess. Since 23k+ iterations between imputations not practical, I'm wondering about modeling this outcome as continuous. I have four categories in ivh, although I could shrink it to two if need be. I can't model proportions since not fitting an intercept only model-this is a regression. Basing my approach on the MLWin docs, but would be interested in your expertise. What do you think I should do (besides buy a super computer)? Thx.

www.cmm.bristol.ac.uk/forum

Random intercept vs. random slopes in 2LevelImpute

Re: Random intercept vs. random slopes in 2LevelImpute

Re: Random intercept vs. random slopes in 2LevelImpute

Re: Random intercept vs. random slopes in 2LevelImpute

Re: Random intercept vs. random slopes in 2LevelImpute

Re: Random intercept vs. random slopes in 2LevelImpute

Re: Random intercept vs. random slopes in 2LevelImpute

Re: Random intercept vs. random slopes in 2LevelImpute

Re: Random intercept vs. random slopes in 2LevelImpute

Re: Random intercept vs. random slopes in 2LevelImpute

Re: Random intercept vs. random slopes in 2LevelImpute