How to run multilevel models with missing data using Stat-JR and MLwiN?

Welcome to the forum for Stat-JR users. Feel free to post your question about Stat-JR here. The Centre for Multilevel Modelling take no responsibility for the accuracy of these posts, we are unable to monitor them closely. Do go ahead and post your question and thank you in advance if you find the time to post any answers!

We will add further support to the Stat-JR website, such as FAQs and tutorials, as soon as it is available; the Stat-JR website can be found here: http://www.bristol.ac.uk/cmm/software/statjr/
Post Reply
gromatics
Posts: 2
Joined: Wed Dec 27, 2023 6:48 am

How to run multilevel models with missing data using Stat-JR and MLwiN?

Post by gromatics »

Hi, I'm a Stat-JR user who wants to run multilevel models with missing data using MLwiN. I have a three-level dataset with students nested within schools nested within countries, and I have some missing values in the outcome and predictor variables. I want to use multiple imputation to handle the missing data and then fit a random intercept model with MLwiN. I have read the Stat-JR documentation and the multiple imputation template, but I'm still confused about how to do this. Can anyone help me with the following questions?

• How do I specify the imputation model and the analysis model in Stat-JR? Do I need to use the same variables and levels for both models?

• How do I choose the number of imputations and iterations for the imputation process? What are the criteria or rules of thumb for this?

• How do I export the imputed datasets from Stat-JR to MLwiN? Do I need to use the realcomimpute command or the mi: prefix in MLwiN?

• How do I combine the results from the imputed datasets using Rubin's rules? Can Stat-JR or MLwiN do this automatically or do I need to do it manually?

I would appreciate any guidance or advice on how to run multilevel models with missing data using Stat-JR and MLwiN. Thank you in advance.
richardparker
Posts: 61
Joined: Fri Oct 23, 2009 1:49 pm

Re: How to run multilevel models with missing data using Stat-JR and MLwiN?

Post by richardparker »

Hi - you may find the document "Imputation for Multilevel Models with Missing Data Using Stat-JR" helpful: https://www.bristol.ac.uk/cmm/media/sof ... statjr.pdf

For a 3-level model, you will need to use the Stat-JR template NLevelImpute. This has not been as widely tested as the 2-level-only version (2LevelImpute), but the inputs are very similar to those for 2LevelImpute (which are in turn described in more detail in the document linked to above). You may also be interested to know that Blimp (https://www.appliedmissingdata.com/blimp) offers imputation for 3-level models, but using a fully conditional specification (as opposed to joint modelling, as in Stat-JR).

"How do I specify the imputation model and the analysis model in Stat-JR? Do I need to use the same variables and levels for both models?"
As the linked document above indicates, the template questions ask you to specify both your analysis model (model of interest; MOI) and your imputation model. Since the assumptions of the two models must not conflict, then it is strongly advisable to include the same levels for both models. For the same reasons, it is advised for the imputation model to contain all the variables in the analysis model, but it can contain additional (auxiliary) variables too (e.g. that predict missingness / underlying missing values).

"How do I choose the number of imputations and iterations for the imputation process? What are the criteria or rules of thumb for this?"
With regard to choosing the number of iterations, note that Stat-JR generates imputed datasets from Markov chain Monte Carlo (MCMC) methods, which sample from the posterior distribution. MCMC chains typically don't immediately sample from the posterior distibution, however, and this initial section of the MCMC chain is often discarded as a 'warm-up' period (the user determines how long to make the 'warm-up', e.g. by inspecting relevant diagnostics). Once the number of iterations prior to the first imputation has been determined, there is also the question of how many chain iterations to leave between subsequent imputed datasets. This will depend on how autocorrelated the chains are (i.e. higher autocorrelation implies more iterations need to occur before an imputation drawn from the chain can be made which is effectively independent from the last drawn imputation). There are a number of diagnostic plots provided by Stat-JR (e.g. those outputs with an .svg extension, as described in "What is returned in the results pane?" in the document linked to above). With regard to the number of imputations, this is a more general multiple imputation question, for which there is quite a lot of advice published elsewhere.

"How do I export the imputed datasets from Stat-JR to MLwiN? Do I need to use the realcomimpute command or the mi: prefix in MLwiN?"
In terms of getting the imputed data out of Stat-JR - these should be included in the big zip file you get if you click the "Download" button after running NLevelImpute. Alternatively, they'll all end up in the general dataset list so you can switch to each dataset via the Dataset > Choose menu and then download it with Dataset > Download, or if you want to look at it first Dataset > View > Download.

"How do I combine the results from the imputed datasets using Rubin's rules? Can Stat-JR or MLwiN do this automatically or do I need to do it manually?"
Stat-JR does combine the results from the imputed datasets using Rubin's rules (see the document linked to above), or you can apply Rubin's rules yourself (via your software of choice) if you export the imputed datasets. (NB if you want to analyse the models in MLwiN after imputation then you would have to run a model on each imputed dataset individually, in MLwiN, and then apply Rubin's rules yourself (as the multiple imputation functionality in MLwiN only supports imputations in the format created by Realcom-Impute)).
tonyadams
Posts: 6
Joined: Fri Jun 16, 2023 9:24 am

Re: How to run multilevel models with missing data using Stat-JR and MLwiN?

Post by tonyadams »

To run multilevel models with missing data in Stat-JR and MLwiN, you can use Stat-JR for imputation,Buckshot Roulette specifying relevant predictors in the imputation model and defining your multilevel analysis model separately.
Manisoa12
Posts: 5
Joined: Tue Jul 11, 2023 3:00 am

Re: How to run multilevel models with missing data using Stat-JR and MLwiN?

Post by Manisoa12 »

you can run multilevel models with missing data by first creating your multilevel analysis model and then using Stat-JR for imputation to identify important predictors.
run 3
Post Reply