R2MLwiN with mice imputed data (with sampling weights)

Welcome to the forum for R2MLwiN users. Feel free to post your question about R2MLwiN here. The Centre for Multilevel Modelling take no responsibility for the accuracy of these posts, we are unable to monitor them closely. Do go ahead and post your question and thank you in advance if you find the time to post any answers!

Go to R2MLwiN: Running MLwiN from within R >> http://www.bris.ac.uk/cmm/software/r2mlwin/
Post Reply
R2MLwiNuser
Posts: 2
Joined: Thu Nov 30, 2017 4:21 pm

R2MLwiN with mice imputed data (with sampling weights)

Post by R2MLwiNuser » Thu Nov 30, 2017 4:41 pm

Hello,

I'm having difficulty 'pooling' multilevel regression results using data that has been multiply imputed using MICE software through the R2MLwiN package (I'm a little stuck!).

What I am trying to do
Run 10 imputed datasets through the runMLwiN() function, and apply survey weights at level 1.

This is the code I am using

Code: Select all

Formula: F1 <- Y ~ 1 + x1 + x2 + (1|level2) + (1|level1)

model <- with(imputed, runMLwin(Formula = F1, estoptions=list(weighting=list(weightvar=c(NA, "Level1weight")), standardised=TRUE))
Error in eval(predvars, data, env) : object 'Level1weight' not found <---- NOT WORKING

If I do this: 
model <model <- with(imputed, runMLwin(Formula = F1, estoptions=list(weighting=list(weightvar=c(NA, "Level1weight")), standardised=TRUE), data=imputed_as_dataframe) <-- WORKS but, treats the imputed datasets as one big dataset!
The Level1Weight is in the imputed data (mids object not data.frame), and it seems like Formula = F1 works but the error comes up when I try and apply the weights. If I remove the sampling weights, then I can obtain the pooled estimates. For example, see the following code:

Code: Select all

model <- with(imputed, runMLwin(Formula = F1)) 
summary(pool(model)) <--- THIS WORKS
I just can't seem to apply survey weights AND get pooled estimates!

Alternative approach that I'm thinking about
I've deconstructed the imputed data into 10 datasets and am running the following code:

Code: Select all

 
model1 <- runMLwin(Formula = F1, estoptions=list(weighting=list(weightvar=c(NA, "Level1weight")), standardised=TRUE), data=imputed1)
model2 <- runMLwin(Formula = F1, estoptions=list(weighting=list(weightvar=c(NA, "Level1weight")), standardised=TRUE), data=imputed2)
**
**
model10 <- runMLwin(Formula = F1, estoptions=list(weighting=list(weightvar=c(NA, "Level1weight")), standardised=TRUE), data=imputed10)


When I apply the weights like this, I am able to run the models without errors BUT i'm unsure of how to pool the results!!

Is it possible to pool the results of several objects stored as "Formal class mlwinfitIGLS"?

Thanks so much. Please let me know if I can clarify anything!

ChrisCharlton
Posts: 1112
Joined: Mon Oct 19, 2009 10:34 am

Re: R2MLwiN with mice imputed data (with sampling weights)

Post by ChrisCharlton » Thu Nov 30, 2017 5:49 pm

It looks as if R2MLwiN is unable to find the weighting variable in the data.frame that it is provided with from the with() function provided by mice. Possible things to try are:
  • Modify the imputed object to contain the weighting variable. Looking at the structure of this object this looks like it would be fiddly.
  • Rather than using the name of the weight column put in the vector directly, i.e. if it was in a vector named Level1weight you would use the following syntax:

    Code: Select all

    model <- with(imputed, runMLwin(Formula = F1, estoptions=list(weighting=list(weightvar=c(NA, Level1weight)), standardised=TRUE))
    This should work, but hasn't been tested much.
  • Use mice to run the models without the weights and then replace the fits in the output with version that do use the weights (obviously this will mean that the modelling will take twice as long). For example:

    Code: Select all

    Formula: F1 <- Y ~ 1 + x1 + x2 + (1|level2) + (1|level1)
    model <- with(imputed, runMLwin(Formula = F1)) 
    model$analyses[[1]] <- runMLwin(Formula = F1, estoptions=list(weighting=list(weightvar=c(NA, "Level1weight")), standardised=TRUE), data=imputed1)
    model$analyses[[2]] <- runMLwin(Formula = F1, estoptions=list(weighting=list(weightvar=c(NA, "Level1weight")), standardised=TRUE), data=imputed2)
    model$analyses[[10]] <- runMLwin(Formula = F1, estoptions=list(weighting=list(weightvar=c(NA, "Level1weight")), standardised=TRUE), data=imputed10)
    summary(pool(model))
    

R2MLwiNuser
Posts: 2
Joined: Thu Nov 30, 2017 4:21 pm

Re: R2MLwiN with mice imputed data (with sampling weights)

Post by R2MLwiNuser » Thu Nov 30, 2017 6:44 pm

Hi Chris,

First of all, thank you so much for the quick reply!

1st/2nd suggestion
1st: Modify the imputed object to contain the weighting variable. Looking at the structure of this object this looks like it would be fiddly.
2nd: Rather than using the name of the weight column put in the vector directly, i.e. if it was in a vector named Level1weight you would use the following syntax:

Code: Select all

model <- with(imputed, runMLwin(Formula = F1, estoptions=list(weighting=list(weightvar=c(NA, Level1weight)), standardised=TRUE))
Removing the quotes did not work in my case. My imputed object does contain the "Level1Weight" object, however, removing the quotes causes R to notify me that "the object is not found"

3rd suggestion
Use mice to run the models without the weights and then replace the fits in the output with version that do use the weights (obviously this will mean that the modelling will take twice as long). For example:

Code: Select all

Formula: F1 <- Y ~ 1 + x1 + x2 + (1|level2) + (1|level1)
model <- with(imputed, runMLwin(Formula = F1)) 
model$analyses[[1]] <- runMLwin(Formula = F1, estoptions=list(weighting=list(weightvar=c(NA, "Level1weight")), standardised=TRUE), data=imputed1)
model$analyses[[2]] <- runMLwin(Formula = F1, estoptions=list(weighting=list(weightvar=c(NA, "Level1weight")), standardised=TRUE), data=imputed2)
model$analyses[[10]] <- runMLwin(Formula = F1, estoptions=list(weighting=list(weightvar=c(NA, "Level1weight")), standardised=TRUE), data=imputed10)
summary(pool(model))
This approach worked! I am now able to imputed a dataset using MICE, and then convert each imputation (.imp=1..n) to individual data.frames. And then run these imputed data through R2MLwiN's runMLwiN() function with sampling weights at level 1 to obtain pooled estimates.

Complicated but it works! Thank you so much.

I do have a question about: "model$analyses[[#]]" is this basically recording the results of each multilevel analysis into its own variable and then this allows the mice package's pool() function, to pool the estimates of all models stored in the "model$analyses" object? What I am trying to ask is, does "model$analyses" store the estimates from my analysis on each imputed dataset? Then does using the pool() function, pools those estimates?

ChrisCharlton
Posts: 1112
Joined: Mon Oct 19, 2009 10:34 am

Re: R2MLwiN with mice imputed data (with sampling weights)

Post by ChrisCharlton » Fri Dec 01, 2017 10:26 am

I should probably have been more clear regarding passing the weights as a vector. Here is the code within the runMLwiN function for handling the weights:

Code: Select all

  # Extract weights and add to output data
  if (!is.null(weighting)) {
    for (i in 1:length(weighting$weightvar)) {
      if (!is.na(weighting$weightvar[i])) {
        if (is.character(weighting$weightvar[[i]])) {
          wtvar <- model.frame(as.formula(paste0("~", weighting$weightvar[[i]])), data = data, na.action = NULL)
          indata <- cbind(indata, wtvar)
        } else {
          if (is.vector(weighting$weightvar[[i]])) {
            indata <- cbind(indata, weighting$weightvar[[i]])
            wtname <- paste0("_WEIGHT", (length(weighting$weightvar) - i) + 1)
            vnames <- colnames(indata)
            vnames[length(vnames)] <- wtname
            colnames(indata) <- vnames
            weighting$weightvar[[i]] <- wtname
          } else {
            stop("Invalid weights specification")
          }
        }
      }
    }
    if (is.null(fpsandwich))
      fpsandwich <- TRUE
    if (is.null(rpsandwich))
      rpsandwich <- TRUE
    if (is.null(weighting$standardised))
      weighting$standardised <- TRUE
  }
If you specify the weights as a string then it tries to find the data from either that passed to the runMLwiN via the data function or if this is not specified the parent frame. The second of these is happening in your case.

As an alternative to this you should be able to pass in a vector directly, which it then appends to its working data.frame. If this was stored in a vector named Level1weight then you could pass this directory. If instead if was in a column of a data.frame named mydata then you would pass it as mydata$Level1weight. If this column is directly within your imputed object then you would pass this as something like imputed$Level1weight. You can find the exact name with the function:

Code: Select all

str(imputed)
Yes, contained within the model object is a list of the outputs for each of the imputed models. If you run the R function to view the structure of the class:

Code: Select all

str(model)
Then you should see that within this it contains details of the original model and imputation calls, information regarding the number of missing values in each variable and a list of output objects from each model fit. It is the last of these that you are changing in your case. The pool function will then use this information to apply Rubin's rules to the estimates giving you overall estimates. If you didn't want to run as many models you could of course just manually run the individual models and then apply Rubin's rules yourself.

Post Reply