check the homoscedasticity assumption in logistic models

Welcome to the forum for R2MLwiN users. Feel free to post your question about R2MLwiN here. The Centre for Multilevel Modelling take no responsibility for the accuracy of these posts, we are unable to monitor them closely. Do go ahead and post your question and thank you in advance if you find the time to post any answers!

Go to R2MLwiN: Running MLwiN from within R >> http://www.bris.ac.uk/cmm/software/r2mlwin/
Post Reply
adeldaoud
Posts: 63
Joined: Sat Aug 15, 2015 4:00 pm

check the homoscedasticity assumption in logistic models

Post by adeldaoud » Tue Mar 07, 2017 11:24 pm

Hi

I would like to check the homoscedasticity assumption in my logistic multilevel model. According to this post viewtopic.php?f=1&t=2006&e=1&view=unread#p4567, “You can assess the homoskedasticity assumption by plotting the predicted random effects against the covariates.”.

In the MLwiN envirioment, I assume this is achieved by plotting “the standardised residual x fixed part prediction” for the second level. I wonder how I can achieve this in R2mlwin? I cannot find anything on this in the manual.

Thanks in advance

ChrisCharlton
Posts: 1083
Joined: Mon Oct 19, 2009 10:34 am

Re: check the homoscedasticity assumption in logistic models

Post by ChrisCharlton » Wed Mar 08, 2017 12:04 pm

The following is an example of how to replicate this residuals plot within R:

Code: Select all

library(R2MLwiN)
library(doBy)
data(tutorial)
(mymodel <- runMLwiN(normexam ~ 1 + standlrt + (1 | school) + (1 | student), estoptions = list(resi.store = TRUE, resioptions = "standardised"), data = tutorial))
pred = data.frame(school=tutorial$school, predexam=predict(mymodel))
predexam.mean <- summaryBy(predexam~school, data=pred, FUN=mean)$predexam.mean
plot(predexam.mean, mymodel@residual$lev_2_std_resi_est_Intercept, xlab="pred. val.", ylab="std(Intercept)")

ChrisCharlton
Posts: 1083
Joined: Mon Oct 19, 2009 10:34 am

Re: check the homoscedasticity assumption in logistic models

Post by ChrisCharlton » Wed Mar 08, 2017 2:36 pm

If I am interpreting what George said correctly he is suggesting plotting the residuals vs each of the predictors in the model (I assume averaged within level-2 units). The R code for this would be the same as above, except that the prediction would be replaced by the variable being checked.

adeldaoud
Posts: 63
Joined: Sat Aug 15, 2015 4:00 pm

Re: check the homoscedasticity assumption in logistic models

Post by adeldaoud » Wed Mar 08, 2017 5:43 pm

Thanks for the clarifications, Chris, and some additional questions:

1)
Does the predict(mymodel) function takes the mean of the covariates or the actual data values for each observation (and plug them into the model)? [it looks like it is taking the actual values…]

What if I want to set some covariates to their mean, and but usual actual values for one?

2)
If I am interpreting what George said correctly he is suggesting plotting the residuals vs each of the predictors in the model (I assume averaged within level-2 units). The R code for this would be the same as above, except that the prediction would be replaced by the variable being checked.
I am re-fitting the model with two variables, to get predictions on each variable. However, I am getting the exact same predictions back (same plots). Can you see what I am doing wrong?

library(R2MLwiN)
library(doBy)
data(tutorial)
(mymodel <- runMLwiN(normexam ~ 1 + standlrt + sex + (1 | school) + (1 | student), estoptions = list(resi.store = TRUE, resioptions = "standardised"), data = tutorial))

# predicting on standlrt
pred = data.frame(school=tutorial$school, predexam=predict(mymodel, terms = "standlrt"))
predexam.mean <- summaryBy(predexam~school, data=pred, FUN=mean)$predexam.mean
plot(predexam.mean, mymodel@residual$lev_2_std_resi_est_Intercept, xlab="pred. val.", ylab="std(Intercept)")


# predicting on sex
pred = data.frame(school=tutorial$school, predexam=predict(mymodel, terms = "sex"))
predexam.mean <- summaryBy(predexam~school, data=pred, FUN=mean)$predexam.mean
plot(predexam.mean, mymodel@residual$lev_2_std_resi_est_Intercept, xlab="pred. val.", ylab="std(Intercept)")

ChrisCharlton
Posts: 1083
Joined: Mon Oct 19, 2009 10:34 am

Re: check the homoscedasticity assumption in logistic models

Post by ChrisCharlton » Wed Mar 08, 2017 6:18 pm

1) The predict function by default will give you the sum of the covariates multiplied by their parameter estimates. It does however have an option to supply you own data, so you could give it a data frame where you have replaced the values with their means.

2) If you wanted to replicate the behaviour of the MLwiN residuals window where it plots residuals by an average for variable instead of a fixed-part prediction the code would just be something like the following:

Code: Select all

library(R2MLwiN)
library(doBy)
data(tutorial)
(mymodel <- runMLwiN(normexam ~ 1 + standlrt + (1 | school) + (1 | student), estoptions = list(resi.store = TRUE, resioptions = "standardised"), data = tutorial))
standlrt.mean <- summaryBy(standlrt~school, data=tutorial, FUN=mean)$standlrt.mean
plot(standlrt.mean, mymodel@residual$lev_2_std_resi_est_Intercept, xlab="standlrt", ylab="std(Intercept)")

adeldaoud
Posts: 63
Joined: Sat Aug 15, 2015 4:00 pm

Re: check the homoscedasticity assumption in logistic models

Post by adeldaoud » Thu Mar 09, 2017 3:10 pm

Thanks, again, Chris.

Regarding the behavior of the predict() function.

1) does it take the sample data stored in the mymodel object?

2) not getting what the difference between the terms= argument and the params= argument (from the manual and my own experimentation)? Could you please help me understand this?

3) Beyond testing homoscedasticity, I would like to produce a couple of prediction holding a set of covariates at their mean, and only vary one variable (the range specified by me, and not the sample). [is this where you propose I feed the predict() with new data?]

Many thanks

ChrisCharlton
Posts: 1083
Joined: Mon Oct 19, 2009 10:34 am

Re: check the homoscedasticity assumption in logistic models

Post by ChrisCharlton » Thu Mar 09, 2017 3:43 pm

1) Unless you specify an alternative data frame with the newdata= argument in the function call, then yes this is the data that it uses.

2) The terms= argument allows you to choose by a subset of data via the variable names, params= allows to to choose via a subset of data via the parameter names (FP_...). The terms= parameter is only used if you have also specified type="terms", however in this case it takes precedence over the params= argument. The params= parameter applies to all prediction types.

3) Yes, my suggestion here was to create a new data frame which has the same variable names as the original data, but has the variables not being investigated held at their means. You would then pass this to the predict function via the newdata= argument.

Post Reply