Page **1** of **1**

### Identifying residuals

Posted: **Wed Jul 04, 2018 4:03 pm**

by **pwalthery**

Hello,

I have been fitting a straightforward variance component two-level model, with did and lev2ID thelevel one and level 2 ids , and would like to label my level 2 residuals in a caterpillar plot. However, I am not able to match mymodel2e@residual$lev_2_resi_est_Intercept to mymodel2e@data$lev2ID, as their size are different. How can I identify my level 2 observations?

Thank you

### Re: Identifying residuals

Posted: **Thu Jul 05, 2018 9:42 am**

by **ChrisCharlton**

The residuals should be in the same order as the units in the original data, so if your ID variables are sorted, are unique across level-2 units and you had a two-level model then you could calculate the IDs with a line such as:

Code: Select all

`residualsID <- unique(mymodel2e@data$lev2ID)`

Unfortunately the current

**caterpillar()** function provided with R2MLwiN does not provide a way to add your own labels to the plot, so you would need to adapt it or make the plot manually. The following is an example of how you might do this for the first example in the example at

http://www.bristol.ac.uk/cmm/media/r2ml ... ide03.Rout:

Code: Select all

```
caterpillarlab <- function(y, x, ranks, qtlow, qtup, xlab = "", ylab = "", xlim = NULL, ylim = NULL, main = "") {
# This function draws a caterpillar plot
if (is.null(ylim)){
ylim = c(min(qtlow), max(qtup))
}
plot(x, y[rankno], xlim = xlim, ylim = ylim, pch = 15, xlab = xlab, ylab = ylab, main = main, xaxt="n")
points(x, qtlow[rankno], pch = 24, bg = "grey")
points(x, qtup[rankno], pch = 25, bg = "grey")
for (i in 1:length(x)) {
lines(rep(x[i], 2), c(qtlow[rankno][i], qtup[rankno][i]))
}
if (!is.null(names(x))) {
axis(side=1, at=x, labels = names(x)[rankno], cex.axis = 0.5)
} else {
if (is.factor(x)) {
axis(side=1, at=x, labels = levels(x)[rankno], cex.axis = 0.5)
} else {
axis(side=1, at=x, labels = x[rankno])
}
}
}
library(R2MLwiN)
data(tutorial, package = "R2MLwiN")
(mymodel1 <- runMLwiN(normexam ~ 1 + (1 | school) + (1 | student), data = tutorial, estoptions = list(resi.store = TRUE)))
residuals <- mymodel1@residual$lev_2_resi_est_Intercept
residualsCI <- 1.96 * sqrt(mymodel1@residual$lev_2_resi_var_Intercept)
residualsRank <- rank(residuals)
rankno <- order(residualsRank)
residualsID <- unique(mymodel1@data$school)
xaxis <- 1:65
names(xaxis) <- residualsID
# Note, xaxis could have been a factor instead, as below:
# xaxis <- factor(1:65, labels=unique(mymodel1@data$school))
caterpillarlab(y = residuals, x = xaxis, ranks=rankno, qtlow = (residuals - residualsCI), qtup = (residuals + residualsCI), xlab = 'Rank', ylab = 'Intercept')
```

### Re: Identifying residuals

Posted: **Thu Jul 05, 2018 1:03 pm**

by **pwalthery**

Thank you for the reply as well as for the graphing tips.

It looks like there is something wrong with the model specification then, as

Code: Select all

`length(unique(mymodel0e@data$lev2ID))`

provides the expected answer (25), but

Code: Select all

`length(unique(mymodel2e@residual$lev_2_resi_est_Intercept))`

gives 943. This is indeed a two level variance component model, in its simplest form:

Code: Select all

```
mymodel2e <- runMLwiN(mwept.nenj ~ 1 +(1 | lev2ID)+(1 | pid),
data = reg.dat ,estoptions = list(resi.store = T))
```

PID (level 1) uniquely identifies the observations and lev2ID is distributed as follows:

Code: Select all

```
table(reg.dat$lev2ID)
11 12 21 22 23 24 31 32 33 34 35 41 42 51 52 53 54 61 62 71 72 81 82 91 92
230 73 76 26 144 80 25 99 37 64 129 156 41 39 61 53 40 169 42 111 25 63 84 58 127
```

Am I missing something obvious? Both PID and lev2ID are numerical and reg.dat is data frame.

### Re: Identifying residuals

Posted: **Thu Jul 05, 2018 1:11 pm**

by **pwalthery**

In case this is helpful , here is the R2MLwiN output:

Code: Select all

```
-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-
MLwiN (version: 3) multilevel model (Normal)
N min mean max N_complete min_complete mean_complete max_complete
lev2ID 25 25 82.08 230 25 25 82.08 230
Estimation algorithm: IGLS Elapsed time : 0.3s
Number of obs: 2052 (from total 2052) The model converged after 3 iterations.
Log likelihood: -3389.5
Deviance statistic: 6779
---------------------------------------------------------------------------------------------------
The model formula:
mwept.nenj ~ 1 + (1 | lev2ID) + (1 | pid)
Level 2: lev2ID Level 1: pid
---------------------------------------------------------------------------------------------------
The fixed part estimates:
Coef. Std. Err. z Pr(>|z|) [95% Conf. Interval]
Intercept 4.70063 0.03188 147.46 0 *** 4.63815 4.76311
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
---------------------------------------------------------------------------------------------------
The random part estimates at the lev2ID level:
Coef. Std. Err.
var_Intercept 0.86884 0.06859
---------------------------------------------------------------------------------------------------
The random part estimates at the pid level:
Coef. Std. Err.
var_Intercept 0.85364 0.05156
-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-
```

### Re: Identifying residuals

Posted: **Thu Jul 05, 2018 1:20 pm**

by **ChrisCharlton**

That does look like something is wrong, as I would expect **mymodel2e@residual$lev_2_resi_est_Intercept** to have a length of 25. Could you confirm that your data are sorted by the level-2 identifier? If so, could you add the **debugmode=TRUE** option into the **estoptions** part of your specification and then check the **Model->Hierarchy viewer** within MLwiN once it opens to check that it matches your expectations. Finally I would suggest looking at the contents of mymodel2e@residual$lev_2_resi_est_Intercept to see whether it looks sensible or if it has been padded with NA values.

### Re: Identifying residuals

Posted: **Fri Jul 13, 2018 1:42 pm**

by **pwalthery**

Oops. The level 2 variable was not sorted. I had completely forgotten this was a requirement.

Sorry for wasting your time.