Different ESS in MLwiN and R2MLwiN

Welcome to the forum for R2MLwiN users. Feel free to post your question about R2MLwiN here. The Centre for Multilevel Modelling take no responsibility for the accuracy of these posts, we are unable to monitor them closely. Do go ahead and post your question and thank you in advance if you find the time to post any answers!

Go to R2MLwiN: Running MLwiN from within R >> http://www.bris.ac.uk/cmm/software/r2mlwin/
andrewjdbell
Posts: 17
Joined: Mon Jun 03, 2013 3:19 pm

Different ESS in MLwiN and R2MLwiN

Post by andrewjdbell »

I'm getting quite different ESS scores in my models comparing R2MLwiN and MLwiN, using the same models and data.

With MLwiN, I get the following results:

Code: Select all

	       model	S.E.	ESS
Response	deaths		
			
Fixed Part			
cons	         -2.227	0.028	1159
(age-gm)	  0.048	0.001	511
(age-gm)^2     0.001	0.000	18439
(year-gm)	 -0.031	0.001	512
			
Random Part			
Level: birth_year			
Var(cons)	  0.044	0.006	910595
Level: year			
Var(cons)	  0.031	0.005	830212
Level: c1			
Var(cons)			
Var(bcons.1) 1.000	0.000	0
			
Units: birth_year	101		
Units: year	95		
Units: c1	6163		
Estimation: 	MCMC		
DIC: 	3558279.237		
pD: 	195.819		
Burnin: 	500000		
Chain Length: 	1000000		
Thinning: 	1
And in R2MlwiN I'm getting the following:

Code: Select all

-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- 
MLwiN (version: 3.01)  multilevel model (Poisson) 
             N min     mean max N_complete min_complete mean_complete max_complete
year        95  23 64.87368  91         95           23      64.87368           91
birth_year 101  17 61.01980  91        101           17      61.01980           91
Estimation algorithm:  MCMC      Cross-classified              Elapsed time : 1729.97s 
Number of obs:  6163 (from total 6163)          Number of iter.: 1e+06  Chains: 1  Burn-in: 5e+05 
Bayesian Deviance Information Criterion (DIC)
Dbar      D(thetabar)    pD      DIC
3558095.934 3557900.101 195.833    3558291.767 
--------------------------------------------------------------------------------------------------- 
The model formula:
log(deaths) ~ 1 + offset(logexp) + agec + agec2 + yearc + (1 | 
    year) + (1 | birth_year)
Level 3: year     Level 2: birth_year     Level 1: l1id      
--------------------------------------------------------------------------------------------------- 
The fixed part estimates:  
               Coef.   Std. Err.        z     Pr(>|z|)       [95% Cred.   Interval]     ESS 
Intercept   -2.20986     0.02754   -80.25            0  ***    -2.26254    -2.15503     471 
agec         0.04807     0.00060    80.17            0  ***     0.04703     0.04927      20 
agec2        0.00122     0.00000  2277.38            0  ***     0.00122     0.00122   16221 
yearc       -0.03163     0.00089   -35.61   1.063e-277  ***    -0.03339    -0.02993     728 
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1  
--------------------------------------------------------------------------------------------------- 
The random part estimates at the year level: 
                  Coef.   Std. Err.   [95% Cred.   Interval]      ESS 
var_Intercept   0.03078     0.00462      0.02303     0.04107   954338 
--------------------------------------------------------------------------------------------------- 
The random part estimates at the birth_year level: 
                  Coef.   Std. Err.   [95% Cred.   Interval]      ESS 
var_Intercept   0.04389     0.00639      0.03311     0.05806   178671 
--------------------------------------------------------------------------------------------------- 
The random part estimates at the l1id level: 
                Coef.   Std. Err.   [95% Cred.   Interval]     ESS 
var_bcons_1   1.00000     0.00000      1.00000     1.00000   1e+06 
-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- 
The latter was made with the following code:

Code: Select all

model1 <- runMLwiN(log(deaths) ~ 1 + offset(logexp) + agec + agec2 +
  yearc + (1 | year) + (1 | birth_year), D= "Poisson",
  estoptions = list(xc = TRUE, EstM = 1,
      mcmcOptions = list(hcen = 3),
      mcmcMeth = list(burnin=500000, iterations=1000000)
      ),
  data=UK_fem2)
So for my purposes, the problematic one (in R2MLwiN) is the ESS associated with the age-gm term.
The variables are all the same (and centred the same), both models are cross-classified and have hierarchical centring specified. Both have starting values from IGLS and long burn ins (500k) and chains (1mil). I haven't messed with the default priors. Is there a default that might be different with one and not the other?

Thanks!
ChrisCharlton
Posts: 1351
Joined: Mon Oct 19, 2009 10:34 am

Re: Different ESS in MLwiN and R2MLwiN

Post by ChrisCharlton »

R2MLwiN uses the effectiveSize() function from the coda package to calculate the ESS, which has the following code:

Code: Select all

effectiveSize <- function(x)
{
  if (is.mcmc.list(x))
    {
      ##RGA changed to sum across all chains
      ess <- do.call("rbind",lapply(x,effectiveSize))
      ans <- apply(ess,2,sum)
    }
  else
    {
      x <- as.mcmc(x)
      x <- as.matrix(x)
      spec <- spectrum0.ar(x)$spec
      ans <- ifelse(spec==0, 0, nrow(x) * apply(x, 2, var)/spec)
    }
  return(ans)
}
whereas MLwiN uses a different algorithm, which looks something like:

Code: Select all

ESS <- function(x) {
	n <- length(x)
	meanx <- mean(x)
	varx <- var(x)
	nbig <- 1000
	ybig <- rep(0, nbig)
	
	for (k in 1:nbig) {
		for (t in 0:(n - k)) {
			ybig[k] <- ybig[k] + ((x[t + 1] - meanx) * (x[t + k] - meanx))
		}
		ybig[k] <- ybig[k] / (n * varx)
	}

	act <- 0.0
	for (i in 2:nbig) {
		if ((i > 5) && (ybig[i] < 0.1)) {
			break
		}
		act <- act + ybig[i]
	}
	act <- act * 2
	act <- act + 1
	return(as.integer(n / act))
}
(NOTE: This will be slow as it isn't vectorised).
Would this explain the differences that you are seeing?
andrewjdbell
Posts: 17
Joined: Mon Jun 03, 2013 3:19 pm

Re: Different ESS in MLwiN and R2MLwiN

Post by andrewjdbell »

Possibly - not sure my coding ability is good enough to be sure - I'll see if I can work it out! Although given the parameter estimates are (slightly) different as well, possible that there's something else that's different as well?

(normally wouldn't be fussed by the slight differences in MCMC but as they have the same (IGLS) starting values and uber-long chains you'd expect them to be identical?)

Btw the hierarchical centering doesn't seem to make a difference in either software so it isn't anything to do with that...

Thanks Chris!
ChrisCharlton
Posts: 1351
Joined: Mon Oct 19, 2009 10:34 am

Re: Different ESS in MLwiN and R2MLwiN

Post by ChrisCharlton »

Another thing to check if the parameter estimates are different with the same starting values is what you are using as the seed. MLwiN will use a seed of thirteen if you start with a blank worksheet whereas the default seed in R2MLwiN is one (which matches the seed used in the sample worksheets).
ChrisCharlton
Posts: 1351
Joined: Mon Oct 19, 2009 10:34 am

Re: Different ESS in MLwiN and R2MLwiN

Post by ChrisCharlton »

Here is a more vectorised version of the MLwiN algorithm (which also fixes an off-by-one error in the final for loop):

Code: Select all

ESS <- function(x) {
	n <- length(x)
	meanx <- mean(x)
	varx <- var(x)
	nbig <- 1000
	ybig <- rep(0, nbig)
	
	for (k in 1:nbig) {
		ybig[k] <- sum((x[1:(n - k + 1)] - meanx) * (x[k:n] - meanx)) / (n * varx)
	}

	act <- 0.0
	for (i in 2:nbig) {
		if ((i > 6) && (ybig[i] < 0.1)) {
			break
		}
		act <- act + ybig[i]
	}
	act <- act * 2
	act <- act + 1
	return(as.integer(n / act))
}
andrewjdbell
Posts: 17
Joined: Mon Jun 03, 2013 3:19 pm

Re: Different ESS in MLwiN and R2MLwiN

Post by andrewjdbell »

Thanks for this Chris! That function seems to work and gives numbers that are pretty comparable to what I got in MLwiN.

I guess the question then is: which one is right? I'd love to just use the MLwiN ones as it gives me the answer I want, but would be nice to have more of a justification than that (unfortunately my knowledge of time-series statistics is letting me down in understanding where the differences are coming from...)

Thanks again!
Andy
ChrisCharlton
Posts: 1351
Joined: Mon Oct 19, 2009 10:34 am

Re: Different ESS in MLwiN and R2MLwiN

Post by ChrisCharlton »

My suspicion is that they are both right, but just calculating slightly different things. I'll mention this question to Bill to see whether he can provide a more definitive explanation.
ChrisCharlton
Posts: 1351
Joined: Mon Oct 19, 2009 10:34 am

Re: Different ESS in MLwiN and R2MLwiN

Post by ChrisCharlton »

I talked to Bill about this and he pointed out that to speed things up the MLwiN ESS code it only calculates the first 1000 correlations, and then halts the calculation of the ESS earlier if the correlation is lower than 0.1 after the first 5. You could therefore test whether this is accounting for the differences by increasing the values of nbig and removing the statement:

Code: Select all

		if ((i > 5) && (ybig[i] < 0.1)) {
			break
		}
from the function.

He also noticed that you are entering the classifications in a different order between MLwiN and R2MLwiN and was therefore wondering whether perhaps you had specified hierarchical centring but chosen a different level between the two versions.
andrewjdbell
Posts: 17
Joined: Mon Jun 03, 2013 3:19 pm

Re: Different ESS in MLwiN and R2MLwiN

Post by andrewjdbell »

Aha - yep that's what makes the difference! (the code change, not the hierarchical centering) Thanks. Weird that it makes that much difference, given the correlations are so high between iterations in this model (see 6-way plot below). Any idea why it makes such a big difference, and would I be right in thinking that the R version is probably the 'correct' one (or are they both equally valid?)
Attachments
m13_fp_age_cent_sixway.png
m13_fp_age_cent_sixway.png (65.12 KiB) Viewed 16421 times
ChrisCharlton
Posts: 1351
Joined: Mon Oct 19, 2009 10:34 am

Re: Different ESS in MLwiN and R2MLwiN

Post by ChrisCharlton »

Thanks for checking this, that's good to know. Yes, I suspect that the R version is more accurate as Bill said that the early break was added to speed up the calculation. I'll forward this thread on to him to see if he has anything to add.
Post Reply