Cross-classified logit model: Getting the ICCs

johannesmueller · Post by **johannesmueller** » Fri Oct 12, 2018 7:11 pm

Dear all,

I am trying the calculate the ICCs of a cross-classified logit model where I have three non-hierarchically nested random terms. I can find the information how to do so for a two-level logit, and for more complex cross-classified models with a normally distributed dependent variable, but I am afraid I am somewhat stuck with the compounded complications that arise.

This is my status quo lacking the crucial steps:

Code: Select all


*three non-nested random terms
	matrix b = (.20,.20,.20,.20,.20) 
	runmlwin DV cons, level4(cluster1: cons) level3(cluster2: cons) ///
		level2(cluster3: cons) level1(ID: )  mcmc(cc on) initsb(b) discrete(distribution(binomial) link(logit) denom(cons))  nopause    
	runmlwin DV cons, level4(cluster1: cons) level3(cluster2: cons) ///
		level2(cluster3: cons) level1(ID: )  mcmc(cc on) initsprevious discrete(distribution(binomial) link(logit) denom(cons))  nopause  
	runmlwin, or		  
	mcmcsum
	mcmcsum, getchains

I would be very grateful for any insights how I can get to the ICCs of a 3-way cross-classified logit model.

Thanks a million in advance!

billb · Post by **billb** » Mon Oct 15, 2018 1:22 pm

Hi Johannes,
I am not sure here whether you mean ICC or VPC but I can offer the paper I wrote on variance partitioning in multilevel logistic models (Browne et al., 2005 JRSS A https://rss.onlinelibrary.wiley.com/doi ... 04.00365.x ). This covers overdispersed models which are similar to 3 level nested models and offers several approaches. The simplest latent variable approach should work equally well with crossed models. Of course ICCs are slightly different to VPCs as they apply to 2 individuals in a cluster but in your example with only variance components this should be fine.
Best wishes,
Bill.

johannesmueller · Post by **johannesmueller** » Fri Mar 29, 2019 11:43 am

Hi Bill,

Thank you very much for your reply and the suggestion. To be honest, I have appreciated you article a lot but it also is a bit beyond my current involvement with methodology; I am inherently an applied researcher with its pros and cons...

Hence, I was wondering whether you could possibly give me a more applied layman response?

Where would I go from here?
runmlwin DV cons, level4(cluster1: cons) level3(cluster2: cons) ///
level2(cluster3: cons) level1(ID: ) mcmc(cc on) initsprevious discrete(distribution(binomial) link(logit) denom(cons)) nopause

Thank you very much in advance!

Best
J

billb · Post by **billb** » Fri Mar 29, 2019 1:11 pm

Hi Johannes,
My suggestion is to fit your model as you are doing and then extract the different variances for each level along with pi^2/3 for the binomial variation at level 1 and sum these together to get total variance. Then the ICC for any particular classification is it's variance / total variance.
Does that make sense?
Best wishes,
Bill.

johannesmueller · Post by **johannesmueller** » Sun Mar 31, 2019 11:44 am

Dear Bill,

Thank you so much for getting back to me.
This sounds great.
Just to be 100% sure: This should get me to my goal of having the ICCs from an additive cross-classified model, correct?
In the null-model, the ICCs and VPC are identical, or is there some adjustment that'd be needed to also obtain the VPC?

Code: Select all

					runmlwin DV  cons, level4(cluster1: cons) level3(cluster2: cons) ///
								level2(cluster3: cons) level1(ID: ) discrete(distribution(binomial) link(logit) denom(cons))  nopause   
						matrix b = e(b)
						matrix V = e(V)
						runmlwin DV  cons, level4(cluster1: cons) level3(cluster2: cons) ///
								level2(cluster3 cons) level1(ID: )  mcmc(cc on) initsprevious discrete(distribution(binomial) link(logit) denom(cons))  nopause  
						runmlwin, or		  
						mcmcsum
						mcmcsum, getchains 
						[b]display [RP4]var(cons)/([RP4]var(cons) + [RP3]var(cons) + [RP2]var(cons) + (_pi^2)/3)
						display [RP3]var(cons)/([RP4]var(cons) + [RP3]var(cons) + [RP2]var(cons) + (_pi^2)/3)
						display [RP2]var(cons)/([RP4]var(cons) + [RP3]var(cons) + [RP2]var(cons) + (_pi^2)/3)[/b]

Best

billb · Post by **billb** » Mon Apr 01, 2019 6:27 am

Hi Johannes,
The VPC and ICC have the same formulae for all random intercept models (including your null model). It is only in random slopes models where the variability at higher levels depends on predictor variables that the 2 concepts diverge.
Best wishes,
Bill.

johannesmueller · Post by **johannesmueller** » Mon Apr 01, 2019 2:07 pm

Thank you so much for all your help, Bill, absolutely fantastic and highly appreciated!
Have a great week,
Johannes

johannesmueller · Post by **johannesmueller** » Wed Apr 03, 2019 5:04 pm

Dear Bill, dear all,

I am afraid I have a last question on this matter. The results for the ICCs/VPCs are very sensitive to the chosen starting values.

Code: Select all

*three non-nested random terms
	matrix b = (.20,.20,.20,.20,.20) 
	runmlwin DV cons, level4(cluster1: cons) level3(cluster2: cons) ///
		level2(cluster3: cons) level1(ID: )  mcmc(cc on) initsb(b) discrete(distribution(binomial) link(logit) denom(cons))  nopause    
	runmlwin DV cons, level4(cluster1: cons) level3(cluster2: cons) ///
		level2(cluster3: cons) level1(ID: )  mcmc(cc on) initsprevious discrete(distribution(binomial) link(logit) denom(cons))  nopause  
		
		display [RP4]var(cons)/([RP4]var(cons) + [RP3]var(cons) + [RP2]var(cons) + (_pi^2)/3)
		display [RP3]var(cons)/([RP4]var(cons) + [RP3]var(cons) + [RP2]var(cons) + (_pi^2)/3)
		display [RP2]var(cons)/([RP4]var(cons) + [RP3]var(cons) + [RP2]var(cons) + (_pi^2)/3)

If I now re-run the last bit several times the results change in an order of magnitude from ICC1 being ~0.015 to 0.005.

(i.e.:

Code: Select all

runmlwin DV cons, level4(cluster1: cons) level3(cluster2: cons) ///
		level2(cluster3: cons) level1(ID: )  mcmc(cc on) initsprevious discrete(distribution(binomial) link(logit) denom(cons))  nopause  
		runmlwin DV cons, level4(cluster1: cons) level3(cluster2: cons) ///
		level2(cluster3: cons) level1(ID: )  mcmc(cc on) initsprevious discrete(distribution(binomial) link(logit) denom(cons))  nopause  
		runmlwin DV cons, level4(cluster1: cons) level3(cluster2: cons) ///
		level2(cluster3: cons) level1(ID: )  mcmc(cc on) initsprevious discrete(distribution(binomial) link(logit) denom(cons))  nopause  
		....

I know, ICCs/VPCs from an empty logit model are approximations and somewhat shaky, and these ICCs above are very small, but I would like to understand what I can do in order to minimize this issue. Would looping over the last snippet and running it 10 (100?) times be a solution (assuming that because of "initsprevious" the difference should get smaller over time?) Is there a more elegant way like bootstrapping that would be beneficial?

In a similar vein, whether I start with "matrix b = (.20,.20,.20,.20,.20)" and then refer to these arbitrary values with initsprevious (first model) or whether I leave this out in the very first line (i.e.

Code: Select all

	runmlwin DV cons, level4(cluster1: cons) level3(cluster2: cons) ///
		level2(cluster3: cons) level1(ID: )   discrete(distribution(binomial) link(logit) denom(cons))  nopause

and then

Code: Select all

runmlwin DV cons, level4(cluster1: cons) level3(cluster2: cons) ///
		level2(cluster3: cons) level1(ID: )  mcmc(cc on) initsprevious discrete(distribution(binomial) link(logit) denom(cons))  nopause

has a large impact on the final result even after updating the model several times. What would be preferable, w or w/o the manually set initial values? Am I right to assume that it shouldn't matter eventually once I have run the model 100 times?

Thank you very much for your time in advance.

Best
Johannes

PD. This is likely related to a small sample size issue; there, I have 3000 individuals in three cross-classified clusters. When using an alternative database with 100.000 the changes in ICCs seem to be smaller.

billb · Post by **billb** » Thu Apr 04, 2019 7:05 am

Morning Johannes,
I am assuming looking at the syntax you are using runmlwin in Stata? It is probably worth reading a little bit more about how MCMC works e.g. my MCMC in MLwiN book as unlike the IGLS algorithm for ML estimation which converges to ML estimates, MCMC constructs via simulation dependent samples from the posterior distribution and then the sample mean is the estimate of the posterior mean HOWEVER MCMC is run for a number of iterations and convergence is to a distribution and will depend on how correlate the chains are. You should always therefore investigate the mixing of the chains and diagnostics like the Effective sample size (ESS) and not just run for a default number of iterations. I suspect the differences you are seeing are because you haven't run the MCMC sampling long enough - sometimes you need to run for millions of iterations (or at least many thousands).
Best wishes,
Bill.

www.cmm.bristol.ac.uk/forum

Cross-classified logit model: Getting the ICCs

Cross-classified logit model: Getting the ICCs

Re: Cross-classified logit model: Getting the ICCs

Re: Cross-classified logit model: Getting the ICCs

Re: Cross-classified logit model: Getting the ICCs

Re: Cross-classified logit model: Getting the ICCs

Re: Cross-classified logit model: Getting the ICCs

Re: Cross-classified logit model: Getting the ICCs

Re: Cross-classified logit model: Getting the ICCs

Re: Cross-classified logit model: Getting the ICCs