Dear Bill, dear all,
I am afraid I have a last question on this matter. The results for the ICCs/VPCs are very sensitive to the chosen starting values.
Code: Select all
*three non-nested random terms
matrix b = (.20,.20,.20,.20,.20)
runmlwin DV cons, level4(cluster1: cons) level3(cluster2: cons) ///
level2(cluster3: cons) level1(ID: ) mcmc(cc on) initsb(b) discrete(distribution(binomial) link(logit) denom(cons)) nopause
runmlwin DV cons, level4(cluster1: cons) level3(cluster2: cons) ///
level2(cluster3: cons) level1(ID: ) mcmc(cc on) initsprevious discrete(distribution(binomial) link(logit) denom(cons)) nopause
display [RP4]var(cons)/([RP4]var(cons) + [RP3]var(cons) + [RP2]var(cons) + (_pi^2)/3)
display [RP3]var(cons)/([RP4]var(cons) + [RP3]var(cons) + [RP2]var(cons) + (_pi^2)/3)
display [RP2]var(cons)/([RP4]var(cons) + [RP3]var(cons) + [RP2]var(cons) + (_pi^2)/3)
If I now re-run the last bit several times the results change in an order of magnitude from ICC1 being ~0.015 to 0.005.
(i.e.:
Code: Select all
runmlwin DV cons, level4(cluster1: cons) level3(cluster2: cons) ///
level2(cluster3: cons) level1(ID: ) mcmc(cc on) initsprevious discrete(distribution(binomial) link(logit) denom(cons)) nopause
runmlwin DV cons, level4(cluster1: cons) level3(cluster2: cons) ///
level2(cluster3: cons) level1(ID: ) mcmc(cc on) initsprevious discrete(distribution(binomial) link(logit) denom(cons)) nopause
runmlwin DV cons, level4(cluster1: cons) level3(cluster2: cons) ///
level2(cluster3: cons) level1(ID: ) mcmc(cc on) initsprevious discrete(distribution(binomial) link(logit) denom(cons)) nopause
....
I know, ICCs/VPCs from an empty logit model are approximations and somewhat shaky, and these ICCs above are very small, but I would like to understand what I can do in order to minimize this issue. Would looping over the last snippet and running it 10 (100?) times be a solution (assuming that because of "initsprevious" the difference should get smaller over time?) Is there a more elegant way like bootstrapping that would be beneficial?
In a similar vein, whether I start with "matrix b = (.20,.20,.20,.20,.20)" and then refer to these arbitrary values with initsprevious (first model) or whether I leave this out in the very first line (i.e.
Code: Select all
runmlwin DV cons, level4(cluster1: cons) level3(cluster2: cons) ///
level2(cluster3: cons) level1(ID: ) discrete(distribution(binomial) link(logit) denom(cons)) nopause
and then
Code: Select all
runmlwin DV cons, level4(cluster1: cons) level3(cluster2: cons) ///
level2(cluster3: cons) level1(ID: ) mcmc(cc on) initsprevious discrete(distribution(binomial) link(logit) denom(cons)) nopause
has a large impact on the final result even after updating the model several times. What would be preferable, w or w/o the manually set initial values? Am I right to assume that it shouldn't matter eventually once I have run the model 100 times?
Thank you very much for your time in advance.
Best
Johannes
PD. This is likely related to a small sample size issue; there, I have 3000 individuals in three cross-classified clusters. When using an alternative database with 100.000 the changes in ICCs seem to be smaller.