Bootstrap runmlwin

johannesmueller · Post by **johannesmueller** » Fri Jul 02, 2021 4:06 pm

Dear all

I would like to bootstrap over two regressions, one estimate in Stata, and one using runmlwin.

This is the way I would set it up. Would there be any concerns associated with such an approach from (run)MLwiN side?

Many thanks in advance!

Code: Select all

program bootMLM
	glm A B C, r  	 // first stage 
	predict R, response

	runmlwin DV cons INDVAR B R , ///	
		level3(GEO: cons) level2(GEO2: cons) level1(ID) discrete(distribution(binomial) link(logit) denom(cons)) nopause forcesort zratio level(99)
		drop R 
end program 

bootstrap, reps(500) cluster(GEO): bootMLM

ChrisCharlton · Post by **ChrisCharlton** » Fri Jul 09, 2021 1:37 pm

I checked this with George, and he recommended that you would probably want to change the runmlwin estimation option to be pql2 instead of the default mql1 for this.

johannesmueller · Post by **johannesmueller** » Fri Jul 09, 2021 5:09 pm

Thank you for getting back to me, Chris, highly appreciated.

It turns out that I am stuck at an earlier stage; accessing the results after the bootstrap has finished.

The program below runs all iterations, but culminates in a "name conflict". When adding "eclass",

Code: Select all

program bootMLM, eclass

the program does not run through either.

Would you have any suggestions on how I can access the bootstrapped results from (run)MLwiN?

That'd be fantastic!

ChrisCharlton · Post by **ChrisCharlton** » Mon Jul 12, 2021 10:24 am

This is not something that we are particularly familiar with, however the examples I found seem to add rclass rather than eclass to the bootstrap program and appear to require you to pass on any required outputs from the estimation command using return.

Bootstrap sampling and estimation

How do I write my own bootstrap program

johannesmueller · Post by **johannesmueller** » Fri Jul 23, 2021 11:00 am

Hi Chris,

Many thanks for these links.

I am afraid rclass is not workable; I have run a trial. Generally, I think eclass would contain estimation results and rclass e.g. R2 / log-likelihood / etc., but then again I am not sure I have a firm grasp as to how MLwiN returns the results to Stata.

I am essentially facing two hurdles that impede me from leveraging MLwiN and I would be grateful for any help.

Do you have any worked bootstrap example / Do File?

I appreciate that making MLwiN compatible with the margins command in Stata likely is a very resource-consuming undertaking. At the same time, as the emphasis on the visual communication of findings is growing, all the more so in models containing interactions, it seems like being able to obtain marginal effects and predicted probabilities (with CIs) would be important. There were some conversations here on leveraging MCMC models for that purpose, but I am not entirely sure whether they have led to a best-case practice worked example that could be deployed? Anything that would lower the set-up costs of getting this to work would be great

Essentially, I am fully convinced of the superior performance of MLwiN compared to standard xtmixed /meglm routines in Stata but it feels tricky to ensure that all analytical steps can be conducted consistently based on MLwiN / runmlwin.

Best wishes
J

johannesmueller · Post by **johannesmueller** » Fri Jul 23, 2021 11:28 am

PD. To be more specific, when I run a sandbox example (hence only 10 repetitions), and using e-class, all seems to work well until I get a "name conflict"

Code: Select all

capture program drop myreg
program myreg, eclass 
	tempname bb
	capture drop Res_X_uhat
						
	reg A B C if pickone_c==1, r 
	predict Res_X_uhat, res
						
	runmlwin DV cons A B Res_X_uhat, ///	 
		level3(GEO: cons) level2(GEO2: cons) level1(ID) discrete(distribution(binomial) link(logit) denom(cons)) ///
		nopause forcesort zratio maxiterations(5000) level(99)
			matrix `bb'=e(b)
			ereturn post `bb'
			ereturn local cmd="bootstrap"
end 
bootstrap _b, reps(10) seed(123)  nowarn force cluster(GEO2): myreg

i.e. this is the output:

Bootstrap replications (10)
1 ---+--- 2 - -+-- 3 ---+--- 4 ---+--- 5
..........
name conflict
r(507);

ChrisCharlton · Post by **ChrisCharlton** » Fri Jul 23, 2021 12:36 pm

If you could adapt your example to a dataset that I could use to replicate this (for example the MLwiN tutorial data) then I can attempt to work out where this error is coming from.

johannesmueller · Post by **johannesmueller** » Fri Jul 23, 2021 3:51 pm

Dear Chris,

Thank you very much for your very generous offer!

Please find a stylized sandbox example below, it is based on the "bang" data and the example given in the runmlwin helpfile.
Substantively, it of course doesn't make much sense; it assumes d_pray to be related to use only through d_lit.
It does a good job at reproducing the "name conflict r(507);" challenge I am facing, though.

It would be tremendous if you could look into it. Many thanks again and have a nice weekend in advance!

Best wishes

Code: Select all

					clear all
					. use http://www.bristol.ac.uk/cmm/media/runmlwin/bang, clear
					. generate lc1 = (lc==1)
					. generate lc2 = (lc==2)
					. generate lc3plus = (lc>=3)
					
					foreach var of varlist hindu urban {
						bysort district: egen M_`var' =mean(`var')
						}
						egen pickone_c= tag(district)
					
					capture program drop myreg
					program myreg, eclass
						tempname bb
						capture drop R
						
						reg  d_lit M_urban M_hindu d_pray if pickone_c==1, r  // district level regression
						predict R, res
						
						runmlwin use cons lc1 lc2 lc3plus age d_lit M_urban M_hindu R, ///
							level2(district: cons) level1(woman) discrete(distribution(binomial) link(logit) denominator(cons)) ///
							nopause forcesort zratio level(99)
						matrix `bb'=e(b)
						ereturn post `bb'
						*ereturn local cmd="bootstrap"
					end 
					bootstrap _b, reps(4) seed(123) force cluster(district): myreg

ChrisCharlton · Post by **ChrisCharlton** » Mon Jul 26, 2021 2:38 pm

It seems that the bootstrap command doesn't like something about the way that we have named the output model parameters (I suspect the var(cons) parameter). If I adjust your example to rename this before returning the results from myreg then the following does appear to give results:

Code: Select all

clear all
use http://www.bristol.ac.uk/cmm/media/runmlwin/bang, clear
generate lc1 = (lc==1)
generate lc2 = (lc==2)
generate lc3plus = (lc>=3)

foreach var of varlist hindu urban {
	bysort district: egen M_`var' =mean(`var')
}
	
egen pickone_c = tag(district)

capture program drop myreg
program myreg, eclass
	preserve
	reg d_lit M_urban M_hindu d_pray if pickone_c==1, r  // district level regression
	predict R, res
	
	runmlwin use cons lc1 lc2 lc3plus age d_lit M_urban M_hindu R, ///
		level2(district: cons) level1(woman) discrete(distribution(binomial) link(logit) denominator(cons)) ///
		nopause forcesort zratio level(99)

	tempname bb
	matrix `bb' = e(b)
	matrix colnames `bb' = cons lc1 lc2 lc3plus age d_lit M_urban M_hindu R var_cons bcons
	ereturn post `bb'
		
	restore
end 
bootstrap _b, reps(4) seed(123) cluster(district): myreg

johannesmueller · Post by **johannesmueller** » Mon Jul 26, 2021 4:40 pm

Dear Chris,

Your solution worked like a charm, many thanks for your guidance!!

May I follow up with one further question? Suppose we expect the effect of educ to decline in d_lit. Without predicted probability and marginal effects plots it is hard to interpret this logit interaction model credibly.

Is there any way to obtain the predicted probabilities of use at various levels of educ and d_lit (ideally with CIs) as well as the marginal effects? I have seen the suggestion to use MCMC chains for that here in the forum, but I am not entirely sure how this would be worked out. Would you have any good case practice on how to get there, i.e. how "imitate" the margins command of Stata when the model was estimated in MLwiN?

Code: Select all

					 
ar all
use http://www.bristol.ac.uk/cmm/media/runmlwin/bang, clear
generate lc1 = (lc==1)
generate lc2 = (lc==2)
generate lc3plus = (lc>=3)

foreach var of varlist hindu urban {
	bysort district: egen M_`var' =mean(`var')
}
	 
*Stata	 
logit use cons lc1 lc2 lc3plus age  M_urban M_hindu i.educ##c.d_lit, vce(cluster district) // using "logit" rather than "melogit" for illustration to speed up estimation 
est sto m
margins, noatlegend at(educ==(1(1)4) d_lit==(0(0.1)0.3)) atmeans post  
marginsplot, xdimension(d_lit) recast(line) 

est resto m
margins, dydx(d_lit) at(educ==(1(1)4)) atmeans post    
marginsplot, xdimension(educ) recast(line) yline(0)
 
 
 
*MLwiN
gen double educ_X_d_lit = educ*d_lit // new moderation of interest
  
runmlwin use cons lc1 lc2 lc3plus age d_lit M_urban M_hindu  educ educ_X_d_lit , ///
		level2(district: cons) level1(woman) discrete(distribution(binomial) link(logit) denominator(cons)) ///
		nopause forcesort zratio level(99)

www.cmm.bristol.ac.uk/forum

Bootstrap runmlwin

Bootstrap runmlwin

Re: Bootstrap runmlwin

Re: Bootstrap runmlwin

Re: Bootstrap runmlwin

Re: Bootstrap runmlwin

Re: Bootstrap runmlwin

Re: Bootstrap runmlwin

Re: Bootstrap runmlwin

Re: Bootstrap runmlwin

Re: Bootstrap runmlwin