Non-convergence of runmlwin model

andrewjdbell · Post by **andrewjdbell** » Tue Jul 07, 2015 11:58 am

I have a model which I can make converge in MLwiN, but cannot make converge in runmlwin.

Specifically, I have a 2-level negative binomial model with quite a few predictor variables. when I don't include a variable (lets call it X1) from my model, the model doesn't converge when run from Stata (even with maxi() set high). The model runs fine when X1 is included, although the coefficient associated with X1 is very non-significant.

However, when I run the model with the variable X1 included, and without the 'nopause' option, press 'resume macro' to let the model run in MLwiN, then from the MLwiN GUI, remove the variable X1 and then press 'More', the model then converges fine. But I then can't send this model back to Stata.

With runmlwin, I tried using the initsprevious option, and then the initsb and initsv options (with matrices based on the model with the relevant rows/columns removed), but this didn't seem to help. Any ideas to solve the problem / what the difference between my two methods could be?

Thanks,
Andy

ChrisCharlton · Post by **ChrisCharlton** » Tue Jul 07, 2015 12:21 pm

Does the model converge if you run the model that you want without nopause, then choose "Abort Macro" in MLwiN, followed by pressing "start"?

When the model is failing to converge are the estimates that it gets to anywhere near those when it does converge?

andrewjdbell · Post by **andrewjdbell** » Tue Jul 07, 2015 1:37 pm

Hi Chris,

1) no, doesn't converge pressing start;

2) They are the same orders of magnitude and same direction, so not way off/exploding, but the differences aren't negligible (eg a coefficient of 0.14 coming out as 0.20). (also it seems to be stuck oscillating between two values).

Cheers,
Andy

ChrisCharlton · Post by **ChrisCharlton** » Tue Jul 07, 2015 2:12 pm

That suggests that the difference is in how the model is being set up, rather than a difference in how MLwiN is fitting it. Would you be able to run the model with the viewfullmacro option added, and then post the generated script?

Is it just removing the X1 variable that lets it converge for you, or do other variables have a similar effect?

andrewjdbell · Post by **andrewjdbell** » Tue Jul 07, 2015 3:03 pm

No problem: so this is the script from the model with the variables that converged:

ECHO 0
NOTE ***********************************************************************
NOTE MLwiN macro created by runmlwin Stata command: 7 Jul 2015, 15:54:56
NOTE See: http://www.bristol.ac.uk/cmm/runmlwin for help
NOTE ***********************************************************************

NOTE Initialise MLwiN storage
INIT 3 10000 1500 60 30

OPTS 0
NOTE Don't use worksheet for matrix storage
MEMS 1

MONI 0
NOTE Import the Stata data set into MLwiN
RSTA 'C:\Users\Andrew\AppData\Local\Temp\ST_00000002.tmp'

NOTE Specify the response variable(s)
RESP 'protests_urban'

NOTE Response distribution, link function and denominator
RDIS 1 2
LFUN 3

NOTE Specify the level identifier(s)
IDEN 2 'Country'
IDEN 1 'Year'

NOTE Specify covariate(s) used anywhere in the model
ADDT 'cons'
ADDT 'lnupop_wi'
ADDT 'lnurbz_wi'
ADDT 'upz_5_wi'
ADDT 'upg_5_wi'
ADDT 'allelect_wi'
ADDT 'rchange_lag_wi'
ADDT 'fotp_score_wi'
ADDT 'fotp_sq_wi'
ADDT 'democp_lag_wi'
ADDT 'democ_sq_wi'
ADDT 'lngdp_pc_ppp_wi'
ADDT 'gdpgrow_us_wi'
ADDT 'armcon_urban_wi'
ADDT 'lnupop_bw'
ADDT 'lnurbz_bw'
ADDT 'upz_5_bw'
ADDT 'upg_5_bw'
ADDT 'allelect_bw'
ADDT 'rchange_lag_bw'
ADDT 'fotp_score_bw'
ADDT 'fotp_sq_bw'
ADDT 'democp_lag_bw'
ADDT 'democ_sq_bw'
ADDT 'lngdp_pc_ppp_bw'
ADDT 'gdpgrow_us_bw'
ADDT 'armcon_urban_bw'
ADDT 'ethno_f_bw'
ADDT 'ethno_p_bw'
ADDT 'EA_dum'
ADDT 'MA_dum'
ADDT 'NA_dum'
ADDT 'SA_dum'
ADDT '__000003'
ADDT '__000004'
ADDT '__000005'
ADDT '__000006'
ADDT '__000007'
ADDT '__000008'
ADDT '__000009'
ADDT '__00000A'
ADDT '__00000B'
ADDT '__00000C'
ADDT '__00000D'
ADDT '__00000E'
ADDT '__00000F'
ADDT '__00000G'
ADDT '__00000H'
ADDT '__00000I'
ADDT '__00000J'
ADDT '__00000K'
ADDT '__00000L'
ADDT '__00000M'
ADDT '__00000N'
ADDT '__00000O'
ADDT '__00000P'
ADDT '__00000Q'

NOTE Specify level 2 random part covariate(s)
SETV 2 'cons'

NOTE Set estimation method to be RIGLS
METH 0

NOTE Set maximum number of (R)IGLS iterations
MAXI 100

NOTE Set estimation method to be PQL2
LINE 1 2

NOTE Fit the model
STAR
BATC 1
NEXT
MONI 1
ITNU 0 b21
CONV b22

NOTE Open the equations window
WSET 15 1
EXPA 3
ESTM 2
NOTE ***********************************************************************

NOTE ***********************************************************************
NOTE Export the model results to Stata
NOTE ***********************************************************************
LINK 1 G30
NAME G30[1] '_Stats'
EDIT 7 G30[1] b21
EDIT 8 G30[1] b22
NAME c1098 '_FP_b'
NAME c1099 '_FP_v'
NAME c1096 '_RP_b'
NAME c1097 '_RP_v'
NAME c1094 '_esample'
SUM '_esample' b1
EDIT 9 G30[1] b1
PSTA 'C:\Users\Andrew\AppData\Local\Temp\ST_00000005.tmp' '_FP_b' '_FP_v' '_RP_b
> ' '_RP_v' '_Stats'
ERAS '_Stats'
LINK 0 G30
NOTE generate esample for Stata if there a missing values
SWIT b1
CASE 0:
LEAVE
CASE:
CALC '_esample' = abso('_esample' - 1)
PSTA 'C:\Users\Andrew\AppData\Local\Temp\ST_00000006.tmp' '__000000' '_esample'
ENDS
EXIT

And this is the script from the one that didn't (ie with 'ethno_f_bw' and 'ethno_p_bw' excluded:

ECHO 0
NOTE ***********************************************************************
NOTE MLwiN macro created by runmlwin Stata command: 7 Jul 2015, 15:54:58
NOTE See: http://www.bristol.ac.uk/cmm/runmlwin for help
NOTE ***********************************************************************

NOTE Initialise MLwiN storage
INIT 3 10000 1500 58 30

OPTS 0
NOTE Don't use worksheet for matrix storage
MEMS 1

MONI 0
NOTE Import the Stata data set into MLwiN
RSTA 'C:\Users\Andrew\AppData\Local\Temp\ST_00000002.tmp'

NOTE Specify the response variable(s)
RESP 'protests_urban'

NOTE Response distribution, link function and denominator
RDIS 1 2
LFUN 3

NOTE Specify the level identifier(s)
IDEN 2 'Country'
IDEN 1 'Year'

NOTE Specify covariate(s) used anywhere in the model
ADDT 'cons'
ADDT 'lnupop_wi'
ADDT 'lnurbz_wi'
ADDT 'upz_5_wi'
ADDT 'upg_5_wi'
ADDT 'allelect_wi'
ADDT 'rchange_lag_wi'
ADDT 'fotp_score_wi'
ADDT 'fotp_sq_wi'
ADDT 'democp_lag_wi'
ADDT 'democ_sq_wi'
ADDT 'lngdp_pc_ppp_wi'
ADDT 'gdpgrow_us_wi'
ADDT 'armcon_urban_wi'
ADDT 'lnupop_bw'
ADDT 'lnurbz_bw'
ADDT 'upz_5_bw'
ADDT 'upg_5_bw'
ADDT 'allelect_bw'
ADDT 'rchange_lag_bw'
ADDT 'fotp_score_bw'
ADDT 'fotp_sq_bw'
ADDT 'democp_lag_bw'
ADDT 'democ_sq_bw'
ADDT 'lngdp_pc_ppp_bw'
ADDT 'gdpgrow_us_bw'
ADDT 'armcon_urban_bw'
ADDT 'EA_dum'
ADDT 'MA_dum'
ADDT 'NA_dum'
ADDT 'SA_dum'
ADDT '__000003'
ADDT '__000004'
ADDT '__000005'
ADDT '__000006'
ADDT '__000007'
ADDT '__000008'
ADDT '__000009'
ADDT '__00000A'
ADDT '__00000B'
ADDT '__00000C'
ADDT '__00000D'
ADDT '__00000E'
ADDT '__00000F'
ADDT '__00000G'
ADDT '__00000H'
ADDT '__00000I'
ADDT '__00000J'
ADDT '__00000K'
ADDT '__00000L'
ADDT '__00000M'
ADDT '__00000N'
ADDT '__00000O'
ADDT '__00000P'
ADDT '__00000Q'

NOTE Specify level 2 random part covariate(s)
SETV 2 'cons'

NOTE Set estimation method to be RIGLS
METH 0

NOTE Set maximum number of (R)IGLS iterations
MAXI 100

NOTE Set estimation method to be PQL2
LINE 1 2

NOTE Fit the model
STAR
BATC 1
NEXT
MONI 1
ITNU 0 b21
CONV b22

NOTE Open the equations window
WSET 15 1
EXPA 3
ESTM 2
NOTE ***********************************************************************

NOTE ***********************************************************************
NOTE Export the model results to Stata
NOTE ***********************************************************************
LINK 1 G30
NAME G30[1] '_Stats'
EDIT 7 G30[1] b21
EDIT 8 G30[1] b22
NAME c1098 '_FP_b'
NAME c1099 '_FP_v'
NAME c1096 '_RP_b'
NAME c1097 '_RP_v'
NAME c1094 '_esample'
SUM '_esample' b1
EDIT 9 G30[1] b1
PSTA 'C:\Users\Andrew\AppData\Local\Temp\ST_00000005.tmp' '_FP_b' '_FP_v' '_RP_b
> ' '_RP_v' '_Stats'
ERAS '_Stats'
LINK 0 G30
NOTE generate esample for Stata if there a missing values
SWIT b1
CASE 0:
LEAVE
CASE:
CALC '_esample' = abso('_esample' - 1)
PSTA 'C:\Users\Andrew\AppData\Local\Temp\ST_00000006.tmp' '__000000' '_esample'
ENDS
EXIT

FYI - the code used to run these models was

runmlwin protests_urban cons lnupop_wi lnurbz_wi upz_5_wi upg_5_wi ///
allelect_wi rchange_lag_wi fotp_score_wi fotp_sq_wi ///
democp_lag_wi democ_sq_wi ///
lngdp_pc_ppp_wi gdpgrow_us_wi armcon_urban_wi ///
lnupop_bw lnurbz_bw upz_5_bw upg_5_bw ///
allelect_bw rchange_lag_bw fotp_score_bw fotp_sq_bw ///
democp_lag_bw democ_sq_bw ///
lngdp_pc_ppp_bw gdpgrow_us_bw armcon_urban_bw ///
ethno_f_bw ethno_p_bw EA_dum MA_dum NA_dum SA_dum i.Year, ///
level2(Country: cons) level1(Year:) discrete(distribution(nbinomial) pql2 link(log)) ///
rigls maxi(100) viewfullmacro nopause

runmlwin protests_urban cons lnupop_wi lnurbz_wi upz_5_wi upg_5_wi ///
allelect_wi rchange_lag_wi fotp_score_wi fotp_sq_wi ///
democp_lag_wi democ_sq_wi ///
lngdp_pc_ppp_wi gdpgrow_us_wi armcon_urban_wi ///
lnupop_bw lnurbz_bw upz_5_bw upg_5_bw ///
allelect_bw rchange_lag_bw fotp_score_bw fotp_sq_bw ///
democp_lag_bw democ_sq_bw ///
lngdp_pc_ppp_bw gdpgrow_us_bw armcon_urban_bw ///
EA_dum MA_dum NA_dum SA_dum i.Year, ///
level2(Country: cons) level1(Year:) discrete(distribution(nbinomial) pql2 link(log)) ///
rigls maxi(100) viewfullmacro nopause

No other variables that I've noticed that have the same problems, although when I exclude others in addition to the ethno_ variables, I get a different error:

->OBEY "C:\Users\Andrew\AppData\Local\Temp\ST_00000007.tmp"

error while obeying batch file C:\Users\Andrew\AppData\Local\Temp\ST_00000007.tmp at line number 98:
calc g9[1 = 'P'^0.5

Numeric error(s) in calculate command. Affected entries replaced with system missing.

In which case, no estimates are produced at all.

andrewjdbell · Post by **andrewjdbell** » Wed Jul 08, 2015 9:59 am

Hi Chris - I've found a solution: It seems that mean-centring my variables solves the problem (which works for me as I probably should have done that anyway!)
Doesn't explain why MLwiN and runmlwin were doing different things, but does solve my immediate problem

Thanks for your help,
Andy

ChrisCharlton · Post by **ChrisCharlton** » Tue Jul 14, 2015 11:59 am

I think that I have now got to the bottom of the original problem. The difference that you were seeing is due to the variables that you removed having missing values that results in these rows of data being excluded from the estimation. With ethno_p_bw and ethno_f_bw included MLwiN is using 882 of 1128 cases, whereas when you remove these it uses 954 of 1128. The reason why using More works after removing the variables is that MLwiN is not recalculating which rows to include (this may be a bug) and therefore using the shorter sample from when the variables were included. If you run through -runmlwin- then the model is set up from scratch and this information is not retained. If I run:

Code: Select all

drop if ethno_p_bw == .
drop if ethno_f_bw == .

prior to sending the model to MLwiN then it does converge, although interesting the results aren't identical to those obtained by pressing More (although they are close).

www.cmm.bristol.ac.uk/forum

Non-convergence of runmlwin model

Non-convergence of runmlwin model

Re: Non-convergence of runmlwin model

Re: Non-convergence of runmlwin model

Re: Non-convergence of runmlwin model

Re: Non-convergence of runmlwin model

Re: Non-convergence of runmlwin model

Re: Non-convergence of runmlwin model