### Re: MCMC Macro

Posted:

**Tue Jan 10, 2017 2:22 pm**This is really helpful as well! Thank you so much Chris for all your wonderful helps!!

Best,

YongJoo

Best,

YongJoo

Welcome to the MLwiN User Forum

https://www.cmm.bristol.ac.uk/forum/

https://www.cmm.bristol.ac.uk/forum/viewtopic.php?f=1&t=1903

Page **2** of **3**

Posted: **Tue Jan 10, 2017 2:22 pm**

This is really helpful as well! Thank you so much Chris for all your wonderful helps!!

Best,

YongJoo

Best,

YongJoo

Posted: **Thu Aug 10, 2017 4:38 pm**

Thank you so much again! This has been really helpful and I very much appreciate it!

Best regards,

Yongjoo

Best regards,

Yongjoo

Posted: **Sat Apr 21, 2018 2:38 am**

Dear Chris,

Thank you so much again for your wonderful support! It's been really helpful and I appreciate it a lot!

While making my projects move forward, I've got a few quick questions. Briefly, my project is as follows:

(1) outcome: depressive symptoms score (0-27), which almost looks like zero-inflated Poisson or zero-inflated negative binomial distribution (nearly 35% responded 0, and less and less for the higher scores).

(2) analysis: linear four-level analysis with MCMC

(3) macro (based on your approach)

For a null model without any covariate:

RESP 'depression'

IDEN 4 'region_id'

IDEN 3 'psu_id'

IDEN 2 'family_id'

IDEN 1 'person_id'

RDISt 1 3

RCON

ADDT

ADDT 'cons'

SETV 4 'cons'

SETV 3 'cons'

SETV 2 'cons'

SETV 1 'cons'

ESTM 1

WSET

EXPA 2

WSET

ESTM 2

WSET

EXPA 2

WSET

METH 1

MAXI 20

BATC 1

START

EMOD 3

MCMC 0 500 1 5.8 50 10 1 1 1 1 1 1

MCMC 1 5000 1 c1090 c1091 c1003 c1004 1 1

PUPN c1003 c1004

My question is, if I want to run (a) zero-inflated poisson model and (b) zero-inflated negative binomial model, what I might need to change in this code. I've looked at your previous comments, but couldn't find the ones for zero-inflated models, and would appreciate if you could kindly let me know about it.

Best wishes,

Yongjoo

Thank you so much again for your wonderful support! It's been really helpful and I appreciate it a lot!

While making my projects move forward, I've got a few quick questions. Briefly, my project is as follows:

(1) outcome: depressive symptoms score (0-27), which almost looks like zero-inflated Poisson or zero-inflated negative binomial distribution (nearly 35% responded 0, and less and less for the higher scores).

(2) analysis: linear four-level analysis with MCMC

(3) macro (based on your approach)

For a null model without any covariate:

RESP 'depression'

IDEN 4 'region_id'

IDEN 3 'psu_id'

IDEN 2 'family_id'

IDEN 1 'person_id'

RDISt 1 3

RCON

ADDT

ADDT 'cons'

SETV 4 'cons'

SETV 3 'cons'

SETV 2 'cons'

SETV 1 'cons'

ESTM 1

WSET

EXPA 2

WSET

ESTM 2

WSET

EXPA 2

WSET

METH 1

MAXI 20

BATC 1

START

EMOD 3

MCMC 0 500 1 5.8 50 10 1 1 1 1 1 1

MCMC 1 5000 1 c1090 c1091 c1003 c1004 1 1

PUPN c1003 c1004

My question is, if I want to run (a) zero-inflated poisson model and (b) zero-inflated negative binomial model, what I might need to change in this code. I've looked at your previous comments, but couldn't find the ones for zero-inflated models, and would appreciate if you could kindly let me know about it.

Best wishes,

Yongjoo

Posted: **Sat Apr 21, 2018 3:02 am**

Also, I've tried to run just poisson and negative binomial model with the above setting, just by replacing 1 (for normal distribution) with 3 (for poisson) and 8 (for negative binomial), but in both cases, the model automatically switched to normal linear model. Am wondering what additional procedure I might need to work on... Thank you for all your help again!

Posted: **Mon Apr 23, 2018 10:20 am**

Unfortunately MLwiN cannot currently fit zero-inflated models. You can allow for overdispersion, which goes some way towards accounting for the extra zeros, but this is different.

Posted: **Mon Apr 23, 2018 1:14 pm**

Thank you for your kind response, Chris!

Then, I just tried to run regular Poisson model (four-level with MCMC, offset=1 for all observations). It seems that the null model works, but when I added covariates, I got error messages as this:

- "covariance matrix is not positive-definite"

- "MCMC Error 0002: Matrix must be positive definite for inversion"

With the same covariates, I've got estimates for linear and binomial models, and am wondering what might be a good solution for this. The macro that I've been using for the Poisson model is as follows and would appreciate your support!

RESP 'depression'

IDEN 4 'region_id'

IDEN 3 'psu_id'

IDEN 2 'family_id'

IDEN 1 'person_id'

RDISt 1 1

LFUN 3

DOFFs 1 'cons'

ADDT

ADDT 'cons'

SETV 4 'cons'

SETV 3 'cons'

SETV 2 'cons'

ESTM 1

WSET

EXPA 2

WSET

ESTM 2

WSET

EXPA 2

WSET

METH 1

MAXI 20

BATC 1

START

EMOD 3

MCMC 0 500 1 5.8 50 10 1 1 1 1 1 1

MCMC 1 5000 1 c1090 c1091 c1003 c1004 1 1

PUPN c1003 c1004

MSTO '\M0'

STOR

EMODe 3

EMODe 0

METH 1

EMODe 0

METH 1

ADDT 'age'

CENT 0

ADDT 'menopause' 0

CENT 0

ADDT 'education' 3

CENT 0

ADDT 'income' 4

CENT 0

ADDT 'marrital status' 1

CENT 0

ADDT 'region' 1

CENT 0

ADDT 'comorbidity' 0

ADDT 'bmi'

CENT 0

ADDT 'wsep_3cat' 1

METH 1

MAXI 20

BATC 1

START

EMOD 3

MCMC 0 500 1 5.8 50 10 3 3 3 1 1 3

MCMC 1 5000 1 c1090 c1091 c1003 c1004 1 3

PUPN c1003 c1004

MSTO '\M1'

STOR

Thank you for your help again,

Yongjoo

Then, I just tried to run regular Poisson model (four-level with MCMC, offset=1 for all observations). It seems that the null model works, but when I added covariates, I got error messages as this:

- "covariance matrix is not positive-definite"

- "MCMC Error 0002: Matrix must be positive definite for inversion"

With the same covariates, I've got estimates for linear and binomial models, and am wondering what might be a good solution for this. The macro that I've been using for the Poisson model is as follows and would appreciate your support!

RESP 'depression'

IDEN 4 'region_id'

IDEN 3 'psu_id'

IDEN 2 'family_id'

IDEN 1 'person_id'

RDISt 1 1

LFUN 3

DOFFs 1 'cons'

ADDT

ADDT 'cons'

SETV 4 'cons'

SETV 3 'cons'

SETV 2 'cons'

ESTM 1

WSET

EXPA 2

WSET

ESTM 2

WSET

EXPA 2

WSET

METH 1

MAXI 20

BATC 1

START

EMOD 3

MCMC 0 500 1 5.8 50 10 1 1 1 1 1 1

MCMC 1 5000 1 c1090 c1091 c1003 c1004 1 1

PUPN c1003 c1004

MSTO '\M0'

STOR

EMODe 3

EMODe 0

METH 1

EMODe 0

METH 1

ADDT 'age'

CENT 0

ADDT 'menopause' 0

CENT 0

ADDT 'education' 3

CENT 0

ADDT 'income' 4

CENT 0

ADDT 'marrital status' 1

CENT 0

ADDT 'region' 1

CENT 0

ADDT 'comorbidity' 0

ADDT 'bmi'

CENT 0

ADDT 'wsep_3cat' 1

METH 1

MAXI 20

BATC 1

START

EMOD 3

MCMC 0 500 1 5.8 50 10 3 3 3 1 1 3

MCMC 1 5000 1 c1090 c1091 c1003 c1004 1 3

PUPN c1003 c1004

MSTO '\M1'

STOR

Thank you for your help again,

Yongjoo

Posted: **Mon Apr 23, 2018 1:29 pm**

The error message that you are seeing suggests that the starting values (provided by IGLS) for one or more of your higher level co-variance matrices is non-invertable. A common symptom of this might be a zero element on the diagonal of this matrix. You should be able to check this via the equation window. Assuming that your model specification makes sense a workaround for this would be to specify your own starting values, however I would suggest trying a range of these to test the sensitivity of the estimates to the values that you provide.

Posted: **Mon Apr 23, 2018 1:47 pm**

Hi Chris,

Thank you for your advice!

I've checked the covariance matrix (in the null model where it works) and it seems the variance estimates were:

by IGLS estimation:

- the highest level (region): 0.000(0.000)

- the second highest (neighborhood): 0.034 (0.016)

- the third highest (family): 1.0009 (0.042)

by MCMC estimation

- the highest level (region): 0.005(0.006)

- the second highest (neighborhood): 0.019 (0.017)

- the third highest (family): 0.918 (0.061)

Then, with the set of covariates,

by IGLS estimation:

- the highest level (region): 0.000(0.000)

- the second highest (neighborhood): 0.021 (0.013)

- the third highest (family): 0.861 (0.037)

by MCMC estimation

- got error messages....

Then, actually, I am wondering what might be good ways to specify my own starting values

- (a) what might be reasonable ranges of those values, and

- (b) technically, how to impose those values in MLwiN (i.e., any function or macro to let MLwiN account for those values?)

Thank you again!

Thank you for your advice!

I've checked the covariance matrix (in the null model where it works) and it seems the variance estimates were:

by IGLS estimation:

- the highest level (region): 0.000(0.000)

- the second highest (neighborhood): 0.034 (0.016)

- the third highest (family): 1.0009 (0.042)

by MCMC estimation

- the highest level (region): 0.005(0.006)

- the second highest (neighborhood): 0.019 (0.017)

- the third highest (family): 0.918 (0.061)

Then, with the set of covariates,

by IGLS estimation:

- the highest level (region): 0.000(0.000)

- the second highest (neighborhood): 0.021 (0.013)

- the third highest (family): 0.861 (0.037)

by MCMC estimation

- got error messages....

Then, actually, I am wondering what might be good ways to specify my own starting values

- (a) what might be reasonable ranges of those values, and

- (b) technically, how to impose those values in MLwiN (i.e., any function or macro to let MLwiN account for those values?)

Thank you again!

Posted: **Mon Apr 23, 2018 2:17 pm**

The estimate (and standard error) of zero for the region level look a bit suspicious. I would suggest checking that your data are sorted correctly (for example by checking the number of units are as expected in the hierarchy window) and whether this estimate makes sense.

I have just also noticed that you are running the MCMC version of your "m0" model as a normal response model (the last value in the MCMC commands is**1**). For Poisson response models this needs to be **3**. MLwiN should check and correct this, but it is worth checking that this is happening. You do change this for the "m1" model.

I would suggest checking the IGLS version of "m1" to see what the estimates look like there, as those will be the actual values used as MCMC starting values.

MCMC in MLwiN uses the values of the current parameter estimates stored on the worksheet as starting values. The fixed-part estimates are stored in the column C1098 and the random-part estimates are stored in C1096. If you change these (for example with the EDIT or JOIN commands) then the new values will be picked up instead of those put there by IGLS.

I can't advise as to what reasonable starting values would be, but you basically want to make sure that wherever you start the chains converge back to the same range of estimates. Note that MCMC in MLwiN also uses the starting values for the random-part co-variance matrix in the Wishart priors, unless you specify your own.

I have just also noticed that you are running the MCMC version of your "m0" model as a normal response model (the last value in the MCMC commands is

I would suggest checking the IGLS version of "m1" to see what the estimates look like there, as those will be the actual values used as MCMC starting values.

MCMC in MLwiN uses the values of the current parameter estimates stored on the worksheet as starting values. The fixed-part estimates are stored in the column C1098 and the random-part estimates are stored in C1096. If you change these (for example with the EDIT or JOIN commands) then the new values will be picked up instead of those put there by IGLS.

I can't advise as to what reasonable starting values would be, but you basically want to make sure that wherever you start the chains converge back to the same range of estimates. Note that MCMC in MLwiN also uses the starting values for the random-part co-variance matrix in the Wishart priors, unless you specify your own.

Posted: **Mon Apr 23, 2018 3:43 pm**

This is really helpful and thank you again for the details!

(1) Re the random part estimates for the region (highest level), I checked the data and it appears that the region (n=16) is sorted correctly. It's min=1, max=16, and categorical=true. In the view/edit data window, I can see that obs are sorted from region_1 thru region_16...

(2) Re the MCMC macro for "m0", thanks for the catch, and it seems I somehow mixed up when I copied and pasted it from my MLwiN window to here, but the actual model was estimated by using the code consistent with "m1" model (using "3").

(3) Re the IGLS version of "m1", I just updated the random part results in my previous post (as follows).

(i) M0 model

by IGLS estimation:

- the highest level (region): 0.000(0.000)

- the second highest (neighborhood): 0.034 (0.016)

- the third highest (family): 1.0009 (0.042)

by MCMC estimation

- the highest level (region): 0.005(0.006)

- the second highest (neighborhood): 0.019 (0.017)

- the third highest (family): 0.918 (0.061)

(ii) M1 model with the set of covariates

by IGLS estimation:

- the highest level (region): 0.000(0.000)

- the second highest (neighborhood): 0.021 (0.013)

- the third highest (family): 0.861 (0.037)

by MCMC estimation -> error messages..

(4) Re C1096 and C1098, I didn't know about it and thank you so much for this information.

In my C1096, there are n=4 with min=0 and max=1, and

in my C1098, there are n=14 with min=-0.108 and max=0.529.

Then, could you give me a little bit more information on how I can actually edit this part? I searched MLwiN Macro on EDIT or JOIN, and what I found was something like this: "JOIN C201 20 c201"... So, can I try e.g., "JOIN C1096 xx C1096"??

(5) Re "you basically want to make sure that wherever you start the chains converge back to the same range of estimates", could you tell me a bit more about this? How can I check whether the chains converge back to the same range of estimates?

Again, thank you so much for all these useful comments!

(1) Re the random part estimates for the region (highest level), I checked the data and it appears that the region (n=16) is sorted correctly. It's min=1, max=16, and categorical=true. In the view/edit data window, I can see that obs are sorted from region_1 thru region_16...

(2) Re the MCMC macro for "m0", thanks for the catch, and it seems I somehow mixed up when I copied and pasted it from my MLwiN window to here, but the actual model was estimated by using the code consistent with "m1" model (using "3").

(3) Re the IGLS version of "m1", I just updated the random part results in my previous post (as follows).

(i) M0 model

by IGLS estimation:

- the highest level (region): 0.000(0.000)

- the second highest (neighborhood): 0.034 (0.016)

- the third highest (family): 1.0009 (0.042)

by MCMC estimation

- the highest level (region): 0.005(0.006)

- the second highest (neighborhood): 0.019 (0.017)

- the third highest (family): 0.918 (0.061)

(ii) M1 model with the set of covariates

by IGLS estimation:

- the highest level (region): 0.000(0.000)

- the second highest (neighborhood): 0.021 (0.013)

- the third highest (family): 0.861 (0.037)

by MCMC estimation -> error messages..

(4) Re C1096 and C1098, I didn't know about it and thank you so much for this information.

In my C1096, there are n=4 with min=0 and max=1, and

in my C1098, there are n=14 with min=-0.108 and max=0.529.

Then, could you give me a little bit more information on how I can actually edit this part? I searched MLwiN Macro on EDIT or JOIN, and what I found was something like this: "JOIN C201 20 c201"... So, can I try e.g., "JOIN C1096 xx C1096"??

(5) Re "you basically want to make sure that wherever you start the chains converge back to the same range of estimates", could you tell me a bit more about this? How can I check whether the chains converge back to the same range of estimates?

Again, thank you so much for all these useful comments!