problems when specifying random slopes with dummy variables

Taratara · Post by **Taratara** » Wed Feb 22, 2017 1:48 pm

Dear all,

I am very glad that I found runmlwin, it has been helping me a lot in my analyses. However, I now ran into a problem when specifying random slopes with a dummy variable. I get "0" in all respective outputs, if I estimate the same syntax with a metric random slope variable I get a "normal" filled output. There should be a random slope for my dummy variable as well (at least meqrlogit told me so), so I think I may have made a mistake in my model specification. Do you maybe have any idea, what my problem may be? ( I also got no error message when estimating it without the nopause option and checking the output in MLwiN)

Thank you!

Tara

Code: Select all

runmlwin sold cons i.pitch, level2(disctrict:cons pitch)  level1(tries) discrete(distribution(binomial) link(logit)denom(cons)) nopause

Code: Select all


Run time (seconds)   =       4.15
Number of iterations =          7
------------------------------------------------------------------------------
sold        |     Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        cons |   1.403182    .112137    12.51   0.000     1.183398    1.622967
   _1_pitch |  -.3634118   .1172853    -3.10   0.002    -.5932869   -.1335368
------------------------------------------------------------------------------

------------------------------------------------------------------------------
   Random-effects Parameters |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
Level 2: disctrict            |
                   var(cons) |   1.034937   .2053409      .6324766    1.437398
            cov(cons,pitch) |          0          0             0           0
                 var(pitch) |          0          0             0           0
------------------------------------------------------------------------------

Taratara · Post by **Taratara** » Wed Feb 22, 2017 2:15 pm

Update: I tried it with an mcmc model and an unstructured covariance matrix and now I get the error message "MCMC Error 0315: Prior variance matrix is not positive definite". The prior model seems to run just fine however (and to simplify matters I didnt even specify the random slope in the prior model).

Code: Select all

runmlwin sold cons i.pitch, level2(district:cons)  level1(tries) discrete(distribution(binomial) link(logit)denom(cons)pql2)
runmlwin sold cons i.pitch, level2(district:cons pitch)  level1(tries) discrete(distribution(binomial) link(logit)denom(cons)) mcmc(corresiduals(unstruc)) initsprev nopause

ChrisCharlton · Post by **ChrisCharlton** » Wed Feb 22, 2017 2:37 pm

Would it be possible to provide the syntax that you are using for the meqrlogit command as well?

I tried the following example using the MLwiN sample data with both commands and did get a value for the random effect:

Code: Select all

use http://www.bristol.ac.uk/cmm/media/runmlwin/bang, clear

Code: Select all

. meqrlogit use hindu || district: hindu, covariance(unstructured)

Refining starting values: 

Iteration 0:   log likelihood = -1876.8465  
Iteration 1:   log likelihood = -1870.6881  
Iteration 2:   log likelihood = -1863.5003  

Performing gradient-based optimization: 

Iteration 0:   log likelihood = -1863.5003  
Iteration 1:   log likelihood =  -1862.921  
Iteration 2:   log likelihood = -1862.5309  
Iteration 3:   log likelihood = -1862.5164  
Iteration 4:   log likelihood = -1862.5163  

Mixed-effects logistic regression               Number of obs     =      2,867
Group variable: district                        Number of groups  =         60

                                                Obs per group:
                                                              min =          3
                                                              avg =       47.8
                                                              max =        173

Integration points =   7                        Wald chi2(1)      =       9.10
Log likelihood = -1862.5163                     Prob > chi2       =     0.0026

------------------------------------------------------------------------------
         use |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       hindu |   .5034708   .1668943     3.02   0.003     .1763639    .8305777
       _cons |  -.5687622   .0891787    -6.38   0.000    -.7435492   -.3939752
------------------------------------------------------------------------------

------------------------------------------------------------------------------
  Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
district: Unstructured       |
                  var(hindu) |   .5146337    .229693       .214579    1.234268
                  var(_cons) |   .3146233   .0884784      .1813075    .5459664
            cov(hindu,_cons) |   -.260808   .1299126     -.5154321   -.0061839
------------------------------------------------------------------------------
LR test vs. logistic model: chi2(3) = 113.20              Prob > chi2 = 0.0000

Note: LR test is conservative and provided only for reference.

Code: Select all

MLwiN 3.0 multilevel model                      Number of obs      =      2867
Binomial logit response model
Estimation algorithm: IGLS, PQL2

-----------------------------------------------------------
                |   No. of       Observations per Group
 Level Variable |   Groups    Minimum    Average    Maximum
----------------+------------------------------------------
       district |       60          3       47.8        173
-----------------------------------------------------------

Run time (seconds)   =       1.06
Number of iterations =          6
------------------------------------------------------------------------------
         use |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        cons |  -.5680659   .0881795    -6.44   0.000    -.7408945   -.3952372
       hindu |   .5042605   .1601074     3.15   0.002     .1904558    .8180653
------------------------------------------------------------------------------

------------------------------------------------------------------------------
   Random-effects Parameters |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
Level 2: district            |
                   var(cons) |   .3104756   .0829008       .147993    .4729581
             cov(cons,hindu) |  -.2589796   .1182524     -.4907501   -.0272091
                  var(hindu) |   .4973529   .2428069      .0214601    .9732457
------------------------------------------------------------------------------

The MCMC error that you are seeing will be due to the starting values for the random part co-variance matrix containing zeros on the diagonal. As these starting values are used in the priors this matrix needs to be invertable, hence the error message that you are seeing. Leaving out the variable doesn't work because -runmlwin- will just fill in any missing parameter values with zeros. A workaround would be to specify your own starting values where this matrix has valid values.

Taratara · Post by **Taratara** » Wed Feb 22, 2017 3:22 pm

Dear Mr. Carlton,

thank you!! I´ve just tried the workaround following the instructions in this post (https://www.cmm.bristol.ac.uk/forum/vie ... 61b7b2265d) and using the starting values from the meqrlogit model as my starting values (see below). Ultimately I would like to specify more complex models which meqrlogit doesnt fit (without taking a couple of days), which is why I switched to MLwiN in the first place. There is no way to specify an unstructured covariance matrix with a different algorythm than mcmc is there? Because I figured that(option cov, unstructured) might be, why I (don´t) get the 0 values.

Code: Select all

meqrlogit sold i.pitch|| district:pitch, cov(uns)

Code: Select all

Integration points =   7                        Wald chi2(1)       =      1.75
Log likelihood = -742.28789                     Prob > chi2        =    0.1858

------------------------------------------------------------------------------
sold        |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
    1.pitch |  -.6352327   .4800912    -1.32   0.186    -1.576194    .3057288
       _cons |   3.393485   .3201014    10.60   0.000     2.766097    4.020872
------------------------------------------------------------------------------

------------------------------------------------------------------------------
  Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
district: Unstructured         |
                 var(pitch) |   .2015089   .5729564      .0007657    53.03248
                  var(_cons) |   13.04298   2.441708       9.03707     18.8246
           cov(pitch,_cons) |   1.621188   2.210978      -2.71225    5.954626
------------------------------------------------------------------------------
LR test vs. logistic regression:     chi2(3) =   459.06   Prob > chi2 = 0.0000

Note: LR test is conservative and provided only for reference.

.

which I then plug into the mcmc model by

Code: Select all

runmlwin sold cons i.pitch, level2(district:cons pitch)  level1(tries) discrete(distribution(binomial) link(logit)denom(cons))
mat b=e(b)
matrix b[1,1]= 3.393485
matrix b[1,2]=-.6352327 
matrix b[1,3]=13.04298
matrix b[1,4]=1.621188
matrix b[1,5]=0.2015089
runmlwin sold cons i.pitch, level2(district:cons pitch)  level1(tries) discrete(distribution(binomial) link(logit)denom(cons)) mcmc(corresiduals(unstruc)) initsb(b) nopause

Which does provide me with a random slope, however I´m not sure if thats the right way to go

Code: Select all


Burnin                     =        500
Chain                      =       5000
Thinning                   =          1
Run time (seconds)         =         30
Deviance (dbar)            =     494.78
Deviance (thetabar)        =     491.13
Effective no. of pars (pd) =       3.65
Bayesian DIC               =     498.43
------------------------------------------------------------------------------
sold|     Mean    Std. Dev.     ESS     P       [95% Cred. Interval]
-------------+----------------------------------------------------------------
        cons |   7.447456   .2256756      439   0.000     7.018939    7.880266
   _1_pitch |   -2.20292   .3397999      440   0.000    -2.856248   -1.541322
------------------------------------------------------------------------------

------------------------------------------------------------------------------
   Random-effects Parameters |     Mean   Std. Dev.   ESS     [95% Cred. Int]
-----------------------------+------------------------------------------------
Level 2: district        |
                   var(cons) |  32.14499  1.706985   5503   28.97968  35.68283
            cov(cons,pitch) |  4.002844   .212415   5585   3.609307  4.439916
                 var(pitch) |  .4984563  .0264537   5570   .4496161  .5533429
------------------------------------------------------------------------------

ChrisCharlton · Post by **ChrisCharlton** » Wed Feb 22, 2017 3:45 pm

The covariance(unstructured) option is the equivalent of the structure that MLwiN uses by default. If you wanted it to match the default for meqrlogit (covariance(independent)):

Code: Select all

. meqrlogit use hindu || district: hindu

Refining starting values: 

Iteration 0:   log likelihood = -1876.8465  
Iteration 1:   log likelihood = -1875.0169  
Iteration 2:   log likelihood = -1866.4233  

Performing gradient-based optimization: 

Iteration 0:   log likelihood = -1866.4233  
Iteration 1:   log likelihood = -1865.1548  
Iteration 2:   log likelihood = -1865.1417  
Iteration 3:   log likelihood = -1865.1417  

Mixed-effects logistic regression               Number of obs     =      2,867
Group variable: district                        Number of groups  =         60

                                                Obs per group:
                                                              min =          3
                                                              avg =       47.8
                                                              max =        173

Integration points =   7                        Wald chi2(1)      =       6.45
Log likelihood = -1865.1417                     Prob > chi2       =     0.0111

------------------------------------------------------------------------------
         use |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       hindu |   .4200996   .1654576     2.54   0.011     .0958087    .7443905
       _cons |  -.5674963      .0844    -6.72   0.000    -.7329173   -.4020752
------------------------------------------------------------------------------

------------------------------------------------------------------------------
  Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
district: Independent        |
                  var(hindu) |   .4120454   .2298717       .138063    1.229739
                  var(_cons) |   .2714203   .0759463      .1568436    .4696969
------------------------------------------------------------------------------
LR test vs. logistic model: chi2(2) = 107.95              Prob > chi2 = 0.0000

You would use the diag option within the level definition:

Code: Select all

. runmlwin use cons hindu, level2(district: cons hindu, diag) level1(woman:) discrete(distribution(binomial) link(logit) denominator(cons) pql2) nopause

MLwiN 3.0 multilevel model                      Number of obs      =      2867
Binomial logit response model
Estimation algorithm: IGLS, PQL2

-----------------------------------------------------------
                |   No. of       Observations per Group
 Level Variable |   Groups    Minimum    Average    Maximum
----------------+------------------------------------------
       district |       60          3       47.8        173
-----------------------------------------------------------

Run time (seconds)   =       1.07
Number of iterations =          6
------------------------------------------------------------------------------
         use |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        cons |  -.5668934    .083392    -6.80   0.000    -.7303387   -.4034481
       hindu |   .4210103   .1607305     2.62   0.009     .1059843    .7360362
------------------------------------------------------------------------------

------------------------------------------------------------------------------
   Random-effects Parameters |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
Level 2: district            |
                   var(cons) |   .2674414    .072279      .1257772    .4091057
                  var(hindu) |   .3850518    .220236     -.0466029    .8167065
------------------------------------------------------------------------------

Your MCMC set-up looks fine, although I would suggest that you check that the corresiduals option is doing what you expect (see section 19.6 of the MCMC guide: http://www.bristol.ac.uk/cmm/media/soft ... mc-web.pdf).

To check the results you could try a number of different starting values to see if they remain stable. It is also worth looking at the parameter chains to check that they look okay.

www.cmm.bristol.ac.uk/forum

problems when specifying random slopes with dummy variables

problems when specifying random slopes with dummy variables

Re: problems when specifying random slopes with dummy variables

Re: problems when specifying random slopes with dummy variables

Re: problems when specifying random slopes with dummy variables

Re: problems when specifying random slopes with dummy variables