Notation in discrete multivariate models

Welcome to the forum for runmlwin users. Feel free to post your question about runmlwin here. The Centre for Multilevel Modelling take no responsibility for the accuracy of these posts, we are unable to monitor them closely. Do go ahead and post your question and thank you in advance if you find the time to post any answers!

Go to runmlwin: Running MLwiN from within Stata >> http://www.bristol.ac.uk/cmm/software/runmlwin/
Post Reply
NilsGYork
Posts: 11
Joined: Thu Oct 03, 2013 10:57 am

Notation in discrete multivariate models

Post by NilsGYork »

Hello,

I've noticed an odd behaviour in the way runmlwin/MLwiN orders and labels the variance component estimates in a multivariate model where one outcome is continuous and normally distributed whereas the other outcome is binary and modelled using the probit link.

If I estimate a bivariate model with two continuous outcomes the variance components are labelled as expected. var(cons_1) refers to the random effect(s) on the constant in equation 1, var(cons_2) refers to the random effect(s) in the second equation. See example below.

Code: Select all

    runmlwin (binary cons, eq(1)) (continuous cons, eq(2)), ///
>            level2(cluster: (cons, eq(1))(cons, eq(2))) level1(individual: (cons, eq(1))(cons,eq(2))) ///           
>         /* discrete(distribution(binomial normal) link(logit) denominator(cons cons)) */ /// 
>            nopause cor batch mlwinpath("C:\Program Files (x86)\MLwiN v2.28\x64\mlnscript.exe") maxiterations(150)
 
 --- Begin MLwiN error log --- 
MLN - Software for N-level analysis.   Thu Oct 03 11:52:09 2013


C:\Users\ng526\AppData\Local\Temp\ST_02000002.tmp

C:\Users\ng526\AppData\Local\Temp\ST_02000006.tmp
 --- End MLwiN error log --- 

MLwiN 2.26 multilevel model                     Number of obs      =     99597
Multivariate response model
Estimation algorithm: IGLS

-----------------------------------------------------------
                |   No. of       Observations per Group
 Level Variable |   Groups    Minimum    Average    Maximum
----------------+------------------------------------------
        cluster |      286          1      348.2       2321
-----------------------------------------------------------

Run time (seconds)   =      29.14
Number of iterations =          4
Log likelihood       = -373669.16
Deviance             =  747338.31
------------------------------------------------------------------------------
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
binary       |
      cons_1 |   .7302724   .0042698   171.03   0.000     .7219036    .7386411
-------------+----------------------------------------------------------------
continuous   |
      cons_2 |   5.385433   .0807643    66.68   0.000     5.227138    5.543728
------------------------------------------------------------------------------

------------------------------------------------------------------------------
   Random-effects Parameters |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
Level 2: cluster             |
                 var(cons_1) |   .0036802   .0004087      .0028791    .0044813
         corr(cons_1,cons_2) |  -.6049433   .0501553     -.7032458   -.5066407
                 var(cons_2) |    1.52018   .1504653      1.225273    1.815086
-----------------------------+------------------------------------------------
Level 1: individual          |
                 var(cons_1) |   .1994257   .0008948       .197672    .2011795
         corr(cons_1,cons_2) |  -.1043134   .0031382     -.1104642   -.0981626
                 var(cons_2) |    31.2086   .1400372      30.93413    31.48307
------------------------------------------------------------------------------
This notation gets mixed up once I specify the first outcome as binary and use the probit link. var(cons_1) at cluster level still seems to refer to the variance of the random effect at cluster level on the binary outcome (eq1) but var(cons_1) at the individual level now refers to the error variance in the equation for the continuous outcome. This is easy to see because var(cons_2) at individual level is now fixed at one (i.e. probit).

I think this is a bug. Or am I missing something?

Code: Select all

     runmlwin (binary cons, eq(1)) (continuous cons, eq(2)), ///
>            level2(cluster: (cons, eq(1))(cons, eq(2))) level1(individual: (cons, eq(1))(cons,eq(2))) ///           
>            discrete(distribution(binomial normal) link(logit) denominator(cons cons)) /// 
>            nopause cor batch mlwinpath("C:\Program Files (x86)\MLwiN v2.28\x64\mlnscript.exe") maxiterations(150)
 
 --- Begin MLwiN error log --- 
MLN - Software for N-level analysis.   Thu Oct 03 11:52:39 2013


C:\Users\ng526\AppData\Local\Temp\ST_02000002.tmp

C:\Program Files (x86)\MLwiN v2.28\x64\..\discrete\pre

C:\Program Files (x86)\MLwiN v2.28\x64\..\discrete\errchk

C:\Program Files (x86)\MLwiN v2.28\x64\..\discrete\pre_0

C:\Program Files (x86)\MLwiN v2.28\x64\..\discrete\nobvar

C:\Program Files (x86)\MLwiN v2.28\x64\..\discrete\post

C:\Program Files (x86)\MLwiN v2.28\x64\..\discrete\post_0

C:\Program Files (x86)\MLwiN v2.28\x64\..\discrete\pre

C:\Program Files (x86)\MLwiN v2.28\x64\..\discrete\errchk

C:\Program Files (x86)\MLwiN v2.28\x64\..\discrete\bvar

C:\Program Files (x86)\MLwiN v2.28\x64\..\discrete\post

C:\Program Files (x86)\MLwiN v2.28\x64\..\discrete\pre

C:\Program Files (x86)\MLwiN v2.28\x64\..\discrete\errchk

C:\Program Files (x86)\MLwiN v2.28\x64\..\discrete\bvar

C:\Program Files (x86)\MLwiN v2.28\x64\..\discrete\post

C:\Program Files (x86)\MLwiN v2.28\x64\..\discrete\pre

C:\Program Files (x86)\MLwiN v2.28\x64\..\discrete\errchk

C:\Program Files (x86)\MLwiN v2.28\x64\..\discrete\bvar

C:\Program Files (x86)\MLwiN v2.28\x64\..\discrete\post

C:\Program Files (x86)\MLwiN v2.28\x64\..\discrete\pre

C:\Program Files (x86)\MLwiN v2.28\x64\..\discrete\errchk

C:\Program Files (x86)\MLwiN v2.28\x64\..\discrete\bvar

C:\Program Files (x86)\MLwiN v2.28\x64\..\discrete\post

C:\Users\ng526\AppData\Local\Temp\ST_02000006.tmp
 --- End MLwiN error log --- 

MLwiN 2.26 multilevel model                     Number of obs      =     99597
Multivariate response model
Estimation algorithm: IGLS, MQL1

-----------------------------------------------------------
                |   No. of       Observations per Group
 Level Variable |   Groups    Minimum    Average    Maximum
----------------+------------------------------------------
        cluster |      286          1      348.2       2321
-----------------------------------------------------------

Run time (seconds)   =      35.22
Number of iterations =          5
------------------------------------------------------------------------------
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
binary       |
      cons_1 |   .9962659   .0216978    45.92   0.000      .953739    1.038793
-------------+----------------------------------------------------------------
continuous   |
      cons_2 |   5.385259    .080767    66.68   0.000     5.226958    5.543559
------------------------------------------------------------------------------

------------------------------------------------------------------------------
   Random-effects Parameters |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
Level 2: cluster             |
                 var(cons_1) |   .0953512   .0105829       .074609    .1160933
         corr(cons_1,cons_2) |  -.6043977   .0501581     -.7027058   -.5060897
                 var(cons_2) |   1.520418   .1504496      1.225543    1.815294
-----------------------------+------------------------------------------------
Level 1: individual          |
                var(bcons_1) |   31.20439   .1400105      30.92998    31.47881
        corr(bcons_1,cons_2) |  -.1036713   .0031302     -.1098063   -.0975363
                 var(cons_2) |          1          0             1           1
------------------------------------------------------------------------------
BTW: mlnscript in the 64-bit version is brilliant and can deal with surprisingly large datasets!
ChrisCharlton
Posts: 1384
Joined: Mon Oct 19, 2009 10:34 am

Re: Notation in discrete multivariate models

Post by ChrisCharlton »

Thanks for letting us know about this, it does look like a bug. We will investigate this and let you know what we find.
ChrisCharlton
Posts: 1384
Joined: Mon Oct 19, 2009 10:34 am

Re: Notation in discrete multivariate models

Post by ChrisCharlton »

We have now looked into this further and it appears that the mislabelling occurs whenever you have variables that are in the random part but not the fixed part of the model in multivariate and multinomial models. This happens because these variables are added after all the fixed part variables are added, rather than after the variables for their related response. We have identified a fix but need to check that there are no side effects, so this bug should be fixed in the next update to runmlwin.
Post Reply