Correlation greater than one -- in a 2-level bivariate model

Welcome to the forum for runmlwin users. Feel free to post your question about runmlwin here. The Centre for Multilevel Modelling take no responsibility for the accuracy of these posts, we are unable to monitor them closely. Do go ahead and post your question and thank you in advance if you find the time to post any answers!

Go to runmlwin: Running MLwiN from within Stata >> http://www.bristol.ac.uk/cmm/software/runmlwin/
Post Reply
shaofuhuang
Posts: 3
Joined: Fri Jan 11, 2013 5:15 pm

Correlation greater than one -- in a 2-level bivariate model

Post by shaofuhuang »

Hi all
I am trying to fit a 2-level bivariate model with one explanatory factor. My aim is to know the correlation between these two responses at level2 and level1, and I calculate correlation = cov/sqrt(var1*var2). The problem is that I got a correlation greater than one.

Does anyone has an idea what the implication is? and how can I fix it?

--- about the data and the model ---
I only have a very small sample (level1 n=208 level2 n=10), also the response variables are not in perfect normal distributions. The runmlwin commend is specified as:

. runmlwin (resp1 cons school, eq(1)) (resp2 cons school, eq(2)), level2(class:cons) level1(person:cons)

The level 2 random part are estimated as:
var(cons_1)= 2.921527
var(cons_2)= 1.898666
cov(cons_1,cons_2)= 5.548939

Therefore the correlation = 5.548939/sqrt(2.921527*1.898666) = 2.3560294
----

Any help is much appreciated

Many thanks

Shaofu
GeorgeLeckie
Site Admin
Posts: 432
Joined: Fri Apr 01, 2011 2:14 pm

Re: Correlation greater than one -- in a 2-level bivariate m

Post by GeorgeLeckie »

Hi Shaofu,

Your syntax looks ok. But what is the variable school? The name implied that it is a school identifier, but this should not usually be entered as a covariate.

It would help if you could paste your runmlwin output

Try collapsing each response to class level means using the -collapse- command and see what the standard correlation on these 10 observations is using the -corr- command. This might shed some insight as to whether you have a problem at the class levle.

Best wishes

George
shaofuhuang
Posts: 3
Joined: Fri Jan 11, 2013 5:15 pm

Re: Correlation greater than one -- in a 2-level bivariate m

Post by shaofuhuang »

Hi George,

Thank you for helping look into this.
The runmlwin output was:

Code: Select all

MLwiN 2.25 multilevel model                     Number of obs      =       208
Multivariate response model
Estimation algorithm: IGLS

-----------------------------------------------------------
                |   No. of       Observations per Group
 Level Variable |   Groups    Minimum    Average    Maximum
----------------+------------------------------------------
          class |       10         12       20.8         25
-----------------------------------------------------------

Run time (seconds)   =       3.27
Number of iterations =          6
Log likelihood       = -1517.1924
Deviance             =  3034.3848
------------------------------------------------------------------------------
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
resp1        |
      cons_1 |   144.1554   1.534303    93.95   0.000     141.1482    147.1626
    school_1 |   9.048841   1.908777     4.74   0.000     5.307706    12.78998
-------------+----------------------------------------------------------------
resp2        |
      cons_2 |   62.19214   2.478722    25.09   0.000     57.33394    67.05035
    school_2 |   8.599575   3.043134     2.83   0.005     2.635143    14.56401
------------------------------------------------------------------------------

------------------------------------------------------------------------------
   Random-effects Parameters |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
Level 2: class               |
                 var(cons_1) |   2.541607   3.732272     -4.773511    9.856725
          cov(cons_1,cons_2) |   5.152835   4.535371     -3.736328      14.042
                 var(cons_2) |    1.54237   9.170894     -16.43225    19.51699
-----------------------------+------------------------------------------------
Level 1: person              |
                 var(cons_1) |   119.3382   11.98735      95.84342     142.833
          cov(cons_1,cons_2) |   9.792209   15.51666     -20.61989    40.20431
                 var(cons_2) |   323.2115   36.27026      252.1231    394.2999
------------------------------------------------------------------------------

The variable school is a school identifier. And I entered it into the model because I only have two schools in my data. Students in these two schools behaved quite differently on response1 so I wanted to differentiate the class effect from school effect. Would this cause the problem? The correlation was greater than one even I removed the variable school.

Following your suggestion, I ran the commands below
. collapse resp1 resp2, by(class)
. corr resp1 resp2, means

and I got the output

Code: Select all

(obs=10)

    Variable |         Mean    Std. Dev.          Min          Max
-------------+----------------------------------------------------
       resp1 |     149.7403      5.48013     139.7143     156.1429
       resp2 |     67.40309     6.041832     59.05797     78.19186


             |    resp1    resp2
-------------+------------------
       resp1 |   1.0000
       resp2 |   0.8033   1.0000

There was a strong and positive correlation (.80) between these two variables at the class level. The correlation at personal level was .15, which was bigger than the correlation calculated from the runmlwin output at this level. But I can't work out what the implication is.

I hope the above information surface some new clues.

Many thanks

Shaofu
GeorgeLeckie
Site Admin
Posts: 432
Joined: Fri Apr 01, 2011 2:14 pm

Re: Correlation greater than one -- in a 2-level bivariate m

Post by GeorgeLeckie »

Hi Shaofu,

I presume in that case that you have coded and entered School into the model as a binary variable (dummy variable)

MLwiN, unlike many other packages, does not place any bounds on the correlation and so you can sometimes get correlations which lie outside the range -1 to 1.
With such a high raw correlation and only 10 group you are in the territory where you might obtain just such an estimate. If you fitted the model in another package with constrains the correlation to lie in the space -1 to +1 I suspect that you will obtain an estimate of +1.

Best wishes

George
shaofuhuang
Posts: 3
Joined: Fri Jan 11, 2013 5:15 pm

Re: Correlation greater than one -- in a 2-level bivariate m

Post by shaofuhuang »

Hi George,

Thank you for providing an interpretation for the result, which is very helpful to me. I'll explore other ways to assess the level-2 correlations or otherwise to avoid it in my analysis.

Many thanks
Shaofu
GeorgeLeckie
Site Admin
Posts: 432
Joined: Fri Apr 01, 2011 2:14 pm

Re: Correlation greater than one -- in a 2-level bivariate m

Post by GeorgeLeckie »

Hi Shaofu,

I should add that 10 groups would not normally be considered sufficient for a multilevel analysis

20, 25 groups often gets bandied around as a minimum in several multilevel text books, but really the more groups the better

George
Post Reply