Calculating confidence intervals around level 2 (or 3) predictions

Welcome to the forum for MLwiN users. Feel free to post your question about MLwiN software here. The Centre for Multilevel Modelling take no responsibility for the accuracy of these posts, we are unable to monitor them closely. Do go ahead and post your question and thank you in advance if you find the time to post any answers!

Remember to check out our extensive software FAQs which may answer your question: http://www.bristol.ac.uk/cmm/software/s ... port-faqs/
Post Reply
kaiserdominici
Posts: 18
Joined: Thu Feb 06, 2014 9:37 am

Calculating confidence intervals around level 2 (or 3) predictions

Post by kaiserdominici »

I have around 1700 subjects j nested in 50 groups k that were observed 3 times i, producing the following 3-level growth model (Fig 1):

Image

I would like to calculate the approximate 95% CI around the fitted values of each group, but it looks that my method produces slightly different outcomes than the dedicated function in MLwiN, so I was wondering whether it is my approach that is incorrect or my use of MLwiN for this purpose (in either case, I am making a mistake! :D ).

Let's take the fitted values for the first group, which I predicted using the window below (Fig 2):

Image

The outcome is (Fig 3):

Image

Looking at the first column and first three rows, we see that group 1 is predicted to have an average of 59.259 at time 0 (the intercept), 61.568 at time 1 and 63.878 at time 2.

Now let's go back to the prediction window in figure 2. The help file tells me that:
You can also output any constant k times the estimated standard deviation (standard error) of the prediction function into a chosen column ... If level n resid. function is chosen (n=1,2 in this example) then the s.e. associated with the specified function of residuals at level n is produced.
As you can see at the bottom of the image, I chose to store the Level 3 residual function time 1.96. The outcome is column 2 in figure 3: 3.393 at time 0, 2.693 at time 1 and 3.529 at time 2. This outcome is slightly different from what I get by doing the calculations by hand.

At time 0 (i = 1), the fitted value for group 1 is (I incorrectly used the subscript j rather than k in the left side of the equation):

Image

(You should seriously consider importing the MathJax APIs for this board...)

It is a sum of normally distributed random variables, so the variance of group 1's prediction at time 0 is :

Image

It is not clear to me where the variance matrix for each group is stored in MLwiN , but I know by fitting the same model in R that the conditional mode of the random effect v0 for group 1 is 2.993. So the variance for group 1 is 0.348^2 + 2.933 = 3.054. The 95% CI of this is 1.96 * ± sqrt(3.054) = ± 3.425, which is close, but different from the value of 3.393 calculated through the prediction window.

Similarly, the variance at time 1 is:

Image

The covariance of B0 and B1 is given in column c1099 and it is -0.044. The variance of v1 (k = 1) from R is 1.228 and the covariance between v0 and v1 for group 1 is -1.167. So the total variance for the fitted value of 61.568 at time 1 is: 0.348^2 + 2.933 + 0.216^2 + 1.228 + 2 x (-0.044) + 2 x (-1.167) = 1.906. The SE x 1.96 is sqrt(1.906) x 1.96 = 2.706, which again is ≈ 2.693.

Time 2 works similarly. Are these just rounding errors or am I mistaken in using the "1.96 SE of level 3 resid. function" from the Predictions window the way I am? On a related matter, where does MLwiN store the covariance matrix for the conditional modes of the random effects for levels 2 and 3?

Thank you and all the best,

k.
ChrisCharlton
Posts: 1384
Joined: Mon Oct 19, 2009 10:34 am

Re: Calculating confidence intervals around level 2 (or 3) predictions

Post by ChrisCharlton »

The variances for each group are not stored anywhere on the worksheet, but as instead calculated as required the the RESI command. If you look in the Data Manipulation->Command interface window after using the Model->Predictions window you will see that first the residuals are calculated to correspond to the selected e, u, etc. These are then included in the PRED command to define the required prediction function. If you also request a residual function then the variables that make this up at the requested level are then used as parameters to the RFUN command, and the output residual function is stored in the column you selected (after square-rooting and multiplying by the specified amount).

You should therefore be able to check whether or not the differences you are seeing are due to rounding errors by using the Model->Residuals window (or if you more than one higher-level variable the RFUN and RESI commands) to request the residuals and their variances at the required level and then plugging these numbers into your formulae.
Post Reply