Page 1 of 1

Confidence intervals for the sum of residual and coefficient

Posted: Tue Mar 25, 2014 4:35 pm
by Paatrick
I am interested in obtaining confidence intervals for the sum of the empirical Bayes residuals plus the overall regression coefficient after fitting a random coefficient model.

Specifically, I'm fitting a model similar to this one:

Code: Select all

use "http://www.bristol.ac.uk/cmm/media/runmlwin/tutorial.dta", clear

* Generate a boys' school dummy variable
generate boysch = (schgend==2)

* Generate a girls' school dummy variable
generate girlsch = (schgend==3)

* Fit a two-level random slope model for student age 16 scores and retrieve
* the empirical Bayes estimates of the school random effects
runmlwin normexam cons standlrt girl boysch girlsch, ///
   level2(school: cons standlrt, residuals(u)) ///
   level1(student: cons) nopause nogroup ///
   mlwinpath(C:\Program Files\MLwiN v2.28\i386\MLwiN.exe)
I see that it's then easy to fit a caterpillar plot of the u1 residuals like this:

Code: Select all

// Create caterpillar plot of u1 residual
* Tag one observation per school
egen pickone_school = tag(school)

* Rank the standlrt residuals
egen u1rank = rank(u1) if pickone_school

* Plot a caterpillar plot of the standlrt residuals 
serrbar u1 u1se u1rank if pickone_school, scale(1.96) yline(0)
However, what I would like to show is that the effect of "standlrt" is significantly different from zero in all schools. It was easy to get to here:

Code: Select all

// Create caterpillar plot of sum of standlrt coefficient plus u1 residual
* Sum of standlrt coefficient plus u1 residual
gen standlrt_total = [FP1]standlrt + u1

* Plot sum of standlrt coefficient plus u1 residual
graph dot standlrt_total if pickone_school, over(u1rank) vertical 
However, what I'm struggling with now is to calculate confidence intervals around the sum of the standlrt coefficient plus the u1 residual. Is there a straightforward (and correct) way to do this?

Thanks for your consideration!
Patrick

Re: Confidence intervals for the sum of residual and coeffic

Posted: Wed Mar 26, 2014 7:42 pm
by GeorgeLeckie
Hi Patrick,

Interesting question. I have not tried the following out, but here are some thoughts nonetheless.

You want the sampling distribution of b1 + u1_j

Var (b1 + u1_j) = Var(b1) + 2*Cov(b1,u1_j) + Var(u1_j)

If you fit the model by IGLS, the problem is that the u1_j are predicted post-estimation and so we do not have Cov(b1,u1j). I guess, the best you could do would be to calculate

Var (b1 + u1_j) = Var(b1) + Var(u1_j)

but this does not seem very satisfactory.

Alternatively, you could fit the model by MCMC and retrieve the MCMC chains for b1 and u1_j. You could then calculate the sum b1 + u1_j at every iteration of the MCMC chain. The 2.5th and 97.5th quantiles will give you a 95% confidence interval which takes into account the sampling variability and co variability between b1 and u1_j. The mean gives you the point estimate.

Best wishes

George