Predictions via the runmlwin interface: a clarification

Welcome to the forum for runmlwin users. Feel free to post your question about runmlwin here. The Centre for Multilevel Modelling take no responsibility for the accuracy of these posts, we are unable to monitor them closely. Do go ahead and post your question and thank you in advance if you find the time to post any answers!

Go to runmlwin: Running MLwiN from within Stata >> http://www.bristol.ac.uk/cmm/software/runmlwin/
Post Reply
ewancarr
Posts: 18
Joined: Fri Feb 04, 2011 12:46 pm

Predictions via the runmlwin interface: a clarification

Post by ewancarr »

Apologies if this is a stupid question: my lack of Stata/MLwiN knowledge isn't helping here...

Question: How can I get predictions from MLwiN via runmlwin?

For example, using the predictions window I can select several variables, choose a column,
and then output the predicted values to that column (as in the window below).

Image

I can't see a way to replicate this functionality using the 'predict' command in Stata. Any help would be much appreciated.

Thanks,

Ewan
--
GeorgeLeckie
Site Admin
Posts: 432
Joined: Fri Apr 01, 2011 2:14 pm

Re: Predictions via the runmlwin interface: a clarification

Post by GeorgeLeckie »

Hi Ewan,

This is a very good question.

After running runmlwin, you can generate all the predictions that you would normally make using the "Predictions" or "Customised Predictions" windows in MLwiN. However, we have not programmed up a runmlwin specific prediction post-estimation command, so it takes a little bit more work than usual.

Having fitted your two-level model using runmlwin, one way to generate the desired predictions in Stata is to use the generate command:

. generate satlifehat = _b[cons]*cons + _b[Ilikely_4]*Ilikely_4 + _b[unemp_06_cent]*unemp_06_cent + _b[unempXlik4]*unempXlik4

where satlifehat is the new variable where your predictions are stored.
The _b[...] syntax is Stata's way of referencing the parameter estimates in the model.

Why don't you give it a go and let me know how you get on.

Best wishes

George
ewancarr
Posts: 18
Joined: Fri Feb 04, 2011 12:46 pm

Re: Predictions via the runmlwin interface: a clarification

Post by ewancarr »

Perfect! I'll give that a try tomorrow and let you know how it goes..

Ewan
--
ewancarr
Posts: 18
Joined: Fri Feb 04, 2011 12:46 pm

Re: Predictions via the runmlwin interface: a clarification

Post by ewancarr »

Update: the above syntax worked perfectly.. For example:

Image

For those interested, the syntax to achieve this is:

Code: Select all

xi: runmlwin satlife cons ///
          incgm agegm female i.genhlth2 socact socmeet ///
			 discuss i.feelinc2 ptnr children i.optim temp ///
			 i.likely ///
			 gdp06_cent gdpXlik2 gdpXlik3 gdpXlik4, ///
   level2(cntry: cons) ///
   level1(id: cons, weightvar(cweight))
  
cap drop hat_gdp06xlik4
gen hat_gdp06xlik4 = _b[cons]*cons + _b[_Ilikely_4]*_Ilikely_4 + _b[gdp06_cent]*gdp06_cent + _b[gdpXlik4]*gdpXlik4

twoway (connected hat_gdp06xlik4 gdp06_cent if lik4 == 0) ///
       (connected hat_gdp06xlik4 gdp06_cent if lik4 == 1), ///
	   ytitle(Predicted life satisfaction) ///
	   xtitle(National GDP (PPPs, 2006; centred)) ///
	   title(Interaction plot) ///
	   subtitle(job insecurity x  GDP (PPPs)) ///
	   caption(Weighted to control for population size and sample design, size(small)) ///
	   note(Source: European Social Survey (2006); Eurostat (2006)) ///
	   legend(order(1 "lik4 = 0" 2 "lik4 = 1") position(7) ring(0)) ///
	   scheme(s1color) xsize(6) aspectratio(.8)
	   
While I'm here.. I have a follow-up question:

Is there any way of plotting the above graph with confidence intervals?

My guess is that this would involve using 'generate' to extract the standard errors of the parameters
of interest, then calculating confidence intervals (y_hat +/- 1.96 SE_hat), and finally plotting these
in Stata – probably using a combination of 'line' and 'rarea' graphs. For example, in the graph below
I've plotted the original graph (above) with shaded areas to represent 0.5 above and 0.5 below the
predicted lines.

Image

What I need, therefore, is to be able to save the (predicted) standard errors from a model (for a
specified set of parameters) so I can calculate the variables "upper" and lower".
Any thoughts/advice
on how to do this would be gratefully received.

Thanks!

Ewan
--
GeorgeLeckie
Site Admin
Posts: 432
Joined: Fri Apr 01, 2011 2:14 pm

Re: Predictions via the runmlwin interface: a clarification

Post by GeorgeLeckie »

Hi Ewan,

This is also an excellent question and follows naturally from your previous question. The answer is similar to before:

After running runmlwin, you can generate all the predictions AND THEIR SAMPLING ERRORS (AND THEREFORE CONFIDENCE INTERVALS) that you would normally make using the "Predictions" or "Customised Predictions" windows in MLwiN. However, we have not programmed up a runmlwin specific prediction post-estimation command, so it takes a little bit more work than usual.

The solution I gave you for generating predictions involved explicitly referencing each parameter involved in the prediction equation. You could equally generate the sampling variance for each prediction by referencing the sampling variances and covariances of the parameters involved in the prediction equation. You can view these by typing matrix list e(V). However, the maths gets a little cumbersome and it is easy to make errors by this approach.

Fortunately the solution I gave you for generating predictions was just one of many available to you. Now that you want confidence intervals as well as the raw predictions, we will explore another, less error prone approach.

You can also generate the predictions that you wanted by simply replacing all the explanatory variables in the model which are not to be used in the prediction equation with the value zero and then use Stata's predict command in the usual way. Replacing these variables with the value zero effectively removes these variables from the prediction equation. Only a small number of variables appear in your prediction equation so you might want to write a foreach loop to do this.

You can do a save and use in Stata to return to the original data. Alternatively, you can use a preserve and restore.

Once done you can calculate the raw predictions by issuing the predict command.

predict yhat

You should get the same same predictions as before!

The next step is to generate the standard errors for these predicted values. Simply use the predict command with the stdp option.

predict yhat_se, stdp

Finally, to get the 95% lower and upper bounds, make use of the generate command

gen yhat_lb = yhat - 1.96*yhat_se
gen yhat_ub = yhat + 1.96*yhat_se


You now have all the information you need to generate your graph.

Another approach you might want to look into is making use of Stata's adjust and potentially the new margins commands.

Let me know how you get on.

Best wishes

George
ewancarr
Posts: 18
Joined: Fri Feb 04, 2011 12:46 pm

Re: Predictions via the runmlwin interface: a clarification

Post by ewancarr »

Great – thanks for your help with this, it's much appreciated.

I've done as you suggested:
  1. Estimate the model in MLwiN (via runmlwin)
  2. Set all variables to 0, except those we need for prediction.
  3. Generate predicted values and standard errors (using predict)
  4. Plot the graph, as before.
This gives the below graph (using a different example to before):

Image

I also checked (using compare) that values from predict and generate
were identical, which they were. I've included the code for all this below.

Thanks again!

Ewan
--

Code: Select all

use "C:\Users\vmware\Desktop\working_data.data", clear

cap drop eaggXvlik
gen eaggXvlik = econagg_cent * vlik

xi: runmlwin satlife cons ///
             incgm agegm female i.genhlth2 socact socmeet ///
			 discuss i.feelinc2 ptnr children i.optim temp ///
			 vlik ///
			 econagg_cent eaggXvlik, ///
   level2(cntry: cons) ///
   level1(id: cons, weightvar(cweight)) ///
   nopause

cap drop yhat_old
gen yhat_old = _b[cons]*cons + _b[vlik]*vlik + _b[econagg_cent]*econagg_cent + _b[eaggXvlik]*eaggXvlik

cap drop yhat yhat_lb yhat_ub yhat_se x_var

local unwanted = "incgm agegm female _Igenhlth2_1 _Igenhlth2_2 _Igenhlth2_3  socact socmeet discuss _Ifeelinc2_2 _Ifeelinc2_3 ptnr children _Ioptim_2 _Ioptim_3 temp"
foreach var in `unwanted' {
    replace `var' = 0
	}
	
predict yhat

compare yhat yhat_old
 
predict yhat_se, stdp
gen yhat_lb = yhat - 1.96 * yhat_se
gen yhat_ub = yhat + 1.96 * yhat_se
	
twoway (rarea yhat_ub yhat_lb econagg_cent if vlik == 0, sort fcolor(blue) fintensity(inten10)) ///
       (rarea yhat_ub yhat_lb econagg_cent if vlik == 1, sort fcolor(red) fintensity(inten10)) ///
	   (line yhat econagg_cent if vlik == 1, lcolor(red) lwidth(inten10)) ///
	   (line yhat econagg_cent if vlik == 0, lcolor(blue)), ///
	   ytitle(Predicted life satisfaction) ///
	   legend(order(3 "vlik == 1"  4 "vlik == 0"  1 "95% CI"  2 "95% CI")) ///
	   xsize(5.3) scale(.9) aspectratio(.8) 
	   
GeorgeLeckie
Site Admin
Posts: 432
Joined: Fri Apr 01, 2011 2:14 pm

Re: Predictions via the runmlwin interface: a clarification

Post by GeorgeLeckie »

That's great. I'm glad it worked.

Thank you for putting all the graphs and code in. It makes this post a useful resource for other users

Best wishes

George
Post Reply