Time varying predictors at higher aggregation levels

Welcome to the forum for runmlwin users. Feel free to post your question about runmlwin here. The Centre for Multilevel Modelling take no responsibility for the accuracy of these posts, we are unable to monitor them closely. Do go ahead and post your question and thank you in advance if you find the time to post any answers!

Go to runmlwin: Running MLwiN from within Stata >> http://www.bristol.ac.uk/cmm/software/runmlwin/
Post Reply
Raphael
Posts: 19
Joined: Wed Oct 12, 2011 2:52 am

Time varying predictors at higher aggregation levels

Post by Raphael »

The case:
I am trying to estimate event history models (also known as survival models) with time-varying predictors at two different levels of (geographical) aggregation. More precisely, I am using a discrete time event history model (logit model on stacked data) to predict the odds of outmigration (mig) at the household-level. Each household is exposed to the hazard of migration over a certain period (in this example three years; exposure). I have a number of time-varying (e.g., wx = cumulative working experience of household head) and time invariant household-level predictors (e.g., fem = household head is female) to control for the effect of varies socio-demographic on the decision to migrate. However, the households in my sample are located in different municipalities (MunID). In my research I am interested in how a set of time-varying characteristics of the environment (Env1, e.g. rainfall decline) that operate at the municipality-level impact the odds of household-level outmigration. However, I also need to control for some time-invariant municipality-level characteristics (Env2, e.g., % land used for agricultural production). A simplified example of the data structure is presented in the below table (sorry for abusing the code feature for the display).

Code: Select all

exposure       HHID        HHIDy       mig         wx         fem        MunID          MunIDy         Env1       Env2
1              A           A_1         0           1           0           M1           M1_1           4           3
2              A           A_2         0           2           0           M1           M1_2           5           3
3              A           A_3         1           3           0           M1           M1_3           6           3
1              B           B_1         0           5           1           M1           M1_1           4           3
2              B           B_2         0           5           1           M1           M1_2           5           3
3              B           B_3         0           6           1           M1           M1_3           6           3
1              C           C_1         0           3           0           M1           M1_1           4           3
2              C           C_2         1           4           0           M1           M1_2           5           3
1              D           D_1         0           7           0           M1           M1_1           4           3
2              D           D_2         0           8           0           M1           M1_2           5           3
3              D           D_3         0           9           0           M1           M1_3           6           3
1              E           E_1          1           2           0           M2           M2_1           2.5        6
1              F           F_1          0           2           0           M2           M2_1           2.5         6
2              F           F_2          0           3           0           M2           M2_2           1            6
3              F           F_3          0           4           0           M2           M2_3           3            6
1              G           G_1         0           8           1           M2           M2_1           2.5         6
2              G           G_2         1           8           1           M2           M2_2           1            6
1              H           H_1         0           5           0           M2           M2_1           2.5         6
2              H           H_2         0           6           0           M2           M2_2           1            6
3              H           H_3         0           6           0           M2           M2_3           3            6
The problem:
Because I have two levels of aggregation (households clustered in municipalities), I was intending to use logistic multilevel models. However, I am not quite sure how to correctly specify my levels so that the aggregate-level nature of my time-varying predictor at level-3 (e.g., Env1) is correctly accounted for.

Possible solutions:
1. Courgeau (2007) describes a multilevel event history model with three levels: Time (level-1) is nested within individuals (level-2), who are nested within states (level-3). However, Courgeau only mentions a time-invariant state-level predictor (which of course has the same values for all person-years/rows within each state-level unit). In my case, I have the problem that a time-varying predictor at the municipality-level (e.g., Env1) would not be recognized by MLwiN as operating at the municipality-level (level-3) because the values within each aggregation unit vary across time. However, the standard errors of the estimate for Env1 will be biased if the model considers this variable as a level-1 predictor because at each time point all households within one municipality will have the same Env1 value.

2. As another option, I could use the combined MunIDy variable to specify my third-level. MunIDy combines the municipality ID (MunID) with the exposure year variable (exposure) and results in n=3*2=6 aggregation units at level-3. However, this solution seems to be also less ideal since, each level-3 unit would contain only household and municipality level values for one exposure year (e.g., one unit would consists of all cases/observations in a particular exposure year and a particular community), and if I sort the data on MuniIDy it messes up the event history.

Does anyone have an idea of how to correctly specify the levels in my analysis so that I can investigate the effect of time-varying predictors at level-3? Or can anyone point me to published work that uses a multi-level event history analysis with time-varying predictors at higher aggregation levels? Thanks a lot for any help!


References:
Courgeau, D. (2007). Multilevel synthesis: From the group to the individual. Dordrecht, The Netherlands: Springer.
GeorgeLeckie
Site Admin
Posts: 432
Joined: Fri Apr 01, 2011 2:14 pm

Re: Time varying predictors at higher aggregation levels

Post by GeorgeLeckie »

Hi Raphael,

Interesting query. Perhaps you shout put in a municipality-by-year random-interaction effects into your model.

However, I would recommend you post this topic to the multilevel JISC email list, as it not specific to runmlwin (or for that matter MLwiN). The JISC email list is read very widely so you will have much more chance of getting a good response there.

Best wishes

George
Raphael
Posts: 19
Joined: Wed Oct 12, 2011 2:52 am

Re: Time varying predictors at higher aggregation levels

Post by Raphael »

Hi George,
thanks so much for the great advice!

Best,
Raphael
Raphael
Posts: 19
Joined: Wed Oct 12, 2011 2:52 am

Re: Time varying predictors at higher aggregation levels

Post by Raphael »

I think I found a solution. I read two book chapters about multilevel event history models (Courgeau, 2007; Goldstein, 2011), which discuss similar cases and suggest using a three-level structure such as time (level-1) nested within households (level-2), which are in turn nested within municipalities (level-3). Goldstein (2011, p. 221) explicitly states for this structure that “The exploratory variables can be defined at any level. They may also vary over time, allowing so-called time varying covariates.”

So here is a quick explanation why I think that such a three-level model is able to correctly incorporate time-varying predictors at the municipality-level (level-3), such as the environmental variable “Env1”. Because Env1 varies across time, the model automatically treats it as a level-1 variable. It does not know that at each time step (e.g., year 1990), the values for Env1 are the same for all households located in a particular municipality. However, I don’t think that this biases the standard errors for the Env1 variable because I have household random effects (level-2) included in the model, which estimate a separate intercept for each household. Moreover, I also include an additional variance component at level-2 that allows the slope of Env1 to vary randomly across households. In this way the effect of Env1 is uniquely computed for each household.

References:
Courgeau, D. (2007). Multilevel synthesis: From the group to the individual. Dordrecht, The Netherlands: Springer.
Goldstein, H. (2011). Multilevel statistical models (4th ed.). Chichester, U.K.: John Wiley & Sons.
GeorgeLeckie
Site Admin
Posts: 432
Joined: Fri Apr 01, 2011 2:14 pm

Re: Time varying predictors at higher aggregation levels

Post by GeorgeLeckie »

Thanks very much for replying
I suppose one might also consider a simulation study to explore these issues
Best wishes
George
Post Reply