### Nested models with fixed - sort of - categories

Posted:

**Tue Mar 16, 2021 8:03 pm**I have been asked to run a model in which individuals are nested within divisions of an organisation, and divisions are nested within departments.

I am not sure how to treat the upper levels of the hierarchy. I currently have received data from about 96% of the divisions (about 250 in total), which amounts to a full set of divisional data from over 90% of departments (about 50 in total).

The remaining data may or may not be eventually supplied.

If I do not receive any more data, I wonder if I can classify divisions and departments as random effects (even though the population of divisions is barely greater than the sample of divisions, and likewise for departments); and hence construct a multilevel model with individuals at level 1, divisions level 2 and departments level 3. I have reason to believe that the 96% of divisions who have supplied data so far is not a random sample of divisions: I could almost have completely correctly predicted the 4% I was not going to get.

If the assumption that divisions and departments are random factors is not justified, or I do subsequently receive a complete set of data, both divisions and departments will be fixed effects. I don't want to use 249 indicator variables to model the divisions and another 49 to model the departments, so I wondered what would be the implications of treating them as levels in the hierarchy, even though they would not be random effects - or whether there was an alternative appropriate treatment.

Many thanks in advance for any advice that can be provided.

John

I am not sure how to treat the upper levels of the hierarchy. I currently have received data from about 96% of the divisions (about 250 in total), which amounts to a full set of divisional data from over 90% of departments (about 50 in total).

The remaining data may or may not be eventually supplied.

If I do not receive any more data, I wonder if I can classify divisions and departments as random effects (even though the population of divisions is barely greater than the sample of divisions, and likewise for departments); and hence construct a multilevel model with individuals at level 1, divisions level 2 and departments level 3. I have reason to believe that the 96% of divisions who have supplied data so far is not a random sample of divisions: I could almost have completely correctly predicted the 4% I was not going to get.

If the assumption that divisions and departments are random factors is not justified, or I do subsequently receive a complete set of data, both divisions and departments will be fixed effects. I don't want to use 249 indicator variables to model the divisions and another 49 to model the departments, so I wondered what would be the implications of treating them as levels in the hierarchy, even though they would not be random effects - or whether there was an alternative appropriate treatment.

Many thanks in advance for any advice that can be provided.

John