'Realcom'
Developing multilevel models for REAListically COMplex social science data
Problems?
Go to Realcom discussion forum
The ESRC has rated this project as outstanding. The outstanding grade indicates that a project has fully met its objectives and has provided an exceptional research contribution well above average or very high in relation to the level of award. Go to ESRC award details.
Realcom Downloads
- Realcom Imputation A new version of REALCOM-IMPUTE is now available with fixes to known bugs. (03-Nov-09)
Realcom Installer (updated 03-Nov-09) -
Note: During installation you may get a message on your screen: ".Net Framework is not installed - do you want to stop this installation and install .Net first?". Answer: You do not need .Net to run the application.
Further instructions are available in the training manual on page 2 (page 5 of the PDF). - Matlab runtime installer for Realcom - You will need this if you do not have MATLAB already installed. Please email info-cmm@bristol.ac.uk giving your name, organisation and email address to register for this download
-
Realcom workshop training manual - This version (June 2008) of the training manual has been updated since the printed manual you may have received if you attended any of the the workshops
Workshop Powerpoint presentation
Bug Fixes
- For some models with missing data in categorical variables incorrect values may may have got imputed. Generally this will have given clearly incorrect results. This bug is now fixed (25-Mar-09)
- A bug has now been fixed in the Realcom Factor Module. This affected models that were ordered categorical reponses with more than 3 categories.
- March 2009 version fixed certain bugs for level 2 responses that could have caused crashes.
Previous Bugs (earlier versions of the software)
To resolve the following bugs please ensure you have
the latest version of Realcom (updated 03-Nov-09)
- When using MLwiN in conjunction with Realcom-impute only the first imputed data set is used when running the imputation analysis. The workaround is to manually specify the data sets to use in the ISTA command.
For example if you have 10 data sets to analyse then instead of selecting:
Model->Imputation->Start Analysis
go to the command window and run the command: ISTA 1 2 3 4 5 6 7 8 9 10 - There was a bug in the mixed response modelling macros that would have affected some models with ordered categorical responses at level 2 (28-Nov-07). This has been corrected.
- The class size data set originally supplied for the missing data example was incorrect. The new one is now part of the installation. The
training manual is now supplied with corrections to the data set description.
Revised dataset.
The project developed new methodology and associated training materials in the following areas of multilevel modelling: structural equation models, measurement errors and multivariate mixed response types at more than one level of the data hierarchy.
The methodology builds upon that already implemented in MLwiN version 2.02 which is described in the MLwiN manuals. The training materials are written in MATLAB. and are available as free-standing programs. They are designed to interface with MLwiN in terms of data transfer but have their own graphical user interfaces for setting up models and displaying results. There is a set of training materials which provides an introduction to the methodology and a guide to using the software.
Applications are to a variety of problems, including flexible prediction models, multiple imputation for missing data in multilevel models, and misclassification errors in social status data.
Three repeated 1-day workshops were held in Bristol, London and Birmingham, June/July 2007.
Papers
Modelling measurement errors and category misclassifications in multilevel models
Harvey Goldstein, Daphne Kounali and Anthony Robinson: Statistical Modelling 2008; 8 (3): 243-261
Models are developed to adjust for measurement errors in normally distributed predictor and response variables and categorical predictors with misclassification errors. The models allow for a hierarchical data structure and for correlations among the errors and misclassifications. Markov Chain Monte Carlo (MCMC) estimation is used.
The models with examples are also described in the REALCOM training manual and users can fit these in the REALCOM software.
Multilevel Structural Equation Models for the Analysis of Comparative Data on Educational Performance
Harvey Goldstein, Gérard Bonnet, Thierry Rocher Ministère de l’Education Nationale, de l’Enseignement Supérieur et de la Recherche, Direction de l’Évaluation et de la Prospective, Paris
The Programme for International Student Assessment comparative study of reading performance among 15-year-olds is reanalyzed using statistical procedures that allow the full complexity of the data structures to be explored. The article extends existing multilevel factor analysis and structural equation models and shows how this can extract richer information from the data and provide better fits to the data. It shows how these models can be used fully to explore the dimensionality of the data and to provide efficient, single-stage models that avoid the need for multiple imputation procedures. Markov Chain Monte Carlo methodology for parameter estimation is described.
Multilevel Models with multivariate mixed response types
Harvey Goldstein, James Carpenter, Michael G Kenward, Kate A Levin
We build upon the existing literature to formulate a class of models for multivariate mixtures of Gaussian, ordered or unordered categorical responses and continuous distributions that are not Gaussian, each of which can be defined at any level of a multilevel data hierarchy. We describe a MCMC algorithm for fitting such models. We show how this unifies a number of disparate problems, including partially observed data and missing data in generalised linear modelling. The 2-level model is considered in detail with worked examples of applications to a prediction problem and to multiple imputation for missing data. We conclude with a discussion outlining possible extensions and connections in the literature. Software for estimating the models is freely available.
This paper is based upon the REALCOM research project.
The Realcom team
Harvey Goldstein, (Project Director), Jon Rasbash, Fiona Steele (Co-Directors), Christopher Charlton (Research Officer), Hilary Browne (Web Developer), Sophie Pollard (Project Assistant)
This three-year ESRC-funded research project developed multilevel modelling techniques, software and training materials in three areas: models with responses at several levels of a data hierarchy, multilevel structural equation models, and measurement error modelling. The models developed under the project were estimated using Markov Chain Monte Carlo (MCMC) estimation.

