Improving MCMC convergence time
Posted: Fri Jul 21, 2017 7:37 pm
Hello,
I am building a three-level logistic model using the runmlwin command. I was able to build the model using PQL2 estimation and I am interested in also fitting the model through MCMC methods. It is a fairly large dataset and model, with the following structure:
- >200,000 observations, with around 2000 observations having a 1 for the outcome
- Random intercepts for states (49), county within state (2343), and individuals within county
- 14 fixed effects variables, including quadratic terms, categorical, and continuous variables (examples: year, population, weight)
- A categorical X continuous interaction term
I fit the model using MCMC methods, with a burn-in of 5000 iterations and a chain of 100,000 iterations, which took around a day. Unfortunately, most of the variables were not close to having converged after this many iterations and the chains for most variables and the variance components had autocorrelation factors of 1 across all lags shown in the diagnostic plots. Rafferty Lewis statistics, for state variance for example, suggest that I would need to run the model for 850,000 iterations.
Here is a sample of the code I used:
runmlwin outcome cons var 1 var2 ... varj ... varJ , level3(state: cons) level2(county_FIPS: cons) level1(short_id:) discrete(distribution(binomial) link(logit) denominator(cons) pql2) mlwinsettings(optimat) nopause rigls
runmlwin outcome cons var 1 var2 ... varj ... varJ , level3(state: cons) level2(county_FIPS: cons) level1(short_id:) discrete(distribution(binomial) link(logit) denominator(cons) pql2) mlwinsettings(optimat) nopause mcmc(burnin(5000) chain(100000)) initsprevious
I was wondering whether you could offer any advice as to whether there's any way to improve the convergence of my model short of taking variables out or allowing the model to run for a few days to see if it will improve. Any comments you may be able to provide would be greatly appreciated.
Thank you,
Madeline
I am building a three-level logistic model using the runmlwin command. I was able to build the model using PQL2 estimation and I am interested in also fitting the model through MCMC methods. It is a fairly large dataset and model, with the following structure:
- >200,000 observations, with around 2000 observations having a 1 for the outcome
- Random intercepts for states (49), county within state (2343), and individuals within county
- 14 fixed effects variables, including quadratic terms, categorical, and continuous variables (examples: year, population, weight)
- A categorical X continuous interaction term
I fit the model using MCMC methods, with a burn-in of 5000 iterations and a chain of 100,000 iterations, which took around a day. Unfortunately, most of the variables were not close to having converged after this many iterations and the chains for most variables and the variance components had autocorrelation factors of 1 across all lags shown in the diagnostic plots. Rafferty Lewis statistics, for state variance for example, suggest that I would need to run the model for 850,000 iterations.
Here is a sample of the code I used:
runmlwin outcome cons var 1 var2 ... varj ... varJ , level3(state: cons) level2(county_FIPS: cons) level1(short_id:) discrete(distribution(binomial) link(logit) denominator(cons) pql2) mlwinsettings(optimat) nopause rigls
runmlwin outcome cons var 1 var2 ... varj ... varJ , level3(state: cons) level2(county_FIPS: cons) level1(short_id:) discrete(distribution(binomial) link(logit) denominator(cons) pql2) mlwinsettings(optimat) nopause mcmc(burnin(5000) chain(100000)) initsprevious
I was wondering whether you could offer any advice as to whether there's any way to improve the convergence of my model short of taking variables out or allowing the model to run for a few days to see if it will improve. Any comments you may be able to provide would be greatly appreciated.
Thank you,
Madeline