Page 1 of 1

Using caterpillarR with NA values

Posted: Sun Feb 26, 2017 11:42 am
by adeldaoud
Hi

I would like to draw Quantile-Quantile plots to check model assumptions. But when--and this is not uncommon-- I have missing observations, the caterpillarR throws an error at me:

Code: Select all

> caterpillarR(mymodel['residual'], lev = 2)
Error in tt[1, 1, ] <- var : 
  number of items to replace is not a multiple of replacement length
In addition: Warning message:
In is.na(object) : is.na() applied to non-(list or vector) of type 'NULL'
Can I specify an option where caterpillarR omitts NAs? (running through the models again but with complete data is too tedious as I am running through many different specification)

Thanks

Re: Using caterpillarR with NA values

Posted: Mon Feb 27, 2017 2:58 pm
by ChrisCharlton
If I understand the code that is causing this error correctly:

Code: Select all

  est.names <- names(myresi)[grep(paste("lev_", lev, "_resi_est", sep = ""), names(myresi))]
  if (length(est.names) == 1) {
    est <- as.matrix(na.omit(myresi[[est.names]]))
    colnames(est) <- sub("_resi_est", "", est.names)
    var <- na.omit(myresi[[grep(paste("lev_", lev, "_resi_(var|variance)_", sep = ""), names(myresi))[1]]])
    d1 <- length(est)
    tt <- array(, c(1, 1, d1))
    tt[1, 1, ] <- var
  }
then to trigger what you are seeing the following conditions would need to be true:
  • You have only one parameter set to be random at the chosen label
  • You have at least one unit that is entirely missing at the higher level (otherwise there wouldn't be any missing residuals)
  • The number of missing values in the residual estimates is different to the number of missing values in their variances
Is this the case for your problematic models?

Re: Using caterpillarR with NA values

Posted: Tue Mar 07, 2017 11:15 pm
by adeldaoud
Hi Chris,

I am not sure if this is the code that is causing an error.

The following is my model, with the residuals stored:

> myresi <- result.l_imf.educ.collapsed.urban.int[[1]]@residual
> names(myresi)
[1] "lev_2_resi_est_Intercept" "lev_2_resi_se_Intercept" "lev_2_std_resi_est_Intercept" "lev_2_resi_leverage_Intercept"
[5] "lev_2_resi_deletion_Intercept" "lev_2_resi_influence_Intercept" "lev_2_residualid" "lev_1_resi_est_bcons.1"
[9] "lev_1_resi_se_bcons.1" "lev_1_std_resi_est_bcons.1" "lev_1_resi_leverage_bcons.1" "lev_1_resi_deletion_bcons.1"
[13] "lev_1_resi_influence_bcons.1" "lev_1_residualid"


Any ideas?

Re: Using caterpillarR with NA values

Posted: Wed Mar 08, 2017 11:33 am
by ChrisCharlton
The caterpillarR function is looking for data with names starting lev_2_resi_var, however these do not appear to have been output from your model. I suspect that this is probably the cause of the error message that you are seeing.

Re: Using caterpillarR with NA values

Posted: Wed Mar 08, 2017 1:14 pm
by adeldaoud
ok, strange, because I specified resi.store=T, with the following options:

resi.store = TRUE, resioptions= c("standardised", "leverage", "influence", "deletion"),

Should I specify the "resioptions" differently?

Re: Using caterpillarR with NA values

Posted: Wed Mar 08, 2017 2:39 pm
by ChrisCharlton
The default for resioptions is "variance", however if you specify you own options then this will not be automatically included. You will therefore need to add it to your list if you want to make use of it.