Category Archives: Technical

Alternatives to model diagnostics for statistical inference?

Consider the problem of making statistical inferences, as opposed to predictions. The product of statistical inference is a probabilistic statement about a population quantity, for example a 100(1-)% confidence interval for a population median. In this context, the principal reason for diagnostics is to comfort ourselves about the quality of such inferences. For example, we would like to ensure that the stated confidence level of an interval is roughly the population coverage. We can't compute the population coverage directly because we generally don't have access to an entire population. Hence, we often attempt to verify the modeling assumptions that lead us to theoretically correct coverage.

In the prediction framework, we use model diagnostics to verify that the model fits well, which has a direct bearing on the quality of predictions. For example, a line generally does not approximate a quadratic curve. However, it is possible to make accurate inferences about a linear approximation to a quadratic curve. Hence, model fit is not required to make quality inferences. Rather, the requirement is that the associated probability statements are correct.

Assessing model diagnostics is an indirect mechanism to comfort ourselves about the quality of inferences. As an alternative, we might attempt a more direct check, for example, by constructing an empirical estimate of coverage. We may then go further and adjust, or calibrate, the confidence interval to have the correct empirical coverage. These ideas are fundamental parts of the 'double bootstrap', and 'iterated Monte Carlo' methods. For the sake of argument, I will state that this type of empirical check and calibration is sufficient to fully replace model diagnostics for statistical inference. It is also my hypothesis that model diagnostics have been historically favored to iterative Monte Carlo methods (the double bootstrap appeared in the late 1980's) because the latter is more computationally intensive. Current computational tools mitigate, but do not eliminate this concern.

I will present examples with R code in a later post.