Abstract

Missing data are a common challenge facing empirical researchers. This paper presents a general GMM framework and estimator for dealing with missing values of an explanatory variable in linear regression analysis. The GMM estimator is efficient under assumptions needed for consistency of linear-imputation methods. The estimator, which also allows for a specification test of the missingness assumptions, is compared to existing linear imputation, complete data, and dummy variable methods commonly used in empirical research. The dummy variable method is generally inconsistent even when data are missing completely at random, and the dummy variable method, when consistent, can be less efficient than the complete data method.

Supplementary data

You do not currently have access to this content.