Missing data are a common challenge facing empirical researchers. This paper presents a general GMM framework and estimator for dealing with missing values of an explanatory variable in linear regression analysis. The GMM estimator is efficient under assumptions needed for consistency of linear-imputation methods. The estimator, which also allows for a specification test of the missingness assumptions, is compared to existing linear imputation, complete data, and dummy variable methods commonly used in empirical research. The dummy variable method is generally inconsistent even when data are missing completely at random, and the dummy variable method, when consistent, can be less efficient than the complete data method.
© 2017 The President and Fellows of Harvard College and the Massachusetts Institute of Technology
The President and Fellows of Harvard College and the Massachusetts Institute of Technology
rest_a_00645-esupp- pdf file