Instrumental variables (IV) and control variables are frequently used to assist researchers in investigating the endogenous treatment effects. When used together, their identities are typically assumed to be known. However, in many practical situations, one is faced with a large and mixed set of covariates, some of which can serve as excluded IVs, some can serve as control variables, while others should be discarded from the model. It is often not possible to classify them based on economic theory alone. This paper proposes a data-driven method to classify a large (increasing with sample size) set of covariates into excluded IVs, controls, and noise to be discarded. The resulting IV estimator is shown to have the oracle property (to have the same first-order asymptotic distribution as the IV estimator, assuming the true classification is known).

This content is only available as a PDF.

Article PDF first page preview

Article PDF first page preview

Supplementary data

You do not currently have access to this content.