Working papers

Assessing Sensitivity to Unconfoundedness: Estimation and Inference”, with Alexandre Poirier and Linqi Zhang (December 2020)

  • This paper provides a set of methods for quantifying the robustness of treatment effects estimated using the unconfoundedness assumption (also known as selection on observables or conditional independence). Specifically, we estimate and do inference on bounds on various treatment effect parameters, like the average treatment effect (ATE) and the average effect of treatment on the treated (ATT), under nonparametric relaxations of the unconfoundedness assumption indexed by a scalar sensitivity parameter c. These relaxations allow for limited selection on unobservables, depending on the value of c. For large enough c, these bounds equal the no assumptions bounds. Using a non-standard bootstrap method, we show how to construct confidence bands for these bound functions which are uniform over all values of c. We illustrate these methods with an empirical application to effects of the National Supported Work Demonstration program. We implement these methods in a companion Stata module for easy use in practice.

To install the companion Stata module, type

ssc install tesensitivity

from within Stata. Type

help tesensitivity

for syntax and instructions. Also see our vignette for a walkthrough. All files are also available on our GitHub repo.

ivcrc: An Instrumental Variables Estimator for the Correlated Random Coefficients Model”, with David Benson and Alexander Torgovitsky (June 2020), Revision Requested

  • We present the ivcrc command, which implements an instrumental variables (IV) estimator for the linear correlated random coefficients (CRC) model. This model is a natural generalization of the standard linear IV model that allows for endogenous, multivalued treatments and unobserved heterogeneity in treatment effects. The proposed estimator uses recent semiparametric identification results that allow for flexible functional forms and permit instruments that may be binary, discrete, or continuous. The command also allows for the estimation of varying coefficients regressions, which are closely related in structure to the proposed IV estimator. We illustrate this IV estimator and the ivcrc command by estimating the returns to education in the National Longitudinal Survey of Young Men.

Interpreting Quantile Independence”, with Alexandre Poirier (April 2018)

  • How should one assess the credibility of assumptions weaker than statistical independence, like quantile independence? In the context of identifying causal effects of a treatment variable, we argue that such deviations should be chosen based on the form of selection on unobservables they allow. For quantile independence, we characterize this form of treatment selection. Specifically, we show that quantile independence is equivalent to a constraint on the average value of either a latent propensity score (for a binary treatment) or the cdf of treatment given the unobservables (for a continuous treatment). In both cases, this average value constraint requires a kind of non-monotonic treatment selection. Using these results, we show that several common treatment selection models are incompatible with quantile independence. We introduce a class of assumptions which weakens quantile independence by removing the average value constraint, and therefore allows for monotonic treatment selection. In a potential outcomes model with a binary treatment, we derive identified sets for the ATT and QTT under both classes of assumptions. In a numerical example we show that the average value constraint inherent in quantile independence has substantial identifying power. Our results suggest that researchers should carefully consider the credibility of this non-monotonicity property when using quantile independence to weaken full independence.

Partial Independence in Nonseparable Models“, with Alexandre Poirier (June 2016); Portions of this paper appear in [4]

  • We analyze identification of nonseparable models under three kinds of exogeneity assumptions weaker than full statistical independence. The first is based on quantile independence. Selection on unobservables drives deviations from full independence. We show that such deviations based on quantile independence require non-monotonic and oscillatory propensity scores. Our second and third approaches are based on a distance-from-independence metric, using either a conditional cdf or a propensity score. Under all three approaches we obtain simple analytical characterizations of identified sets for various parameters of interest. We do this in three models: the exogenous regressor model of Matzkin (2003), the instrumental variable model of Chernozhukov and Hansen (2005), and the binary choice model with nonparametric latent utility of Matzkin (1992).

Instrumental Variables Estimation of a Generalized Correlated Random Coefficients Model” (2014), with Alexander Torgovitsky; Portions of this paper appear in [2]

Published papers

[7] “Salvaging Falsified Instrumental Variable Models”, with Alexandre Poirier (this version: Jan 2020; first version: Dec 2018), Econometrica (forthcoming)

  • What should researchers do when their baseline model is refuted? We provide four constructive answers. First, researchers can measure the extent of falsification. To do this, we consider continuous relaxations of the baseline assumptions of concern. We then define the falsification frontier: The smallest relaxations of the baseline model which are not refuted. This frontier provides a quantitative measure of the extent of falsification. Second, researchers can present the identified set for the parameter of interest under the assumption that the true model lies somewhere on this frontier. We call this the falsification adaptive set. This set generalizes the standard baseline estimand to account for possible falsification. Third, researchers can present the identified set for a specific point on this frontier. Finally, as a sensitivity analysis, researchers can present identified sets for points beyond the frontier. To illustrate these four ways of salvaging falsified models, we study overidentifying restrictions in two instrumental variable models: a homogeneous effects linear model, and heterogeneous effect models with either binary or continuous outcomes. In the linear model, we consider the classical overidentifying restrictions implied when multiple instruments are observed. We generalize these conditions by considering continuous relaxations of the classical exclusion restrictions. By sufficiently weakening the assumptions, a falsified baseline model becomes non-falsified. We obtain analogous results in the heterogeneous effect models, where we derive identified sets for marginal distributions of potential outcomes, falsification frontiers, and falsification adaptive sets under continuous relaxations of the instrument exogeneity assumptions. We illustrate our results in four different empirical applications.

[6] “Inference on Breakdown Frontiers” (2020), with Alexandre Poirier (Supplemental appendix; Replication files; May 2017 draft; arXiv drafts), Quantitative Economics

  • A breakdown frontier is the boundary between the set of assumptions which lead to a specific conclusion and those which do not. In a potential outcomes model with a binary treatment, we consider two conclusions: First, that ATE is at least a specific value (e.g., nonnegative) and second that the proportion of units who benefit from treatment is at least a specific value (e.g., at least 50%). For these conclusions, we derive the breakdown frontier for two kinds of assumptions: one which indexes deviations from random assignment of treatment, and one which indexes deviations from rank invariance. These classes of assumptions nest both the point identifying assumptions of random assignment and rank invariance and the opposite end of no constraints on treatment selection or the dependence structure between potential outcomes. This frontier provides a quantitative measure of robustness of conclusions to deviations from the point identifying assumptions. We derive root-N-consistent sample analog estimators for these frontiers. We then provide two asymptotically valid bootstrap procedures for constructing lower uniform confidence bands for the breakdown frontier. As a measure of robustness, estimated breakdown frontiers and their corresponding confidence bands can be presented alongside traditional point estimates and confidence intervals obtained under point identifying assumptions. We illustrate this approach in an empirical application to the effect of child soldiering on wages. We find that the conclusions we consider are fairly robust to failure of rank invariance, when random assignment holds, but these conclusions are much more sensitive to both assumptions for small deviations from random assignment.

[5] “A Practical Guide to Compact Infinite Dimensional Parameter Spaces’‘ (2019), with Joachim Freyberger (Supplemental appendix; First version), Econometric Reviews

  • We gather and review general compactness results for many commonly used parameter spaces in nonparametric estimation, and we provide several new results. We consider three kinds of functions: (1) functions with bounded domains which satisfy standard norm bounds, (2) functions with bounded domains which do not satisfy standard norm bounds, and (3) functions with unbounded domains. In all three cases we provide two kinds of results, compact embedding and closedness, which together allow one to show that parameter spaces defined by a strong norm bound are compact under a consistency norm. We illustrate how these results are typically used in econometrics by considering two common settings: nonparametric mean regression and nonparametric instrumental variables estimation.

[4] “Identification of Treatment Effects under Conditional Partial Independence” (2018), with Alexandre Poirier, Econometrica (Journal link)

  • Conditional independence of treatment assignment from potential outcomes is a commonly used but nonrefutable assumption. We derive identified sets for various treatment effect parameters under nonparametric deviations from this conditional independence assumption. These deviations are defined via a conditional treatment assignment probability, which makes it straightforward to interpret. Our results can be used to assess the robustness of empirical conclusions obtained under the baseline conditional independence assumption.

Note: See our working paper Masten, Poirier, and Zhang (2020) for the corresponding estimation and inference theory, as well as a companion Stata module: ssc install tesensitivity. Further installation instructions above.

[3] “Random Coefficients on Endogenous Variables in Simultaneous Equations Models” (2018), The Review of Economic Studies (Supplemental appendix; Matlab and Stata code; 2015 preprint; 2013 preprint)

  • This paper considers a classical linear simultaneous equations model with random coefficients on the endogenous variables. Simultaneous equations models are used to study social interactions, strategic interactions between firms, and market equilibrium. Random coefficient models allow for heterogeneous marginal effects. I show that random coefficient seemingly unrelated regression models with common regressors are not point identified, which implies random coefficient simultaneous equations models are not point identified. Important features of these models, however, can be identified. For two-equation systems, I give two sets of sufficient conditions for point identification of the coefficients’ marginal distributions conditional on exogenous covariates. The first allows for small support continuous instruments under tail restrictions on the distributions of unobservables which are necessary for point identification. The second requires full support instruments, but allows for nearly arbitrary distributions of unobservables. I discuss how to generalize these results to many equation systems, where I focus on linear-in-means models with heterogeneous endogenous social interaction effects. I give sufficient conditions for point identification of the distributions of these endogenous social effects. I propose a consistent nonparametric kernel estimator for these distributions based on the identification arguments. I apply my results to the Add Health data to analyze peer effects in education.

[2] “Identification of Instrumental Variable Correlated Random Coefficients Models” (2016), with Alexander Torgovitsky,The Review of Economics and Statistics (Preprint)

Companion Stata module (GitHub)

  • We study identification and estimation of the average partial effect in an instrumental variable correlated random coefficients model with continuously distributed endogenous regressors. This model allows treatment effects to be correlated with the level of treatment. The main result shows that the average partial effect is identified by averaging coefficients obtained from a collection of ordinary linear regressions that condition on different realizations of a control function. These control functions can be constructed from binary or discrete instruments which may affect the endogenous variables heterogeneously. Our results suggest a simple estimator that can be implemented with a companion Stata module.

[1] “A Specification Test for Discrete Choice Models” (2013) with Mark Chicu, Economics Letters

  • In standard discrete choice models, adding options cannot increase the choice probability of an existing alternative. We use this observation to construct a simple nonparametric specification test by exploiting variation in the choice sets individuals face. We use a multiple testing procedure to determine the particular kind of choice sets that produce violations. We apply these tests to the 1896 US House of Representatives election and reject commonly used discrete choice voting models.

Other published works

How Should the Graduate Economics Core be Changed?” (2011) with Jose Miguel Abito, Katarina Borovickova, Hays Golden, Jacob Goldin, Miguel Morin, Alexandre Poirier, Vincent Pons, Israel Romem, Tyler Williams, and Chamna Yoon, The Journal of Economic Education

  • The authors present suggestions by graduate students from a range of economics departments for improving the first-year core sequence in economics. The students identified a number of elements that should be added to the core: more training in building microeconomic models, a discussion of the methodological foundations of model-building, more emphasis on institutions to motivate and contextualize macroeconomic models, and greater focus on econometric practice rather than theory. The authors hope that these suggestions will encourage departments to take a fresh look at the content of the first-year core.