Matt Masten | Research

Work in progress

[19] Causality for the Cautious (PhD textbook)

[18] "Assessing Sensitivity to Identifying Assumptions", with Alexandre Poirier, Revision Requested, Journal of Economic Literature

[17] "Assessing IV Exclusion and Exogeneity without First Stage Monotonicity", with Paul Diegert and Alexandre Poirier

Working papers

[16] “An Axiomatic Approach to Comparing Sensitivity Parameters", with Paul Diegert and Alexandre Poirier (April 2025; latest draft June 2025). Note: This material first appeared in our retired and superseded drafts arXiv:2206.02303v3 (May 2023) and arXiv:2206.02303v4 (July 2023). Those drafts also contained additional results which are now in [15].

Abstract

Many methods are available for assessing the importance of omitted variables. These methods typically make different, non-falsifiable assumptions. Hence the data alone cannot tell us which method is most appropriate. Since it is unreasonable to expect results to be robust against all possible robustness checks, researchers often use methods deemed ``interpretable'', a subjective criterion with no formal definition. In contrast, we develop the first formal, axiomatic framework for comparing and selecting among these methods. Our framework is analogous to the standard approach for comparing estimators based on their sampling distributions. We propose that sensitivity parameters be selected based on their covariate sampling distributions, a design distribution of parameter values induced by an assumption on how covariates are assigned to be observed or unobserved. Using this idea, we define a new concept of parameter consistency, and argue that a reasonable sensitivity parameter should be consistent. We prove that the literature's most popular approach is inconsistent, while several alternatives are consistent.

[15] “Assessing Omitted Variable Bias when the Controls are Endogenous", with Paul Diegert and Alexandre Poirier (June 2022; latest draft: June 2025). Note: A previous draft contained material that is no longer in the current draft; it is now in [16].

Abstract

Omitted variables are one of the most important threats to the identification of causal effects. Several widely used methods assess the impact of omitted variables on empirical conclusions by comparing measures of selection on observables with measures of selection on unobservables. The recent literature has discussed various limitations of these existing methods, however. This includes a companion paper of ours which explains issues that arise when the omitted variables are endogenous, meaning that they are correlated with the included controls. In the present paper, we develop a new approach to sensitivity analysis that avoids those limitations, while still allowing researchers to calibrate sensitivity parameters by comparing the magnitude of selection on observables with the magnitude of selection on unobservables as in previous methods. We illustrate our results in an empirical study of the effect of historical American frontier life on modern cultural beliefs. Finally, we implement these methods in the companion Stata module regsensitivity for easy use in practice.

See [12] for Stata module installation instructions.

[14] “Finite Population Identification and Design-Based Sensitivity Analysis", with Brendan Kline (April 2025; latest draft: June 2025)

Abstract

We develop a new approach for quantifying uncertainty in finite populations, by using design distributions to calibrate sensitivity parameters in finite population identified sets. This yields uncertainty intervals that can be interpreted as identified sets, Bayesian credible sets, or frequentist design-based confidence sets. We focus on quantifying uncertainty about the average treatment effect (ATE) due to missing potential outcomes in a randomized experiment, where our approach (1) yields design-based confidence intervals for ATE which allow for heterogeneous treatment effects but do not rely on asymptotics, (2) provides a new motivation for examining covariate balance, and (3) gives a new formal analysis of the role of randomized treatment assignment. We illustrate our approach in three empirical applications.

[13] “A General Approach to Relaxing Unconfoundedness", with Alexandre Poirier and Muyang Ren (Jan 2025)

Abstract

This paper defines a general class of relaxations of the unconfoundedness assumption. This class includes several previous approaches as special cases, including the marginal sensitivity model of Tan (2006). This class therefore allows us to precisely compare and contrast these previously disparate relaxations. We use this class to derive a variety of new identification results which can be used to assess sensitivity to unconfoundedness. In particular, the prior literature focuses on average parameters, like the average treatment effect (ATE). We move beyond averages by providing sharp bounds for a large class of parameters, including both the quantile treatment effect (QTE) and the distribution of treatment effects (DTE), results which were previously unknown even for the marginal sensitivity model.

[12] “The Effect of Omitted Variables on the Sign of Regression Coefficients", with Alexandre Poirier (August 2022; latest draft: Dec 2024), Revision Requested, American Economic Review

Abstract

We show that, depending on how the impact of omitted variables is measured, it can be substantially easier for omitted variables to flip coefficient signs than to drive them to zero. This behavior occurs with "Oster's delta" (Oster 2019), a widely reported robustness measure. Consequently, any time this measure is large -- suggesting that omitted variables may be unimportant -- a much smaller value reverses the sign of the parameter of interest. We propose a modified measure of robustness to address this concern. We illustrate our results in four empirical applications and two meta-analyses. We implement our methods in the companion Stata module regsensitivity.

To install the companion Stata module, type ssc install regsensitivity, all from within Stata. Type help regsensitivity for syntax and instructions. Also see our vignette for a walkthrough. All files are also available on our GitHub repo.

Published papers

[11] “Assessing Sensitivity to Unconfoundedness: Estimation and Inference”, with Alexandre Poirier and Linqi Zhang (2024), Journal of Business & Economic Statistics

Abstract

This article provides a set of methods for quantifying the robustness of treatment effects estimated using the unconfoundedness assumption. Specifically, we estimate and do inference on bounds for various treatment effect parameters, like the Average Treatment Effect (ATE) and the average effect of treatment on the treated (ATT), under nonparametric relaxations of the unconfoundedness assumption indexed by a scalar sensitivity parameter c. These relaxations allow for limited selection on unobservables, depending on the value of c. For large enough c, these bounds equal the no assumptions bounds. Using a nonstandard bootstrap method, we show how to construct confidence bands for these bound functions which are uniform over all values of c. We illustrate these methods with an empirical application to the National Supported Work Demonstration program. We implement these methods in the companion Stata module tesensitivity for easy use in practice.

To install the companion Stata module, type ssc install tesensitivity from within Stata. Type help tesensitivity for syntax and instructions. Also see our vignette for a walkthrough. All files are also available on our GitHub repo.

[10] “Minimax-Regret Treatment Rules with Many Treatments” (2023), The Japanese Economic Review

Abstract

Statistical treatment rules map data into treatment choices. Optimal treatment rules maximize social welfare. Although some finite sample results exist, it is generally difficult to prove that a particular treatment rule is optimal. This paper develops asymptotic and numerical results on minimax-regret treatment rules when there are many treatments. I first extend a result of Hirano and Porter (2009) to show that an empirical success rule is asymptotically optimal under the minimax-regret criterion. The key difference is that I use a permutation invariance argument from Lehmann (1966) to solve the limit experiment instead of applying results from hypothesis testing. I then compare the finite sample performance of several treatment rules. I find that the empirical success rule performs poorly in unbalanced designs, and that when prior information about treatments is symmetric, balanced designs are preferred to unbalanced designs. Finally, I discuss how to compute optimal finite sample rules by applying methods from computational game theory.

[9] “Choosing Exogeneity Assumptions in Potential Outcome Models”, with Alexandre Poirier (2023), Econometrics Journal, ``Editor's Choice'' selection; Supersedes our previous paper “Interpreting Quantile Independence”

Abstract

There are many kinds of exogeneity assumptions. How should researchers choose among them? When exogeneity is imposed on an unobservable like a potential outcome, we argue that the form of exogeneity should be chosen based on the kind of selection on unobservables it allows. Consequently, researchers can assess the plausibility of any exogeneity assumption by studying the distributions of treatment given the unobservables that are consistent with that assumption. We use this approach to study two common exogeneity assumptions: quantile and mean independence. We show that both assumptions require a kind of non-monotonic relationship between treatment and the potential outcomes. We discuss how to assess the plausibility of this kind of treatment selection. We also show how to define a new and weaker version of quantile independence that allows for monotonic treatment selection. We then show the implications of the choice of exogeneity assumption for identification. We apply these results in an empirical illustration of the effect of child soldiering on wages.

[8] “ivcrc: An Instrumental Variables Estimator for the Correlated Random Coefficients Model” (2022), with David Benson and Alexander Torgovitsky (Preprint), The Stata Journal

Abstract

We present the ivcrc command, which implements an instrumental variables (IV) estimator for the linear correlated random coefficients (CRC) model. This model is a natural generalization of the standard linear IV model that allows for endogenous, multivalued treatments and unobserved heterogeneity in treatment effects. The proposed estimator uses recent semiparametric identification results that allow for flexible functional forms and permit instruments that may be binary, discrete, or continuous. The command also allows for the estimation of varying coefficients regressions, which are closely related in structure to the proposed IV estimator. We illustrate this IV estimator and the ivcrc command by estimating the returns to education in the National Longitudinal Survey of Young Men.

[7] “Salvaging Falsified Instrumental Variable Models” (2021), with Alexandre Poirier (Supplemental appendix; Replication files; Dec 2018 draft; Journal link), Econometrica

Abstract

What should researchers do when their baseline model is falsified? We recommend reporting the set of parameters that are consistent with minimally nonfalsified models. We call this the falsification adaptive set (FAS). This set generalizes the standard baseline estimand to account for possible falsification. Importantly, it does not require the researcher to select or calibrate sensitivity parameters. In the classical linear IV model with multiple instruments, we show that the FAS has a simple closed-form expression that only depends on a few 2SLS coefficients. We apply our results to an empirical study of roads and trade. We show how the FAS complements traditional overidentification tests by summarizing the variation in estimates obtained from alternative nonfalsified models.

[6] “Inference on Breakdown Frontiers” (2020), with Alexandre Poirier (Supplemental appendix; Replication files; May 2017 draft; arXiv drafts), Quantitative Economics

Abstract

A breakdown frontier is the boundary between the set of assumptions which lead to a specific conclusion and those which do not. In a potential outcomes model with a binary treatment, we consider two conclusions: First, that ATE is at least a specific value (e.g., nonnegative) and second that the proportion of units who benefit from treatment is at least a specific value (e.g., at least 50%). For these conclusions, we derive the breakdown frontier for two kinds of assumptions: one which indexes deviations from random assignment of treatment, and one which indexes deviations from rank invariance. These classes of assumptions nest both the point identifying assumptions of random assignment and rank invariance and the opposite end of no constraints on treatment selection or the dependence structure between potential outcomes. This frontier provides a quantitative measure of robustness of conclusions to deviations from the point identifying assumptions. We derive root-N-consistent sample analog estimators for these frontiers. We then provide two asymptotically valid bootstrap procedures for constructing lower uniform confidence bands for the breakdown frontier. As a measure of robustness, estimated breakdown frontiers and their corresponding confidence bands can be presented alongside traditional point estimates and confidence intervals obtained under point identifying assumptions. We illustrate this approach in an empirical application to the effect of child soldiering on wages. We find that the conclusions we consider are fairly robust to failure of rank invariance, when random assignment holds, but these conclusions are much more sensitive to both assumptions for small deviations from random assignment.

[5] “A Practical Guide to Compact Infinite Dimensional Parameter Spaces’‘ (2019), with Joachim Freyberger (Supplemental appendix; First version), Econometric Reviews

Abstract

We gather and review general compactness results for many commonly used parameter spaces in nonparametric estimation, and we provide several new results. We consider three kinds of functions: (1) functions with bounded domains which satisfy standard norm bounds, (2) functions with bounded domains which do not satisfy standard norm bounds, and (3) functions with unbounded domains. In all three cases we provide two kinds of results, compact embedding and closedness, which together allow one to show that parameter spaces defined by a strong norm bound are compact under a consistency norm. We illustrate how these results are typically used in econometrics by considering two common settings: nonparametric mean regression and nonparametric instrumental variables estimation.

[4] “Identification of Treatment Effects under Conditional Partial Independence” (2018), with Alexandre Poirier (Journal link), Econometrica

Abstract

Conditional independence of treatment assignment from potential outcomes is a commonly used but nonrefutable assumption. We derive identified sets for various treatment effect parameters under nonparametric deviations from this conditional independence assumption. These deviations are defined via a conditional treatment assignment probability, which makes it straightforward to interpret. Our results can be used to assess the robustness of empirical conclusions obtained under the baseline conditional independence assumption.

Note 1: See here for an erratum correcting Proposition 5.

Note 2: See our paper [11] Masten, Poirier, and Zhang (2024) for the corresponding estimation and inference theory, as well as a companion Stata module: ssc install tesensitivity. Further installation instructions above.

[3] “Random Coefficients on Endogenous Variables in Simultaneous Equations Models” (2018) (Supplemental appendix; Replication files; 2015 preprint; 2013 preprint), The Review of Economic Studies

Abstract

This paper considers a classical linear simultaneous equations model with random coefficients on the endogenous variables. Simultaneous equations models are used to study social interactions, strategic interactions between firms, and market equilibrium. Random coefficient models allow for heterogeneous marginal effects. I show that random coefficient seemingly unrelated regression models with common regressors are not point identified, which implies random coefficient simultaneous equations models are not point identified. Important features of these models, however, can be identified. For two-equation systems, I give two sets of sufficient conditions for point identification of the coefficients’ marginal distributions conditional on exogenous covariates. The first allows for small support continuous instruments under tail restrictions on the distributions of unobservables which are necessary for point identification. The second requires full support instruments, but allows for nearly arbitrary distributions of unobservables. I discuss how to generalize these results to many equation systems, where I focus on linear-in-means models with heterogeneous endogenous social interaction effects. I give sufficient conditions for point identification of the distributions of these endogenous social effects. I propose a consistent nonparametric kernel estimator for these distributions based on the identification arguments. I apply my results to the Add Health data to analyze peer effects in education.

[2] “Identification of Instrumental Variable Correlated Random Coefficients Models” (2016), with Alexander Torgovitsky (Preprint; 2014 draft (longer)),The Review of Economics and Statistics

Companion Stata module ivcrc is available at our GitHub repo, which includes installation instructions.

Abstract

We study identification and estimation of the average partial effect in an instrumental variable correlated random coefficients model with continuously distributed endogenous regressors. This model allows treatment effects to be correlated with the level of treatment. The main result shows that the average partial effect is identified by averaging coefficients obtained from a collection of ordinary linear regressions that condition on different realizations of a control function. These control functions can be constructed from binary or discrete instruments which may affect the endogenous variables heterogeneously. Our results suggest a simple estimator that can be implemented with a companion Stata module.

[1] “A Specification Test for Discrete Choice Models” (2013) with Mark Chicu, Economics Letters

Abstract

In standard discrete choice models, adding options cannot increase the choice probability of an existing alternative. We use this observation to construct a simple nonparametric specification test by exploiting variation in the choice sets individuals face. We use a multiple testing procedure to determine the particular kind of choice sets that produce violations. We apply these tests to the 1896 US House of Representatives election and reject commonly used discrete choice voting models.

Other papers

“Partial Independence in Nonseparable Models“, with Alexandre Poirier (June 2016); Portions of this paper appear in [4] and [9]

Abstract

We analyze identification of nonseparable models under three kinds of exogeneity assumptions weaker than full statistical independence. The first is based on quantile independence. Selection on unobservables drives deviations from full independence. We show that such deviations based on quantile independence require non-monotonic and oscillatory propensity scores. Our second and third approaches are based on a distance-from-independence metric, using either a conditional cdf or a propensity score. Under all three approaches we obtain simple analytical characterizations of identified sets for various parameters of interest. We do this in three models: the exogenous regressor model of Matzkin (2003), the instrumental variable model of Chernozhukov and Hansen (2005), and the binary choice model with nonparametric latent utility of Matzkin (1992).

“How Should the Graduate Economics Core be Changed?” (2011) with Jose Miguel Abito, Katarina Borovickova, Hays Golden, Jacob Goldin, Miguel Morin, Alexandre Poirier, Vincent Pons, Israel Romem, Tyler Williams, and Chamna Yoon, The Journal of Economic Education

Abstract

The authors present suggestions by graduate students from a range of economics departments for improving the first-year core sequence in economics. The students identified a number of elements that should be added to the core: more training in building microeconomic models, a discussion of the methodological foundations of model-building, more emphasis on institutions to motivate and contextualize macroeconomic models, and greater focus on econometric practice rather than theory. The authors hope that these suggestions will encourage departments to take a fresh look at the content of the first-year core.