喻园管理论坛2024年第73期(总第1005期)
演讲主题: Machine Learning for Causal Inference: Is a Nonlinear First Stage Really Forbidden in 2SLS?
主 讲 人:彭景康涅狄格大学商学院副教授
主 持 人:李建斌 管理学院副院长、教授
活动时间: 2024年7月12日(周五)9:00—11:00
活动地点:管院大楼402室
主讲人简介:
Jing Peng is an Associate Professor of Operations and Information Management at the School of Business, University of Connecticut. He received his Ph.D. from the Wharton School, University of Pennsylvania. His research interests focus on e-commerce, social media, gig economy, digital health, and artificial intelligence. His work has appeared in Information Systems Research, Journal of Marketing Research, Management Science, MISQ Quarterly, and other outlets. He is very active in developing novel econometric methods and has contributed three R packages on methodologies to CRAN. His research has won multiple best paper awards. He is a recipient of the INFORMS Information Systems Society Gordon B. Davis Young Scholar Award.
活动简介:
The application of machine learning (ML) in causal inference has garnered significant attention from researchers. A particular focus lies in the integration of ML into two-stage least squares (2SLS), a cornerstone methodology for causal inference. While ML can significantly reduce the prediction error in the first stage, a major hurdle arises due to the concept of forbidden regression. Specifically, a nonlinear first stage is commonly deemed forbidden because the potential lack of orthogonality between the prediction and prediction error may lead to inconsistent estimates. To investigate the applicability of ML in 2SLS, this paper decomposes the bias of 2SLS into an observable bias and an unobservable bias, without specifying the functional form of the first stage or assuming the validity of the proposed instrument. Analytical results and extensive simulations show that while a linear prediction can ensure a zero observable bias, it may result in a substantial unobservable bias, especially when the instrument is weak or not strictly exogenous. Conversely, by utilizing constrained or orthogonalized ML predictions, it is possible, and even guaranteed under certain conditions, to reduce the unobservable bias without introducing an observable bias. This research establishes crucial theoretical foundations for the integration of ML into 2SLS.