Attend an invited talk with Benjamin Kelcey on March 16

Targeted After Generative Learning for Causal Inference with Latent Variables

Date: Monday, March 16
Time: 11:30 a.m-12:30 p.m.
Location: Aderhold Hall Room 201
Speaker: Benjamin Kelcey, professor, University of Cincinnati

Benjamin Kelcey is a professor of quantitative methods at the University of Cincinnati. Kelcey’s research focuses on causal inference, machine learning, structural equation modeling (SEM), and (latent) measurement methods within the context of multilevel and multidimensional settings, such as classrooms and schools. His research has been funded by several funding agencies, including the National Science Foundation, Institute of Education Sciences, and the Spencer Foundation.

Abstract: Latent variables (e.g., achievement, instruction, efficacy) and causal inference are central to implementing empirical research and advancing substantive theory across the social sciences. Conventional latent-variable frameworks, such as structural equation modeling (SEM), are severely constrained by simplifying assumptions (e.g., linearity, additivity, and correct model specification). In this study, we relax key SEM assumptions by building data-adaptive estimation methods for causal inference with latent variables. We developed a Targeted After Generative (TAG) learning framework that integrates SEM-informed variational autoencoders (SEM VAEs) with targeted maximum likelihood estimation (TMLE) to, e.g., estimate the effect of a treatment on a latent outcome controlling for latent and observed covariates. On the generative side, the SEM VAE encodes theory-aligned neural networks that simultaneously learn flexible and unknown measurement and structural models. On the targeted learning side, we construct an efficient influence function for the treatment effect with the SEM-VAE implied posterior distribution of the latent variables to estimate the average effect. Across simulations, the TAG learning approach uniformly outperforms alternative approaches and demonstrates little to no bias in finite samples. TAG learning offers a principled, flexible, and practical approach to causally interpretable, theory-aligned data-adaptive analysis with latent variables. More practically, the framework allows nearly unbiased estimation of causal effects involving latent variables, even in the presence of unknown functional relationships in the measurement and structural models (e.g., data-adaptive effect estimation for nonlinear/nonadditive SEMs). We illustrate the method by investigating the effect of mental health induction support on teacher mental health during the first year of teaching.