A goal of almost every scientific field is causal inference; that is, implying a cause-and-effect relationship from data, so that one can quantify how intervening on certain variables affects other variables in a system. Recently, researchers have started to combine causal inference with modern natural language processing (NLP) models in order to analyze text data, which can be a rich source of information about human behavior, thought, and interactions. However, methods in this area have largely been focused on estimating average causal effects. The goal of many real world applications, is estimating heterogenous causal effects by exploring relationships between specific variables so that the knowledge can be customized for sub-groups. For example, knowledge to help clinicians decide which medications to prescribe to specific patients, central bank committees' decisions on interest rates in relation to changes in key variables, and platform administrators deciding how to optimally manage users. Estimating the effect of interventions from data can help inform decision making, particularly when effect estimates vary based on individuals’ features. A rich, unstructured source of features is written text: notes from electronic health records (EHRs) detail patients’ personal and medical histories, newspaper articles document national and international events, and online platforms host exchanges of users’ written opinions. Yet, there exist few methods that can incorporate importa