7 Defining the inquiry
An inquiry is a question we ask of the world, and in the same way, of our models of the world. If we stipulate a reference model, \(m\), then our inquiry is a summary of \(m\). Suppose in some reference model that \(X\) affects \(Y\). One inquiry might be descriptive: what is the average level of \(Y\) when \(X=1\), under the model? A second might be causal: what is the average treatment effect of \(X\) on \(Y\)? A third is about counterfactuals: for what share of units would \(Y\) have been different if \(X\) were different? If a model involves more variables, many more questions open up, for instance, regarding how the effect of one variable passes through, or is modified by, another.
When designing research, we should have our inquiries front of mind. Amazingly, very many research projects do not specify the target of inference, focusing instead on the specification of estimation procedures. At some stages of research it is not possible to specify the inquiry with great precision. In early stages you may need to do model-building research in order to find out what the right question is. But once you are at the stage of thinking through inferential strategies you need an inquiry in mind in order to select among options. For the same reason readers need to know your inquiry in order to evaluate your choices.
Formally, an inquiry is a summary function I that operates on an instance of a model \(m \in M\). When we summarize the model with the inquiry, we obtain an “answer under the model.” We formalize this idea as \(I(m) = a_m\), with the important special case \(a_{m^*}\) representing our estimand. The difference between I and \(a_m\) is the difference between a question and its answer: I is the question we ask about the model and \(a_m\) is the answer.
In this book when we talk about inquiries, we will usually be referring to single-number summaries of models. Some common inquiries are descriptive, such as the means, conditional means, correlations, partial correlations, quantiles, and truth statements about variables in the model. Others are causal, such as the average difference in one variable when a second variable is set to two different values. We can think of a single-number inquiry as the atom of a research question.
While most inquiries are “atomic” in this way, some inquiries are more complex than a single-number summary. For example, the best linear predictor of \(Y\) given \(X\) is a two-number summary: it is the pair of numbers (the slope and intercept) that minimizes the total squared distance between the line and each value of \(Y\). No need to stop at two-number summaries though. We could imagine the best quadratic predictor of \(Y\) given \(X\) (a three-number summary), and so on. We could have an inquiry that is the full conditional expectation function of \(Y\) given \(X\), no matter how wiggly, nonlinear, and nuanced the shape of that function. It could in principle be a 1,000-number summary of the model, or something still more complex.
The inquiry could be constituted by a series of interrelated questions about the model. Indeed the goal of the research may be to generate or to test a model of the world.^{1} For instance, a researcher might articulate a handful of important questions about the model that all have to come out a certain way or the model itself should be rejected. These complex inquires are made up of a series of atomic inquiries. We’re interested in the sub-inquiries only insofar as they help us understand the real inquiry—is this model of the world a good one or not?
7.1 Elements of inquiries
Every inquiry operates on the events generated by the model. We can think of the events as the data set that describes the units, treatment conditions, and outcome variables over which inquiries can be defined. This definition is closely connected to the common UTOS (units, treatments, outcomes, and settings) framework (Shadish, Cook, and Campbell 2002). The units are from the set of units within the model that the inquiry refers to, either all or a subset. The treatment conditions represent the set chosen for study. A descriptive inquiry is a summary of a single condition (reality), whereas a causal inquiry is a summary of multiple conditions. The outcomes are the set of nodes in the model that the inquiry concerns. Finally, the inquiry operates on the model events via a summary function. For example, the “population average” inquiry summarizes the outcome for all units in the population with the mean function. We discuss each element of inquiries in turn.
7.1.1 Units
The units of an inquiry are the set of people, places, or things that we are interested in studying. They may refer to all the units in a study or just a subset of them. For example, we can distinguish between many different average causal effect inquiries on the basis of their units: the average treatment effect (ATE) refers to all units in the study, the average treatment effect on the treated (ATT) refers to the units who actually are treated, the average treatment effect on the untreated (ATU) refers to those who actually are not treated, and the complier average causal effect (CACE) refers to those who would take treatment if assigned to be treated but not otherwise.
The reason we need to specify the units of an inquiry is inquiry values (estimands) may differ across units. If the units that are included in the sample live in easier-to-reach areas and people who live in easier-to-reach areas are wealthier than others, the sample average will differ from the population average—and also from the average among those in hard-to-reach places.
The choice of which set of units to focus on depends on theoretical considerations. To whom does the theoretical expectation apply? As a general matter, seeking insights that apply across many individuals is the goal of many social scientists. We are not typically interested in the effect of a treatment or the average outcome in a random sample of 100 units because we care about those units in particular, but because we wish to understand the treatment effect or outcomes in a broader population. Our theories often have so-called scope conditions, which define the types of units for which our theory is operative. A mechanism might operate only for coethnics of a country’s president, small to medium towns, blue collar workers, or the mothers of daughters. The units of an inquiry should be defined by these theoretical expectations, not by which inquiries our data and answer strategies can target easily.
Distinctions among inquiries often arise in debates over instrumental variables designs, which target local average treatment effects (LATEs), meaning the average treatment among a subset. The effect these designs estimate is the average treatment effect among those units that are “compliers.” Compliers are the subset of units that take treatment if assigned and don’t take treatment if not assigned. The effect among compliers may or may not be like the effect among the whole sample or the population from which the sample was drawn. The debate between Deaton (2010) and Imbens (2010) centers precisely on which inquiry is the appropriate one, the LATE among compliers or the ATE in the whole sample. In many settings, the LATE may be the only inquiry we can reliably estimate, so the question becomes—is the LATE a theoretically relevant inquiry?
If the inquiry is defined with respect to the units sampled by the data strategy, then we do not have to engage in generalization inference—we learn directly about the sample from the sample. But if the inquiry is defined at the population level, then we need to generalize from the sample to the population. We also need to engage in generalization inference when we want to generalize study results to other populations that we did not explicitly sample from. Whether an inquiry requires generalization inference depends on the data strategy in the following way. If the data strategy samples the units that define the inquiry, we do not need to generalize beyond the study. If the data strategy explicitly samples from a well-defined population, we can generalize from sample to population using canonical sampling theory. But if we want to generalize to an inquiry defined over some other set of units (for example, Brazilian citizens ten years in the future), we need to engage in generalization inference (see Egami and Hartman 2022).
7.1.2 Outcomes
Every inquiry is also defined by what outcomes are considered for each of the units. The choice of outcome also draws on theory: what outcomes are to be described, or with which outcomes do we want to measure the effects of a treatment? An inquiry might be about a single outcome or multiple outcomes. The average belief that climate change is real would be a single-outcome inquiry, and the difference between that belief and support for government rebates for purchasing electric vehicles a multiple-outcome inquiry.
In some cases, an inquiry will be about a latent outcome that we cannot directly measure, such as preferences, attitudes, or emotions. We can construct data strategies that measure these latent outcomes by asking questions or observing behavior, but we cannot directly measure them. Even though these constructs may be difficult or impossible to measure well, it is often preferable to define the inquiry in terms of the latent outcome of interest rather than in terms of the measured outcome.
7.1.3 Treatment conditions
The final element of an inquiry is the treatment conditions under consideration and, in the case of more than one, compared.
Descriptive inquiries are defined with respect to one single treatment condition. That treatment condition is often the “unmanipulated” condition in which the researcher exposes units to no additional causal agents. Here the goal is not to learn about the summaries of the distributions of outcomes as we observe them. Table 7.1 (top panel) enumerates some common descriptive inquiries. These inquiries have in common that you do not need any counterfactual quantities in order to define them. The covariance (similarly, the correlation) between \(X\) and \(Y\) enters as a descriptive inquiry; so too does the line of best fit for \(Y\) given \(X\). For each descriptive inquiry, we list the units, treatment conditions, and outcomes that define them. We also provide R
code snippets for each.
Inquiry | Units | Treatment conditions | Outcomes | Code |
---|---|---|---|---|
Average value of variable Y in a finite population | Units in the population | Unmanipulated | Y | mean(Y) |
Average value of variable Y in a sample | Sampled units | Unmanipulated | Y | mean(Y[S == 1]) |
Conditional average value of Y given X = 1 | Units for whom X = 1 | Unmanipulated | Y | mean(Y[X == 1]) |
The variance of Y | Units in the population | Unmanipulated | Y | pop.var(Y) |
The covariance of X and Y | Units in the population | Unmanipulated | X, Y | pop.cov(X, Y) |
The best linear predictor of Y given X | Units in the population | Unmanipulated | Y | cov(Y, X) / var(X) |
Conditional expectation function of Y given X | Units in the population | Unmanipulated | Y | cef(Y, X) |
Causal inquiries involve a comparison of at least two possible treatment conditions. For example, an inquiry might be the causal effect of \(X\) on \(Y\) for a single unit. In order to infer that causal effect, we would need to know the value of \(Y\) in two worlds: one world in which \(X\) is set to 1 and one in which \(X\) is set to 0. Table 7.2 (middle panel) enumerates some common causal inquiries. These inquiries vary in the units they refer to. For instance, some are questions about samples (SATEs) and others about populations (PATEs). Inquiries can also be defined for units of a particular covariate class (CATEs). Finally, they may be summaries of more than one potential outcome. For instance, the interaction effect is defined here at the individual level as the effect of one treatment on the effect of another treatment.
Inquiry | Units | Treatment conditions | Outcomes | Code |
---|---|---|---|---|
Average treatment effect in a finite population (PATE) | Units in the population | D = 0, D = 1 | Y | mean(Y_D_1 - Y_D_0) |
Conditional average treatment effect (CATE) for X = 1 | Units for whom X = 1 | D = 0, D = 1 | Y | mean(Y_D_1[X == 1] - Y_D_0[X == 1]) |
Complier average causal effect (CACE) | Complier units | D = 0, D = 1 | Y | mean(Y_D_1[D_Z_1 > D_Z_0] - Y_D_0[D_Z_1 > D_Z_0]) |
Causal interactions of \(D_1\) and \(D_2\) | Units in the population | D1 = 1, D1 = 0, D2 = 1, D2 = 0 | Y | mean((Y_D1_1_D2_1 - Y_D1_0_D2_1) - (Y_D1_1_D2_0 - Y_D1_0_D2_0)) |
Generations of students have been told to excise words that connote causality from their empirical writing. “Affects” becomes “is associated with” and “impacts” becomes “moves with.” Being careful about causal language is of course very important (it’s really true that correlation does not imply causation!). But this change in language is not usually accompanied by a change in inquiry. Many times we are faced with drawing causal inferences from less than ideal data, but the deficiencies of the data strategy should not lead us too far away from our inferential targets. If the inquiry is a causal inquiry, then although the move from “causes” to “is correlated with” might involve more defensible claims about the data, we still need a strategy to get to an answer to the inquiry.
7.1.4 Summary functions
With the units, treatments, and outcomes specified, the last element of the inquiry is the summary function that is applied to them. For a great many inquiries, this function is the mean
function: the ATE, the CATE, the LATE, the SATE, the population mean—these are all averages. These and other inquiries are “decomposable” in the sense that you can think of an average effect for a large group as being the weighted average of a set of average effects of smaller groups.
However, not all inquiries are of this form. For example, the line of best fit is defined as the covariance of X and Y divided by the variance of X. This inquiry is a complex summary of all the units in the model.
The inquiry that the regression discontinuity design shoots at is also non-decomposable. In the RDD model (see Section 16.5), we imagine units with \(Y_i(1)\), \(Y_i(0)\). Each \(i\) also has a value on a “running variable”, \(X_i\), and units receive treatment if and only if \(X_i>0\). In this case the “effect at the point of discontinuity” might be written:
\[E_{i|X_i = 0}(Y_i(Z_i=1) - Y_i(Z_i=0))\]
Curiously, however, there may be no units for whom \(X_i\) equals exactly 0 (a candidate who wins exactly 50% of the vote happens, but it is rare), so we cannot easily think of the inquiry as being a summary of individual potential outcomes. Instead, we construct a conditional expectation function for both potential outcome functions with respect to \(X_i\) and evaluate the difference between these when \(X_i = 0\). Though not an average of individual effects, this difference is nevertheless a summary of the potential outcomes.
7.2 Types of inquiries
The largest division in the typology of inquiries is between descriptive and causal inquiries. It is for this reason that Part III, the design library, is organized into descriptive and causal chapters, separated by whether the data strategy is observational or experimental. In this section, we describe other important ways inquiries vary and how to think about declaring them.
7.2.1 Data-dependent inquiries
Most of the inquiries we have introduced thus far depend on variables in the model, but not on features of the data and answer strategies. However, common inquiries do depend on realizations of the research design.
The first type depends on realizations of the data \(d\): inquiries about units within a sample depend on which units enter the sample; inquiries about treated units depend on which are treated. For example, the average treatment effect on the treated (ATT) is a data-dependent inquiry in the sense that it is the average effect of treatment among the particular set of units that happened to be randomly assigned to treatment. The value of that particular ATT doesn’t change depending on the data strategy, of course, but which ATT we end up estimating depends on the realization of the data strategy. Table 7.3 describes three data-dependent inquiries.
Inquiry | Units | Treatment conditions | Outcomes | Code |
---|---|---|---|---|
Average treatment effect in a sample (SATE) | Sampled units | D = 0, D = 1 | Y | mean(Y_D_1[S == 1] - Y_D_0[S == 1]) |
Average treatment effect on the treated (ATT) | Treated units | D = 0, D = 1 | Y | mean(Y_D_1[D == 1] - Y_D_0[D == 1]) |
Average treatment effect on the untreated (ATU) | Untreated units | D = 0, D = 1 | Y | mean(Y_D_1[D == 0] - Y_D_0[D == 0]) |
7.2.2 Causal attribution inquiries
A causal attribution inquiry is a different kind of data-dependent inquiry. A causal effect inquiry focuses on the change in an outcome that would be induced by a change in the causal variable, irrespective of the values that the outcome takes in the realized data. By contrast, causal attribution inquiries focus on inquiries that condition on realized outcomes, such as the “the absence of the outcome in the hypothetical absence of the treatment (\(Y_i(0) = 0\)) given the actual presence of both (\(D_i = Y_i = 1\))” (Yamamoto 2012, 240–41). In other words, had this feature been different would the outcome have been different? Goertz and Mahoney (2012) and others refer to causal attribution inquiries as cause-of-effects questions because they start with an outcome (an “effect”) and seek to validate a hypothesis about its cause.
The dependence of these inquiries on actual outcomes makes them harder (though not impossible!) to answer with the tools of quantitative science, though they are often of central interest to scientific and policy agendas and have occupied a large number of qualitative studies. Questions like “Was economic crisis necessary for democratization in the Southern Cone of Latin America?” or “Were high levels of foreign investment in combination with soft authoritarianism and export-oriented policies sufficient for the economic miracles in South Korea and Taiwan?” are examples of such inquiries (Goertz and Mahoney 2012). Though they bear a resemblance to causal effect inquiries that focus on observed subsets (such as the average treatment effect on the treated, or ATT)^{2} it is important not to confuse the two kinds of inquiries.
While it is increasingly common to explicitly formalize causal effect inquiries, it is less common to formalize causal attribution inquiries. Doing so, however, can provide the specificity required to diagnose a design. Pearl (1999) provides formal definitions for these inquiries using the language of causal necessity and sufficiency, depicted in Table 7.4. To put these inquiries in the context of the democratic peace hypothesis (which states that no two democracies will go to war), for example, in a given country dyad-year, \(Y_i = 1\) and \(D_i = 1\) could represent “Peace” and “Both democracies” and \(Y_i = 0\) and \(D_i = 0\) could represent “War” and “Not both democracies.”^{3} Then \(\Pr(Y_i(D_i = 0) = 0 \mid D_i = Y_i = 1)\) asks, among peaceful, fully democratic dyads, what is the proportion that would have had wars were they not both democracies—that is, in what proportion of dyad-years was democracy a necessary cause of peace? Similarly, \(\Pr(Y_i(D_i = 1)=1 \mid D_i = Y_i = 0)\) asks, among dyads that had a war and at least one non-democracy in a given year, what is the proportion that would have experienced peace if both countries were democracies—in other words, in what proportion of cases would democracy have been sufficient to cause peace? Yamamoto (2012) extends on this account to focus on causal attribution inquiries for particular subsets, such as compilers.
Inquiry | Units | Treatment conditions | Outcomes | Code |
---|---|---|---|---|
Probability D necessary for \(Y\) | Units for whom D = 1 and Y = 1 | D = 0 | Y | mean(Y_D_0[D == 1 & Y == 1] == 0) |
Probability D sufficient for \(Y\) | Units for whom D = 0 and Y = 0 | D = 1 | Y | mean(Y_D_1[D == 0 & Y == 0] == 1) |
Complier probability D necessary for \(Y\) | Units for whom D = 1 and Y = 1 who are compliers | D = 0, Z = 1, Z = 0 | Y | mean(Y_D_0[D == 1 & Y == 1 & D_Z_1 == 1 & D_Z_0 == 0] == 0) |
7.2.3 Complex counterfactual inquiries
The causal inquiries we have considered thus far have involved comparisons of the counterfactual values an outcome could take, depending on the value of one or more treatment variables. These inquiries are mind-bending in that we have to imagine two counterfactual states at the same time. Complex counterfactual inquiries require more mind bending still.
An example of a complex counterfactual inquiry is the “natural direct effect.” Suppose our model contains a treatment \(Z_i\), a mediator \(M_i\), and outcome \(Y_i\). The natural direct effect of the treatment is defined as: \[\mathrm{NDE} = Y_i(Z=1, M_i=M_i(Z_i=1)) - Y_i(Z_i=0, M_i=M_i(Z_i=1))\]
In the second term of this expression, we have to hold in our minds the complex counterfactual: what is the level of \(Y_i\), when \(Z_i\) equals 0, but \(M\) is at the value it would take if \(Z_i\) equaled one?
7.2.4 Inquiries with continuous causal variables
The forgoing causal inquiries have focused on contrasts between discrete levels of a treatment variable. We can also imagine many different types of inquiries defined in continuous treatment spaces. For example, we could think of the effects of any level of salary from 5 dollars an hour to 500 dollars an hour on workplace satisfaction. We could “discretize” these continuous treatments in bins, in which case we are back to defining inquiries with discrete treatment conditions. Another possibility is to describe the inquiry as the average of the slopes from many lines of best fit. For each subject, we describe the line of best fit of the outcome with respect to the treatment. Our inquiry is then the average of the resulting slopes.
7.3 How to define inquiries
There are multiple criteria for choosing an inquiry. We want to pick one that is interesting in its own right or one that would facilitate a real-world decision. We want to pick research questions that we can learn the answer to someday, possibly with a lot of effort. We want to avoid unfeasible research questions. Among feasible research questions, we want to select ones that we are likely to obtain the most informative answers, in terms of moving our priors the most.
Sometimes, advisers tell students to follow a “theory-first” route to picking a research question. Read the literature, find an unsolved puzzle, then start choosing among the methodological approaches that might answer the problem. Others are more skeptical of starting with questions that might not be answerable and encourage students to first master tools that can answer particular types of questions and then find places to apply them. You don’t have to subscribe to either of these positions but you do have to keep an eye simultaneously on the substantive importance of questions and the scope for generating informative answers.
The first criterion for a good inquiry is then the subjective importance of a question. The answer may be important for science (building a theoretical understanding of the world) or for decision-making (choosing which policies to implement). Even so, the scientific enterprise is designed around the idea that importance is in the eye of the beholder and is not some objective quantity. This is for two reasons. First, the scientific or practical importance of a discovery may not be understood until decades later, when other pieces of the causal model are put together or the world faces new problems. Moreover, “importance” differs for different segments of society, and scientists must be able to study questions not judged important by groups in power in order to discover new ways to solve problems faced by the left-out groups.
The second important criterion for a good inquiry is that it should be answerable, or at least partially answerable. The main way an inquiry might not be answerable is we can’t find a feasible data or answer strategy. When for ethical, legal, logistical, or financial constraints, we simply can’t conduct the study, the inquiry is not answerable.
There are subtler ways in which an inquiry might not be answerable. For example, it might be undefined. Inquiries are undefined when I returns \(I(m) = a_m = \mathrm{NA}\). For example, sometimes audit studies consider the effect of treatment on responding to an email and on the tone of the email. However, in conditions where the email is never sent, it has no tone. As a result, we can’t learn about the average effect of treatment on tone, we can only learn about the effect in a subgroup: those units who always respond to email, regardless of condition. This new inquiry is well-defined, but hard to estimate (see Coppock 2019).
An inquiry is also not answerable if it is not, at least partially, identifiable. A question is at least partly answerable if there are at least two different sets of data you might observe that would lead you to make two different inferences. In the best case, one might imagine that you have lots of data and each possible data pattern you see is consistent with only one possible answer. You might then say that your model, or inquiry, is identified. Failing that, you might imagine that different data patterns at least let you rule out some answers even though you can’t be sure of the right answer. In this case we have “partial identification.” Some inquiries might not even be partially identifiable. For instance if we have a model that says an outcome \(Y\) is defined by the equation \(Y=(a+b)X\), no amount of data can tell us the exact values of \(a\) and \(b\). Indeed without limits on the values of \(a\) and \(b\) (such as \(a\geq0\)), no amount of data can even narrow down the ranges of \(a\) and \(b\). The basic problem is that for any value of \(a\) we can choose a \(b\) that keeps the sum of \(a+b\) constant. In this setting, even though there is an answer to our inquiry (\(a\)) in theory, it is not one we can ever answer in practice. Many other types of inquiries, such as mediation inquiries, are not identifiable. There are some circumstances in which we can provide a partial answer to the inquiry, such as learning a range of values within which the parameter lives. At a minimum, we urge you to pose inquiries that are at least partially answerable with possible data.
One place in which a tradeoff between substantive importance and answerability can come to a head is in selecting the population for which an inquiry is defined.
One common approach is to define inferences with respect to a “finite population.” For instance, all US states. You might then sample from this finite population in the data strategy in such a way that you can use the sample to draw inferences about the population. The probability distribution over the exogenous variables simply enumerates the values that these variables take on in the population. Any randomness in the design is generated by the sampling and, perhaps, by assignment procedures, not in the values of the exogenous variables.
A second, and closely related, approach is to define inferences for a finite sample. This is like population inference when you sample the whole population. In a sense the sample is the population. Finite sample inference is common in research designs that involve random assignment of treatments. The only source of randomness in the finite sample setting is the random assignment itself.
A third approach is to think in terms of “superpopulations”, in which we imagine that any particular population is just a draw from an infinite superpopulation. In this case, we can conceive of the randomness in the design as being fundamental—every unit is a random draw from the superpopulation.
Implicitly if you set up a simulation and draw data using some probability density function, you are drawing from a superpopulation. But you get to specify the type of target of inference when you specify the inquiry, as in Declaration 7.1.
Declaration 7.1 Super-population, population, and finite sample design
Here is one draw on the estimands:
draw_estimands(declaration_7.1)
inquiry | estimand |
---|---|
superpopulation_mean | 1.00 |
population_mean | 1.19 |
sample_mean | 1.50 |
Which of these to choose? Researchers sometimes prefer superpopulation inquiries because they describe general processes on substantive grounds, seeing understanding general processes as the primary goal of social scientific interest. Some are skeptical however about speculating about general, unobservable, processes, preferring to make statements about cases that actually exist in the world. They prefer to select populations of substantive importance. Others seeking to avoid engaging in generalization inference prefer to focus on sample quantities. In some cases the statistics are more suited for finite populations: for instance, randomization inference is based on the randomness induced by assignment and not from sampling populations from superpopulations or samples from populations.^{4} Critics worry that keeping the focus on the sample means having inquiries that are determined by the realizations of your data strategy rather than having data strategies developed to answer prespecified inquiries.
7.4 Summary
Inquiries define our targets of inference, stated in terms of a model of the world. We described how inquiries are defined with respect to specific outcomes expressed by specific units under specific conditions, which are summarized with a chosen function. Inquiries can be descriptive or causal, depending on the mix of conditions they depend on. They can be data-dependent or not and decomposable or not. They should be well-defined and they should be answerable. When we lose track of our inquiries, research studies can end up estimating whatever the answer strategy ends up targeting. Researchers should choose their inquiries with intention so that they can select appropriate empirical strategies for them.
Here, we are referring to an “inquiry model” not a “reference model,” as discussed in Section 2.1.1. We provide an example of this type of inquiry model in Section 19.2.↩︎
Specifically, as Yamamoto (2012) points out, the causal attribution inquiry for binary variables can be written \(\Pr(Y_i(0) = 0 \mid D_i = Y_i = 1)\), while the average treatment effect among those successfully treated can be written \(E[Y_i(1) - Y_i(0) \mid D_i = Y_i = 1]\). Given binary outcomes and the additive property of expectations, the ATE among those successfully treated can be written \(\Pr(Y_i(1)\mid D_i = Y_i = 1) - \Pr(Y_i(0) \mid D_i = Y_i = 1)\). The causal attribution inquiry can be written as 1 minus the second term of the ATE among the successfully treated.↩︎
Note that we can think of the probability in this statement as implying a population level inquiry—the share with a given feature; or as a representation of a unit level Bayesian answer to the question. See e.g. Dawid, Humphreys, and Musio (2022).↩︎
For some purposes the statistics are easier for the superpopulation quantities: for instance, the Neyman variance estimate is exact in this case but conservative in the finite population case (for instance Aronow, Green, and Lee (2014)).↩︎