IV method and Shift-Share instruments

A quick introduction to IV regressions in causal analysis and Shift Share instruments

Instrumental Variable approach for Causal Inference

If we believe that a variable XX causes (or affects) another variable YY, we are often interested in studying how much the value of YY changes if we make a small δx\delta x change in XX. For example, how much do people dislike traffic on their street can be measured by decrease in house prices in response to increase in street traffic. The statistical technique most commonly used to answer causal questions is the linear regression [1]

Y=βX+Cγ+ϵ Y = \beta X + C\gamma + \epsilon

where CC is the matrix of control variables and we assume that standard OLS assumption holds. However, even when XX and YY are causally related, the estimate of β\beta is unbiased only if XX and ϵ\epsilon are independent conditioned on CC. This is often not the case in practice. In our example, the construction of a new housing development in a neighborhood may increase traffic while lowering the house prices, but this decrease in prices would not be the effect of increased traffic. Similarly, if a shopping complex is built in a neighborhood, it may increase traffic and house prices without this being a causal effect. Instrumental Variable estimation addresses this endogeneity problem by using an instrument ZZ that affects YY only through XX. By restricting to studying variation in YY due to variation in XX that originates through ZZ, it is possible to obtain unbiased estimates of β\beta.

fig1
Figure 1: IV estimate in presence of omitted variable

Let us build an intuitive understanding of omitted variable bias in OLS and how IV estimates remove the bias. Consider the causal relation described in figure 1a. AA is an omitted variable which confounds XX and YY. For concreteness, let’s assume that the true data generating process is given by

X=A+Z+ηY=2X+2A+ζ \begin{align*} X &= A+Z + \eta\\ Y &= 2X + 2A + \zeta \end{align*}

Assume that AA and ZZ are independent, have mean 00 and variance 11. The length of line segments in Figure 1b represent the variation in the variables. Without accounting for the omitted variable, the standard OLS Y=βX+ϵ Y = \beta X + \epsilon gives the incorrect causal estimate

β^OLS=Cov(X,Y)Var(X)=3 \hat \beta_{OLS} = \frac{Cov(X,Y)}{Var(X)} = 3

In the IV framework, we use an instrument ZZ that affects YY only through XX (exogeneity requirement). The IV regression proceeds in 2 stages [2]

First Stage:   X=δZ+η^    X^=δ^IV1ZSecond Stage:   Y=βX^+ζ^=βδ^IV1Z+ζ^ \begin{align*} \text{First Stage}&:~~~ X = \delta Z + \hat \eta \implies \hat X =\hat \delta_{IV1} Z \\ \text{Second Stage}&:~~~ Y = \beta \hat X + \hat \zeta = \beta \hat \delta_{IV1} Z + \hat \zeta \end{align*}

In the stage 1, we isolate the variation in XX caused by the variation in ZZ given by X^=δ^IV1Z\hat X =\hat \delta_{IV1} Z. This is represented by the blue part of line segment representing XX in Figure 1b. This variation is independent of the omitted variable AA, and we use this variation in X^\hat X to identify the causal effect of XX on YY (represented by blue part of YY’s line segment). The estimates are given by

δ^IV1=Cov(X,Z)Var(Z)=1β^IV2=Cov(X^,Y)Var(X^)=1δ^IV1Cov(Y,Z)Var(Z)=Cov(Y,Z)Cov(X,Z)=2 \begin{align*} \hat \delta_{IV1} &= \frac{Cov(X,Z)}{Var(Z)} = 1 \\ \hat \beta_{IV2} &= \frac{Cov(\hat X,Y)}{Var(\hat X)} = \frac{1}{\hat \delta_{IV1}} \frac{Cov(Y,Z)}{Var(Z)} = \frac{Cov(Y,Z)}{Cov(X,Z)} = 2 \end{align*}

which is the correct causal effect of XX on YY. Although we only looked at the omitted variable bias, IV method is also useful to correct for “simultaneous causality bias” and the “Errors-in-variables bias”. To wrap our discussions, let us formally state the necessary conditions required for the IV method [3]

  1. Relevance: The instrument variable ZZ should be correlated with the causal variable of interest XX. This can be easily checked from significance level of the first stage estimate. Higher correlation in the first stage means that the instrument can more effectively extract the exogenous variation in the regressor and hence the causal estimate has a lower standard error.
  2. Exogeneity: The instrument variable is uncorrelated with the error term in the regression equation (Cor(Z,ϵC)=0Cor(Z,\epsilon | C) = 0 where CC is a vector of controls). Since the error term in unobserved, this condition is not statistically testable and needs to be justified theoretically. Sometimes, this is specified as the combination of exclusion restriction and as-if random condition.
    • Exclusion Restriction: Fixing controls CC, the instrument ZZ affects outcome YY only through XX.
    • As-if random: Fixing controls CC, the instrument itself must not be endogenous. This rules out any reverse causality from YY to XX.

Shift-Share instruments

In many economic settings, the shocks that a researcher would like to use as instruments operate at a different level than the units being studied. For example, a researcher studying regional labor markets may find plausibly exogenous variation at the industry level — such as changes in trade policy, technology, or national demand that hit specific industries. But the outcome equation is specified at the regional level, because that is where workers live, earn wages, and make decisions.

The shock does not map one-to-one onto regions: a single industry shock affects many regions, and a single region is exposed to many industry shocks, each to a different degree depending on its industrial composition. Shift-share instrumental variables (also known as Bartik instruments) provide a principled way to translate these shock-level instruments to the unit level. This section draws on the practical guide by Borusyak, Hull, and Jaravel (2025).

Structure of a shift-share instrument

Consider a model where we wish to estimate the causal effect of treatment xix_i on outcome yiy_i across units ii:

yi=βxi+γwi+εi y_i = \beta x_i + \gamma' \mathbf{w}_i + \varepsilon_i

where wi\mathbf{w}_i is a vector of controls. As before, OLS is biased when xix_i is correlated with εi\varepsilon_i. Suppose that a set of shocks (g1,,gK)(g_1, \ldots, g_K) at some other level kk (industries, origin countries, age groups, etc.) are plausibly exogenous. Each unit ii has a known vector of exposure shares (si1,,siK)(s_{i1}, \ldots, s_{iK}) that capture how much unit ii is exposed to each shock kk. The shift-share instrument aggregates these into a single unit-level variable:

zi=k=1KsikSharegkShift z_i = \sum_{k=1}^{K} \underbrace{s_{ik}}_{ \text{Share} } \cdot \underbrace{g_k}_{ \text{Shift} }

When the shares sum to one, ziz_i is a share-weighted average of the shifts. The instrument inherits its exogeneity from the shifts, while the shares give it cross-sectional variation at the unit level — different regions are affected differently by the same set of national shocks because their industrial compositions differ.

Example 1: Inverse elasticity of regional labor supply

The canonical example is Bartik’s (1991) instrument for local labor demand. Suppose we want to estimate the inverse elasticity of regional labor supply β\beta by relating wage growth yiy_i to employment growth xix_i across regions ii. Local employment can be decomposed across industries:

xi=kXik0Xi0xikx_i = \sum_k \frac{X_{ik0}}{X_{i0}} \cdot x_{ik}

where Xik0/Xi0X_{ik0}/X_{i0} is the initial employment share of industry kk in region ii, and xikx_{ik} is the local industry growth rate. To isolate demand-driven variation, the Bartik instrument replaces local industry shifts xikx_{ik} with national industry growth rates gkg_k, while keeping the local shares sik=Xik0/Xi0s_{ik} = X_{ik0}/X_{i0}. The national growth rates proxy for aggregate demand shifts and should be uncorrelated with local labor supply shocks.

Example 2: US labor market impact of Chinese import competition

Autor, Dorn, and Hanson (2013) study how Chinese import competition affected US local labor markets. Their treatment is the change in import exposure per worker in commuting zone ii, measured as a share-weighted sum of industry-level changes in US imports from China. Since realized US imports may reflect US-specific demand shocks, they instrument with Chinese import growth in eight other high-income countries:

ΔIPWoit=jLij,t1Luj,t1ShareΔMocjtLi,t1Shift \Delta IPW_{oit} = \sum_j \underbrace{\frac{L_{ij,t-1}}{L_{uj,t-1}}}_{\text{Share}} \cdot \underbrace{\frac{\Delta M_{ocjt}}{L_{i,t-1}}}_{\text{Shift}}

where Lij,t1/Luj,t1L_{ij,t-1}/L_{uj,t-1} is region ii’s lagged share of national employment in industry jj, and ΔMocjt\Delta M_{ocjt} is the change in imports from China to other developed markets in industry jj. Lagged employment avoids simultaneity. The identification relies on China’s export surge being driven by internal supply-side factors (productivity growth, WTO accession) rather than correlated demand shocks across importing countries.

Two paths to identification

As discussed above, the causal identification in the IV method requires the instrument to be relevant and exogenous. The core challenge with any shift-share instrument is arguing that ziz_i is exogenous, i.e. uncorrelated with εi\varepsilon_i. Since ziz_i combines two distinct sources of variation — shifts and shares — there are two paths to making this argument.

Path 1: Exogenous shifts

The first strategy places the exogeneity burden on the shifts. If each shift gkg_k is as-good-as-randomly assigned and only affects the outcome through the treatment, then a weighted average of many such random shifts will also be exogenous. The intuition is that of a “weighted average of lotteries”: if each industry shock is like an independent draw, then the shift-share instrument — which averages across many of them — inherits this randomness by a law of large numbers argument.

What makes this powerful is that the shares need not be exogenous at all. Regions that specialize in high-skill industries may have systematically different unobservables from those that specialize in low-skill industries. But as long as the shocks hitting high-skill and low-skill industries are themselves random, these compositional differences wash out in expectation.

More formally, exogeneity of the instrument requires a weaker condition than full randomization of each gkg_k:

E[gk(isikεi)]=0 \mathbb E\left[ g_k \left( \sum_i s_{ik} \varepsilon_i \right) \right] = 0

This is sufficient for E[ziεi]0\mathbb{E}[z_i \varepsilon_i] \approx 0. In words: each shift shock must not be systematically correlated with the idiosyncratic unobservables of the units most heavily exposed to it.

Returning to the China shock example: even if regions with more manufacturing employment have systematically different labor market trends, the instrument is valid as long as China’s productivity shocks across industries are unrelated to US regional labor market conditions.

Requirements:

  • Many shifts g1,,gKg_1, \ldots, g_K are necessary. Otherwise, there might be spurious correlation between shift shocks gkg_k and average idiosyncratic shocks εˉik=isikεi\bar \varepsilon_i^k= \sum_i s_{ik} \varepsilon_i .
  • Shares must sum to one for each unit (ksik=1\sum_k s_{ik} = 1). Otherwise, if shares do not sum to one, then E[zisik]=μksik\mathbb{E}[z_i \mid s_{ik}] = \mu \sum_k s_{ik} where μ=E[gk]\mu = \mathbb{E}[g_k]. Units with a larger sum of shares systematically get higher values of the instrument, and this sum may be correlated with εi\varepsilon_i, creating bias. Including Si=ksikS_i = \sum_k s_{ik} as a control resolves this.

Practical considerations:

  • Shift-share aggregates of any shift-level confounders qkq_k should be included as controls (i.e. control for ksikqk\sum_k s_{ik} q_k)
  • Shares should be lagged to the beginning of the natural experiment to avoid the shifts themselves reshaping the shares
  • Standard errors should be “exposure-robust”, obtained from the equivalent shift-level IV regression (available via the ssaggregate package in Stata and R)

Path 2: Exogenous shares

A different strategy assumes that the exposure shares siks_{ik} are exogenous. This can be interpreted as each share satisfying a parallel trends condition: outcomes of units with high versus low values of siks_{ik} would have trended similarly absent the treatment gkg_k.

Under share exogeneity, the shift-share estimate can be viewed as pooling together KK “one-at-a-time” estimates, each using a single share siks_{ik} as the instrument. The exogenous shares approach is appropriate when the researcher is comfortable using any of the individual shares as an exogenous instrument — that is, when there are no conceivable unobserved shocks that affect the outcome via the same shares used to construct the instrument. This is bolstered when the shares are “tailored” to the treatment, in the sense of mediating only the shocks to xix_i and not a broad set of shocks that might affect yiy_i.

For example, Card (2009) studies the effect of immigration on native wages across US cities. The share siks_{ik} is the fraction of immigrants from origin country kk living in city ii in 1980, and the shift gkg_k is the national inflow of immigrants from country kk in later decades. The share of Cuban immigrants in Miami, say, is a plausible instrument because it is “tailored” — it captures exposure specifically to Cuban immigration shocks, not to labor market shocks in general. The parallel trends assumption is that cities with high versus low Cuban immigrant shares would have seen similar wage trends absent the immigration surge.

Practical considerations:

  • Shares must be “tailored” to the treatment, not “generic” (capturing exposure to many types of shocks)
  • Balance tests should be performed on individual shares, focusing on those with high Rotemberg weights
  • Rotemberg weights (computed via the bartik_weight command) measure the importance of each share instrument and the sensitivity of the estimate to violations of exogeneity for each share
  • Sensitivity to alternative ways of combining share instruments should be checked (e.g. overidentification tests, visual instrument variable plots)
  • Standard heteroskedasticity- or cluster-robust standard errors are appropriate

Choosing between the two approaches

Exogenous Shifts Exogenous Shares
Identification Shifts are as-good-as-randomly assigned and only affect outcome through treatment Each share satisfies parallel trends: units with high vs. low shares would have trended similarly absent treatment
Estimation Control for sum of shares (if not one) and shift-share aggregates of shift-level controls Check robustness to using share instruments directly (e.g. one share at a time, or pooled via 2SLS/LIML/GMM)
Inference Exposure-robust standard errors from equivalent shift-level IV regression Conventional heteroskedasticity- or cluster-robust standard errors
Balance tests For both the shift-share instrument and the shifts For both the instrument and shares with high Rotemberg weights
Do not use when… Shifts are too few or endogenous to use directly as instruments Shares are “generic” (capturing exposure to many types of shocks)

In some settings one approach is clearly more appropriate. For instance, the exogenous shifts approach requires many shifts for the law of large numbers argument to work, while the exogenous shares approach requires shares that are tailored to the specific treatment. In other settings, thinking through the potential bias and efficiency properties under each approach can help the researcher decide.

References

  • Autor, David H., David Dorn, and Gordon H. Hanson. 2013. “The China Syndrome: Local Labor Market Impacts of Import Competition in the United States.” American Economic Review 103(6): 2121–68.
  • Bartik, Timothy J. 1991. Who Benefits from State and Local Economic Development Policies? W. E. Upjohn Institute for Employment Research.
  • Borusyak, Kirill, Peter Hull, and Xavier Jaravel. 2022. “Quasi-Experimental Shift-Share Research Designs.” Review of Economic Studies 89(1): 181–213.
  • Borusyak, Kirill, Peter Hull, and Xavier Jaravel. 2025. “A Practical Guide to Shift-Share Instruments.” Journal of Economic Perspectives 39(1): 181–204.
  • Card, David. 2009. “Immigration and Inequality.” American Economic Review 99(2): 1–21.
  • Goldsmith-Pinkham, Paul, Isaac Sorkin, and Henry Swift. 2020. “Bartik Instruments: What, When, Why, and How.” American Economic Review 110(8): 2586–2624.
  • Adão, Rodrigo, Michal Kolesár, and Eduardo Morales. 2019. “Shift-Share Designs: Theory and Inference.” Quarterly Journal of Economics 134(4): 1949–2010.