Content overview | ECON3500: Econometrics and Applications

Exam 3 - Ch 8-10, 12

Tue, 14 Apr 2026 00:00:00 +0000

Quick links: What to know · Coverage guide · Exam structure · How to prepare · In-class practice · Slides · Practice exam

Coverage guide

The Exam 3 Coverage Guide summarizes everything you need to know for the exam — calculations, interpretations, and key concepts by chapter.

What should I know?

See the specific chapter guides from Chapter 8, Chapter 9, Chapter 10, and Chapter 12. You should understand all terms and definitions (be able to define and also apply them). You should also be comfortable interpreting regression output either written out in equation form or as Stata output. The level of PS4/PS5 problems is about right in terms of difficulty.

There will not be a Stata component to the exam.

I need more practice!

I recommend the following resources:

Odd-numbered problems from Stock and Watson - solutions are available here
In-class practice
Practice exam

What is the structure of the exam?

The exam will be a lot like a math exam - solving some problems and showing your work. I would expect about 3 multi-part questions. There may be some questions that involve definitions and/or true-false.

The exam is closed book with a calculator allowed. You are allowed to bring your own one-page formula sheet.

What is allowed?

Pen or pencil
One page formula sheet (double sided)
Calculator (basic, scientific, graphing)
Drink and quiet snack

What is not allowed?

~~Phone~~
~~Other notes~~
~~Textbook~~
~~Tablet/Laptop~~
~~Earbuds/headphones~~

How can I prepare?

A few ideas, depending on where you’re feeling least secure:

Work through problem sets and prevous in-class exercises (posted in their respective content section). Do them without looking at your past work!
Make one-page chapter guides. These are handy resources for the exam and a good way to make sure you’ve thought about what’s important
Make yourself a formula sheet. Useful to have, and also the act of creating it will help you
Watch videos/review slides when you have gaps.

In-class practice

In-class practice activities:

CH9 Activity: Activity · Solutions
CH10 Activity: Activity · Solutions
CH12 Activity: Activity · Solutions

Slides

📊 HTML 📥 Download PDF

Practice exam

Use this for additional practice and to get a sense of the format and difficulty of questions. It covers the same topics as this exam (panel data/fixed effects, instrumental variables, difference-in-differences).

(I strongly recommend completing the practice exams before reviewing the solutions)

Week 13 - Instrumental Variables

Sun, 12 Apr 2026 00:00:00 +0000

Overview

Let’s wrap up with a final hurrah - one more empirical tool to help us identify causal impacts.

Reading Guide

Chapter 12: Instrumental Variables Regression

SW 12.1 The IV Estimator with a Single Regressor and a Single Instrument

Make sure you understand the difference between endogeneous variables (correlated with error term $u$) and exogenous variables (not correlated). The textbook covers two conditions for a valid instrument: relevance ($corr(Z_i,X_i) \neq 0$) and exogeneity ($corr(Z_i,u_i) = 0$). To make the assumptions of IV explicit, I prefer to include an exclusion assumption: that the instrument $Z$ affects $Y$ only through $X$. If you come away from this section with the right vocabulary and a good intuition of what an instrument “does,” then you’re in good shape!

I find the derivation (in the text and video) helpful, but it’s not something I’d test you on.

SW 12.2 The General IV Regression Model

Now we get a bit technical. The key important things here are the understanding of two-stage least squares - what it is and how it works, plus a good understanding of those assumptions.

SW 12.3 Checking Instrument Validity

Make sure you know what a weak instrument is, why it’s a problem, and how we identify it. Make sure you understand the logic behind whether an instrument violates it’s exogeneity assumption.

nobody will remember:

your salary

how “busy you were”

how many hours you worked

people will remember:

when your instrumental variable violated the exclusion restriction

— @dggoldst.bsky.social
https://bsky.app/profile/dggoldst.bsky.social/post/3kzxkv76lds2v

Skip the test of overidentifying restrictions

SW 12.4-12.6 The Other Stuff

Good content and examples, but nothing new.

Other resources

Getting your head around IV can be tricky! Here are a few resources if you want to get another take:

Rebecca Barter: Understanding nstrumental variables: Helpful diagrams!
Pierre-Louis Vézina:Instrumental Variables - an intuitive guide: Overview plus a few more examples of IVs in the literature
Nick Huntington-Klein has the best econometrics resources 🔥 🔥.
- Check out his lecture slides on instrumental variables, an accompanying video lecture here.
- Also, some fancy animations to show what IV actually does w/ data

Slides

📊 HTML 📥 Download PDF

Week 14 - Research Week

Sun, 12 Apr 2026 00:00:00 +0000

Writing Papers

Slides

📥 Download this week's slides

Presenting results

Empirical specification template

A helpful, five-step template!¹

(1) First, I state what in broad terms type of model I am using (probably OLS!) and what I am going to do with it. If I’m using a fancy speciifcation like a difference-in-differences or panel data, now is a good time to mention this.

I use ordinary least squares to estimate the relationship between ice cream flavor consumed and risk aversion, using state-level fixed effects to control for time-invariant determinants of risk aversion that may vary by state.

(2) Then, I’m going to write the population model I am estimating in equation form. I’m going to use appropriate subscripts:

$outcome_{is} = \beta_0 + \beta_1 VariableName_{is} + \beta_2 AnotherVariableName_{2,is} + f_s + u_{is}$

Use descriptive names (these don’t actually have to be your variable names)
If you use fixed effects, you can notate these without having to expand them:
- adding state fixed effects could be written as $ … + f_s + …$, for example
- or, you could add them this way: $\sum_{s=1}^{50}D_s$ for a set of 50 state-level fixed effects (with one omitted)
List covariates you include. If you have a ton of these, you could include a vector of individual-specific covariates ($X'\gamma$, for example) and then list them in paragraph form. However, this isn’t likely for most papers.

(3) Then, I’m going to define my independent and dependent variables in a list or paragraph form. I’m going to also describe what the subscripts are

Where $outcome_{is}$ is measured risk aversion for individual $i$ living in state $s$, following the scale described above. $VariableName_{is}$ is a measure of flavor intensity, standardized around Edy’s French Vanilla …

I’ll want to include any controls and fixed effects.

(4) I’ll mention any special things I do when coding (missing value flags, etc), and what types of standard errors I’m using

(5) Now, I write how I will interpret my coefficient of interest (causal, correlational, etc.) - what will the coefficients on my key indicator variables tell me? Are there any key identifying assumptions at play?

If you are using multiple specifications, then you have two options:

If it’s a matter of adding additional controls, then mention that you will also add them in a second model in part (3).
If it’s a fundamentally different population model, then include a second model, and define any terms that were not previously defined. Make sure to discuss how the interpretation of results would differ.

Working with `outreg2`

Download sample code, which uses graduation.dta

Introductions and conclusions

The introduction formula

Listen to Keith Head and follow “The Introduction Formula”

General guidance for paper writing

Four steps to an applied micro paper, Jesse Shapiro
How to write applied papers in economics, Marc Bellemare
Economical Writing, Deidre McCloskey
Writing papers: a checklist, Michael Kremer

Use the template, but please do not plagiarize the template. ↩︎

Week 11/12 - Panel data

Sun, 22 Mar 2026 00:00:00 +0000

Overview

We’ve complained a lot about challenges to obtaining causal estimation. Now, let’s do something about it! :strong:

Tuesday covers panel data, first differencing, and fixed effects — the tools that let us eliminate time-invariant omitted variable bias.

Thursday covers difference-in-differences (DiD) — a research design that uses these tools to estimate the causal effects of policies and events. DiD isn’t actually in Chapter 10¹, but it fits here nicely.

Reading Guide

Chapter 10: Regression with Panel Data

SW 10.1 Panel data

Make sure you know the difference between cross-sectional, time-series, and panel data. Also, what does it mean to be balanced?

SW 10.2 Panel data with two time periods: “Before and after” comparisons

We can control for variables that are constant over time by differencing them out. Do you understand how this works?

SW 10.3 Fixed effects regression

Instead of differencing, let’s include a bunch of entity-specific intercepts. What are we doing, why does it work? When is it like the before-after comparisons from before?

SW 10.4 Regression with time fixed effects

Just like we can control for unit-specific factors that remain constant over time, we can also control for factors that vary over time but are constant across units with time fixed effects. Neat!

Can we include both entity- and time-specific fixed effetcst? You bet we can!

SW 10.5 The fixed effects regression assumptions and standard errors for fixed effects regressions

How do our LS assumptions change when we move to panel data?

SW ?? Difference-in-differences estimation

Set up and interpret difference-in-differences estimation. There’s a decent discussion in SW13.4, but I think it’s more appropriate to cover it here, since it’s so closely linked to panel data.

Slides

In-Class Activity

From Tuesday 4/7, here are the fixed effects handout and solution set:

Fixed Effects Showdown activity (PDF)
Fixed Effects Showdown answer key (PDF) (try the activity first if you want the full practice effect)

Check out SW13.4 if you want to be precise. ↩︎

Week 10 - Causal Diagrams & Assessing Studies

Sun, 15 Mar 2026 00:00:00 +0000

Overview

This week has two connected parts. Tuesday: we introduce causal diagrams (DAGs) — a visual framework for thinking about causality, confounding, and what to control for. Thursday: we apply that framework to assessing regression validity (internal and external validity, omitted variable bias, measurement error, simultaneity).

Reading Guide

Tuesday: Causal Diagrams (DAGs)

Read the following chapters from Huntington-Klein, The Effect (free online):

The Effect, Chapter 6: Causal Diagrams

What are DAGs? Nodes, arrows, and how to represent a data generating process visually.

The Effect, Chapter 7: Drawing Causal Diagrams

How to build a DAG for your own research question. Practical guidance on simplifying and avoiding common mistakes.

The Effect, Chapter 8: Causal Paths and Closing Back Doors

The key chapter: front door vs. back door paths, open vs. closed paths, confounders, colliders, and the backdoor criterion.

Thursday: Assessing Regression Validity (SW Chapter 9)

SW 9.1 Internal and External Validity

Get those definitions, and threats to external validity

SW 9.2 Threats to Internal Validity of Multiple Regression Analysis

Five big threats. And how to handle them.

Bonus: SW 11.1 Linear Probability Models

Optional background if you want a bit more on [linear probability models]{.kw}. This is not part of Chapter 9, but it connects directly to Thursday’s class and Lab 6.

SW 9.4 Example: Test Score and Class Size

No new content, but this is a really good walk-through of the 9.1 and 9.2 content.

Slides

Tuesday: Causal Diagrams (DAGs)

📊 HTML 📥 Download PDF

Thursday: Assessing Studies (SW Ch. 9)

📊 HTML 📥 Download PDF

Week 9 - Nonlinear Regression

Sun, 08 Mar 2026 00:00:00 +0000

Overview

Let’s make regressions faaaancy, with three new friends: polynomials, logarithms, and interactions!

Reading Guide

Chapter 8: Nonlinear Regression Functions

SW 8.1 A General Strategy for Modeling Nonlinear Regression Functons

Key takeaway here - think about the economic theory and its application to determine the best approach to modeling!

SW 8.2 Nonlinear Functions of a Single Independent Variable

Polynomials and logs. Note that this section includes only 3 ways of setting up equations: level-level, log-log, log-level. We’ll also cover level-log, because, why not?

SW 8.3 Interactions Between Independent Variables

Three types of interactions you should know:

Binary-binary
Continuous-binary
Contnuous-continuous

Slides

📊 HTML 📥 Download PDF

Exam 2 - CH 4-7

Fri, 20 Feb 2026 00:00:00 +0000

What should I know?

See the specific chapter guides from Week 4, Week 5, Week 6, and Week 7. You should understand all terms and definitions (be able to define and also apply them). You should also be comfortable interpreting regression output either written out in equation form or as Stata output. The level of PS2/PS3 problems is about right in terms of difficulty.

There will not be a Stata component to the exam.

You can find a more detailed set of guidelines here.

I need more practice!

I recommend the following resources:

Odd-numbered problems from Stock and Watson - solutions are available here
Practice exam
In-class practice

What is the structure of the exam?

The exam is closed book with a calculator allowed. You are allowed a formula sheet that you created!

I will provide the following F-distribution table (recreated from S&W): F-table.

You will not need a normal distribution table. However, you will want to know common critical values for t-tests.

What is allowed?

Pen or pencil
One page formula sheet (double sided)
Calculator (basic, scientific, graphing)
Drink and quiet snack

What is not allowed?

~~Phone~~
~~Other notes~~
~~Textbook~~
~~Tablet/Laptop~~
~~Earbuds/headphones~~

In-class practice

We have several sets of in-class practice activities:

Slides

📥 Download this week's slides

Past exam

Spring 2018 Practice Exam

Spring 2018 Practice Exam Solutions

Spring 2017 Practice Exam - note that this exam asks you to interpret log coefficients, which we have not yet covered!

Spring 2017 Practice Exam Solutions (I strongly recommend doing the practice exam in its entirety before reviewing the solutions)

Week 7 - Hypothesis Tests with Multiple Regressions

Mon, 16 Feb 2026 00:00:00 +0000

Overview

And now, we’re back to hypothesis testing. The big thing we’re going to learn about is testing more complicated hypotheses, like whether two coefficients are equal, and whether a whole bunch of coefficients equal zero.

Reading Guide

Chapter 7: Hypothesis Tests and Confidence Intervals with Multiple Regressions

SW 7.1 Hypothesis Tests and Confidence Intervals for a Single Coefficient

Here, we want to test hypothesis of this form: $H_0: \beta_j = \beta_{j,0}$ vs. $H_a: \beta_j \neq \beta_{j,0}$

If you want another take on hypothesis testing with regression coefficients, go ahead and read this. I’m not going to cover this in class, because we’ve hit this in Chapter 5.

SW 7.2 Tests of Joint Hypotheses

Here, we test hypothesis with lots of coefficients, of this form: $H_0: \beta_j = \beta_{j,0}, \beta_m = \beta_{m,0}, …$, for $q$ restrictions, vs $H_a$: any one of those $q$ restrictions does not hold.

SW 7.3 Testing Single Restrictions Involving Multiple Coefficients

Here, we test one restriction, but with multiple coefficients, like this: $H_0: \beta_j = \beta_m$ vs. $H_a: \beta_j \neq \beta_m$

That’s it!

7.5 is a good review of things we’ve already discussed, and 7.6 walk us through an example. I encourage you to read them, but not mandatory. And we’re skipping 7.4

Slides

📊 HTML 📥 Download PDF

Week 6 - Multiple Linear Regression

Tue, 10 Feb 2026 00:00:00 +0000

Overview

Let’s model!

Now, we can build powerful models with heaps of dependent variables. Want to predict wages? Let’s control for education, for experience, for gender, for age, for age squared (yes!). YES. Only our degrees of freedom can hold us back.

Reading Guide

Chapter 6: Linear Regression with Multiple Regressors

SW 6.1 Omitted Variable Bias

A discussion that connects nicely with our previous discussion of the zero conditional mean discussion and causal inference.

SW 6.2 The Multiple Regression Model

Hooray!

SW 6.3 The OLS Estimator in Multiple Regression

This section doesn’t get into derivation, and neither do we!

SW 6.4 Measures of Fit in Multiple Regression

The only new thing here is a revised $SER$ forumla and the introduction of the Adjusted $R^2$. Note that the lecture video also discusses the root mean standard error, $RMSE$, which is a lot like the $SER$ except that it uses $n$ rather than degrees of freedom as a denominator.

SW 6.5 The Least Squares Assumptions in Multiple Regression

Take the three from univariate regression and add … no multicollinearity. Sorted.

SW 6.6 Distribution of the OLS Estimators in Multiple Regression

Just the intuition, don’t worry about the appendix.

SW 6.7 Multicollinearity

Make sure you understand the examples, but remember that in practice, any statistical package will fix perfect multicollinearity on its own. Imperfect multicollinearity, on the other hand, is something to think about when crafting your models.

SW 6.8 Conclusion

Treat yourself.

Slides

📊 HTML 📥 Download PDF

Other resources

As requested, slower graphs! Also added a graph on collider bias, the webpage explanation helps there.

These graphs are intended to show what standard causal inference methods actually *do* to data, and how they work.

This is what controlling for a binary variable looks like: pic.twitter.com/dTZxqY5JxA
— Nick HK (@nickchk) November 29, 2018

Week 5 - Inference with One Regressor

Sun, 01 Feb 2026 00:00:00 +0000

Overview

As we carry on into the wonderful world of statistical inference, expect flashbacks of our statistics review. That’s what the review was for! It’s all coming together 😌.

Reading Guide

Chapter 5: Hypothesis Tests and Confidence Intervals

SW 5.1 Testing Hypotheses About One of the Regression Coefficients

This is very important stuff. Again, don’t worry about calculating the variance of a beta coefficient by hand. However, note how similar this is to our hypothesis testing in Chapter 3!

SW 5.2 Confidence intervals for a regression coefficient

SW 5.3 Regression when X is a binary variable

We are going to use this ALL THE TIME.

SW 5.4 Heteroskedasticity and homoskedasticity

Drop “heteroskedasticity” into any conversation and you’re sure to delight. Just one more benefit of EC200. The examples and implications of heteroskedasticity are important.

And, now you’ll know to add , robust to all your Stata regressions.

SW 5.5 Theoretical foundations of OLS

Know the Gauss-Markov theorem and related assumptions. Skip “regression estimators other than OLS.” The appendix contains a proof of the Gauss-Markov theorm, but we will not cover that.

~~SW 5.6~~ Skip this section!

SW 5.7 Conclusion

For the good times.

Slides

📊 HTML 📥 Download PDF

Other resources

EGAP: 10 things to know about hypothesis testing
EGAP: 10 things to know about statistical power - gets a bit deeper than we go, but accessible and handy!

Week 4 - Linear Regression with One Regressor

Mon, 26 Jan 2026 00:00:00 +0000

Overview

Welcome to Week 4! We are proceeding boldly into the world of linear regression. We’re starting by looking into the linear relationship between two variables.

What we won’t be doing is controlling for other factors, nor conducting statistical inference. We won’t be looking at non-linear relationships yet either! Bah.

Rather, we’re going to dive deep into what it means to look at how we can find a good estimate - nay, the best estimate! - of the relatonship between some $X$ and some $Y$.

This is where Stock and Watson really start to shine. I highly recommend basking in their expertise and conversational style.

Reading Guide

Chapter 4: Linear Regression with One Regressor

SW 4.1 - The Linear Regression Model

This section is packed w/ good intuition and bolded vocabulary. Make sure you know it!

SW 4.2 - Estimating the Coefficient of the Linear Regression Model?

You should be able to estimate linear regression coefficients by hand 😪.

SW 4.3 Measures of Fit

Know how to use and interpret $R^2$, $ESS$, $TSS$, $SSR$ and $SER$. You will also need to know how to find these from raw Stata output as well.

SW 4.4 Least Squares Assumptions

Known and understand the three least squares assumptions

SW 4.5 Sampling Distribution of the OLS

Discuss unbiasedness of estimators and effects of larger vs. smaller sample sizes on standard errors. We won’t calculate standard errors by hand.

Note on causality

At this stage, we are thinking about making good model of data, but not necessarily the data generating process behind that data. When we use the framing of an independent and dependent variables, it’s tempting to think that we’re examining whether the independent variable causes the dependent variable.

At this point in the course, we’re looking at associations which could be causal … or they could not be!

If you want to dig deeper, check out this great guide from EGAP: 10 things to know about causal inference.

Slides

📊 HTML 📥 Download PDF

Video: Building intuition around the OLS model

You can play along with the same simulator!

In-class exercise

Link to pdf here

Consider a dataset on births to women in the United States. Two variables of interest are infant birth weight in ounces (bwght), and the average number of cigarettes the mother smoked per day during pregnancy (cigs). The following simple regression was estimated using data on 1,388 births.

 Source | SS df MS Number of obs = 1,388
-------------+---------------------------------- F(1, 1386) = 32.24
Model | 13060.4194 1 13060.4194 Prob > F = 0.0000
Residual | 561551.3 1,386 405.159668 R-squared = 0.0227
-------------+---------------------------------- Adj R-squared = 0.0220
Total | 574611.72 1,387 414.283864 Root MSE = 20.129
------------------------------------------------------------------------------
bwght | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
cigs | -.5137721 .0904909 -5.68 0.000 -.6912861 -.3362581
_cons | 119.7719 .5723407 209.27 0.000 118.6492 120.8946
------------------------------------------------------------------------------

These results can also be written in the following way:

$$\widehat{bwght} = 119.77 - 0.514 cigs$$

What is the dependent variable? What is the independent variable?
Write, in words, what the interpretation of $0.514$ is.
What is the predicted birth weight among mothers who do not smoke? What about when $cigs=20$ (one pack per day)? Comment on the difference.
Consider Prof. Beam, whose mother “cut back” to 10 cigarettes per day (it was the 80s) and was born weighing 9lb, 15 oz. What is her residual?
Find $R^2$ in the raw regression output. What does it tell us?
Are any least squares assumptions likely to be violated? Explain.
Does this simple regression necessarily capture a causal relationship between the child’s birth weight and the mother’s smoking habits? Explain.

Week 3 - Stats Exam

Mon, 12 Jan 2026 00:00:00 +0000

What should I know?

Make sure you are comfortable with the material chapter guides from Week 1 and Week 2. What does comfortable mean? That you’re familiar with definitions and terminology, and that you can work through problems at the level of PS1 (solutions on Brightspace).

There will not be a Stata component to the exam.

I need more practice!

I recommend the following resources:

Odd-numbered problems from Stock and Watson - solutions are available here
Practice exam
In-class practice

What is the structure of the exam?

The exam is closed book with a calculator allowed. You are allowed to bring a one-page formula sheet.

What is allowed?

Pen or pencil
One page formula sheet (double sided)
Calculator (basic, scientific, graphing)
Drink and quiet snack

What is not allowed?

~~Phone~~
~~Other notes~~
~~Textbook~~
~~Tablet/Laptop~~
~~Earbuds/headphones~~

Helpful materials

Two things that may be useful to you:

Standard Normal Distribution Table
Formula chart: Is this all the knowledge you need to know? Of course not! But, this should cover most of the formulas you might need.

Past exam

Yes, this exam is very old! Fortunately, the statistics don’t change!

Fall 2018 Practice Exam

Fall 2018 Practice Exam Solutions (I strongly recommend doing the practice exam in its entirety before reviewing the solutions)

Week 1 - Probability Review

Tue, 06 Jan 2026 00:00:00 +0000

Overview

Welcome to Week 1! Our goal this week is to (1) help us sort out our varying technologies and workflows and (2) start reviewing statistics and probability. Basically, this the start of a three-week mini-bootcamp to get us ready for the glorious world of regressions!^[Aside: This unit will pull from lots of resources. Things will be simpler once we get to chapter 4 and onward]

What you should do for this three-week unit depends on your background:

I took STAT1410 or another course recently and remember some stuff

Read through SW, and dig deeper into anything that doesn’t look familiar.
Watch the class videos and complete video quizzes
Use the Khan Academy to clarify anything you feel a bit fuzzy on.
Make sure you can do the practice problems.

Everyone else

Start by reviewing the Khan Academy videos
Work through all the Khan Academy practice problems.
Watch the class videos and complete video quizzes
Head into the practice problems, and review concepts where you are still stuck.

Reading Guide for Week 1

Chapter 2.1-2.4: Review of Probability

This is the material you should already know, along with supports from Khan Academy (). Remember that you don’t need to memorize formulas! Note that most links take you a direct video, but there are also relevant videos in the accompanying playlist.

SW 2.1 Random Variables and Probability Distributions

Introduction to random variables and probability distributions variables (entire playlist, starting with “Constructing a probability distribution for a random varible”)
Mean and standard deviation of random variables (entire playlist)
Conditional probability (entire playlist)

SW 2.2 Expected Values, Mean, and Variance

You only need a general knowledge of kurtosis and skew.
Transforming random variables (entire playlist, just 3 videos)

SW 2.3 Two Random Variables

Law of iterated expectations can be skimmed
- Make sure that you’re good with key concept box 2.3!
Combining random variables (entire playlist)

SW 2.4 The Normal, Chi-Squared, Student t, and F distributions

Normal distribution only, ~~but we’ll discuss t-distributions a little bit~~. We won’t use chi-squared, and we’ll come back to F-distributions later. For any work we’ll be doing, our sample sizes will be greater than 100, so $t \rightarrow z$.
Z-scores (entire playlist)
Normal distributions and the empirical rule not explicitly covered, but extremely useful review for intuition
Normal distribution calculations (entire playlist, walks through all the possible ways of working w/ normal tables)

Slides

📥 Download this week's slides

Week 2 - Statistics Review

Tue, 06 Jan 2026 00:00:00 +0000

Overview

Welcome to Week 2! Statistical review continues apace.

As before, what you should do this final week before the exam depends on your background:

I took STAT1410 recently and remember some stuff

Read through SW, and dig deeper into anything that doesn’t look familiar.
Watch the class videos and complete video quiz.
Use the Khan Academy to clarify anything you feel a bit fuzzy on.
Make sure you can do the practice problems.

Everyone else

Start by reviewing the Khan Academy videos
Work through all the Khan Academy practice problems.
Watch the class videos and complete video quiz.
Head into the practice problems, and review concepts where you are still stuck.

Reading Guide (Chapter 2)

SW 2.5 Random Sampling and the Distribution of the Sample Average

What is a sampling distribution? (just the one video)

SW 2.6 Large-sample approximations to sampling distributions

Get that central limit theorem!
Central Limit Theorem (just the one video)
Sampling distribution of a sample, part I mean and part II (just the two videos, with a practice after)

Reading Guide (Chapter 3)

This is the material you should know, along with supports from Khan Academy (). Remember that you don’t need to memorize formulas!

SW 3.1 Estimation of the population mean

Don’t worry about BLUE, we’ll come back to it. But make sure you get bias, consistency, and efficiency.

SW 3.2 Hypothesis tests concerning the population mean

KA presents confidence intervals before hypothesis tests, so if following their curruculum, start w/ confidence intervals first
The idea behind significance tests
Introduction to confidence intervals
Setting up a test for a population mean (entire playlist)
Carrying out a test for a population mean (entire playlist, except part about TI calculator)

SW 3.3 Confidence intervals for the population mean

Constructing a confidence interval for a population mean

SW 3.4 Comparing means from different populations

SKIP

Only the type of two-sample test we’ve been practicing - look at difference between two means where the standard deviation is unknown and you do not assume that they come from the same underlying population distribution. Make sure you can use formulas 3.19 and 3.20 ~~Testing for the differences of two population means~~

Skip SW 3.5/3.6

SW 3.7 Scatterplots, the sample covariance, and the sample correlation

Correlation coefficients

Slides

📥 Download this week's slides

Mon, 01 Jan 0001 00:00:00 +0000

ECON 3500 Econometrics and Applications
Spring 2026

In-Class Activity: DiD by Hand

Chapter 10 — Difference-in-Differences
Time: ~20 minutes

Setup

In January 2020, New Jersey increased its state minimum wage from $10.00 to $12.00 per hour. Neighboring Pennsylvania kept its minimum wage at $7.25. A researcher collected data on average employment at fast-food restaurants (measured as full-time equivalent workers per restaurant) in both states, before and after the policy change.

	Before (Nov 2019)	After (Mar 2020)
New Jersey (treatment)	20.4	21.0
Pennsylvania (control)	23.3	21.2

Questions

1. Calculate the simple before-and-after change in employment for New Jersey (the treatment group).

\vspace{1.5cm}

2. A state legislator sees your answer to Question 1 and says, “See! Raising the minimum wage increased employment!” Explain why the simple before-after comparison is not a valid estimate of the causal effect of the minimum wage increase.

\vspace{2.5cm}

3. Calculate the difference-in-differences (DiD) estimate of the effect of the minimum wage increase on fast-food employment.

Show your work. You can calculate this either as (a) the difference in before-after changes between the two groups, or (b) the difference in cross-state gaps between the two time periods.

\vspace{3cm}

4. Interpret your DiD estimate from Question 3 in a complete sentence. Be specific about units and direction.

\vspace{2cm}

5. State the parallel trends assumption in the context of this study. Be precise — what, specifically, must be true for the DiD estimate to be valid?

\vspace{2.5cm}

6. Evaluate whether parallel trends is plausible here. Identify at least one reason it might hold and one reason it might fail.

\vspace{2.5cm}

6b. A researcher suggests adding restaurant-level characteristics — franchise type, seating capacity, and years in operation — as control variables to the DiD regression. Explain two reasons why this might be useful: one about precision and one about the validity of the parallel trends assumption.

\vspace{2.5cm}

7. (Bonus) Write the regression equation that would produce the DiD estimate from Question 3. Define your variables clearly and identify which coefficient gives the DiD estimate.

\vspace{3cm}

\pagebreak

INSTRUCTOR NOTES — DO NOT DISTRIBUTE

Answers

1. Simple before-after change for NJ:

$$\Delta_{NJ} = 21.0 - 20.4 = +0.6 \text{ FTE workers}$$

2. The before-after comparison confounds the effect of the minimum wage increase with any other changes happening over the same time period. PA employment fell by 2.1 FTE — if NJ would have followed the same trend absent the policy, the simple comparison overstates the true effect. March 2020 also coincides with the onset of COVID-19, making this a particularly sharp example of why common time trends must be accounted for.

3. DiD estimate (either approach):

Approach (a): Difference in before-after changes

$$\text{DiD} = \Delta_{NJ} - \Delta_{PA} = (21.0 - 20.4) - (21.2 - 23.3) = 0.6 - (-2.1) = +2.7$$

Approach (b): Difference in cross-state gaps

$$\text{DiD} = (21.0 - 21.2) - (20.4 - 23.3) = (-0.2) - (-2.9) = +2.7$$

Both give DiD = +2.7 FTE workers per restaurant.

4. “The minimum wage increase in New Jersey is estimated to have increased average fast-food restaurant employment by 2.7 full-time equivalent workers per restaurant, relative to what would have occurred absent the policy change.”

5. The parallel trends assumption requires that, in the absence of the minimum wage increase, average fast-food employment in New Jersey would have changed by the same amount as in Pennsylvania between November 2019 and March 2020. Any time trends affecting fast-food employment would have been the same in both states had NJ not raised its minimum wage.

Reasons it might hold:

NJ and PA are neighboring states with similar economies, labor markets, and demographics
Fast-food chains operate similarly across state lines (same companies, similar consumer bases)
Pre-treatment trends in fast-food employment might be similar

Reasons it might fail:

March 2020 is the start of the COVID-19 pandemic — differential pandemic impacts across states could violate parallel trends (this is the key one to discuss!)
Anticipation effects: NJ employers might have adjusted employment before formal implementation
Other state-level policies may have changed at the same time

6b.

Precision: Adding restaurant-level controls (franchise type, seating capacity, years in operation) reduces residual variance → tighter standard errors on the DiD estimate, even if parallel trends holds unconditionally.

Credibility: Parallel trends may only hold conditional on these covariates. If, for example, newer franchises were disproportionately opening in NJ during this period (anticipating higher wages), or if NJ and PA had systematically different franchise compositions, then observable restaurant characteristics could predict differential outcome trends. Controlling for them makes parallel trends more plausible. Omitting relevant covariates when they predict differential trends is OVB — the parallel trends assumption may only hold after conditioning.

7. Regression equation:

$$Employment_{st} = \beta_0 + \beta_1 \cdot NJ_s + \beta_2 \cdot After_t + \beta_3 \cdot (NJ_s \times After_t) + u_{st}$$

Where:

$NJ_s = 1$ if New Jersey, 0 if Pennsylvania
$After_t = 1$ if March 2020 (post-treatment), 0 if November 2019 (pre-treatment)
$\beta_3$ is the DiD estimate (= 2.7)

Interpretation of all coefficients:

$\beta_0 = 23.3$ (PA employment, before)
$\beta_1 = 20.4 - 23.3 = -2.9$ (NJ-PA gap, before)
$\beta_2 = 21.2 - 23.3 = -2.1$ (PA change over time)
$\beta_3 = 2.7$ (DiD: the additional change in NJ beyond the PA trend)

Teaching Notes

This activity is inspired by Card and Krueger (1994). Numbers here are illustrative and close to (but not identical to) the original study.
The COVID timing issue in Q6 is intentional — it creates a natural discussion point about threats to identification and how real-world events can compromise research designs.
Q6b connects to the “Adding Control Variables” section of the slides: stress the distinction between controls for precision vs. controls for identification (conditional parallel trends). This is different from RCTs where controls are only about precision.
For Q7, emphasize that the interaction term is the key — students often struggle to see how a 2×2 table maps to a regression with an interaction.
Common student error: confusing “parallel trends” with “equal levels.” Stress that the assumption is about changes (trends), not levels.

Mon, 01 Jan 0001 00:00:00 +0000

ECON 3500 Econometrics and Applications
Spring 2026

In-Class Activity: Fixed Effects Showdown

Chapter 10 — Panel Data and Fixed Effects
Time: ~20-25 minutes

Setup

A researcher wants to estimate the effect of beer taxes on traffic fatality rates. She has a balanced panel dataset of all 48 contiguous U.S. states observed annually from 2000 to 2009 (T = 10, N = 48, so 480 observations total).

The dependent variable is the traffic fatality rate (deaths per 10,000 people). The key independent variable is the real beer tax (dollars per case, adjusted for inflation). She also observes per capita income (in thousands of dollars).

She estimates four specifications. Study the output below carefully.

Regression Output

	(1) Pooled OLS	(2) Entity FE	(3) Entity + Time FE	(4) First Difference
Beer Tax	-0.655***	-0.640**	-0.485*	-0.072
	(0.188)	(0.254)	(0.261)	(0.117)
Income	0.062***	-0.063*	-0.071**	-0.018
	(0.015)	(0.032)	(0.031)	(0.022)
State FE	No	Yes	Yes	–
Year FE	No	No	Yes	–
Differenced	No	No	No	Yes
SE type	Robust	Clustered (state)	Clustered (state)	Clustered (state)
N	480	480	480	432
R-squared	0.091	0.905	0.918	0.003

Significance: *** p<0.01, ** p<0.05, * p<0.1

Questions

1. For each specification, briefly state what types of omitted variables it controls for and what it does not control for.

(1) Pooled OLS:

\vspace{1.5cm}

(2) Entity FE:

\vspace{1.5cm}

(3) Entity + Time FE:

\vspace{1.5cm}

(4) First Difference:

\vspace{1.5cm}

1b. What is one specific example of a time shock in this context — something that would affect all states' fatality rates in a given year equally? How does adding year fixed effects in specification (3) address it?

\vspace{2.5cm}

2. Look at the coefficient on Income. It flips sign between specification (1) and specification (2). Explain why this happens. What does this tell us about the pooled OLS estimate?

\vspace{3cm}

3. The coefficient on Beer Tax shrinks substantially from specification (1) to specification (4). Does this mean beer taxes have no effect on fatality rates? What should we conclude?

\vspace{3cm}

4. Why does specification (2) use clustered standard errors (clustered by state) rather than the heteroskedasticity-robust standard errors used in specification (1)?

(a) Which panel data least squares assumption is being addressed by clustering?

\vspace{1.5cm}

(b) What goes wrong with standard (unclustered) SEs when that assumption is violated?

\vspace{1.5cm}

5. Even with two-way fixed effects (specification 3), what threats to a causal interpretation remain? Identify at least two specific concerns.

\vspace{3cm}

6. (Optional/Discussion) Suppose states adopted beer tax increases at different times — some in 2002, some in 2005, some in 2008. A researcher uses the two-way FE approach from specification (3) with a single indicator $treated_{it} = 1$ after the state’s tax increase. Why might this estimate be misleading, even if two-way FE was appropriate in specification (3)?

\vspace{3cm}

\pagebreak

INSTRUCTOR NOTES — DO NOT DISTRIBUTE

Answers

1. What each specification controls for:

(1) Pooled OLS: Controls for nothing beyond the included regressors (beer tax, income). Does not account for any unobserved differences across states or over time. Treats all 480 observations as independent cross-sectional data.
(2) Entity FE: Controls for all time-invariant state characteristics (drinking culture, geography, road infrastructure, population density, state-level attitudes toward drunk driving). Does NOT control for factors that change over time and affect all states (national trends in vehicle safety, federal highway policy, changes in social norms around drunk driving).
(3) Entity + Time FE: Controls for both time-invariant state characteristics AND common time trends that affect all states equally (national economic cycles, improvements in vehicle safety technology, federal policies, nationwide public health campaigns). Does NOT control for time-varying, state-specific factors (state-level policy changes other than beer tax, state-specific economic shocks).
(4) First Difference: Like entity FE, removes time-invariant state characteristics by looking at year-over-year changes within each state. Does NOT control for common time trends. Note N = 432 because first-differencing loses one year per state (480 − 48 = 432).

1b. Example: the 2008 financial crisis reduced driving, which could lower fatality rates in every state. Or: improvements in airbag/safety technology affecting all cars nationally in a given year. Year fixed effects address this by giving each year its own intercept — the common level in year $t$ is partialled out for all states, leaving only within-year, across-state variation in fatalities and beer taxes. Without year FEs, a national decline in fatalities would be partially attributed to beer taxes if beer taxes happened to be higher in that period.

2. The sign flip on Income:

In pooled OLS, Income has a positive coefficient (+0.062): richer states appear to have higher fatality rates. This is driven by omitted variable bias — states with higher income may also be larger, more rural, have more driving, etc. These time-invariant state characteristics are confounded with income in pooled regression.

Once we add entity FE, we are looking at changes in income within a state over time. Within a given state, when income rises, fatality rates fall slightly (−0.063). Higher within-state income may lead to better vehicles, more safety investment, or less risky behavior.

The sign flip is strong evidence that pooled OLS suffers from OVB due to unobserved state characteristics.

3. The shrinking Beer Tax coefficient:

The decline from −0.655 to −0.072 does not mean beer taxes have no effect. It means:

Much of the pooled OLS association was driven by cross-state differences correlated with both beer taxes and fatality rates (time-invariant OVB)
After controlling for state and year FE, the remaining within-state, within-year variation in beer taxes may be too small to precisely estimate the effect
The FD estimate is small and imprecisely estimated, suggesting year-to-year changes in beer taxes are not strongly associated with year-to-year changes in fatality rates
Possible interpretations: (a) the true causal effect is small, (b) not enough within-state variation in beer taxes to detect it, or (c) effects may materialize with a lag

Key teaching point: The shrinking coefficient is the point of fixed effects — we strip away confounding variation to isolate the causal estimate, which may be smaller than the biased OLS estimate.

4. Why clustered standard errors:

(a) Assumption 2 requires observations to be i.i.d. draws across entities. The assumption only requires independence across entities, not within. However, standard OLS/robust SEs assume errors are independent across all observations. Within a state, fatality rates in year $t$ are correlated with year $t-1$ (serial correlation). Clustering at the state level allows arbitrary within-state, within-cluster correlation — correcting for this autocorrelation in the error term.

(b) With positive serial correlation (typical in panel data), unclustered SEs are usually too small. This means confidence intervals are too narrow and $t$-statistics are too large → we over-reject the null (false positives). The point estimates are still consistent — only inference is wrong. Note: in some designs with negative within-cluster correlation, clustered SEs can be larger than unclustered, but this is less common.

5. Remaining threats even with two-way FE:

Time-varying, state-specific confounders: Other state policies changing at the same time as beer taxes (DUI enforcement, speed limits, seatbelt laws). Two-way FE only handles fixed state traits and common time shocks — not policies varying across both states and time.
Reverse causality: States with rising fatality rates might respond by raising beer taxes. The policy change may be endogenous to the outcome.
Measurement error: Beer taxes are a poor proxy for the actual price of alcohol (substitution to untaxed beverages, cross-border purchases).
Spillovers: A tax increase in one state may push drinking across state lines (SUTVA violation).
Lagged effects: Tax changes may not affect behavior immediately; contemporaneous comparison may miss the effect.

6. With staggered adoption, TWFE implicitly uses already-treated states as “controls” for later-adopting states. If treatment effects change over time (dynamic effects), the already-treated state’s continuing effect looks like a “trend” — contaminating the comparison. TWFE is a weighted average of all possible 2×2 DiD comparisons, and when treatment effects are heterogeneous, some comparisons receive negative weights, potentially flipping the sign of the aggregate estimate even when all individual true effects are positive. Modern estimators (Callaway-Sant’Anna, Sun-Abraham) avoid this by only using not-yet-treated or never-treated units as controls.

Teaching Notes

This scenario is adapted from the Stock & Watson textbook’s running example of U.S. traffic fatalities (originally from Ruhm, 1996). Numbers are constructed for pedagogical clarity.
The R² jump from 0.091 (pooled) to 0.905 (entity FE) is a great discussion point — most variation in fatality rates is between states, not within states over time.
The income sign-flip is one of the most memorable examples of OVB in panel data — exactly what FE is designed to fix.
For Q4, connect explicitly to the “Problem of Serial Correlation” slide: stress that coefficients are still consistent, but inference (SEs, CIs, p-values) is wrong without clustering.
Q6 is optional — use if students are comfortable with basic DiD. It connects to the staggered adoption section of Tuesday’s slides.

Mon, 01 Jan 0001 00:00:00 +0000

ECON3500 Econometrics and Applications
Spring 2026

In-Class Activity: Is This a Good Instrument?

Chapter 12 — Instrumental Variables Regression

For each scenario below, a researcher wants to estimate the causal effect of $X$ on $Y$ and proposes using $Z$ as an instrumental variable. Your job: evaluate whether $Z$ is a valid instrument.

For each scenario, assess:

Relevance: Is $Z$ correlated with $X$? Why or why not?
Exogeneity: Is $Z$ uncorrelated with the error term $u$? What could go wrong?
Exclusion: Does $Z$ affect $Y$ only through $X$? What are the threats?
Overall verdict: Good instrument, bad instrument, or “it depends”?

Scenario 1: Returns to Education

A researcher wants to estimate the effect of years of education on log weekly earnings. She is worried that ability is unobserved and correlated with both education and earnings (classic OVB). She proposes using distance from the student’s childhood home to the nearest four-year college as an instrument for years of education.

(Based on Card, 1993)

	Your assessment
Relevance
Exogeneity
Exclusion
Verdict

Scenario 2: Police and Crime

A city government wants to know whether hiring more police officers reduces violent crime. The problem: cities that experience crime waves hire more police, creating simultaneous causality. A researcher proposes using the number of firefighters hired in the same year as an instrument for police hiring.

	Your assessment
Relevance
Exogeneity
Exclusion
Verdict

Scenario 3: Class Size and Test Scores

A school district wants to estimate the effect of class size on standardized test scores. Parents with more resources may sort into schools with smaller classes (selection bias). A researcher proposes using random fluctuations in cohort size due to enrollment cutoffs (e.g., Maimonides' Rule: when enrollment crosses a multiple of 40, an additional class section is created) as an instrument for class size.

(Based on Angrist and Lavy, 1999)

	Your assessment
Relevance
Exogeneity
Exclusion
Verdict

Scenario 4: Smoking and Health

A health economist wants to estimate the effect of cigarette consumption (packs per day) on infant birth weight. Smoking behavior is likely correlated with other health behaviors, income, and stress, all of which also affect birth weight. She proposes using state cigarette excise taxes as an instrument for cigarette consumption.

	Your assessment
Relevance
Exogeneity
Exclusion
Verdict

Scenario 5: Income and Health

A researcher wants to estimate the causal effect of income on self-reported health status. Income is endogenous because health affects the ability to work (reverse causality) and unobserved factors like motivation affect both. He proposes using lottery winnings (among people who play the lottery) as an instrument for income.

(Based on Lindahl, 2005)

	Your assessment
Relevance
Exogeneity
Exclusion
Verdict

\newpage

INSTRUCTOR NOTES — DO NOT DISTRIBUTE

Scenario 1: Distance to College → Education → Earnings

Relevance: Yes. Students who grow up closer to a college face lower costs of attending (commuting, living at home) and are more likely to attend. Empirically strong first stage.
Exogeneity: Potential concern. Families who live near colleges may differ systematically — they may live in more urban areas, have higher-income parents, or have better access to other amenities. If these factors also affect earnings directly, exogeneity fails. Card (1993) argues that after controlling for region, urban/rural status, and family background, proximity is plausibly exogenous.
Exclusion: The worry is that proximity to a college captures something about the local labor market or community characteristics that directly affect earnings. If living near a college means living near a city with better job opportunities, the exclusion restriction is violated. Controls help, but the assumption is not testable.
Verdict: Reasonable but imperfect. The key question is whether controls adequately capture the ways proximity might directly affect earnings. This is a classic “plausible but debatable” instrument — good for discussion.

Scenario 2: Firefighters → Police → Crime

Relevance: Moderate. Cities that expand public safety budgets may hire both more police and more firefighters. Budget shocks could move both.
Exogeneity: Problematic. The same budget pressures, political dynamics, or public safety concerns that drive firefighter hiring could also directly affect crime or crime reporting. A crime wave could trigger expanded hiring across all public safety departments.
Exclusion: Very dubious. Firefighter hiring likely reflects the same municipal budget and public safety environment that directly affects crime through many channels (social services, economic conditions, etc.).
Verdict: Bad instrument. The conditions that drive firefighter hiring are likely correlated with unobserved determinants of crime. This violates both exogeneity and exclusion. (Compare with Levitt’s (1997) use of electoral cycles as an instrument, which is more plausible.)

Scenario 3: Enrollment Cutoffs → Class Size → Test Scores

Relevance: Yes. Maimonides' Rule mechanically determines class size. When enrollment crosses a multiple of 40, an additional section is created, producing a sharp drop in average class size. This generates strong variation.
Exogeneity: Strong. The enrollment cutoffs are determined by administrative rules and population fluctuations that are plausibly unrelated to the characteristics of individual students. Parents would need to precisely manipulate enrollment counts to game the system, which is difficult.
Exclusion: Mostly satisfied. The enrollment cutoff affects students primarily through class size. One concern: crossing a threshold could also affect other school resources (need for additional classrooms, teacher quality of marginal hires). Angrist and Lavy argue these channels are minor.
Verdict: Good instrument — one of the textbook examples of a well-designed IV strategy. The mechanical rule creates quasi-random variation in class size.

Scenario 4: Cigarette Taxes → Smoking → Birth Weight

Relevance: Yes. Higher taxes raise the price of cigarettes, which reduces consumption (basic demand theory). Empirically well-documented.
Exogeneity: Moderate concern. State tax rates could be correlated with state-level characteristics that also affect birth weight — for example, states with higher cigarette taxes may also have better public health infrastructure, higher incomes, or different demographics. Including state-level controls helps.
Exclusion: The main worry: cigarette taxes generate state revenue that could fund health programs, or high-tax states may have other anti-smoking policies (clean air laws, health education) that affect birth weight through channels other than the mother’s smoking. If taxes proxy for a broader public health environment, exclusion is violated.
Verdict: Decent but requires careful controls. The instrument is relevant and widely used (Evans and Ringel, 1999), but exclusion requires that tax variation is not simply proxying for a state’s overall health policy environment. Cross-state and over-time variation in tax changes can help.

Scenario 5: Lottery Winnings → Income → Health

Relevance: Yes. Lottery winnings directly increase income. The magnitude depends on the size of winnings — small prizes may not generate enough variation (potential weak instrument concern for small winners).
Exogeneity: Strong, conditional on playing the lottery. Among lottery players, the amount won is random. The key conditioning assumption is that the decision to play is not itself endogenous — the sample is restricted to lottery players, so we need exogeneity within that group, not in the general population.
Exclusion: Mostly satisfied. The main channel from lottery winnings to health is through increased income/wealth. Possible concerns: large winnings could cause stress, lifestyle changes, or social disruption that affect health independent of the “income” channel. But for moderate winnings, income is the dominant pathway.
Verdict: Good instrument (within the lottery-playing population). The randomization of winnings is a powerful source of exogenous variation. The main limitation is external validity: the LATE applies to lottery players, who may not be representative of the general population. Also, the LATE captures the effect of windfall income, which may differ from the effect of earned income.

General Teaching Notes

Emphasize that exogeneity and exclusion are not testable — they require economic reasoning and domain knowledge.
Relevance is testable via the first-stage F-statistic ($F > 10$ rule of thumb).
Use these scenarios to highlight that most IV debates are about exclusion and exogeneity, not relevance.
Scenario 2 (firefighters) is intentionally a clear “bad instrument” — students should feel confident rejecting it. Scenarios 1 and 4 are “good but debatable,” which models the real-world messiness of IV.
Connect to LATE: In each scenario, ask students who are the compliers? For example, in Scenario 1, compliers are students whose education decision is actually affected by distance to college — not those who would attend regardless or never attend regardless.

Mon, 01 Jan 0001 00:00:00 +0000

ECON3500 Econometrics and Applications
Spring 2026

In-Class Activity: 2SLS in Action

Chapter 12 — Instrumental Variables Regression

A labor economist wants to estimate the causal effect of years of education on log hourly wages. She uses data on 3,010 U.S. men born between 1930 and 1939 from the 1980 Census. She is worried that OLS estimates are biased because ability is unobserved: more able individuals get more education and earn higher wages, even holding education constant.

Her proposed instrument: quarter of birth. The idea is that compulsory schooling laws require students to stay in school until age 16. Students born earlier in the year reach age 16 earlier in the school year, so they can legally drop out with less total education. Quarter of birth thus affects years of education completed but should not directly affect wages.

(Based on Angrist and Krueger, 1991)

Regression Output

Table 1: OLS Regression

Dependent variable: Log hourly wage

Variable	Coefficient	Std. Error	t-statistic
Years of education	0.0700	0.0035	20.00
Black	-0.236	0.018	-13.11
Married	0.121	0.016	7.56
Region (South)	-0.098	0.015	-6.53
Constant	5.02	0.052	96.54

$R^2 = 0.132$, $n = 3{,}010$

Table 2: First-Stage Regression

Dependent variable: Years of education

Variable	Coefficient	Std. Error	t-statistic
Born Q1 (Jan–Mar)	-0.152	0.067	-2.27
Black	-1.47	0.12	-12.25
Married	0.38	0.11	3.45
Region (South)	-0.54	0.10	-5.40
Constant	12.84	0.078	164.62

$R^2 = 0.087$, $n = 3{,}010$

First-stage F-statistic on excluded instrument (Born Q1): $F = 5.15$

Table 3: IV/2SLS Regression

Dependent variable: Log hourly wage
Endogenous variable: Years of education
Instrument: Born Q1 (= 1 if born January–March)

Variable	Coefficient	Std. Error	t-statistic
Years of education	0.142	0.061	2.33
Black	-0.133	0.075	-1.77
Married	0.075	0.043	1.74
Region (South)	-0.060	0.038	-1.58
Constant	4.11	0.78	5.27

$n = 3{,}010$

Questions

Question 1: OLS Interpretation

(a) Interpret the OLS coefficient on years of education. Be precise about units and magnitude.

\vspace{2cm}

(b) Why might this coefficient be a biased estimate of the causal effect of education on wages? What is the likely direction of the bias? Explain using the omitted variable bias formula.

\vspace{3cm}

Question 2: First Stage

(a) Interpret the coefficient on “Born Q1” in the first-stage regression. What does it tell us about the relationship between quarter of birth and years of education?

\vspace{2cm}

(b) The first-stage F-statistic on the excluded instrument is $F = 5.15$. What does this tell us? Should we be concerned? Why or why not?

\vspace{2cm}

Question 3: IV/2SLS Interpretation

(a) Interpret the 2SLS coefficient on years of education. How does it compare to the OLS estimate?

\vspace{2cm}

(b) Using the simple IV formula $\hat{\beta}_1^{IV} = \frac{Cov(\text{log wage}, \text{Born Q1})}{Cov(\text{educ}, \text{Born Q1})}$, explain intuitively why the IV estimate is larger than the OLS estimate.

\vspace{3cm}

Question 4: Bias Direction

Given that the OLS estimate (0.070) is smaller than the 2SLS estimate (0.142), what does this imply about the direction of bias in OLS? Is this surprising? Why or why not?

Hint: Think about what you would expect if ability bias drives OLS upward. What else could be going on?

\vspace{3cm}

Question 5: LATE and Validity

(a) The IV estimate is a Local Average Treatment Effect (LATE). In this context, who are the compliers — the group whose behavior is actually affected by the instrument?

\vspace{2cm}

(b) Name one threat to the validity of quarter of birth as an instrument for education. That is, provide a reason why the exogeneity or exclusion restriction might fail.

\vspace{2cm}

\newpage

INSTRUCTOR NOTES — DO NOT DISTRIBUTE

Question 1: OLS Interpretation

(a) One additional year of education is associated with approximately 7.0% higher hourly wages, controlling for race, marital status, and region. (Technically: a one-year increase in education is associated with a 0.070 increase in log wages, or about 7.0% higher wages.)

(b) If ability is omitted:

$\text{Bias} = \beta_{\text{ability}} \times \frac{Cov(\text{educ}, \text{ability})}{Var(\text{educ})}$
$\beta_{\text{ability}} > 0$ (ability raises wages)
$Cov(\text{educ}, \text{ability}) > 0$ (more able people get more education)
Therefore bias is positive: OLS overestimates the causal effect of education.
This is the standard prediction. We expect OLS > true causal effect if ability bias dominates.

Question 2: First Stage

(a) Being born in Q1 (January–March) is associated with 0.152 fewer years of education compared to being born in Q2–Q4, controlling for race, marital status, and region. This is consistent with compulsory schooling laws: Q1 students reach the legal dropout age earlier in the school year and can exit with less total schooling.

(b) The F-statistic of 5.15 is below the conventional threshold of 10 for strong instruments. This is a weak instrument concern.

Consequences of weak instruments:

2SLS estimates are biased toward OLS in finite samples
Standard errors are unreliable (confidence intervals have incorrect coverage)
The 2SLS estimate may not be trustworthy

This is a well-known limitation of the Angrist-Krueger design with a single quarter-of-birth indicator. The original paper used multiple QOB indicators interacted with year of birth to generate more variation — but even then, weak instrument concerns have been raised (Bound, Jaeger, and Baker, 1995).

Question 3: IV/2SLS Interpretation

(a) The 2SLS estimate implies that an additional year of education causes approximately a 14.2% increase in hourly wages. This is twice as large as the OLS estimate of 7.0%.

The standard errors are also much larger (0.061 vs. 0.0035), which is typical of IV — we are using less variation (only the part driven by the instrument), so estimates are noisier.

(b) Intuition for the IV formula: The numerator captures how much wages differ by quarter of birth. The denominator captures how much education differs by quarter of birth. Since the education difference (denominator) is small (only 0.15 years), even a modest wage difference gets “scaled up” substantially when we divide. This is partly why the IV estimate is large — and why weak instruments are dangerous (a small, noisy denominator amplifies noise).

Question 4: Bias Direction

The fact that OLS (0.070) < IV (0.142) is surprising if we believe ability bias is the main problem, because ability bias should push OLS up, not down.

Possible explanations:

Measurement error in education: If education is measured with error (self-reported years of schooling), OLS is attenuated toward zero. IV corrects for measurement error because the instrument is correlated with true education, not the measurement error. This attenuation could be larger than the upward ability bias.
LATE vs. ATE: The IV estimate applies to compliers — people whose education was affected by compulsory schooling laws. These are individuals at the margin of dropping out, who likely have lower baseline education. The return to education may be higher for this group than for the average person in the sample (heterogeneous treatment effects). So IV > OLS could reflect a higher return for compliers, not necessarily that OLS is biased downward for the whole population.
Weak instrument bias: With $F = 5.15$, the IV estimate is biased toward OLS, meaning the true IV (if we had a strong instrument) might be even larger — or the current estimate could be unreliable.

The measurement error explanation is particularly compelling and is emphasized in the literature.

Question 5: LATE and Validity

(a) The compliers are individuals whose total years of education were affected by compulsory schooling laws — specifically, those who would have dropped out of school earlier if they could have (i.e., if they had reached age 16 before the end of the school year). These are people on the margin of the dropout decision.

The IV estimate does not apply to:

Always-takers: People who would have stayed in school regardless of when they turned 16 (e.g., college-bound students)
Never-takers: People who dropped out before age 16 regardless

(b) Threats to validity:

Season-of-birth and family background: If families with different characteristics have children at different times of year (e.g., higher-SES families are more likely to have spring babies), then quarter of birth is correlated with ability/background, violating exogeneity. There is some evidence of this (Buckles and Hungerman, 2013).
Age-at-test effects: Students born earlier in the year are older when they take any given test or enter the labor market at a given calendar date. If age itself affects productivity, quarter of birth has a direct effect on wages (violating exclusion).
School entry age effects: Quarter of birth affects the age at which a child starts school. If starting school younger/older affects learning or development, this is a channel from Z to Y that does not operate through total years of education.

General Teaching Notes

Use this activity to reinforce that IV estimates are noisy — much larger standard errors than OLS.
The weak instrument finding ($F = 5.15$) is a great discussion point. Ask students: “If you were the referee, would you trust this estimate?”
The “wrong sign” on bias direction (OLS < IV) is one of the most important findings in the returns-to-education literature. It highlights that multiple sources of bias can operate simultaneously (ability bias up, measurement error down), and that LATE may differ from ATE.
Time permitting, discuss how Angrist and Krueger’s original paper used 3 quarter-of-birth dummies interacted with 10 year-of-birth dummies (30 instruments!) — which created its own problems (many weak instruments, overfitting in the first stage).

Mon, 01 Jan 0001 00:00:00 +0000

ECON3500: Econometrics and Applications

In-Class Activity: Threat Detective

Chapter 9 — Threats to Internal Validity

For each scenario below, a researcher estimates a regression and interprets the coefficient on the key independent variable as a causal effect. Your job is to play threat detective: identify what could go wrong.

For each scenario:

(a) Identify which threat(s) to internal validity are most likely present. Choose from:
1. Omitted variable bias
2. Wrong functional form
3. Errors-in-variables bias (measurement error)
4. Sample selection bias
5. Simultaneous causality bias
(b) Explain the likely direction of bias on the coefficient of interest (upward or downward). Be specific about your reasoning.
(c) Propose one concrete solution the researcher could implement.

Scenario 1: Education and Earnings

A researcher estimates the following model using data from employed adults aged 25–65:

$$\widehat{earnings}_i = 15{,}200 + 4{,}800 \cdot educ_i$$

where $earnings_i$ is annual earnings in dollars and $educ_i$ is years of schooling. The researcher concludes that each additional year of education causes earnings to increase by $4,800.

(a) Threat(s):

\vspace{2cm}

(b) Direction of bias and reasoning:

\vspace{2cm}

(c) Proposed solution:

\vspace{2cm}

Scenario 2: Police and Crime

A researcher collects data on 300 U.S. cities and estimates:

$$\widehat{crimerate}_i = 12.4 + 0.83 \cdot policepc_i + controls$$

where $crimerate_i$ is violent crimes per 1,000 residents and $policepc_i$ is police officers per 1,000 residents. The researcher is puzzled: “More police appears to increase crime.”

(a) Threat(s):

\vspace{2cm}

(b) Direction of bias and reasoning:

\vspace{2cm}

(c) Proposed solution:

\vspace{2cm}

Scenario 3: Job Training and Wages

A large firm offers a voluntary job training program. A researcher compares the wages of workers who enrolled in the program to those who did not:

$$\widehat{wage}_i = 22.50 + 3.10 \cdot training_i$$

where $wage_i$ is the hourly wage one year after the program was offered and $training_i = 1$ if the worker enrolled. The researcher concludes the program raised wages by $3.10/hour.

(a) Threat(s):

\vspace{2cm}

(b) Direction of bias and reasoning:

\vspace{2cm}

(c) Proposed solution:

\vspace{2cm}

Scenario 4: Health Insurance and Health

A researcher uses survey data in which respondents self-report both their health insurance coverage and their health status on a 1–10 scale. They estimate:

$$\widehat{health}_i = 5.2 + 0.9 \cdot insured_i + controls$$

where $insured_i = 1$ if the respondent reports having health insurance. However, it is known that roughly 10% of respondents misreport their insurance status (some insured people say they are uninsured, and vice versa).

(a) Threat(s):

\vspace{2cm}

(b) Direction of bias and reasoning:

\vspace{2cm}

(c) Proposed solution:

\vspace{2cm}

Scenario 5: Advertising and Sales

A national retail chain estimates the effect of advertising spending on store revenue using quarterly data from its 150 stores:

$$\widehat{revenue}_i = 320{,}000 + 5.2 \cdot adspend_i$$

where $revenue_i$ is quarterly store revenue in dollars and $adspend_i$ is quarterly advertising spending in dollars. The company allocates more advertising budget to stores that had strong sales the previous quarter.

(a) Threat(s):

\vspace{2cm}

(b) Direction of bias and reasoning:

\vspace{2cm}

(c) Proposed solution:

\vspace{2cm}

\newpage

INSTRUCTOR NOTES — DO NOT DISTRIBUTE

Scenario 1: Education and Earnings

(a) Threats:

Omitted variable bias is the primary threat. Ability, family background, and motivation are correlated with both education and earnings but are omitted from the regression.
Sample selection bias is also present: the sample is restricted to employed adults. People with very low education may be disproportionately unemployed and excluded from the sample, which could bias the estimated return to education.

(b) Direction of bias:

OVB: Upward bias. Ability is positively correlated with education (higher-ability people get more schooling) and positively correlated with earnings (higher-ability people earn more). Since both correlations are positive, the omitted variable bias formula gives a positive bias, so 4,800 likely overstates the true causal effect.
Sample selection: Also likely upward. Among people with low education, only those with favorable unobserved characteristics (e.g., motivation, connections) remain employed, which compresses the apparent earnings gap between low- and high-education groups less than it should — but the net effect is ambiguous.

(c) Solutions:

Add control variables for ability (e.g., test scores), family background (parental education, income).
Use an instrumental variable (e.g., proximity to a college, compulsory schooling laws, quarter of birth).
Include the full adult population (employed and unemployed) to address selection.

Scenario 2: Police and Crime

(a) Threats:

Simultaneous causality bias is the primary threat. Crime rates affect police hiring decisions (cities with more crime hire more police), and police presence may also affect crime rates. Causality runs in both directions.

(b) Direction of bias:

Upward bias on the police coefficient. High crime causes cities to hire more police, creating a positive correlation between police and crime that is not the causal effect of police on crime. The true causal effect of police on crime is likely negative (more police reduces crime), but simultaneous causality pushes the estimated coefficient upward — potentially making it positive, as we see here.

(c) Solutions:

Use an instrumental variable that affects police staffing but does not directly affect crime (e.g., electoral cycles, firefighter staffing as in Levitt 1997, or federal grants for police hiring).
Exploit natural experiments (e.g., terror alert levels that exogenously increase police presence, as in Klick and Tabarrok 2005).

Scenario 3: Job Training and Wages

(a) Threats:

Sample selection bias (self-selection into treatment). Workers who voluntarily enroll in training are likely systematically different from those who do not — they may be more motivated, more career-oriented, or already on an upward trajectory.
Omitted variable bias — motivation and ambition are omitted and correlated with both training enrollment and wages.
Note: Students may identify either or both. Accept OVB as an answer, but emphasize that when the selection is into the treatment itself, “sample selection bias” is the more precise label per Stock & Watson Ch. 9.

(b) Direction of bias:

Upward bias. Workers who self-select into training are likely more motivated and ambitious, traits that independently lead to higher wages. The coefficient of 3.10 likely overstates the true causal effect of the program.

(c) Solutions:

Randomize access to the training program (RCT).
Use the initial offer of training as an instrument (intent-to-treat / IV approach) if the program was offered to a random subset.
Compare wages before and after training for participants vs. non-participants (difference-in-differences), though this still requires a parallel trends assumption.

Scenario 4: Health Insurance and Health

(a) Threats:

Errors-in-variables bias (measurement error) in the independent variable. If 10% of respondents misreport their insurance status, $insured_i$ is measured with classical measurement error.
Students may also identify simultaneous causality (healthier people may be more likely to have jobs that provide insurance, and insurance may improve health) and omitted variable bias (income, education, and health behaviors are correlated with both insurance and health outcomes).
All three are defensible, but the scenario is written to highlight measurement error.

(b) Direction of bias:

Measurement error in a binary independent variable causes attenuation bias — the coefficient is biased toward zero. The true effect of insurance on health is likely larger in magnitude than 0.9.
Key point for students: Classical measurement error in X always biases the coefficient toward zero, regardless of the sign of the true effect.

(c) Solutions:

Use administrative records on insurance coverage (e.g., insurer enrollment data) instead of self-reports to eliminate measurement error.
Use an instrumental variable (e.g., Medicaid eligibility cutoffs, employer mandate thresholds).
Reference for discussion: The Oregon Health Insurance Experiment randomly assigned Medicaid access.

Scenario 5: Advertising and Sales

(a) Threats:

Simultaneous causality bias is the primary threat. The company explicitly allocates more advertising to stores with strong prior sales. Revenue drives advertising spending, not just the reverse.
Omitted variable bias may also be present: store location quality, local economic conditions, and management quality affect both revenue and the advertising budget allocated.

(b) Direction of bias:

Upward bias. Stores with high revenue receive more advertising budget, creating a positive feedback loop. The coefficient of 5.2 overstates the true causal return to an additional dollar of advertising.

(c) Solutions:

Randomly assign advertising budgets across stores (an A/B test or field experiment).
Use an instrumental variable for advertising spending that is unrelated to store performance (e.g., random variation in local media costs).
Use lagged advertising spending (from two or more quarters ago) as the independent variable to break the contemporaneous simultaneity, though this does not fully resolve the problem if the allocation rule is persistent.

Mon, 01 Jan 0001 00:00:00 +0000

ECON3500: Econometrics and Applications

In-Class Activity: Regression Audit

Chapter 9 — Threats to Internal Validity

Setup

A school district superintendent wants to know: does reducing class size improve student test scores? She hires an economist to study the question using data from 420 elementary schools across California.

The key variables are:

testscr — district average test score (combined math and reading, scale 600–720)
str — student-teacher ratio (average class size proxy)
el_pct — percent of students who are English learners
avginc — district average household income (in $1,000s)
meal_pct — percent of students qualifying for free/reduced-price meals (a proxy for poverty)
calworks — percent of students in public assistance programs

The economist estimates three specifications. Review the output below and answer the questions that follow.

Specification (1): Bivariate regression

. regress testscr str
Source | SS df MS Number of obs = 420
-------------+---------------------------------- F(1, 418) = 22.58
Model | 7794.11004 1 7794.11004 Prob > F = 0.0000
Residual | 144315.484 418 345.252354 R-squared = 0.0512
-------------+---------------------------------- Adj R-squared = 0.0490
Total | 152109.594 419 363.029819 Root MSE = 18.580
------------------------------------------------------------------------------
testscr | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
str | -2.279808 .4798256 -4.75 0.000 -3.223068 -1.336549
_cons | 698.9330 9.467491 73.83 0.000 680.3231 717.5428
------------------------------------------------------------------------------

Specification (2): Adding demographic controls

. regress testscr str el_pct meal_pct
Source | SS df MS Number of obs = 420
-------------+---------------------------------- F(3, 416) = 361.68
Model | 124045.959 3 41348.6529 Prob > F = 0.0000
Residual | 28063.6355 416 67.4606612 R-squared = 0.8156
-------------+---------------------------------- Adj R-squared = 0.8143
Total | 152109.594 419 363.029819 Root MSE = 8.2135
------------------------------------------------------------------------------
testscr | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
str | -0.998309 .2704117 -3.69 0.000 -1.530032 -0.466587
el_pct | -0.121573 .0332578 -3.66 0.000 -0.186919 -0.056228
meal_pct | -0.547235 .0240462 -22.75 0.000 -0.594531 -0.499940
_cons | 700.3918 5.537407 126.48 0.000 689.5001 711.2835
------------------------------------------------------------------------------

Specification (3): Adding income and income-squared

. regress testscr str el_pct meal_pct avginc avginc_sq
Source | SS df MS Number of obs = 420
-------------+---------------------------------- F(5, 414) = 241.17
Model | 126306.099 5 25261.2198 Prob > F = 0.0000
Residual | 25803.4953 414 62.3272831 R-squared = 0.8305
-------------+---------------------------------- Adj R-squared = 0.8284
Total | 152109.594 419 363.029819 Root MSE = 7.8948
------------------------------------------------------------------------------
testscr | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
str | -0.734206 .2637618 -2.78 0.006 -1.252840 -0.215571
el_pct | -0.176017 .0341101 -5.16 0.000 -0.243040 -0.108993
meal_pct | -0.420495 .0295224 -14.24 0.000 -0.478520 -0.362470
avginc | 3.850891 0.582873 6.61 0.000 2.704831 4.996951
avginc_sq | -0.043100 0.009935 -4.34 0.000 -0.062620 -0.023580
_cons | 663.7046 7.745928 85.69 0.000 648.4748 678.9345
------------------------------------------------------------------------------

Questions

1. Interpret the coefficient on str in Specification (1). Is it statistically significant?

\vspace{3cm}

2. Compare the coefficient on str across the three specifications. What happens to it as controls are added? What does this pattern suggest about the direction of omitted variable bias in Specification (1)?

\vspace{4cm}

3. Specification (1) omits el_pct and meal_pct. Using the omitted variable bias formula, explain why omitting these variables likely biases the coefficient on str in Specification (1). Be specific about:

The likely correlation between str and the omitted variable
The likely sign of the omitted variable’s coefficient
The resulting direction of bias

\vspace{4cm}

4. Specification (3) adds avginc and avginc_sq (income squared). What threat to internal validity does including avginc_sq address? Why might the relationship between income and test scores be nonlinear?

\vspace{3cm}

5. Even after adding all these controls, what threats to internal validity might remain? Identify at least two specific threats and, for each, explain:

What the specific concern is
What additional data or method could help address it

\vspace{5cm}

6. The superintendent wants to use these results to argue that the state legislature should fund smaller class sizes across California. Briefly assess:

(a) Does this study have internal validity for estimating the causal effect of class size on test scores in California? Why or why not?
(b) Would these results have external validity if applied to schools in a developing country? Why or why not?

\vspace{5cm}

\newpage

INSTRUCTOR NOTES — DO NOT DISTRIBUTE

Question 1

The coefficient on str in Specification (1) is -2.28. Interpretation: A one-unit increase in the student-teacher ratio (i.e., one more student per teacher) is associated with a 2.28-point decrease in district average test scores, holding nothing else constant.

It is statistically significant at the 1% level: t = -4.75, p < 0.001. The 95% confidence interval [-3.22, -1.34] does not include zero.

Question 2

The coefficient on str changes across specifications:

Spec (1): -2.28
Spec (2): -1.00
Spec (3): -0.73

The coefficient shrinks in magnitude (toward zero) as controls are added. This pattern suggests that Specification (1) overstates the negative effect — i.e., the omitted variable bias in Spec (1) is negative (biasing the coefficient away from zero, making it more negative than the true causal effect).

This makes sense: districts with high student-teacher ratios tend to be poorer districts that also have worse test scores for other reasons. Once we control for those other reasons (poverty, English learners, income), the remaining effect attributable to class size is smaller.

Question 3

Using the OVB formula: $bias = \delta_1 \times \gamma_{omitted}$

For meal_pct (poverty):

Correlation between str and meal_pct: Likely positive. Poorer districts have less funding and therefore larger class sizes (higher student-teacher ratios).
Coefficient on meal_pct in the regression: Negative (-0.55). Poverty is associated with lower test scores.
Direction of bias: positive $\times$ negative = negative bias. Omitting poverty makes the coefficient on str more negative than its true value. This is consistent with what we observe.

For el_pct (English learners):

Correlation between str and el_pct: Likely positive. Districts with more English learners tend to be urban or under-resourced and may have higher student-teacher ratios.
Coefficient on el_pct: Negative (-0.12). More English learners is associated with lower average test scores.
Direction of bias: positive $\times$ negative = negative bias. Same direction.

Both omitted variables bias the str coefficient in the same direction (more negative), which is why the coefficient shrinks when they are added.

Question 4

Including avginc_sq addresses the threat of wrong functional form. The relationship between income and test scores is likely nonlinear — specifically, concave (diminishing returns). An additional $1,000 in average income probably improves test scores more in poor districts than in wealthy districts.

Evidence: The coefficient on avginc_sq is negative (-0.043) and statistically significant (t = -4.34), confirming the concave relationship. Including the squared term improves R-squared from 0.8156 to 0.8305.

If we only included avginc linearly, we would be imposing a constant marginal effect of income across all income levels, which is a misspecification.

Question 5

Remaining threats (students should identify at least two):

Omitted variable bias: Even with these controls, there may be unobserved factors correlated with both str and test scores. Examples:
- Teacher quality: Districts that can afford smaller classes may also attract better teachers. We cannot separate the class-size effect from the teacher-quality effect.
- Parental involvement: Wealthier districts may have more engaged parents, which improves test scores independent of class size.
- School resources (beyond class size): Libraries, technology, facilities.
- Solution: Add controls for teacher qualifications, school spending per pupil, or parental education. Alternatively, use an instrumental variable or a natural experiment (e.g., class size rules that create discontinuities, as in Angrist and Lavy 1999).
Simultaneous causality: Districts that observe low test scores may respond by reducing class sizes (hiring more teachers). If so, low test scores cause small class sizes, creating reverse causality.
- Solution: Use an instrument for class size, or exploit policy rules that exogenously assign class sizes (e.g., maximum class size rules that create sharp cutoffs).
Errors-in-variables bias: The student-teacher ratio is a proxy for class size, not actual class size. Districts may have non-teaching staff counted in the ratio, or some teachers may serve as specialists rather than classroom instructors. This measurement error would attenuate the coefficient toward zero.
- Solution: Use actual class size data rather than student-teacher ratios.
Sample selection bias: The sample includes only California districts. If certain types of districts are systematically excluded (e.g., very small rural districts, charter schools), results may not reflect the true relationship even within California.
- Solution: Verify sample coverage and compare included vs. excluded districts on observables.

Question 6

(a) Internal validity: The study has limited internal validity. While the addition of controls substantially reduces omitted variable bias (as evidenced by the coefficient changing from -2.28 to -0.73), the remaining threats identified in Question 5 — particularly unobserved teacher quality, simultaneous causality, and measurement error — mean we cannot be confident the estimate of -0.73 represents the true causal effect. The study is observational, not experimental, so there is no way to fully rule out confounding.

That said, the study does a reasonable job of addressing the most obvious sources of bias, and the consistent negative sign across specifications is suggestive. The coefficient is “better” than the naive estimate but still potentially biased.

(b) External validity: External validity to a developing country is weak. Several factors differ:

The relationship between class size and learning may differ due to different pedagogical approaches, curriculum, teacher training, and student background.
Class sizes in developing countries are often much larger (40–80 students), well outside the range observed in California (~15–25). Extrapolating beyond the data range is unreliable.
School infrastructure, teacher quality, and resource constraints differ fundamentally.
The outcome measure (standardized test scores) may not be comparable.

The results may generalize to other U.S. states with similar demographics and school systems, but even that requires caution about institutional differences.

Mon, 01 Jan 0001 00:00:00 +0000

ECON 3500 Econometrics and Applications
Spring 2026

In-Class Activity: Regression Validity — Solutions

Chapter 9 — Assessing Studies Based on Multiple Regression
Time: ~15-20 minutes

Your Job

Each example below is a research study. All six are based on real papers, but the descriptions below are simplified for class use.

For each study:

What is the goal?
- Causal inference
- Forecasting
What is the main problem?
- Omitted variable bias
- Wrong functional form
- Errors-in-variables bias
- Sample selection bias
- Simultaneous causality bias
- External validity only / not mainly an internal-validity problem
Why is that the right diagnosis?
What is one concrete fix or improvement?

Quick Diagnosis Guide

If the problem is…	Ask yourself…
OVB	Is there some omitted factor that affects $Y$ and is correlated with $X$?
Wrong functional form	Did we force a straight-line relationship when the true relationship is curved or interactive?
Measurement error	Is $X$ or $Y$ measured noisily, inaccurately, or systematically wrong?
Sample selection	Are some observations missing because of the outcome or some unobserved factor tied to it?
Simultaneous causality	Does $Y$ also affect $X$?
External validity	Even if the study is internally valid, would the result generalize to a different setting?

Example 1: Catholic Schooling and Educational Attainment

A researcher studies whether attending a Catholic high school increases graduation and college attendance. Students who attend Catholic schools may also come from families that are more motivated, more religious, or more education-focused to begin with.

Goal: Causal inference
Diagnosis: Omitted variable bias
Why: Students who attend Catholic schools are selected. Family motivation, religiosity, discipline, and neighborhood context may affect both school choice and later attainment.
Fixes:
- Add better controls
- Use a credible IV or lottery-style design
- Compare similar students more carefully

Example 2: Oregon Medicaid Lottery

The Oregon Health Insurance Experiment used a lottery to study the effects of Medicaid for low-income uninsured adults in Oregon. A policymaker wants to use those estimates to predict what the effects would be in a very different state with different hospitals, demographics, and eligibility rules.

Goal: Usually causal inference in the original study, but the policymaker’s question is about external validity
Diagnosis: External validity only / not mainly an internal-validity problem
Why: The question is whether Oregon lottery estimates transport to a very different setting. Students should talk about hospitals, baseline uninsured rates, take-up, and the local policy environment.
Fixes:
- Replicate in more settings
- Compare institutional context
- Ask whether the treated and target settings are genuinely comparable

Example 3: Survey Earnings vs. Administrative Records

Bound and Krueger compare workers' self-reported earnings in surveys to administrative earnings records. Suppose a researcher estimates the effect of earnings on some outcome using only the self-reported survey measure.

Goal: Causal inference or prediction; either answer is acceptable if justified
Diagnosis: Errors-in-variables bias
Why: Self-reported earnings differ from administrative records. The observed regressor may contain measurement error, and Bound-Krueger show that it is not purely classical.
Fixes:
- Use administrative records
- Validate survey responses
- Be cautious about assuming classical attenuation only

Example 4: Wages of Married Women

In Heckman’s classic sample-selection setup, wages are only observed for married women who choose to work. A researcher regresses wages on education using only women with observed wages.

Goal: Causal inference
Diagnosis: Sample selection bias
Why: Wages are only observed for women who work. Selection into employment depends on unobservables that may also affect wages.
Fixes:
- Model the selection process
- Use Heckman-style correction methods
- Gather information on nonworkers if possible

Example 5: Children and Mothers' Labor Supply

A researcher regresses a mother’s labor supply on the number of children she has and finds that women with more children work less. He concludes that having another child reduces labor supply by exactly that amount.

Goal: Causal inference
Diagnosis: Simultaneous causality bias
Why: Fertility affects labor supply, but labor supply choices may also affect fertility decisions. Family preferences and timing decisions tie the two together.
Fixes:
- IV
- Natural experiment
- Exogenous variation in family size

Example 6: Earnings and Experience

Following the classic earnings literature, a researcher regresses log earnings on years of schooling and years of labor-market experience. She includes experience only as a linear term, even though the earnings profile appears to rise early in the career and then flatten.

Goal: Prediction or description, though students may argue causal inference if they justify it carefully
Diagnosis: Wrong functional form
Why: A linear term imposes a constant marginal effect of experience, but the classic earnings profile is concave.
Fixes:
- Add experience squared
- Use logs
- Plot the data first

Final Checkup

Choose one of the six studies above and answer:

If you were the journal referee, would you trust the causal claim? Why or why not?

Answers will vary — the key is applying the correct diagnosis and explaining how it undermines or qualifies the study’s conclusions.

Mon, 01 Jan 0001 00:00:00 +0000

ECON 3500 Econometrics and Applications
Spring 2026

In-Class Activity: Regression Validity

Example 1: Catholic Schooling and Educational Attainment

Goal: __________________________________________

Main diagnosis: __________________________________

Why?

\vspace{2.5cm}

One fix:

\vspace{2cm}

\newpage

ECON 3500 Econometrics and Applications
Spring 2026

In-Class Activity: Regression Validity

Example 2: Oregon Medicaid Lottery

Goal: __________________________________________

Main diagnosis: __________________________________

Why?

\vspace{2.5cm}

One fix or follow-up question:

\vspace{2cm}

\newpage

ECON 3500 Econometrics and Applications
Spring 2026

In-Class Activity: Regression Validity

Example 3: Survey Earnings vs. Administrative Records

Goal: __________________________________________

Main diagnosis: __________________________________

Why?

\vspace{2.5cm}

One fix:

\vspace{2cm}

\newpage

ECON 3500 Econometrics and Applications
Spring 2026

In-Class Activity: Regression Validity

Example 4: Wages of Married Women

In Heckman’s classic sample-selection setup, wages are only observed for married women who choose to work. A researcher regresses wages on education using only women with observed wages.

Goal: __________________________________________

Main diagnosis: __________________________________

Why?

\vspace{2.5cm}

One fix:

\vspace{2cm}

\newpage

ECON 3500 Econometrics and Applications
Spring 2026

In-Class Activity: Regression Validity

Example 5: Children and Mothers' Labor Supply

Goal: __________________________________________

Main diagnosis: __________________________________

Why?

\vspace{2.5cm}

One fix:

\vspace{2cm}

\newpage

ECON 3500 Econometrics and Applications
Spring 2026

In-Class Activity: Regression Validity

Example 6: Earnings and Experience

Goal: __________________________________________

Main diagnosis: __________________________________

Why?

\vspace{2.5cm}

One fix:

\vspace{2cm}

Mon, 01 Jan 0001 00:00:00 +0000

ECON 3500 Econometrics and Applications
Spring 2026

In-Class Activity: Regression Validity

Chapter 9 — Assessing Studies Based on Multiple Regression
Time: ~15-20 minutes

Your Job

Each example below is a research study. All six are based on real papers, but the descriptions below are simplified for class use.

For each study:

What is the goal?
- Causal inference
- Forecasting
What is the main problem?
- Omitted variable bias
- Wrong functional form
- Errors-in-variables bias
- Sample selection bias
- Simultaneous causality bias
- External validity only / not mainly an internal-validity problem
Why is that the right diagnosis?
What is one concrete fix or improvement?

Quick Diagnosis Guide

If the problem is…	Ask yourself…
OVB	Is there some omitted factor that affects $Y$ and is correlated with $X$?
Wrong functional form	Did we force a straight-line relationship when the true relationship is curved or interactive?
Measurement error	Is $X$ or $Y$ measured noisily, inaccurately, or systematically wrong?
Sample selection	Are some observations missing because of the outcome or some unobserved factor tied to it?
Simultaneous causality	Does $Y$ also affect $X$?
External validity	Even if the study is internally valid, would the result generalize to a different setting?

Example 1: Catholic Schooling and Educational Attainment

Goal: __________________________________________

Main diagnosis: __________________________________

Why?

\vspace{2cm}

One fix:

\vspace{1.5cm}

Example 2: Oregon Medicaid Lottery

Goal: __________________________________________

Main diagnosis: __________________________________

Why?

\vspace{2cm}

One fix or follow-up question:

\vspace{1.5cm}

Example 3: Survey Earnings vs. Administrative Records

Goal: __________________________________________

Main diagnosis: __________________________________

Why?

\vspace{2cm}

One fix:

\vspace{1.5cm}

Example 4: Wages of Married Women

In Heckman’s classic sample-selection setup, wages are only observed for married women who choose to work. A researcher regresses wages on education using only women with observed wages.

Goal: __________________________________________

Main diagnosis: __________________________________

Why?

\vspace{2cm}

One fix:

\vspace{1.5cm}

\newpage

Example 5: Children and Mothers' Labor Supply

Goal: __________________________________________

Main diagnosis: __________________________________

Why?

\vspace{2cm}

One fix:

\vspace{1.5cm}

Example 6: Earnings and Experience

Goal: __________________________________________

Main diagnosis: __________________________________

Why?

\vspace{2cm}

One fix:

\vspace{1.5cm}

Final Checkup

Choose one of the six studies above and answer:

If you were the journal referee, would you trust the causal claim? Why or why not?

\vspace{4cm}

\newpage

INSTRUCTOR NOTES — DO NOT DISTRIBUTE

Preferred diagnoses

Example 1: Catholic Schooling and Educational Attainment

Goal: Causal inference
Diagnosis: Omitted variable bias
Why: Students who attend Catholic schools are selected. Family motivation, religiosity, discipline, and neighborhood context may affect both school choice and later attainment.
Fixes:
- Add better controls
- Use a credible IV or lottery-style design
- Compare similar students more carefully

Example 2: Oregon Medicaid Lottery

Goal: Usually causal inference in the original study, but the policymaker’s question is about external validity
Diagnosis: External validity only / not mainly an internal-validity problem
Why: The question is whether Oregon lottery estimates transport to a very different setting. Students should talk about hospitals, baseline uninsured rates, take-up, and the local policy environment.
Fixes:
- Replicate in more settings
- Compare institutional context
- Ask whether the treated and target settings are genuinely comparable

Example 3: Survey Earnings vs. Administrative Records

Goal: Causal inference or prediction; either answer is acceptable if justified
Diagnosis: Errors-in-variables bias
Why: Self-reported earnings differ from administrative records. The observed regressor may contain measurement error, and Bound-Krueger show that it is not purely classical.
Fixes:
- Use administrative records
- Validate survey responses
- Be cautious about assuming classical attenuation only

Example 4: Wages of Married Women

Goal: Causal inference
Diagnosis: Sample selection bias
Why: Wages are only observed for women who work. Selection into employment depends on unobservables that may also affect wages.
Fixes:
- Model the selection process
- Use Heckman-style correction methods
- Gather information on nonworkers if possible

Example 5: Children and Mothers' Labor Supply

Goal: Causal inference
Diagnosis: Simultaneous causality bias
Why: Fertility affects labor supply, but labor supply choices may also affect fertility decisions. Family preferences and timing decisions tie the two together.
Fixes:
- IV
- Natural experiment
- Exogenous variation in family size

Example 6: Earnings and Experience

Goal: Prediction or description, though students may argue causal inference if they justify it carefully
Diagnosis: Wrong functional form
Why: A linear term imposes a constant marginal effect of experience, but the classic earnings profile is concave.
Fixes:
- Add experience squared
- Use logs
- Plot the data first

Real-paper anchors

Example 1: Altonji, Elder, and Taber (2005), Journal of Political Economy, “Selection on Observed and Unobserved Variables: Assessing the Effectiveness of Catholic Schools”
Example 2: Finkelstein et al. (2012), Quarterly Journal of Economics, “The Oregon Health Insurance Experiment: Evidence from the First Year”
Example 3: Bound and Krueger (1991), Journal of Labor Economics, “The Extent of Measurement Error in Longitudinal Earnings Data”
Example 4: Heckman (1979), Econometrica, “Sample Selection Bias as a Specification Error”
Example 5: Angrist and Evans (1998), American Economic Review, “Children and Their Parents' Labor Supply: Evidence from Exogenous Variation in Family Size”
Example 6: Mincer (1974), Schooling, Experience, and Earnings

Teaching notes

The cleanest way to run this is:
1. Give groups 8 minutes to diagnose all six studies.
2. Cold-call one group per example.
3. For Example 2, push students to distinguish internal from external validity.
4. For Example 3, ask whether the measurement error is likely classical or non-classical.
5. For Example 6, ask whether wrong functional form threatens causal interpretation, prediction, or both.
If you want a faster version, assign only Examples 1, 3, 4, and 5.
If you want a harder version, require students to say something about the direction of bias for Examples 1, 3, 4, and 5.

Mon, 01 Jan 0001 00:00:00 +0000

Version: Fall 2020
EC200 Econometrics and Applications

In-Class Exercise - Multiple Linear Regression \

Consider a dataset on earnings in the United States. We are interested in the returns to education - how much an extra year of schooling “buys” you in terms of weekly wages (...as of 1980). You’re also worried about whether one’s education suffers from omitted variable bias.

You estimate two equations: $$\begin{aligned} \widehat{wage} &= 146.95 + 60.21educ\ \widehat{educ} & = 5.84 + 0.075IQ\end{aligned}$$

Based on these results, is 60.21 an overestimate or underestimate of the returns to education? How do you know?
You estimate another equation: $\widehat{education} = -128.89 +42.06 educ + 5.14 IQ$

What is the interpretation of the coefficient on $educ$? What is the interpretation of the constant?
Now, you control for experience and age and estimate the following population regression model:

$$wage_i = \beta_0 + \beta_1 educ_i + \beta_2 IQ_i + \beta_3 exper_i + \beta_4 age_i + \beta_5 age_i^2 + u_i$$

A one-year increase in age is associated with what change in wages? (mind the squared term)
Finally, because you are worried about omitted variable bias, you include father’s and mother’s education.
1. Why might parent’s education might directly affect wage?
2. Which other independent variables do you think parent’s education might affect? Explain.
3. How did controlling for parent’s education affect the returns to education? The returns to IQ?

Content overview | ECON3500: Econometrics and Applications

Exam 3 - Ch 8-10, 12

Coverage guide

What should I know?

I need more practice!

What is the structure of the exam?

How can I prepare?

In-class practice

Slides

Practice exam

Week 13 - Instrumental Variables

Overview

Reading Guide

Chapter 12: Instrumental Variables Regression

SW 12.1 The IV Estimator with a Single Regressor and a Single Instrument

SW 12.2 The General IV Regression Model

SW 12.3 Checking Instrument Validity

SW 12.4-12.6 The Other Stuff

Other resources

Slides

Week 14 - Research Week

Writing Papers

Slides

Presenting results

Empirical specification template

Working with outreg2

Introductions and conclusions

The introduction formula

Links

Introductions

Conclusions

General guidance for paper writing

Week 11/12 - Panel data

Overview

Reading Guide

Chapter 10: Regression with Panel Data

SW 10.1 Panel data

SW 10.2 Panel data with two time periods: “Before and after” comparisons

SW 10.3 Fixed effects regression

SW 10.4 Regression with time fixed effects

SW 10.5 The fixed effects regression assumptions and standard errors for fixed effects regressions

SW ?? Difference-in-differences estimation

Slides

Tuesday 3/31: Panel Data, First Differencing & Fixed Effects (Rae)

Thursday 4/4: Difference-in-Differences

Tuesday 4/8: Fixed Effects Extensions & Inference

In-Class Activity

Week 10 - Causal Diagrams & Assessing Studies

Overview

Reading Guide

Tuesday: Causal Diagrams (DAGs)

The Effect, Chapter 6: Causal Diagrams

The Effect, Chapter 7: Drawing Causal Diagrams

The Effect, Chapter 8: Causal Paths and Closing Back Doors

Thursday: Assessing Regression Validity (SW Chapter 9)

SW 9.1 Internal and External Validity

SW 9.2 Threats to Internal Validity of Multiple Regression Analysis

Bonus: SW 11.1 Linear Probability Models

SW 9.4 Example: Test Score and Class Size

Slides

Tuesday: Causal Diagrams (DAGs)

Thursday: Assessing Studies (SW Ch. 9)

Week 9 - Nonlinear Regression

Overview

Reading Guide

Chapter 8: Nonlinear Regression Functions

SW 8.1 A General Strategy for Modeling Nonlinear Regression Functons

SW 8.2 Nonlinear Functions of a Single Independent Variable

SW 8.3 Interactions Between Independent Variables

Slides

Exam 2 - CH 4-7

What should I know?

I need more practice!

What is the structure of the exam?

In-class practice

Slides

Past exam

Week 7 - Hypothesis Tests with Multiple Regressions

Overview

Reading Guide

Working with `outreg2`