Measuring Energy Efficiency

Problem Set Solutions

Prof. Richard Sweeney

Economics of Energy and the Environment
Econ 3391.01
Boston College

2025-12-04

Overview

Topic: How do we measure the benefits of energy efficiency policies?

Key Concepts:

  • We can only learn by comparing energy bills of adopters to non-adopters
  • Different people benefit differently from energy efficiency and adoption is a “purposeful” decision
  • Different comparisons yield different answers. Some have a causal interpretation, others do not.

Methods Covered:

  1. Average Treatment Effect (ATE)
  2. Voluntary Opt-in & Selection
  3. Cross-sectional Comparison
  4. Difference-in-Differences
  5. Randomized Experiments
  6. Intent-to-Treat (ITT)
  7. Local Average Treatment Effect (LATE)

Background

Home Energy Audits

What is a home energy audit?

  • Energy specialist visits your home
  • Inventories appliances, lightbulbs, etc.
  • Checks insulation, doors, windows
  • Analyzes electricity usage patterns
  • Provides recommendations for efficiency investments

Policy Goal: Reduce information barriers to energy efficiency adoption

A Unique Dataset

We have an unusual dataset with parallel universe observations:

Variables:

  • y00: Pre-period electricity spending (no audit)
  • y10: Post-period spending WITHOUT audit
  • y11: Post-period spending WITH audit
  • ey10, ey11: Expected spending
  • hassle: Cost of undergoing audit

Example: Household 1

  • Pre-period: $1,251
  • Expected without audit: $1,843
  • Expected with audit: $1,486
  • Actual without audit: $1,874
  • Actual with audit: $1,548

Key Point: We observe both potential outcomes for each household!

Challenge: In the real world, we never have parallel universe data!

Only observe:

  • y10 if household didn’t get audit
  • y11 if household did get audit
  • NOT hassle costs or expectations

But by having this special data in this problem set, we’ll be able to compute the true savings for different groups of households and compare to various estimation methods.

Load the Data

Show code
# Load the data
# audit_data <- read_csv(file.path(root,"_old", "_problem_sets", 
#                                   "pset_energy-efficiency","audit_data_lowercase.csv"))

audit_data <- read_csv(file.path(root,"modules", "EnergyEfficiency", "practice_problems",
                                  "old_pset","audit_data_lowercase.csv"))

# Display first few rows
head(audit_data, 4) %>%
  kable(caption = "First 4 rows of audit data") %>%
  kable_styling(font_size = 20)
First 4 rows of audit data
household y00 y10 y11 ey10 ey11 hassle
1 1251.248 1874.361 1548.779 1843.381 1486.276 99.83278
2 1418.414 2129.128 1786.575 2151.938 1830.218 70.99882
3 1549.773 2099.195 1646.176 2165.068 1648.257 111.12599
4 1255.664 1922.626 1548.693 1963.576 1570.199 101.88632

Dataset: 10,000 households

Policy Question

Central Question

By how much do home energy audits reduce household energy bills?

Important question:

  • For which households do we want to compute the causal effect of audits?
  • Some options
    • All households (ATE)
    • Only those who choose audits voluntarily (ATT)
    • Only those induced to get audits by a subsidy (LATE)

Question 1.1: True ATE

Average Treatment Effect

Question: If we could force everyone to get an audit, what would be the average savings?

Calculation: \(\text{Savings} = y_{10} - y_{11}\)

This is the Average Treatment Effect (ATE).

Show code
# Calculate true savings
audit_data <- audit_data %>%
  mutate(savings = y10 - y11)

# Calculate average treatment effect
ATE <- mean(audit_data$savings)
# cat("Average Treatment Effect (ATE): $", round(ATE, 2), sep = "")

Answer

Average savings across all households if audited: $301.96

This is the true population average treatment effect.

Distribution of Savings

Question 1.2: Voluntary Opt-in

Who Chooses Audits?

Setup:

  • Audits cost $250
  • Households have hassle costs (time, effort)
  • Households form expectations about savings

Decision Rule: Opt in if: \[\text{Expected Savings} > \$250 + \text{Hassle Cost}\]

Calculate: Net savings = \((\text{ey}_{10} - \text{ey}_{11}) - \text{hassle} - 250\)

Opt-in Analysis

Show code
# Calculate expected savings and net savings
audit_data <- audit_data %>%
  mutate(
    exp_savings = ey10 - ey11,
    net_savings = exp_savings - hassle - 250,
    D_optin = if_else(net_savings > 0, 1, 0)
  )

# Count opt-ins
optin_count <- sum(audit_data$D_optin)

# Average treatment effect by group
ate_by_group <- audit_data %>%
  group_by(D_optin) %>%
  summarise(
    n = n(),
    avg_savings = mean(savings),
    avg_exp_savings = mean(exp_savings),
    avg_net_savings = mean(net_savings),
    .groups = "drop"
  )
Treatment Effects by Opt-in Status
Group N Avg Savings Avg Expected Savings Avg Net Savings
Don't opt-in 6488 254.56 233.61 -120.35
Opt-in 3512 389.53 429.77 87.58

Key Insight: Selection

Selection into Treatment

  • 3512 households opt in voluntarily
  • Opt-in group has higher average savings ($389.53)
  • Non-opt-in group has lower average savings ($254.56)

Why? People who benefit more are more likely to opt in!

This creates selection bias in simple comparisons.

Question 1.3: Cross-sectional Comparison

Naive Comparison

What regulators typically do: Compare people who got audits to those who didn’t

Show code
# Create observed outcome variable
audit_data <- audit_data %>%
  mutate(y1obs = if_else(D_optin == 1, y11, y10))

# Calculate means by group
cross_section <- audit_data %>%
  group_by(D_optin) %>%
  summarise(
    mean_y00 = mean(y00),
    mean_y10 = mean(y10),
    mean_y11 = mean(y11),
    mean_y1obs = mean(y1obs),
    .groups = "drop"
  )

# Calculate cross-sectional estimate
est_xsection <- cross_section$mean_y1obs[1] - cross_section$mean_y1obs[2]

Cross-sectional Results

Mean Outcomes by Opt-in Status
Group Mean Y10 Mean Y11 Mean Observed Y
Don't opt-in 1979 1724 1979
Opt-in 2034 1645 1645

Simple difference: $334

True ATE: $302

Problem: Selection Bias!

The cross-sectional estimate includes both treatment effect AND pre-existing differences between groups.

Question 1.4: Difference-in-Differences

DiD Intuition

Idea: Use pre-period to control for selection

  1. Calculate change over time for each group
  2. Take the difference of these differences

\[\text{DiD} = [E(Y_{1}|D=0) - E(Y_{0}|D=0)] - [E(Y_{1}|D=1) - E(Y_{0}|D=1)]\]

Assumption: Parallel trends in absence of treatment

DiD Calculation

Show code
# Calculate differences over time
did_data <- audit_data %>%
  group_by(D_optin) %>%
  summarise(
    mean_y00 = mean(y00),
    mean_y1obs = mean(y1obs),
    diff = mean_y1obs - mean_y00,
    .groups = "drop"
  )

# DiD estimate
est_did <- did_data$diff[1] - did_data$diff[2]
Difference-in-Differences Calculation
Group Mean Y00 Mean Y1 Difference
Don't opt-in 1491 1979 488
Opt-in 1509 1645 135

DiD Estimate: $352
True ATT (opt-in group): $390

DiD Assessment

Improvement over Cross-section

  • Cross-section: $334
  • DiD: $352
  • True ATT: $390

DiD gets closer by controlling for time-invariant differences!

Still slightly off because selection may occur on time-varying characteristics.

Question 1.5: Randomized Experiment

The Gold Standard: RCT

Setup:

  • Randomly select 1,000 households
  • Force them to have audits (treatment group)
  • Remaining 9,000 households = control group
  • Compare outcomes

Key: Randomization eliminates selection bias!

RCT Implementation

Show code
# Randomize treatment assignment
audit_data <- audit_data %>%
  mutate(irand = runif(n())) %>%
  arrange(irand) %>%
  mutate(D_treat = if_else(row_number() <= 1000, 1, 0))

# Calculate means for treatment and control
rct_means <- audit_data %>%
  group_by(D_treat) %>%
  summarise(
    mean_y11 = mean(y11),
    mean_y10 = mean(y10),
    .groups = "drop"
  )

# Estimate from experiment
est_experiment <- rct_means$mean_y10[1] - rct_means$mean_y11[2]

RCT Results

RCT: Mean Outcomes by Treatment Status
Group Mean Y11 Mean Y10
Control (no audit) 1696 1998
Treatment (forced audit) 1699 2003

Result

  • RCT estimate: $299
  • True ATE: $302

Very close! Small difference due to sampling variation.

Question 1.6: Free Audits (No Mandate)

Realistic Policy: Incentives, Not Mandates

Problem: Can’t actually force people to get audits!

New Setup:

  • Same 1,000 randomly selected households get free audits
  • Other 9,000 can still pay $250 for audits
  • Households choose whether to accept

Key Change: Treatment = offer of free audit, not mandatory audit

Who Takes Up Free Audits?

Show code
# Calculate net savings with RCT (free audits for treatment group)
audit_data <- audit_data %>%
  mutate(
    net_savings_rct = if_else(D_treat == 1, 
                              net_savings + 250,  # No cost for treatment
                              net_savings),       # Still costs $250 for control
    D_optin_rct = if_else(net_savings_rct > 0, 1, 0)
  )

# New opt-ins
new_optins <- sum(audit_data$D_optin_rct == 1 & audit_data$D_optin == 0)
total_optin_rct <- sum(audit_data$D_optin_rct)

Takeup

  • Original opt-ins (at $250): 3512
  • New opt-ins (with RCT): 4112
  • Screened in by free audit: 600 households

The subsidy increases participation by 17%!

Question 1.7: Intent-to-Treat

Intent-to-Treat (ITT)

Definition: Compare outcomes based on random assignment, not actual behavior

  • Treatment group = offered free audit
  • Control group = not offered free audit

Why? Preserves randomization even with imperfect compliance

Show code
# Create observed outcome under RCT
audit_data <- audit_data %>%
  mutate(y1obs_rct = if_else(D_optin_rct == 1, y11, y10))

# Calculate ITT
itt_means <- audit_data %>%
  group_by(D_treat) %>%
  summarise(mean_y1obs_rct = mean(y1obs_rct), .groups = "drop")

# ITT estimate
est_itt <- itt_means$mean_y1obs_rct[1] - itt_means$mean_y1obs_rct[2]

ITT Results

Intent-to-Treat: Mean Outcomes
Group Mean Electricity Spending
Not offered 1861
Offered free audit 1704

Much Smaller Effect!

  • ITT estimate: $157
  • True ATE: $302

Why so different? Not everyone takes the free audit!

Question 1.8: Local Average Treatment Effect

Defining Compliers

Three types of households:

  1. Always-takers: Would get audit even at $250
  2. Compliers: Only get audit when it’s free
  3. Never-takers: Don’t get audit even when free

Key insight: RCT only changes behavior of compliers!

Identifying Compliers

Show code
# Identify compliers
audit_data <- audit_data %>%
  mutate(compliers = if_else(D_optin_rct == 1 & D_optin == 0, 1, 0))

# Compliance rates by treatment group
compliance_table <- audit_data %>%
  group_by(D_treat) %>%
  summarise(
    n = n(),
    pct_optin_rct = mean(D_optin_rct) * 100,
    pct_optin_original = mean(D_optin) * 100,
    pct_compliers = mean(compliers) * 100,
    .groups = "drop"
  )

# Complier share and LATE
comply_share <- mean(audit_data$compliers[audit_data$D_treat == 1])
est_late <- mean(audit_data$savings[audit_data$compliers == 1])

Compliance Rates

Compliance Rates by Treatment Assignment
Group N % Opt-in (RCT) % Opt-in (Original) % Compliers
Control 9000 35.0 35.0 0
Treatment 1000 95.8 35.8 60

Key Numbers

  • Compliers: 60% of treatment group
  • LATE (savings for compliers): $268
  • Compare to ATE: $302

Why is LATE < ATE?

Compliers (free audit only):

  • Lower expected savings
  • Higher hassle costs
  • On the margin

Average savings: $268

Always-takers (pay $250):

  • Higher expected savings
  • Lower hassle costs
  • Inframarginal

Average savings: $390

Households with highest benefits adopt first!

Question 1.9: The ITT-LATE Connection

The Formula

Relationship: \[\text{ITT} = \text{Compliance Rate} \times \text{LATE}\]

Intuition:

  • ITT averages over everyone (including non-compliers)
  • LATE is effect on compliers only
  • Compliance rate scales LATE down to ITT

Verification

Show code
# Check ITT = compliance_rate * LATE
predicted_itt <- comply_share * est_late

The Math Checks Out!

  • Compliance rate: 0.6
  • LATE: $267.66
  • Predicted ITT: 0.6 × $267.66 = $160.59
  • Actual ITT: $156.79

Question 1.10: Policy Implications

Which Estimate Should We Use?

ATE ($302):

Use if you can:

  • Mandate participation
  • Achieve universal coverage
  • Don’t care about costs

Problem: Infeasible and ignores heterogeneity

LATE ($268):

Use for:

  • Voluntary programs
  • Subsidies/incentives
  • Marginal participants

Advantage: Estimates effect on those you’ll actually influence

Key Policy Insights

Main Takeaways

  1. Heterogeneous treatment effects: Benefits vary across households

  2. Selection matters: Those who benefit most adopt first

  3. Declining returns: Each additional adopter has lower savings

  4. Cost-benefit analysis: Optimal policy stops before 100% adoption

  5. Target the margin: Focus on compliers, not always-takers

Summary

Comparison of All Estimates

Summary of Treatment Effect Estimates
Method Question Estimate Bias
True ATE 1.1 302
Cross-sectional 1.3 334 Selection
Difference-in-Differences 1.4 352 Time-varying selection
RCT - Forced 1.5 299 None (forced)
Intent-to-Treat 1.7 157 Compliance
LATE 1.8 268 External validity

Visual Comparison

Lessons for Policy Evaluation

  1. Cross-sectional comparisons are biased by selection

  2. DiD helps but requires parallel trends assumption

  3. Randomization eliminates selection bias (when compliance is perfect)

  4. ITT preserves randomization with imperfect compliance

  5. LATE identifies effect on marginal participants

  6. Choose estimand based on policy: What question are you trying to answer?

Final Thoughts

The Big Picture

Energy efficiency programs face fundamental challenges:

  • Information asymmetry: Households know their benefits better than policymakers
  • Adverse selection: Programs attract those with highest private benefits
  • Diminishing returns: Expanding participation reduces average benefits
  • Optimal policy ≠ universal adoption

Good policy evaluation requires understanding who is affected by your intervention!

Questions?