Short Rate Models (Part 8: Long-Run Expectations II)

2026-03-17

1 Introduction

The theory notebook introduced the survey-augmented Gaussian state-space system. This notebook turns that structure into a reduced empirical workflow. The important design choice is that the notebook does not fetch or reshape data ad hoc. The data layer lives in alphaforge, the model layer lives in short-rate-models, and the notebook only orchestrates the two.

The goal is not a paper-close replication of Kim and Orphanides or Kim and Wright. The goal is a public-data replication that keeps the same logic: a monthly Treasury yield panel, a survey-based expectation anchor, and a benchmark term-premium series from the Federal Reserve Board.

Code
import os
import sys
from pathlib import Path

import numpy as np
import pandas as pd


def locate_workspace() -> Path:
    cwd = Path.cwd().resolve()
    for candidate in [cwd, *cwd.parents]:
        if (candidate / 'alphaforge').exists() and (candidate / 'short-rate-models').exists():
            return candidate
    raise RuntimeError('Could not locate the steveya workspace from the current working directory.')


WORKSPACE = locate_workspace()
sys.path.insert(0, str(WORKSPACE / 'alphaforge'))
sys.path.insert(0, str(WORKSPACE / 'short-rate-models'))

from alphaforge import (
    DataContext,
    DuckDBParquetStore,
    FREDDataSource,
    FRBTermStructureBenchmarkSource,
    PhiladelphiaSPFMeanLevelSource,
    TradingCalendar,
    build_kim_orphanides_dataset,
)
from short_rate_models import KimOrphanidesSurveyModel

2 Data Context

The alphaforge context wires together three official sources.

  1. FRED for the Treasury yield panel.
  2. The Philadelphia Fed SPF historical workbook for survey anchors.
  3. The Federal Reserve Board three-factor term-structure release for an external benchmark term-premium series.

The resulting dataset object contains a monthly yield panel in decimals, a monthly survey anchor panel, a short-rate proxy, and the benchmark term-premium panel.

Code
fred_api_key = os.environ.get('FRED_API_KEY')
if not fred_api_key:
    raise RuntimeError('Set FRED_API_KEY before running this notebook.')

ctx = DataContext(
    sources={
        'fred': FREDDataSource(api_key=fred_api_key),
        'philadelphia_spf': PhiladelphiaSPFMeanLevelSource(),
        'frb_term_structure': FRBTermStructureBenchmarkSource(),
    },
    calendars={'XNYS': TradingCalendar('XNYS', tz='UTC')},
    store=DuckDBParquetStore(root=WORKSPACE / '.alphaforge_store' / 'short_rate_models'),
)

dataset = build_kim_orphanides_dataset(
    ctx,
    start=pd.Timestamp('1990-01-01', tz='UTC'),
    end=pd.Timestamp('2024-12-31', tz='UTC'),
)

if dataset.surveys is None or dataset.surveys.empty:
    raise RuntimeError('The SPF loader did not return usable survey anchors for this date range.')

dataset.metadata
Code
dataset.yields.tail(), dataset.surveys.tail(), None if dataset.benchmark is None else dataset.benchmark.tail()

3 Reduced Fit

The package fit is intentionally lightweight. It extracts three latent factors from the yield panel with principal components, estimates the physical transition with a VAR(1), regresses yields on the inferred states, and then uses the survey panel inside the Kalman filtering and smoothing step.

That is not the full maximum-likelihood estimation problem in the papers, but it gives us a coherent reduced state-space system whose outputs can be inspected and compared to the benchmark data.

Code
model, fit = KimOrphanidesSurveyModel.fit(
    yields=dataset.yields,
    surveys=dataset.surveys,
)

fit['smoothed_states'].tail()
Code
last_state = fit['smoothed_states'].iloc[-1].to_numpy(dtype=float)
decomposition = model.term_premium_decomposition(
    maturities=model.yield_maturities,
    state=last_state,
)

decomposition_frame = pd.DataFrame(
    {
        'maturity_years': decomposition['maturities'],
        'yield': decomposition['yield'],
        'expectations_component': decomposition['expectations'],
        'term_premium': decomposition['term_premium'],
    }
)
decomposition_frame
Code
comparison = None
if dataset.benchmark is not None and not dataset.benchmark.empty:
    benchmark_cols = [c for c in dataset.benchmark.columns if '10' in str(c)]
    if benchmark_cols:
        benchmark_series = dataset.benchmark[benchmark_cols[0]].dropna()
        model_series = fit['smoothed_states'].apply(
            lambda row: model.ten_year_term_premium(state=row.to_numpy(dtype=float)),
            axis=1,
        )
        comparison = pd.concat(
            [model_series.rename('model_10y_term_premium'), benchmark_series.rename('frb_10y_term_premium')],
            axis=1,
        ).dropna()
comparison.tail() if comparison is not None else None

4 Findings

The empirical object to focus on is the decomposition rather than the raw fit statistic. If the survey anchor matters, the expectations component should move more smoothly than the raw yield series, and the resulting term-premium estimate should not merely mirror every yield change one-for-one. The Federal Reserve benchmark is useful here because it disciplines the magnitude and sign of the long-end premium.

In a reduced public-data replication, the exact level will differ from the official series. What matters is whether the model captures the same broad separation between expectations and premia, and whether the survey block stabilizes the long-run anchor relative to a yield-only Gaussian system.

5 Limitations

This notebook uses a public-data approximation. The survey panel is built from the historical SPF workbook and mapped into a reduced set of horizons. The transition law is estimated with a PCA-VAR calibration rather than the full likelihood problem. The notebook is therefore best read as a faithful implementation of the paper’s logic rather than a claim of exact numerical replication.