Short Rate Models (Part 8: Long-Run Expectations II)
2026-03-17
1 Introduction
The theory notebook introduced the survey-augmented Gaussian state-space system. This notebook turns that structure into a reduced empirical workflow. The important design choice is that the notebook does not fetch or reshape data ad hoc. The data layer lives in alphaforge, the model layer lives in short-rate-models, and the notebook only orchestrates the two.
The goal is not a paper-close replication of Kim and Orphanides or Kim and Wright. The goal is a public-data replication that keeps the same logic: a monthly Treasury yield panel, a survey-based expectation anchor, and a benchmark term-premium series from the Federal Reserve Board.
Code
import osimport sysfrom pathlib import Pathimport numpy as npimport pandas as pddef locate_workspace() -> Path: cwd = Path.cwd().resolve()for candidate in [cwd, *cwd.parents]:if (candidate /'alphaforge').exists() and (candidate /'short-rate-models').exists():return candidateraiseRuntimeError('Could not locate the steveya workspace from the current working directory.')WORKSPACE = locate_workspace()sys.path.insert(0, str(WORKSPACE /'alphaforge'))sys.path.insert(0, str(WORKSPACE /'short-rate-models'))from alphaforge import ( DataContext, DuckDBParquetStore, FREDDataSource, FRBTermStructureBenchmarkSource, PhiladelphiaSPFMeanLevelSource, TradingCalendar, build_kim_orphanides_dataset,)from short_rate_models import KimOrphanidesSurveyModel
2 Data Context
The alphaforge context wires together three official sources.
FRED for the Treasury yield panel.
The Philadelphia Fed SPF historical workbook for survey anchors.
The Federal Reserve Board three-factor term-structure release for an external benchmark term-premium series.
The resulting dataset object contains a monthly yield panel in decimals, a monthly survey anchor panel, a short-rate proxy, and the benchmark term-premium panel.
Code
fred_api_key = os.environ.get('FRED_API_KEY')ifnot fred_api_key:raiseRuntimeError('Set FRED_API_KEY before running this notebook.')ctx = DataContext( sources={'fred': FREDDataSource(api_key=fred_api_key),'philadelphia_spf': PhiladelphiaSPFMeanLevelSource(),'frb_term_structure': FRBTermStructureBenchmarkSource(), }, calendars={'XNYS': TradingCalendar('XNYS', tz='UTC')}, store=DuckDBParquetStore(root=WORKSPACE /'.alphaforge_store'/'short_rate_models'),)dataset = build_kim_orphanides_dataset( ctx, start=pd.Timestamp('1990-01-01', tz='UTC'), end=pd.Timestamp('2024-12-31', tz='UTC'),)if dataset.surveys isNoneor dataset.surveys.empty:raiseRuntimeError('The SPF loader did not return usable survey anchors for this date range.')dataset.metadata
The package fit is intentionally lightweight. It extracts three latent factors from the yield panel with principal components, estimates the physical transition with a VAR(1), regresses yields on the inferred states, and then uses the survey panel inside the Kalman filtering and smoothing step.
That is not the full maximum-likelihood estimation problem in the papers, but it gives us a coherent reduced state-space system whose outputs can be inspected and compared to the benchmark data.
Code
model, fit = KimOrphanidesSurveyModel.fit( yields=dataset.yields, surveys=dataset.surveys,)fit['smoothed_states'].tail()
comparison =Noneif dataset.benchmark isnotNoneandnot dataset.benchmark.empty: benchmark_cols = [c for c in dataset.benchmark.columns if'10'instr(c)]if benchmark_cols: benchmark_series = dataset.benchmark[benchmark_cols[0]].dropna() model_series = fit['smoothed_states'].apply(lambda row: model.ten_year_term_premium(state=row.to_numpy(dtype=float)), axis=1, ) comparison = pd.concat( [model_series.rename('model_10y_term_premium'), benchmark_series.rename('frb_10y_term_premium')], axis=1, ).dropna()comparison.tail() if comparison isnotNoneelseNone
4 Findings
The empirical object to focus on is the decomposition rather than the raw fit statistic. If the survey anchor matters, the expectations component should move more smoothly than the raw yield series, and the resulting term-premium estimate should not merely mirror every yield change one-for-one. The Federal Reserve benchmark is useful here because it disciplines the magnitude and sign of the long-end premium.
In a reduced public-data replication, the exact level will differ from the official series. What matters is whether the model captures the same broad separation between expectations and premia, and whether the survey block stabilizes the long-run anchor relative to a yield-only Gaussian system.
5 Limitations
This notebook uses a public-data approximation. The survey panel is built from the historical SPF workbook and mapped into a reduced set of horizons. The transition law is estimated with a PCA-VAR calibration rather than the full likelihood problem. The notebook is therefore best read as a faithful implementation of the paper’s logic rather than a claim of exact numerical replication.