Short Rate Models (Part 11: Local Momentum and Mean Reversion I)

2026-03-17

1 Introduction

The previous theory notebooks stayed mostly inside linear Gaussian affine systems. Duan’s local-momentum paper is harder. The difficulty is not that the model is conceptually opaque. The difficulty is that the paper compresses several nontrivial manipulations into a small number of displayed formulas. This notebook therefore has a different objective from the other theory posts. It reconstructs those formulas slowly enough that a mathematically mature undergraduate can follow the logic without already knowing the term-structure literature.

The derivation naturally falls into three blocks. First, we derive the local-momentum autoregression from the ordinary mean-reverting benchmark and rewrite it as an augmented-state Markov system. Second, we extend the local component to the stochastic-central-tendency system used in the term-structure model. Third, we derive the bond-pricing recursions once the state has been enlarged.

Code
import numpy as np
import pandas as pd

2 Benchmark Mean Reversion

Start from the ordinary autoregression

X_t = a + \,\phi X_{t-1} + \sigma_x \varepsilon_t

with \varepsilon_t \sim N(0,1). When |\phi| < 1, the process is stable. The form used throughout the short-rate literature is obtained by writing the same equation in mean-reversion notation. Define

\kappa_x = 1 - \phi

and, provided \kappa_x \neq 0, define the long-run mean by

\bar \mu = \frac{a}{1 - \phi} = \frac{a}{\kappa_x}

Then

X_t - X_{t-1} = a - (1-\phi)X_{t-1} + \sigma_x \varepsilon_t = \kappa_x(\bar \mu - X_{t-1}) + \sigma_x \varepsilon_t

so the benchmark law becomes

\Delta X_t = \kappa_x(\bar \mu - X_{t-1}) + \sigma_x \varepsilon_t

The sign pattern is immediate. If X_{t-1} is above \bar \mu, the deterministic drift is negative. If it is below \bar \mu, the deterministic drift is positive. That is global mean reversion in its simplest form.

3 Local-Momentum Autoregression

Duan augments the benchmark law with a comparison between the current level and a weighted moving average of the recent past. Write

\Delta X_t = \kappa_x(\bar \mu - X_{t-1}) + \omega(\bar X_{t-1\mid n} - X_{t-1}) + \sigma_x \varepsilon_t

where

\bar X_{t-1\mid n} = \sum_{j=1}^{n} b_j X_{t-j} \qquad b_j \ge 0 \qquad \sum_{j=1}^{n} b_j = 1

The moving-average term is the new object. It compares the current level to a local historical average rather than to the global mean.

To see the implied autoregression, expand the right-hand side:

X_t = X_{t-1} + \kappa_x(\bar \mu - X_{t-1}) + \omega\left(\sum_{j=1}^{n} b_j X_{t-j} - X_{t-1}\right) + \sigma_x \varepsilon_t

Collect the X_{t-1} terms:

X_t = \kappa_x \bar \mu + \bigl(1 - \kappa_x - \omega\bigr)X_{t-1} + \omega \sum_{j=1}^{n} b_j X_{t-j} + \sigma_x \varepsilon_t

This is the first key derivation. The model is not mysterious. It is an autoregression whose coefficients are restricted by the long-run mean-reversion parameter \kappa_x, the local-momentum parameter \omega, and the moving-average weights b_j.

4 Companion-State Representation

When the weighting window is finite, the model can be written as a finite-dimensional Markov system by stacking the lags. Define the companion state

s_t = (X_t, X_{t-1}, \ldots, X_{t-n+1})^\top

Then the first component evolves according to the restricted autoregression above, while the remaining components just shift the lag stack forward:

s_t = c + A s_{t-1} + \Sigma \eta_t

with

c = (\kappa_x \bar \mu, 0, \ldots, 0)^\top

and the first row of A given by the coefficients on the lagged levels. Concretely,

A_{1,1} = 1 - \kappa_x - \omega + \omega b_1 \qquad A_{1,j} = \omega b_j \quad \text{for } j = 2, \ldots, n

and the subdiagonal of A is filled with ones so that the lags shift in the usual companion-matrix way.

This derivation matters because it turns the apparently path-dependent model into an ordinary linear state transition once the state has been enlarged. The history dependence has not disappeared. It has been absorbed into additional state coordinates.

5 Exponential Weights and Recursive Momentum

A particularly convenient choice is exponentially decaying weights. Define the local average recursively by

m_t = \lambda X_t + (1-\lambda)m_{t-1}

with 0 < \lambda \le 1. The local-momentum law becomes

\Delta X_t = \kappa_x(\bar \mu - X_{t-1}) + \omega(m_{t-1} - X_{t-1}) + \sigma_x \varepsilon_t

Equivalently,

X_t = \kappa_x \bar \mu + (1 - \kappa_x - \omega)X_{t-1} + \omega m_{t-1} + \sigma_x \varepsilon_t

Now substitute this into the recursion for m_t:

m_t = \lambda X_t + (1-\lambda)m_{t-1}

= \lambda \kappa_x \bar \mu + \lambda(1-\kappa_x-\omega)X_{t-1} + \bigl(1 - \lambda + \lambda \omega\bigr)m_{t-1} + \lambda \sigma_x \varepsilon_t

So the pair (X_t, m_t) evolves linearly:

\begin{pmatrix} X_t \\ m_t \end{pmatrix} = \begin{pmatrix} \kappa_x \bar \mu \\ \lambda \kappa_x \bar \mu \end{pmatrix} + \begin{pmatrix} 1-\kappa_x-\omega & \omega \\ \lambda(1-\kappa_x-\omega) & 1-\lambda+\lambda\omega \end{pmatrix} \begin{pmatrix} X_{t-1} \\ m_{t-1} \end{pmatrix} + \begin{pmatrix} \sigma_x \\ \lambda\sigma_x \end{pmatrix} \varepsilon_t

This is the second key derivation. Once the weights are recursive, the local-momentum model returns to a low-dimensional Markov form.

Code
# Numerical check: a stable parameter choice should keep the companion eigenvalues inside the unit circle.
A = np.array([
    [1 - 0.08 - 0.20, 0.20],
    [0.25 * (1 - 0.08 - 0.20), 1 - 0.25 + 0.25 * 0.20],
], dtype=float)
np.linalg.eigvals(A)

6 Unconditional Mean and Stability

Write the augmented state as z_t = c + A z_{t-1} + \Sigma \eta_t. If the spectral radius of A is strictly smaller than one, the process has a finite unconditional mean

\mu_z = (I - A)^{-1}c

For the two-state exponential-weight system, solving this explicitly shows why the model preserves global mean reversion. Let \mu_X = \mathbb{E}[X_t] and \mu_m = \mathbb{E}[m_t]. Taking expectations in the two equations gives

\mu_X = \kappa_x \bar \mu + (1-\kappa_x-\omega)\mu_X + \omega \mu_m

and

\mu_m = \lambda \mu_X + (1-\lambda)\mu_m

The second equation implies \mu_m = \mu_X. Substitute this into the first equation to obtain

\mu_X = \kappa_x \bar \mu + (1-\kappa_x)\mu_X

hence

\mu_X = \bar \mu \qquad \mu_m = \bar \mu

So the local term changes the medium-horizon dynamics without shifting the long-run mean, provided the stability condition is respected.

7 Momentum-Preserving and Momentum-Building Cases

The sign of \omega determines how the local term behaves. Suppose the current rate is below its recent average so that m_{t-1} - X_{t-1} > 0.

If \omega > 0, the local term is positive and pushes the process upward toward the recent average. When recent increases have lifted the local average, that mechanism preserves the recent direction of movement.

If \omega < 0, the local term reverses sign and pushes away from the recent average. In that case the local term builds additional directional force. The process is still globally mean-reverting because of the \kappa_x(\bar \mu - X_{t-1}) term, but it can display extended directional phases locally.

This is the intuition behind the paper’s central claim: global mean reversion and local directional persistence are not contradictory once the drift is allowed to depend on recent history as well as the global mean.

8 Stochastic Central Tendency and the Three-Factor Construction

The term-structure model in the paper adds a stochastic central tendency. Denote that slow-moving mean by \mu_t and write

\mu_t = c_\mu + \rho_\mu \mu_{t-1} + \sigma_\mu \eta_t^{(\mu)}

The local rate component then reverts toward the current central tendency rather than a fixed scalar mean:

X_t = c_x + \rho_x X_{t-1} + \beta_m m_{t-1} + \beta_\mu \mu_{t-1} + \sigma_x \eta_t^{(x)}

and the recursive local average evolves as before,

m_t = \lambda X_t + (1-\lambda)m_{t-1}

A local-variation factor v_t can then be added to the short rate,

r_t = X_t + v_t

with its own linear dynamics. At that point the state vector contains the slow central-tendency factor, the local level, the local momentum summary, and the local-variation factor. In companion form the entire system again becomes

s_t = c + A s_{t-1} + \Sigma \eta_t

The notation is heavier, but the idea is the same as before: once the history term is summarized by additional states, the dynamics are linear again.

9 Covariance Structure

Given the matrix state system

s_t = c + A s_{t-1} + \Sigma \eta_t

with independent standard-normal shocks \eta_t, the one-step conditional covariance is

\operatorname{Var}_{t-1}(s_t) = \Sigma \Sigma^\top

and the h-step conditional mean is

\mathbb{E}_t[s_{t+h}] = A^h s_t + \sum_{j=0}^{h-1} A^j c

while the h-step conditional covariance satisfies the recursion

V_h = A V_{h-1} A^\top + \Sigma \Sigma^\top \qquad V_0 = 0

These are the formulas needed to connect the local-momentum transition law to term-structure pricing and to empirical filtering.

10 Bond Pricing Recursions

Once the state is written in linear Gaussian form under the pricing measure,

s_{t+1} = c_Q + A_Q s_t + \Sigma_Q \eta_{t+1}^Q

and the short rate is affine in the state,

r_t = \delta_0 + \delta_1^\top s_t

zero-coupon bond prices retain the exponential-affine form

P_t^{(n)} = \exp\bigl(A_n + B_n^\top s_t\bigr)

Substitute this guess into the pricing identity

P_t^{(n+1)} = \mathbb{E}_t^Q\left[e^{-r_t} P_{t+1}^{(n)}\right]

Using the Gaussian moment-generating formula yields the recursions

A_{n+1} = A_n + B_n^\top c_Q + \tfrac12 B_n^\top \Sigma_Q \Sigma_Q^\top B_n - \delta_0

and

B_{n+1} = A_Q^\top B_n - \delta_1

with boundary conditions

A_0 = 0 \qquad B_0 = 0

The corresponding yield formula is then

y_t^{(n)} = -\frac{1}{n}\bigl(A_n + B_n^\top s_t\bigr)

This is the third key derivation. The local-momentum machinery complicates the state construction, but once the pricing state is linear and Gaussian under Q, the bond-pricing algebra returns to the familiar affine recursion.

11 Hump-Shaped Term Structures

Why can the model generate hump-shaped curves? The answer is in the maturity recursion. The slow central-tendency factor loads most strongly at medium and long maturities because it persists. The local-momentum block can strengthen medium-horizon expected-rate movements without dominating the very long horizon, where mean reversion eventually takes over. The local-variation factor mainly changes the short end. The combination naturally allows a hump-shaped response in B_n across maturities.

This is exactly the sort of shape that a one-factor mean-reverting model struggles to produce. The local-momentum structure earns its complexity by making those medium-horizon shapes and directional episodes possible without sacrificing long-run stability.

Code
# Numerical check: the affine bond-pricing recursion is easy to verify once A_Q, c_Q, Sigma_Q, and delta_1 are fixed.
A_Q = np.array([[0.95, 0.10], [0.00, 0.85]], dtype=float)
c_Q = np.array([0.001, 0.0005], dtype=float)
Sigma_Q = np.diag([0.01, 0.015])
delta_0 = 0.0
delta_1 = np.array([0.0, 1.0], dtype=float)

A_n = 0.0
B_n = np.zeros(2, dtype=float)
recursion = []
for n in range(1, 6):
    A_n = A_n + B_n @ c_Q + 0.5 * B_n @ Sigma_Q @ Sigma_Q.T @ B_n - delta_0
    B_n = A_Q.T @ B_n - delta_1
    recursion.append((n, A_n, *B_n))

pd.DataFrame(recursion, columns=['n', 'A_n', 'B_n_1', 'B_n_2'])

12 Difficulties

The derivations above explain both the appeal and the estimation difficulty of the paper. The appeal is obvious: the model can capture medium-horizon directional behavior that classical mean reversion misses. The difficulty is equally obvious: once local history enters the drift, the state vector expands, the likelihood gets harder, and the line between physical dynamics and pricing dynamics needs to be handled carefully.

That is why the next notebook will separate the full paper formulas from the reduced package implementation. The point is not to pretend that the reduced model is exact. The point is to state precisely which formulas survive intact, which are approximated, and which are deferred.