Code
import numpy as np
import pandas as pd2026-03-17
The previous theory notebooks stayed mostly inside linear Gaussian affine systems. Duan’s local-momentum paper is harder. The difficulty is not that the model is conceptually opaque. The difficulty is that the paper compresses several nontrivial manipulations into a small number of displayed formulas. This notebook therefore has a different objective from the other theory posts. It reconstructs those formulas slowly enough that a mathematically mature undergraduate can follow the logic without already knowing the term-structure literature.
The derivation naturally falls into three blocks. First, we derive the local-momentum autoregression from the ordinary mean-reverting benchmark and rewrite it as an augmented-state Markov system. Second, we extend the local component to the stochastic-central-tendency system used in the term-structure model. Third, we derive the bond-pricing recursions once the state has been enlarged.
import numpy as np
import pandas as pdStart from the ordinary autoregression
X_t = a + \,\phi X_{t-1} + \sigma_x \varepsilon_t
with \varepsilon_t \sim N(0,1). When |\phi| < 1, the process is stable. The form used throughout the short-rate literature is obtained by writing the same equation in mean-reversion notation. Define
\kappa_x = 1 - \phi
and, provided \kappa_x \neq 0, define the long-run mean by
\bar \mu = \frac{a}{1 - \phi} = \frac{a}{\kappa_x}
Then
X_t - X_{t-1} = a - (1-\phi)X_{t-1} + \sigma_x \varepsilon_t = \kappa_x(\bar \mu - X_{t-1}) + \sigma_x \varepsilon_t
so the benchmark law becomes
\Delta X_t = \kappa_x(\bar \mu - X_{t-1}) + \sigma_x \varepsilon_t
The sign pattern is immediate. If X_{t-1} is above \bar \mu, the deterministic drift is negative. If it is below \bar \mu, the deterministic drift is positive. That is global mean reversion in its simplest form.
Duan augments the benchmark law with a comparison between the current level and a weighted moving average of the recent past. Write
\Delta X_t = \kappa_x(\bar \mu - X_{t-1}) + \omega(\bar X_{t-1\mid n} - X_{t-1}) + \sigma_x \varepsilon_t
where
\bar X_{t-1\mid n} = \sum_{j=1}^{n} b_j X_{t-j} \qquad b_j \ge 0 \qquad \sum_{j=1}^{n} b_j = 1
The moving-average term is the new object. It compares the current level to a local historical average rather than to the global mean.
To see the implied autoregression, expand the right-hand side:
X_t = X_{t-1} + \kappa_x(\bar \mu - X_{t-1}) + \omega\left(\sum_{j=1}^{n} b_j X_{t-j} - X_{t-1}\right) + \sigma_x \varepsilon_t
Collect the X_{t-1} terms:
X_t = \kappa_x \bar \mu + \bigl(1 - \kappa_x - \omega\bigr)X_{t-1} + \omega \sum_{j=1}^{n} b_j X_{t-j} + \sigma_x \varepsilon_t
This is the first key derivation. The model is not mysterious. It is an autoregression whose coefficients are restricted by the long-run mean-reversion parameter \kappa_x, the local-momentum parameter \omega, and the moving-average weights b_j.
When the weighting window is finite, the model can be written as a finite-dimensional Markov system by stacking the lags. Define the companion state
s_t = (X_t, X_{t-1}, \ldots, X_{t-n+1})^\top
Then the first component evolves according to the restricted autoregression above, while the remaining components just shift the lag stack forward:
s_t = c + A s_{t-1} + \Sigma \eta_t
with
c = (\kappa_x \bar \mu, 0, \ldots, 0)^\top
and the first row of A given by the coefficients on the lagged levels. Concretely,
A_{1,1} = 1 - \kappa_x - \omega + \omega b_1 \qquad A_{1,j} = \omega b_j \quad \text{for } j = 2, \ldots, n
and the subdiagonal of A is filled with ones so that the lags shift in the usual companion-matrix way.
This derivation matters because it turns the apparently path-dependent model into an ordinary linear state transition once the state has been enlarged. The history dependence has not disappeared. It has been absorbed into additional state coordinates.
A particularly convenient choice is exponentially decaying weights. Define the local average recursively by
m_t = \lambda X_t + (1-\lambda)m_{t-1}
with 0 < \lambda \le 1. The local-momentum law becomes
\Delta X_t = \kappa_x(\bar \mu - X_{t-1}) + \omega(m_{t-1} - X_{t-1}) + \sigma_x \varepsilon_t
Equivalently,
X_t = \kappa_x \bar \mu + (1 - \kappa_x - \omega)X_{t-1} + \omega m_{t-1} + \sigma_x \varepsilon_t
Now substitute this into the recursion for m_t:
m_t = \lambda X_t + (1-\lambda)m_{t-1}
= \lambda \kappa_x \bar \mu + \lambda(1-\kappa_x-\omega)X_{t-1} + \bigl(1 - \lambda + \lambda \omega\bigr)m_{t-1} + \lambda \sigma_x \varepsilon_t
So the pair (X_t, m_t) evolves linearly:
\begin{pmatrix} X_t \\ m_t \end{pmatrix} = \begin{pmatrix} \kappa_x \bar \mu \\ \lambda \kappa_x \bar \mu \end{pmatrix} + \begin{pmatrix} 1-\kappa_x-\omega & \omega \\ \lambda(1-\kappa_x-\omega) & 1-\lambda+\lambda\omega \end{pmatrix} \begin{pmatrix} X_{t-1} \\ m_{t-1} \end{pmatrix} + \begin{pmatrix} \sigma_x \\ \lambda\sigma_x \end{pmatrix} \varepsilon_t
This is the second key derivation. Once the weights are recursive, the local-momentum model returns to a low-dimensional Markov form.
# Numerical check: a stable parameter choice should keep the companion eigenvalues inside the unit circle.
A = np.array([
[1 - 0.08 - 0.20, 0.20],
[0.25 * (1 - 0.08 - 0.20), 1 - 0.25 + 0.25 * 0.20],
], dtype=float)
np.linalg.eigvals(A)Write the augmented state as z_t = c + A z_{t-1} + \Sigma \eta_t. If the spectral radius of A is strictly smaller than one, the process has a finite unconditional mean
\mu_z = (I - A)^{-1}c
For the two-state exponential-weight system, solving this explicitly shows why the model preserves global mean reversion. Let \mu_X = \mathbb{E}[X_t] and \mu_m = \mathbb{E}[m_t]. Taking expectations in the two equations gives
\mu_X = \kappa_x \bar \mu + (1-\kappa_x-\omega)\mu_X + \omega \mu_m
and
\mu_m = \lambda \mu_X + (1-\lambda)\mu_m
The second equation implies \mu_m = \mu_X. Substitute this into the first equation to obtain
\mu_X = \kappa_x \bar \mu + (1-\kappa_x)\mu_X
hence
\mu_X = \bar \mu \qquad \mu_m = \bar \mu
So the local term changes the medium-horizon dynamics without shifting the long-run mean, provided the stability condition is respected.
The sign of \omega determines how the local term behaves. Suppose the current rate is below its recent average so that m_{t-1} - X_{t-1} > 0.
If \omega > 0, the local term is positive and pushes the process upward toward the recent average. When recent increases have lifted the local average, that mechanism preserves the recent direction of movement.
If \omega < 0, the local term reverses sign and pushes away from the recent average. In that case the local term builds additional directional force. The process is still globally mean-reverting because of the \kappa_x(\bar \mu - X_{t-1}) term, but it can display extended directional phases locally.
This is the intuition behind the paper’s central claim: global mean reversion and local directional persistence are not contradictory once the drift is allowed to depend on recent history as well as the global mean.
The term-structure model in the paper adds a stochastic central tendency. Denote that slow-moving mean by \mu_t and write
\mu_t = c_\mu + \rho_\mu \mu_{t-1} + \sigma_\mu \eta_t^{(\mu)}
The local rate component then reverts toward the current central tendency rather than a fixed scalar mean:
X_t = c_x + \rho_x X_{t-1} + \beta_m m_{t-1} + \beta_\mu \mu_{t-1} + \sigma_x \eta_t^{(x)}
and the recursive local average evolves as before,
m_t = \lambda X_t + (1-\lambda)m_{t-1}
A local-variation factor v_t can then be added to the short rate,
r_t = X_t + v_t
with its own linear dynamics. At that point the state vector contains the slow central-tendency factor, the local level, the local momentum summary, and the local-variation factor. In companion form the entire system again becomes
s_t = c + A s_{t-1} + \Sigma \eta_t
The notation is heavier, but the idea is the same as before: once the history term is summarized by additional states, the dynamics are linear again.
Given the matrix state system
s_t = c + A s_{t-1} + \Sigma \eta_t
with independent standard-normal shocks \eta_t, the one-step conditional covariance is
\operatorname{Var}_{t-1}(s_t) = \Sigma \Sigma^\top
and the h-step conditional mean is
\mathbb{E}_t[s_{t+h}] = A^h s_t + \sum_{j=0}^{h-1} A^j c
while the h-step conditional covariance satisfies the recursion
V_h = A V_{h-1} A^\top + \Sigma \Sigma^\top \qquad V_0 = 0
These are the formulas needed to connect the local-momentum transition law to term-structure pricing and to empirical filtering.
Once the state is written in linear Gaussian form under the pricing measure,
s_{t+1} = c_Q + A_Q s_t + \Sigma_Q \eta_{t+1}^Q
and the short rate is affine in the state,
r_t = \delta_0 + \delta_1^\top s_t
zero-coupon bond prices retain the exponential-affine form
P_t^{(n)} = \exp\bigl(A_n + B_n^\top s_t\bigr)
Substitute this guess into the pricing identity
P_t^{(n+1)} = \mathbb{E}_t^Q\left[e^{-r_t} P_{t+1}^{(n)}\right]
Using the Gaussian moment-generating formula yields the recursions
A_{n+1} = A_n + B_n^\top c_Q + \tfrac12 B_n^\top \Sigma_Q \Sigma_Q^\top B_n - \delta_0
and
B_{n+1} = A_Q^\top B_n - \delta_1
with boundary conditions
A_0 = 0 \qquad B_0 = 0
The corresponding yield formula is then
y_t^{(n)} = -\frac{1}{n}\bigl(A_n + B_n^\top s_t\bigr)
This is the third key derivation. The local-momentum machinery complicates the state construction, but once the pricing state is linear and Gaussian under Q, the bond-pricing algebra returns to the familiar affine recursion.
Why can the model generate hump-shaped curves? The answer is in the maturity recursion. The slow central-tendency factor loads most strongly at medium and long maturities because it persists. The local-momentum block can strengthen medium-horizon expected-rate movements without dominating the very long horizon, where mean reversion eventually takes over. The local-variation factor mainly changes the short end. The combination naturally allows a hump-shaped response in B_n across maturities.
This is exactly the sort of shape that a one-factor mean-reverting model struggles to produce. The local-momentum structure earns its complexity by making those medium-horizon shapes and directional episodes possible without sacrificing long-run stability.
# Numerical check: the affine bond-pricing recursion is easy to verify once A_Q, c_Q, Sigma_Q, and delta_1 are fixed.
A_Q = np.array([[0.95, 0.10], [0.00, 0.85]], dtype=float)
c_Q = np.array([0.001, 0.0005], dtype=float)
Sigma_Q = np.diag([0.01, 0.015])
delta_0 = 0.0
delta_1 = np.array([0.0, 1.0], dtype=float)
A_n = 0.0
B_n = np.zeros(2, dtype=float)
recursion = []
for n in range(1, 6):
A_n = A_n + B_n @ c_Q + 0.5 * B_n @ Sigma_Q @ Sigma_Q.T @ B_n - delta_0
B_n = A_Q.T @ B_n - delta_1
recursion.append((n, A_n, *B_n))
pd.DataFrame(recursion, columns=['n', 'A_n', 'B_n_1', 'B_n_2'])The derivations above explain both the appeal and the estimation difficulty of the paper. The appeal is obvious: the model can capture medium-horizon directional behavior that classical mean reversion misses. The difficulty is equally obvious: once local history enters the drift, the state vector expands, the likelihood gets harder, and the line between physical dynamics and pricing dynamics needs to be handled carefully.
That is why the next notebook will separate the full paper formulas from the reduced package implementation. The point is not to pretend that the reduced model is exact. The point is to state precisely which formulas survive intact, which are approximated, and which are deferred.