Appendix A: Derivation of the model
In this appendix, we derive the model for a riskless investment with a continuous profit at a low riskfree rate. We follow the standard derivation in [1]. It is based on the same principle of constructing a riskfree portfolio as the seminal theory of Black–Scholes.
Let \(B(t)\) be the value of a bank account at time \(t \geq 0\). We assume that the bank account evolves according to the differential equation
$$\begin{aligned} dB(t) = B(t) r(t) \,dt, \end{aligned}$$
with initial value \(B(0)=1\), where \(r(t)\) is the shortrate, i.e. the growth rate of the bank account B within a small time interval \((t,t+dt)\). This leads to the formula
$$\begin{aligned} B(t) = \operatorname{exp} \biggl( \int _{0}^{t} r(\tau ) \,d\tau \biggr). \end{aligned}$$
When working with interestrate products, the study of the variability of interest rates is essential. Therefore, the shortrate is modelled as a stochastic process. Let S be the price of stock at the end of the nth trading day. The daily return from days n to \(n+1\) is given by \((S_{n+1}S_{n})/S_{n}\). In general, it is common to work with log returns, since the log return of k days can be easily computed by adding up the daily log returns,
$$\begin{aligned} \operatorname{log}(S_{k}/S_{0}) = \operatorname{log}(S_{1}/S_{0}) + \cdots + \operatorname{log}(S_{k}/S_{k1}). \end{aligned}$$
Based on the assumption that the log returns over disjoint time intervals are stochastically independent and equally distributed, the central limit theorem [47] implies that the log returns are normally distributed [48]. These properties [49] can be realized by a standard Brownian motion \(W(t)\), i.e., a family of random variables \(W(t)\), indexed by \(t\geq 0\), with the properties

\(W(0)=0\).

With probability 1, the function \(W(t)\) is continuous in t.

For \(t\geq 0\), the increment \(W(t+\tau )W(\tau )\) is normally distributed with mean 0 and variance t, i.e.,
$$\begin{aligned} W(t+\tau )  W(\tau ) \sim N(0,t). \end{aligned}$$

For all N and times \(t_{0} < t_{1} < \cdots < t_{N1} < t_{N}\), the increments \(W({t_{j}})W({t_{j1}})\) are stochastically independent.
These properties lead to the Langevin equation, see [50], which is a stochastic differential equation
$$\begin{aligned} \frac{dX(t)}{dt} = q(t)X(t), \end{aligned}$$
with initial condition \(X(0) = X_{0}\) with probability 1, where the stochastic parameter \(q(t)\) is given, see [51], by
$$\begin{aligned} q(t) = f\bigl(r(t),t\bigr) + h\bigl(r(t),t\bigr)w (t). \end{aligned}$$
Here \({w} (t)\) is a white noise process, and \(f,h\) are given functions of the interest rate \(r(t)\). This leads to
$$\begin{aligned} \frac{dX(t)}{dt} = f\bigl(r(t),t\bigr)X(t) + g\bigl(r(t),t\bigr)X(t)w (t). \end{aligned}$$
(13)
The force \(w(t)=dW(t)/dt\) is a fluctuating quantity with Gaussian distribution. Substituting \(dW(t) = w(t) \,dt\) in (13), we get
$$\begin{aligned} dX(t) = f\bigl(r(t),t\bigr)X(t)\,dt + g\bigl(r(t),t\bigr)X(t)\,dW(t) \end{aligned}$$
and we obtain an SDE for the shortrate \(r(t)\) via
$$\begin{aligned} dr(t) = f\bigl(t,r(t)\bigr)\,dt + g\bigl(t,r(t)\bigr)\,dW(t). \end{aligned}$$
Based on the Ito lemma [3], we can derive a general PDE for any underlying instrument depending on the shortrate. Consider a risk neutral portfolio \(\Pi (t)\) that depends on the shortrate \(r(t)\) and consists of two interest rate instruments \(V_{1}\) and \(V_{2}\) with different maturities \(T_{1}\) and \(T_{2}\), respectively. Suppose that there are \(\Delta = (\frac{\partial V_{1}}{\partial r(t)} / \frac{\partial V_{2}}{\partial r(t)} )\) units of the instrument \(V_{2}\). For an infinitesimal time interval, the value change of the portfolio is \(d\Pi (t) = \Delta \,dV_{2}  dV_{1}\). To avoid the arbitrage, we consider a riskfree rate [1], which gives
$$\begin{aligned} d\Pi (t) = \Delta\, dV_{2} + (V_{1}  \Delta V_{2})r(t) \,dt  dV_{1}, \end{aligned}$$
and obtain the PDE
$$\begin{aligned} \begin{aligned} d\Pi (t) = {}&(V_{1}  \Delta V_{2})r(t)\,dt \\ &{} \biggl[ \biggl(\frac{\partial V_{1}}{\partial r(t)}f\bigl(r(t),t\bigr) + \frac{\partial V_{1}}{\partial t} + \frac{1}{2} \frac{\partial ^{2} V_{1}}{\partial r^{2}_{t}} g^{2}\bigl(r(t),t\bigr) \biggr) \,dt + \frac{\partial V_{1}}{\partial r(t)} g\bigl(r(t),t\bigr)\,dW(t) \biggr] \\ &{}+ \Delta \biggl[ \biggl(\frac{\partial V_{2}}{\partial r(t)}f\bigl(r(t),t\bigr) + \frac{\partial V_{2}}{\partial t} + \frac{1}{2} \frac{\partial ^{2} V_{2}}{\partial r^{2}_{t}} g^{2} \bigl(r(t),t\bigr) \biggr) \,dt + \frac{\partial V_{2}}{\partial r(t)} g\bigl(r(t),t\bigr)\,dW(t) \biggr]. \end{aligned} \end{aligned}$$
Assuming a zero net investment requirement, i.e., \(d\Pi (t) = 0\), we obtain
$$\begin{aligned} \begin{aligned} 0 ={}& \biggl[V_{1}  \biggl( \frac{\partial V_{1}}{\partial r(t)} /\frac{\partial V_{2}}{\partial r(t)} \biggr) V_{2} \biggr] r(t)\,dt \\ &{} \biggl[ \biggl(\frac{\partial V_{1}}{\partial r(t)}f\bigl(r(t),t\bigr) + \frac{\partial V_{1}}{\partial t} + \frac{1}{2} \frac{\partial ^{2} V_{1}}{\partial r^{2}_{t}} g^{2}\bigl(r(t),t\bigr) \biggr) \,dt + \frac{\partial V_{1}}{\partial r(t)} g\bigl(r(t),t\bigr)\,dW(t) \biggr] \\ &{}+ \biggl(\frac{\partial V_{1}}{\partial r(t)} / \frac{\partial V_{2}}{\partial r(t)} \biggr) \biggl[ \biggl( \frac{\partial V_{2}}{\partial r(t)}f\bigl(r(t),t\bigr) + \frac{\partial V_{2}}{\partial t} + \frac{1}{2} \frac{\partial ^{2} V_{2}}{\partial r^{2}_{t}} g^{2}\bigl(r(t),t\bigr) \biggr) \,dt\\ &{} + \frac{\partial V_{2}}{\partial r(t)} g\bigl(r(t),t\bigr)\,dW(t) \biggr]. \end{aligned} \end{aligned}$$
Eliminating the stochastic term, we obtain
$$\begin{aligned} \begin{aligned} & \biggl[V_{1}  \biggl(\frac{\partial V_{1}}{\partial r(t)} / \frac{\partial V_{2}}{\partial r(t)} \biggr) V_{2} \biggr] r(t)\,dt \\ &\quad= \biggl[ \frac{\partial V_{1}}{\partial t} + \frac{1}{2} \frac{\partial ^{2} V_{1}}{\partial r^{2}_{t}} g^{2}\bigl(r(t),t\bigr)  \biggl(\frac{\partial V_{1}}{\partial r(t)} / \frac{\partial V_{2}}{\partial r(t)} \biggr) \biggl( \frac{\partial V_{2}}{\partial t} + \frac{1}{2} \frac{\partial ^{2} V_{2}}{\partial r^{2}_{t}} g^{2}\bigl(r(t),t\bigr) \biggr) \biggr]\,dt. \end{aligned} \end{aligned}$$
Rearranging the terms,we get
$$\begin{aligned} \frac{\frac{\partial V_{1}}{\partial t} + \frac{1}{2} \frac{\partial ^{2} V_{1}}{\partial r^{2}} g^{2}(r(t),t)  rV_{1}}{\frac{\partial V_{1}}{\partial r(t)}} = \frac{\frac{\partial V_{2}}{\partial t} + \frac{1}{2} \frac{\partial ^{2} V_{2}}{\partial r(t)^{2}} g^{2}(r(t),t)  r(t)V_{2}}{\frac{\partial V_{2}}{\partial r(t)}}=:u\bigl(r(t),t\bigr), \end{aligned}$$
and we obtain a PDE for the financial instrument \(V_{1}\) depending on \(r(t)\) given by
$$\begin{aligned} \frac{\partial V_{1}}{\partial t} + \frac{1}{2} g^{2}\bigl(r(t),t\bigr) \frac{\partial ^{2} V_{1}}{\partial r(t)^{2}}  u\bigl(r(t),t\bigr) \frac{\partial V_{1}}{\partial r(t)}  r(t)V_{1} = 0. \end{aligned}$$
There exist several wellknown onefactor shortrate models, such as the Vasicek model [52], the Cox–Ingersoll–Ross model [53], or the Hull–White model [32, 36] which is an extension of the Vasicek model. The stochastic differential equation in the Hull–White model is given as
$$\begin{aligned} dr(t) = \bigl(a(t)  b(t)r(t)\bigr)\,dt + \sigma (t)\,dW(t), \end{aligned}$$
with timedependent parameters \(a(t)\), \(b(t)\), and \(\sigma (t)\). The term \((a(t)  b(t)r(t))\) is a drift term and \(a(t)\) is known as deterministic drift. Setting \(g(r(t),t) = \sigma (t)\), and \(u(r(t),t) = (a(t)  b(t)r(t))\).
The Hull–White model is calibrated based on today’s (\(t_{0}\)) market data for bond prices \(B(t,T)\). To project the parameters \(b(t)\) and \(\sigma (t)\) into the future (\(t_{1}\)), one could use either \(B(t_{1},T+(t_{1}  t_{0}))\) or \(B(t_{1},T)\) [14]. In the first case, the shape of parameters remains unchanged and does not cover seasonalities or expected changes like money market politics, while in the second case, we lose the information concentrated on the short end. When b and σ are constants, both approaches deliver the same parameters.
Appendix B: Yield curve simulation
The PRIIP regulation demands to perform yield curve simulations for at least \(10{,}000\) times [6], and that the data set must contain at least 2 years of daily interest rates for an underlying instrument, 4 years of weekly interest rates, or 5 years of monthly interest rates. We construct a data matrix \(D \in \mathbb{R}^{n\times m}\) of the collected historical interest rates data, where each row of the matrix forms a yield curve, and the column represents the m tenor points, which are the different contract lengths of an underlying instrument. For example, if we have collected the daily interest rate data at \(m\approx 20\) tenor points in time over the past five years, then since a year has approximately 260 working days, one obtains \(n \approx 1306\) observation periods.
The regulations demand to take the natural logarithm of the ratio between the interest rate at each observation period and the interest rate at the preceding period. To ensure that we can form the natural logarithm, we need that all elements of the data matrix D are positive which is achieved by adding a correction term. With \(\mathcal{W}\) the matrix of all ones, we set \(\bar{D} = D + \gamma \mathcal{W}\), where γ is chosen so that all elements of matrix D̄ are positive. We are compensating γ shift at the bootstrapping stage by subtracting it from the simulated rates. Then we calculate the log returns over each period and store them into a new matrix \(\hat{D} = \hat{d}_{ij} \in \mathbb{R}^{n\times m}\) as
$$\begin{aligned} \hat{d}_{ij} = \frac{\operatorname{ln}(\bar{d}_{ij})}{\operatorname{ln}(\bar{d}_{i1,j})}. \end{aligned}$$
We calculate the arithmetic mean \(\mu _{j}\) of each column of the matrix D̂,
$$\begin{aligned} \mu _{j} = \frac{1}{n} \sum_{i=1}^{n} \hat{d}_{ij}, \end{aligned}$$
subtract \(\mu _{j}\) from each element of the corresponding jth column of D̂ and store the obtained results in a matrix \(\bar{\bar{D}}\) with entries \(\bar{\bar{d}}_{ij} = \hat{d}_{ij}  \mu _{j}\). We then compute the singular value decomposition (SVD) [41],
$$\begin{aligned} \bar{\bar{D}} = \Phi \Sigma \Psi ^{T}, \end{aligned}$$
where Σ is a diagonal matrix having singular values \(\Sigma _{i}\) arranged in descending order. The columns of Φ are the normalized left singular vectors and the columns of ΦΣ are known as principal components. The colums of Ψ are the right singular vectors or principal directions of the covariance matrix \(\mathcal{C} = \bar{\bar{D}}^{T}\bar{\bar{D}}\).
The relative importance of the ith singular value is determined by the relative energy
$$\begin{aligned} \Xi _{i} = \frac{\Sigma _{i}}{\sum_{i=1}^{m} \Sigma _{i}}, \end{aligned}$$
where the total energy is given by \(\sum_{i=1}^{m} \Xi _{i} =1\). We then select the p right singular vectors corresponding to the maximal p energies and construct a matrix
$$\begin{aligned} \bar{\Psi } = \begin{bmatrix} \psi _{11} & \cdots & \psi _{1p} \\ \vdots & \vdots & \vdots \\ \psi _{m1} & \cdots & \psi _{mp} \end{bmatrix} , \end{aligned}$$
project the matrix \(\bar{\bar{D}}\) onto the matrix Φ via
$$\begin{aligned} M_{p} = \bar{\bar{D}} \cdot \bar{\Psi }\in \mathbb{R}^{n\times p}, \end{aligned}$$
and then calculate the matrix of returns \(M_{R} = M_{p}\bar{\Psi }^{T}\in \mathbb{R}^{n\times m}\). The regulations suggest selecting the first three (\(p=3\)) right singular vectors. This process simplifies the statistical data \(\bar{\bar{D}}\) and transforms m correlated tenor points into p uncorrelated principal components, reproducing the same data by simply reducing the total size of the model.
We then perform bootstrapping, where large numbers of small samples of the same size are drawn repeatedly from the original data set. According to the PRIIP regulations, for the yield curve simulation we have to perform a bootstrapping procedure for at least \(10{,}000\) times. The standardized KID also has to include the recommended holding period, i.e., the period between the acquisition of an asset and its sale. The time step in the simulation of yield curves is typically one observation period. If H is the recommended holding period in days, e.g., \(H \approx 2600\) days, then there are H observation periods in the recommended holding period.
For each such observation period, we select a random row from the matrix \(M_{R}\), i.e., altogether H random rows, and construct a matrix \([\mathfrak{\chi }_{ij}] \in \mathbb{R}^{H\times m}\) from these selected rows. Then we sum over the selected rows of the columns corresponding to the tenor point j, i.e.,
$$\begin{aligned} \bar{\chi }_{j} = \sum_{i=1}^{h} \mathfrak{\chi }_{ij},\quad j = 1, \ldots, m. \end{aligned}$$
In this way, we obtain a row vector \(\bar{\chi }=[\bar{\chi }_{1} \bar{\chi }_{2} \cdots \bar{\chi }_{m}]\in \mathbb{R}^{1\times m}\). The final simulated yield rate \(y_{j}\) at tenor point j is then the rate \(\bar{d}_{nj}\) of the last observation period at the corresponding tenor point j, multiplied by the exponential of \(\bar{\chi }_{j}\), adjusted for any shift γ used to ensure positive values for all tenor points, and adjusted for the forward rate so that the expected mean matches current expectations.
The forward rate between time points \(t_{k}\) and \(t_{\ell }\) starting from a time point \(t_{0}\) is given as
$$\begin{aligned} r_{k,\ell } = \frac{R(t_{0},t_{\ell })(t_{\ell }  t_{0})  R(t_{0},t_{k})(t_{k}  t_{0})}{t_{\ell } t_{k}}, \end{aligned}$$
where \(t_{k}\) and \(t_{\ell }\) are measured in years and \(R(t_{0},t_{k})\) and \(R(t_{0},t_{\ell })\) are the interest rates available from the data matrix for the time periods \((t_{0},t_{k})\) and \((t_{0},t_{\ell })\), respectively. Thus, the final simulated yield curve between time points \(t_{k}\) and \(t_{\ell }\) is given by
$$\begin{aligned} y(t_{\ell }) = \bar{d}_{k,\ell } \operatorname{exp}({\bar{\chi }_{\ell }})  \gamma + {r_{k,l}}, \quad\ell = 1,\ldots,m, \end{aligned}$$
(14)
and the simulated yield curve from the calculated simulated returns is given by
$$\begin{aligned} y = [y_{1} y_{2} \cdots y_{m}]. \end{aligned}$$
We then perform the bootstrapping procedure for at least \(s = 10{,}000\) times and construct a simulated yield curve matrix
$$\begin{aligned} Y = \begin{bmatrix} y_{11} & \cdots & y_{1m} \\ \vdots & \vdots & \vdots \\ y_{s1} & \cdots & y_{sm} \end{bmatrix} \in \mathbb{R}^{s\times m}, \end{aligned}$$
(15)
which is then used to calibrate the parameter \(a(t)\).
Appendix C: Parameter calibration
For a zerocoupon bond \(B(t,T)\) maturing at time T, based on the Hull–White model, one obtains a closedform solution, see [54], as
$$\begin{aligned} B(t,T) = \operatorname{exp}\bigl\{ r(t)\Gamma (t,T)  \Lambda (t,T)\bigr\} , \end{aligned}$$
(16)
where \(\kappa (t) = \int _{0}^{t} b(s)\,ds=bt\), since b is assumed constant,
$$\begin{aligned} \begin{aligned} &\Gamma (t,T)= \int _{t}^{T} e^{\kappa (t)} \,dt, \\ &\Lambda (t,T)= \int _{t}^{T} \biggl[ e^{\kappa (v)}a(v) \biggl( \int _{v}^{T} e^{\kappa (z)}\,dz \biggr)  \frac{1}{2} e^{2\kappa (v)} \sigma ^{2} \biggl( \int _{v}^{T} e^{\kappa (z)} \,dz \biggr)^{2} \biggr] \,dv. \end{aligned} \end{aligned}$$
Here we have again used that σ is constant.
To perform the calibration, we use as input data i) the initial value of \(a(0)\) at \(t=0\), ii) the zerocoupon bond prices, iii) the constant value of the volatility σ of the shortrate \(r(t)\), and iv) the constant value b each for all maturities \(T_{m}\), \(0 \leq T_{m} \leq T\), where \(T_{m}\) is the maturity at the mth tenor point. Then we compute \(\kappa (t)\) from \(\frac{\partial }{\partial T} \kappa (T) = \frac{\partial }{\partial T} \int _{0}^{T} b(s) \,ds = b\) and use
$$\begin{aligned} \frac{\partial }{\partial T} \Gamma (0,T)= e^{\kappa (T)} \end{aligned}$$
to compute \(\Gamma (t)\).
Then, for \(0 \leq T_{m} \leq T\), we get
$$\begin{aligned} \begin{aligned} &\frac{\partial }{\partial T} \Lambda (0,T)= \int _{0}^{T} \biggl[e^{ \kappa (v)} a(v) e^{\kappa (T)}  e^{2\kappa (v)}\sigma ^{2}e^{\kappa (T)} \biggl( \int _{v}^{T} e^{ \kappa (z)} \,dz \biggr) \biggr] \,dv, \\ &e^{\kappa (T)} \frac{\partial }{\partial T} \Lambda (0,T) = \int _{0}^{T} \biggl[e^{\kappa (v)} a(v)  e^{2\kappa (v)} \sigma ^{2} \biggl( \int _{v}^{T} e^{\kappa (z)} \,dz \biggr) \biggr] \,dv, \\ &\frac{\partial }{\partial T} \biggl[ e^{\kappa (T)} \frac{\partial }{\partial T} \Lambda (0,T) \biggr]= e^{\kappa (T)} a(T)  \int _{0}^{T} e^{2\kappa (v)}\sigma ^{2}e^{\kappa (T)} \,dv, \\ &e^{\kappa (T)} \biggl[ e^{\kappa (T)} \frac{\partial }{\partial T} \Lambda (0,T) \biggr] = e^{2\kappa (T)} a(T)  \int _{0}^{T} e^{2 \kappa (v)}\sigma ^{2} \,dv, \\ &\frac{\partial }{\partial T} \biggl[ e^{\kappa (T)} \biggl[ e^{\kappa (T)} \frac{\partial }{\partial T} \Lambda (0,T) \biggr] \biggr] = \frac{\partial a(T)}{\partial T} e^{2\kappa (T)} + 2a(T) e^{2\kappa (T)} \frac{\partial }{\partial T}\kappa (T)  e^{2\kappa (T)}\sigma ^{2}, \\ &\frac{\partial }{\partial T} \biggl[ e^{\kappa (T)} \biggl[ e^{\kappa (T)} \frac{\partial }{\partial T} \Lambda (0,T) \biggr] \biggr] = \frac{\partial a(T)}{\partial T} e^{2\kappa (T)} + 2a(T)e^{2\kappa (T)}b(T)  e^{2\kappa (T)}\sigma ^{2}. \end{aligned} \end{aligned}$$
The simulated yield \(y(T)\) at the tenor point T is then given by
$$\begin{aligned} y(T) = \operatorname{ln}B(0,T), \end{aligned}$$
(17)
and from (17) we obtain \(\Lambda (0,T) = [y(T) r(0)\Gamma ]\). In this way, for \(a(t)\) we obtain the ordinary differential equation (ODE)
$$\begin{aligned} \begin{aligned} \frac{\partial }{\partial t}a(t) e^{2\kappa (t)} + 2a(t) \cdot b \cdot e^{2\kappa (t)}  e^{2\kappa (t)}\sigma ^{2} = \frac{\partial }{\partial t} \biggl[ e^{\kappa (t)} \biggl[ e^{\kappa (t)} \frac{\partial }{\partial t}\bigl(y(t)  r(0)\Gamma (0,t)\bigr) \biggr] \biggr], \end{aligned} \end{aligned}$$
which we solve numerically with the given initial conditions. If we approximate \(a(t)\) by a piecewise constant function with values \(a(i)\) which change at the tenor point i, then we obtain a linear system
$$\begin{aligned} L\alpha = F, \end{aligned}$$
for the vector \(\alpha =[a(i)]\), where L is lower triangular with nonzero diagonal elements. In [55] it is noted that the integral equation Λ is of the first kind with L2 kernel and a small perturbation (noise) in the market data that are used to obtain the yield curves leads to large changes in the model parameter \(a(t)\). This means that the problem to compute \(a(t)\) from the data is an illposed problem and for this reason we determine the vector α via Tikhonov regularization as
$$\begin{aligned} \alpha ^{\delta }_{\mu }= \mathop{\operatorname{argmin}} \bigl\Vert L \alpha  F^{\delta } \bigr\Vert ^{2} + \mu \Vert \alpha \Vert ^{2}, \end{aligned}$$
(18)
where \(\alpha ^{\delta }_{\mu }\) is an approximation to α, μ is the regularization parameter, \(\delta = \ F  F^{\delta }\\) is the noise level, and \(\mu \\alpha \^{2}\) is a regularization term. We then solve the optimization problem (18) to obtain an approximation to the parameter \(a(t)\) via the commercial software UnRisk PRICING ENGINE for the parameter calibrations [46]. By providing the simulated yield curve, the UnRisk pricing function returns the calibrated parameter \(a(t)\) for that yield curve. Based on \(s=10{,}000\) different simulated yield curves, we obtain s different piecewise constant parameters \(a_{\ell }(t)\), which change their values \(\alpha _{\ell,i}\) only at the m tenor points. We incorporate these in a matrix
$$\begin{aligned} {\mathcal{A}}= \begin{bmatrix} \alpha _{11} & \cdots & \alpha _{1m} \\ \vdots & \vdots & \vdots \\ \alpha _{s1} & \cdots & \alpha _{sm} \end{bmatrix}. \end{aligned}$$
(19)
Appendix D: Numerical methods
The Hull–White model (3) is discretized by applying a finite difference method. As computational domain for the interest rate \(r(t)\) we use an interval \([r_{\mathrm{low}},r_{\mathrm{up}}]\), according to [1] given by
$$\begin{aligned} r_{\mathrm{low}} = r(T)  7\sigma \sqrt{T},\qquad r_{\mathrm{up}} = r(T) + 7\sigma \sqrt{T}, \end{aligned}$$
where \(r(T)\) is the yield at the maturity T also known as a spot rate. We divide the spatial domain into M equidistant grid points \(\{r(1),r(2),\dots,r(M) \}\), \(r(i)=r(i1)+h\) with spacial step size h, and the time interval \([0,T]\) in N points \(t_{0}=0,t_{1},\ldots, t_{N}=T\), \(t_{n}=n \tau \), with time step τ. Using the spatial discretization operator \(\mathcal{L}(n)\) at time point n, we get a system of ODEs for the vector \(V=[V_{i}]=[V(r(i))]\) of values at the spatial grid points
$$\begin{aligned} \frac{V(t)  V(t\tau )}{\tau } = (1\Theta ) \bigl(\mathcal{L}(t) V(t)\bigr) + \Theta \bigl( \mathcal{L}(t \tau ) V(t  \tau )\bigr), \end{aligned}$$
which is given componentwise by
$$\begin{aligned} \begin{aligned} &\text{for } \bigl(a(n)  br(i)\bigr) > 0 \\ &\quad {\mathcal{L}}(n)V_{i}^{n} := \frac{1}{2}\sigma ^{2} \frac{V_{i+1}^{n}  2V_{i}^{n} + V_{i1}^{n}}{h}^{2} + \bigl(a(n)  br(i)\bigr) \frac{V_{i}^{n}  V_{i1}^{n}}{h}  r(i)V_{i}^{n}, \\ &\text{for } \bigl(a(n)  br(i)\bigr) < 0 \\ &\quad {\mathcal{L}}(n)V_{i}^{n} := \frac{1}{2}\sigma ^{2} \frac{V_{i+1}^{n}  2V_{i}^{n} + V_{i1}^{n}}{h}^{2} + \bigl(a(n)  br(i)\bigr) \frac{V_{i+1}^{n}  V_{i}^{n}}{h}  r(i)V_{i}^{n}, \end{aligned} \end{aligned}$$
so that the Crank–Nicolson scheme in time gives a linear system
$$\begin{aligned} \underbrace{ \biggl( 1  \frac{1}{2}\tau \mathcal{L}(t\tau ) \biggr)}_{A( \rho _{\ell }(t)) \in \mathbb{R}^{M\times M}} V(t\tau ) = \underbrace{ \biggl( 1 + \frac{1}{2} \tau \mathcal{L}(t) \biggr)}_{B( \rho _{\ell }(t)) \in \mathbb{R}^{M\times M}} V(t), \end{aligned}$$
where the matrices \(A(\rho _{\ell }(t))\), and \(B(\rho _{\ell }(t))\) depend on \(\rho _{\ell }(t) = \{a(t),b,\sigma \}\) for the ℓth group of these parameters and are given by
$$\begin{aligned} A\bigl(\rho _{\ell }(t)\bigr) = I  \frac{\sigma ^{2}\tau }{2h^{2}}J  \frac{\tau }{2h}\bigl(H^{+}G + H^{}G^{T}\bigr) + R_{o}, \end{aligned}$$
and
$$\begin{aligned} B\bigl(\rho _{\ell }(t)\bigr) = I + \frac{\sigma ^{2}\tau }{2h^{2}}J + \frac{\tau }{2h}\bigl(H^{+}G + H^{}G^{T}\bigr)  R_{o}, \end{aligned}$$
where \(R_{o} = \operatorname{diag}(r(1),\ldots,r(M))\), \(H^{+}= \operatorname{diag} (\operatorname{max}(a(n)  br(1)),\ldots, \operatorname{max}(a(n)  br(M)) )\), \(H^{}= \operatorname{diag} (\operatorname{min}(a(n)  br(1)),\ldots, \operatorname{min}(a(n)  br(M)) )\), and
$$\begin{aligned} J = \begin{bmatrix} 2 & 1 & 0 & \cdots & 0 \\ 1 & 2 & 1 & \ddots & \vdots \\ 0 & 1 & \ddots & \ddots & 0 \\ \vdots &\ddots & \ddots & \ddots & 1 \\ 0 & \cdots & 0 & 1 & 2 \end{bmatrix},\qquad G = \begin{bmatrix} 1 & 0 & 0 & \cdots & 0 \\ 1 & 1 & 0 & \ddots & \vdots \\ 0 & 1 & \ddots & \ddots & 0 \\ \vdots &\ddots & \ddots & \ddots & 0 \\ 0 & \cdots & 0 & 1 & 1 \end{bmatrix}. \end{aligned}$$
This discretization is a parametric high dimensional model of the form
$$\begin{aligned} A\bigl(\rho _{\ell }(t)\bigr)V^{n1} = B\bigl(\rho _{\ell }(t)\bigr)V^{n}, \end{aligned}$$
with given terminal vector \(V^{T}\), and matrices \(A(\rho _{\ell }) \in \mathbb{R}^{M\times M}\), and \(B(\rho _{\ell }) \in \mathbb{R}^{M\times M}\). We solve this model by propagating backward in time. Here again \(\ell = 1,\dots,s=10{,} 000\), m is the total number of tenor points, and we need to solve this system at each time step n with an appropriate boundary condition and a known terminal value for the underlying instrument. Altogether we have a parameter space \(\mathcal{P}\) of size \(10{,} 000 \times m\) to which we now apply model reduction.