Due to the differential-algebraic structure of the generated network equations, standard techniques used for ODE model order reduction, such as a direct reduction through a Galerkin projection scheme, may yield unsolvable reduced order models.

To circumvent this and preserve the solvability of the numerical model, the Gauß–Newton with Approximated Tensors (GNAT) model order reduction approach is utilized, [6]. This is combined with a gappy data reconstruction method, [14], for hyper-reduction.

These reduced basis model order reduction approaches utilise a basis obtained by the Maximum Entropy Snapshot Sampling (MESS) method, [10], as proposed in [2]. To illustrate this approach, this section contains an overview of the reduction, hyper-reduction and basis-construction methods.

### Gauß–Newton with approximated tensors

As GNAT operates on a discrete level, we assume that Equation (2.1) is solved by a linear implicit time integrator, see the next section for a more detailed description. Then for \(n_{t}\) time steps, a sequence of \(n_{t}\) systems of nonlinear equations are to be solved, with each system defined at time step

where \(x\in \mathbb{R}^{M}\) and the residual mapping \(R:\mathbb{R}^{M} \rightarrow \mathbb{R}^{M}\). For ease of notation, the time dependencies have been omitted as we consider only one time instance and one input vector.

The GNAT approach for reduction is to use a projection to search the approximated solution in the incremental affine trial subspace \(x^{0} + \mathcal{V}\subset \mathbb{R}^{M}\), with the initial condition \(x^{0} \in \mathbb{R}^{M}\). The incremental subspace is used for consistent Petrov–Galerkin projections, Appendix A [6]. This reduced state vector is then given by

$$ \tilde{x} = x^{0} + V_{x}x_{\mathrm{r}}, $$

(3.2)

where \(V_{x}\in \mathbb{R}^{M\times r}\) is the r-dimensional projection basis for \(\mathcal{V}\), which is not yet defined, and \(x_{\mathrm{r}}\) denotes the reduced incremental vector of the state vector. Substituting Equation (3.2) into Equation (3.1), results in an overdetermined system of *M* equations and *r* unknowns. Since \(V_{x}\) is a matrix with full column rank, it is possible to solve this system by a minimisation in least-squares sense through

$$ \min_{\tilde{x}\in u^{0}+\mathcal{V}} \bigl\Vert R(\tilde{x}) \bigr\Vert _{2}. $$

(3.3)

This approach by residual minimisation is equivalent to performing a Petrov–Galerkin projection where te test basis is given by \(\frac{\partial R}{\partial x}(V_{x})\), [5]. This nonlinear least-squares problem is solved by the Gauß–Newton method, leading to then iterative process for \(k= 1,\ldots,K\), solving

$$ s^{k} = \min_{a\in \mathbb{R}^{M}} \bigl\Vert J^{k}V_{x}a+R^{k} \bigr\Vert _{2}, $$

(3.4)

and updating the search value \(w^{k}_{r}\) with

$$ w_{r}^{k+1} = w^{k}_{r} + s^{k}, $$

(3.5)

where *K* is defined through a convergence criterion, initial guess \(w^{0}_{r}\), \(R^{k} \equiv R(x^{0} + V_{x}w_{r}^{k})\) and \(J^{k}\equiv \frac{\partial R}{\partial x}(x^{0},V_{x} x_{r}^{k})\). Here \(J^{k}\) is the full order Jacobian of the residual at each iteration step *k*. Since the computation of this Jacobian, which is used in circuit simulation to solve for the next step, scales with the original full dimension of Equation (3.1) this is a computational bottleneck. This bottleneck can be circumvented by the application of hyper reduction methods, for which this paper utilises a gappy data reconstruction method.

### Hyper-reduction by gappy data reconstruction

The evaluation of the nonlinear function \(R(x^{0} + V_{x} w_{r}^{k})\) has a computational complexity that is still dependent on the size of the full system. To reduce the complexity of this evaluation a gappy data reconstruction, based on [7], is applied. Like the GNAT approach gappy data reconstruction uses a reduced basis to reconstruct the data. Gappy data reconstruction starts by defining a mask vector *n* for a solution state *x* as

$$\begin{aligned}& n_{j} = 0 \quad \text{if } u_{j} \text{ is missing}, \\& n_{j} = 1 \quad \text{if } u_{j} \text{ is known}, \end{aligned}$$

where *j* denotes the *j*-th element of each vector. The mask vector *n* is applied point-wise to a vector by \((n,x)_{j}=n_{j} x_{j}\). This sets all the unobserved values to 0. Then, the gappy inner product can be defined as \((a,b)_{n}=((n,a),(n,b))\), which is the inner product of the each vector masked respectively. The induced norm is then \(( \Vert x \Vert _{n})^{2}=(x,x)_{n}\). Considering the reduction base obtained by MESS \(V_{\mathrm{gap}}=\{v^{i}\}_{i=1}^{r}\), now we can construct an intermediate “repaired” full size vector *ũ* from a reduced vector *u* with only *g* elements by

$$ \tilde{u} \approx \sum_{i=1}^{r} b_{i}v^{i}, $$

(3.6)

where the coefficients \(b_{i}\) need to minimise an error *E* between the original and repaired vector, which is defined as

$$ E = \Vert u-\tilde{u} \Vert ^{2}_{n}. $$

(3.7)

This minimisation is done by solving the linear system

where

$$ M_{ij} = \bigl(v^{i},v^{j} \bigr)_{n},\quad \text{and}\quad f_{i} = \bigl(u,v^{i}\bigr)_{n}. $$

(3.9)

From this solution *ũ* is constructed. Then the complete vector is reconstructed by mapping the reduced vectors elements to their original indices and filling the rest with the reconstructed values.

### Reduced basis generation

As stated previously, both the nonlinear model order reduction method and the hyper-reduction method use a reduced basis. To generate such a basis, generally snapshot back-ended reduced basis methods are used. A prominent method for this in literature is the proper orthogonal decomposition method.

However, as shown in [3, 10], the way the proper orthogonal decomposition framework extracts information from the high-fidelity snapshot matrix is inherently linear. When used for nonlinear problems, it removes high-frequency components that are present and relevant to the dynamical evolution of these systems. Therefore, the MESS method for reduced basis generation is used.

#### Maximum entropy snapshot sampling

Let *m* and *n* be positive integers and \(m \gg n > 1\). Define a finite sequence \(X = (x_{1},x_{2}, \ldots,x_{n})\) of numerically obtained states \(x_{j}\in \mathbb{R}^{m}\) at time instances \(t_{j}\in \mathbb{R}\), with \(j \in \{1,2,\ldots,n\}\), of a dynamical system governed by either ODEs or DAEs. Provided probability distribution *p* of the states of the system, the second-order Rényi entropy of the sample X is

$$ H^{(2)}_{p}(X) = -\log \sum _{j=1}^{n} p(x_{j})^{2} = -\log \mathbb{E}\bigl(p(x_{j})\bigr), $$

(3.10)

with \(p_{j} \equiv p(x_{j})\) and where \(\mathbb{E}(p(X))\) is the expected value of the probability distribution *p* with respect to *p* itself. When *n* is large enough, according to the law of large numbers, the average of \(p_{1}, p_{2}, \ldots, p_{n}\) almost surely converges to their expected value,

$$ \frac{1}{n}\sum_{j=1}^{n}p(x_{j}) \rightarrow \mathbb{E}\bigl(p(X)\bigr)\quad \text{as }n\rightarrow \infty , $$

(3.11)

thus each \(p(x_{j})\) can be approximated by the sample’s average sojourn time or relative frequency of occurrence. To obtain this frequency of occurrence, considering a norm \(\Vert \cdot \Vert \) on \(\mathbb{R}^{m}\). Then the notion of occurrence can be translated into a proximity condition. In particular, for each \(x_{j}\in \mathbb{R}^{m}\) define the open ball that is centred at \(x_{j}\) and whose radius is \(\epsilon >0\),

$$ B_{\epsilon}(x) = \bigl\{ y\in \mathbb{R}^{m}\mid \Vert x - y \Vert < \epsilon \bigr\} , $$

(3.12)

and introduce the characteristic function with values

$$ \chi _{i}(x) = \textstyle\begin{cases} 1, & \text{if }x\in B_{\epsilon}(x_{i}), \\ 0, & \text{if }x\notin B_{\epsilon}(x_{i}). \end{cases} $$

(3.13)

Under the aforementioned considerations, the entropy of *X* can be estimated by

$$ \hat{H}^{(2)}_{p}(X) = -\log \frac{1}{n^{2}}\sum_{i=1}^{n}\sum _{j=1}^{n} \chi _{i}({x}_{j}). $$

(3.14)

Provided that the limit of the evolution of \(\hat{H}^{(2)}_{p}\) exists and measures the sensitivity of the evolution of the system itself [4, §6.6]. Then, for *n* large enough, a reduced sequence \(X_{\mathrm{r}}=(\bar{x}_{j_{1}},\bar{x}_{j_{2}},\ldots ,\bar{x}_{j_{r}})\), with \(r\leq n\), is sampled from *X*. This is done by requiring that the entropy of \(X_{\mathrm{r}}\) is a strictly increasing function of the index \(k\in \{1,2,\ldots ,r\}\) [11]. The state vector \(\bar{x}_{j_{k}}\) added to sampled snapshot space is the average value of all states in the selected *ϵ*-ball. A reduced basis \(V_{x}\) is then generated from \(X_{\mathrm{r}}\) with any orthonormalization process. It has been shown [10] that any such basis guarantees that the Euclidean reconstruction error of each snapshot is bounded from above by *ϵ*, while a similar bound holds true for future snapshots, up to a specific time-horizon. See Theorems 3.3 and 3.4, [10].

To estimate the parameter *ϵ*, which determines the degree of reduction within the MESS framework, the following optimisation approach is employed [3]. As no user input is now required this optimality requirement makes the MESS parameter free reduction method.

### Reduced order multirate

Combining the previously seen techniques, we apply the model order reduction techniques only in the macro-step of the multirate integration scheme. However, due to the partitioned nature, this full system needs a special type of reduced basis matrix as it needs to reduce only the slow part. To illustrate an implementation of a reduced order multirate scheme, a first order implicit Euler scheme is described. The new reduction matrix is now given by \(\Phi \in \mathbb{R}^{r + n_{\mathrm{F}} \times M}\). Considering the split system defined by Equations (2.7a)–(2.7c) the reduction matrix is defined by

$$ \Phi = \begin{pmatrix} I_{n_{\mathrm{F}}} & 0 \\ 0 & V_{r} \end{pmatrix}. $$

(3.15)

Then in each macro time-step we only need to solve the following reduced system. This nonlinear least-squares problem is solved by the Gauß–Newton method, using Φ,

$$ s^{k} = \min_{\tilde{u}\in u^{0}+\mathcal{V}} \bigl\Vert J_{r}^{k}\Phi a+R^{k} \bigr\Vert _{2}. $$

(3.16)

Where *R* is defined for the whole macro step system as

$$ R(x) = q\biggl(t_{n+m},\frac{x - x_{n}}{H}\biggr) - j(t_{n+1},x) = 0. $$

(3.17)

Note however that now we have \(J_{r}^{k}\) as the full order Jacobian approximated by using the gappy-MESS function evaluations. Then once the solution for the macro time-step is obtained, the slow partition of this solution \(x_{\mathrm{S}}\) are used for the interpolated values needed in the intermediate fast steps.