Fast computation of function composition derivatives for ﬂatness-based control of diffusion problems

The chain rule is a standard tool in diﬀerential calculus to ﬁnd derivatives of composite functions. Faà di Bruno’s formula is a generalization of the chain rule and states a method to ﬁnd high-order derivatives. In this contribution, we propose an algorithm based on Faà di Bruno’s formula and Bell polynomials (Bell in Ann Math 29:38–46, 1927; Parks and Krantz in A primer of real analytic functions, 2012) to compute the structure of derivatives of function compositions. The application of our method is showcased using trajectory planning for the heat equation (Laroche et al. in Int J Robust Nonlinear Control 10(8):629–643, 2000).


Introduction
The chain rule is a standard tool to differentiate function compositions in differential calculus.The first few derivatives can usually be found manually in case of simple functions, but the calculation complexity increases by the order of differentiation.Thus, high-order derivatives are hard to find without the use of a computer.
In the research of flatness-based control we need a high number of derivatives to design open-loop control algorithms for large-scale systems like partial differential equations.An introduction to motion planning for systems governed by partial differential equations can be found in [3].The smooth bump function with T > 0 and ω > 1 is a typical example, which need to be differentiated several times to compute the flatness-based input signal.Bump function (1) and its derivatives are essential to find open-loop control signals for diffusion-convection-reaction systems because bump function (1) is a so called Gevrey-class function, which means that it is particularly smooth.Further information about the control design and the Gevrey class can be found in [3] in Chap.6 and appendix B.1, and in [2,4,5].Function compositions like Eq. ( 1) can be differentiated numerically, symbolically or with algorithmic differentiation.Numerical approaches like finite differences may not be applicable on high-order derivatives because they depend strongly on the order of approximation.Exact derivatives can be found with symbolic differentiation, which is provided by computer algebra systems like Maxima or Mathematica.However, the computed derivatives need to be transferred to source code for further steps of the control design.The authors of [6] implemented the derivatives of bump function (1) up to a order of 43 with Matlab.Algorithmic differentiation gained a lot of research interest in recent years because it is mainly used in optimization and machine learning to find gradients and Hessians [7].
In this article, we describe an alternative way to find the derivatives of arbitrary function compositions φ(γ (t)).Our proposed approach is based on Bell polynomials and Faà di Bruno's formula [1] and calculates the structure of all derivatives up to an order n.The algorithm is implemented in Julia programming language [8] and is available on GitHub [9].Related work about the computation of Bell polynomials and Faà di Bruno's formula can be found in [10] and [11].In Sect.2, we introduce Faà di Bruno's formula and explain its construction with Bell polynomials.The algorithm to build the structure of derivatives is noted in Sect.3. The performance of this algorithm and its implementation are examined in Sect. 4. Finally, we demonstrate the application of our method on bump function (1) in Sect. 5.

Bell polynomials and Faà di Bruno's formula
A function composition φ(γ (t)) = φ • γ (t) is differentiated using the chain rule as with the assumption that γ : R → R and φ : R → R are sufficiently continuously differentiable.An increasing order of differentiation leads to more and more complicated and error-prone manual calculations with an increasing number of terms.The chain rule for the n-th order derivative with n ≥ 1 is generalized by Faà di Bruno's formula where B n,k denote the partial Bell polynomials, see [12] and [1, page 17].The partial Bell polynomials are defined generally by where all j i ∈ N 0 and conditions n-k+1 i j i = k and n-k+1 i ij i = n have to be satisfied.Henceforth, we denote the derivatives of γ as x i = γ (i) (t).Equation ( 3) is rather a formal description of the Bell polynomials than a way to calculate them.Instead, partial Bell polynomials are calculated recursively by with a n,i = n-1 i-1 and for n ≥ 1, k ≤ n.Recursion ( 4) is initialized by B 0,0 (•) = 1, B n,0 (x 1 , . . ., x n+1 ) = 0 for n ≥ 1 and B n,k (x 1 , . . ., x n-k+1 ) = 0 for n ≥ 0 and k > n.To find B n,k for any useful combination (n, k) we need to know all previous Bell polynomials, e.g.
etc.This means, that the calculation of a n-th order derivative as in Eq. ( 2) using B n,k depends on all the derivatives with B ν,k for ν ∈ {1, 2, . . ., n -1}.If we consider B ν,k and identify k = 1 and k = ν in (4), then we yield the first and last monomial (5) where a ν,ν = ν-1 ν-1 = 1 and a ν,1 = ν-1 0 = 1.We are not interested in executing Faà di Bruno's formula (2) with Bell polynomials (4) for each time step over and over again.Instead, we calculate once the structure of all partial Bell polynomial B ν,k with recursion (4).Each partial Bell polynomial B ν,k consists of m ν,k monomials (7) with μ ∈ {1, 2, . . ., m ν,k }, ν ∈ {1, 2, . . ., n} and n ≥ 1 as the highest order of all derivatives.We compute and store exponents j μ,i and coefficients α ν,k,μ .In this manner, we have to execute the recursive algorithm (4) only once for building the polynomial structures -the monomial coefficients and the corresponding exponents -and not for every function evaluation of d ν dt ν φ(γ (t)).Finally, the known derivatives d ν ds ν φ(s) and d ν dt ν γ (t) are inserted in the computed structures.This means, that the algorithm to build the structure of derivatives is independent of the specific function composition.

Algorithmic representation
The procedure to build the polynomial structures is split into two parts: firstly, calculating the exponents j μ,i and secondly, finding the coefficients α ν,k,μ .We begin with the computation of exponents j μ,i of monomial β ν,k,μ from Equation ( 7) which are stored in vector The notation of j μ,i of vector J ν,k,μ is shortened to improve the readability and it refers to j ν,k,μ,i with i ∈ {1, . . ., νk + 1}.In this setting, we interpret the multiplication with i ∈ {1, . . ., νk + 1} as vector addition with unit vector e i = (0 i-1 , 1, 0 n-i ) .Furthermore, all vectors J ν,k,μ are summarized as matrix The number of columns m ν,k (or equally m) increases by ν and k.This behavior is discussed in Sect. 4. Next, we present the algorithm to compute the vectors J ν,k,μ .We implement three nested loops to iterate over k ∈ {1, 2, . . ., ν} (inner loop) and ν ∈ {1, 2, . . ., n} (outer loop) where n is the highest order of differentiation.

Step 2: matrix computation
In the inner loop k ∈ {2, . . ., ν -1}, we define matrix and apply for all i ∈ {1, . . ., νk + 1} the matrix summation as described in Equation (9).The result of matrix summation (11) is saved for i = 1 as Jν,k ← T and for i > 1 we apply the concatenation Jν,k ← [ Jν,k , T ].We use Jν,k = [ Jν,k,1 , . . ., Jν,k,m ] as temporary storage because we are only interested in the unique columns of Jν,k .Therefore, in the end of each iteration k we filter for unique entries and save the result as J ν,k ← unique( Jν,k ).

Step 3: coefficients and final formulation
We derive the coefficients α ν,k,μ from the definition of the partial Bell polynomials (3) as where j μ,i (or equally j ν,k,μ,i ) are the matrix elements of J ν,k as introduced in Equations ( 8), (10).Finally, we find all partial Bell polynomials by evaluating Consequently, we yield the ν-th derivative as

Applicability and performance
Several methods exist to calculate derivatives via numerical, symbolic or algorithmic differentiation as mentioned in Sect. 1.Our proposed method shall evince an additional path beside symbolic and algorithmic differentiation.The proposed algorithm in Sect. 3 calculates the polynomial structure of all derivatives up to the n-th order of any function composition φ(γ (t)) = φ • γ (t).Thus, the algorithm does not depend on any specific functions φ and γ .However, the derivatives d ν ds ν φ(s) and d ν dt ν γ (t) for ν ∈ {1, . . ., n} must exist and need to be implemented in source code to finally compute the derivatives.Table 1 contains a list of some example functions and their ν-th order derivatives, which can be used for φ(x), γ (x), d ν dx ν φ(x) and d ν dx ν γ (x), to demonstrate the algorithm.The advantage of these example function is that the derivatives can be directly implemented in source code.The Gamma function in Table 1 This means that our approach is not based on numerical, symbolic or algorithmic differentiation.In particular, it does not use symbolic computation to find the derivatives.However, the authors cannot exclude that existing computer algebra systems or symbolic toolboxes does not use ideas or parts of our presented approach.
One of the main challenges of our approach is its performance because each new derivative depends on all of its previous derivatives.The described algorithm consists of two parts: the computation of matrices J ν,k and coefficients a ν,k,μ .The algorithm to compute J ν,k consists of three nested loops.In each ν-th outer iteration the number of inner forloop iterations is found by This quadratic scaling increases the computing time significantly in case of high-order derivatives like n > 20.The number of iterations iter J (ν) is listed in Table 2 and the computing time is portrayed in Fig. 1 (a).We observe from the computations of J ν,k , that the number of its columns can be described by for k ∈ {1, . . ., ν -1} with m ν,ν = 1.This integer sequence is known as A008284 [13].Thus, for each derivative ν we need iterations to calculate the coefficients α ν,k,μ .The resulting number of iterations iter α (ν) is listed in Table 2 and the computing time to find the coefficients α ν,k,μ is displayed in Fig. 1 (b).We find that the computation of α ν,k,μ performs much slower than J ν,k .We already discussed, that the computation of the ν-th element of J ν,k depends on all previous ones.In contrast to J ν,k , the ν-th element of α ν,k,μ is computed separately.Thus, the computation of α ν,k,μ might be intensively parallelized to reduce the computing time.The computing times in Fig. 1 are measured for the computation of J ν,k and α ν,k,μ separately

Application example
In this section, we apply the presented methods on bump function (1) to generate the input signal with ˆ := T 0 ω,T (τ ) dτ for a one-dimensional heat conduction model.This control design is introduced and discussed in [2].A step-by-step derivation of input signal ( 14) is noted in [5].We consider the temperature θ : [t, x] ∈ [0, T] → R ≥0 in a one-dimensional rod with length L > 0. Input function (14) shall steer the measured temperature y(t) = θ (t, L) of the linear heat equation Figure 2 Bump function ω,T , derivatives ˙ ω,T and ¨ ω,T (left); and derivatives (8)  ω,T (t) = d 8 dt 8 (t), ω,T (t) = d 9 dt 9 (t) and (10)  ω,T (t) = d 10 dt 10 (t) (right) from y(0) = 0 to y(T) = y(0) + 1 = 1.The input signal is applied on the left side and a thermal isolation is assumed on the right side via the boundary conditions We set the parameters as ω = 2, T = 10 seconds, L = 1 meter and the initial value as θ (0, x) = 0. Theoretically, an infinite number of derivatives of bump function ( 1) is necessary to compute input signal u(t).We assume n = 10 for implementation purposes because the series of input signal ( 14) converges after few iterations, see also [2,5].Bump function (1) is formulated as exp(f (g(t))) for t ∈ (0, T) where The algorithm builds the polynomial structure once and the computed structure is applied here two times: in the first step on f (g(t)) =: q(t) and in the second step on ω,T (t) = exp(q(t)).The derivatives of f (s) and g(t) are found as See also the derivatives in Table 1.Our algorithm computes matrices J ν,k and coefficients a ν,k,μ .We yield the derivatives d ν dt ν q(t) with Eq. ( 13).Next, we use the same data J ν,k and a ν,k,μ with d ν dt ν q(t) and Eq. ( 13) to compute the derivatives d ν dt ν ω,T (t) = d ν dt ν exp(q(t)).Bump function ω,T (t) and its computed derivatives ˙ , ¨ and (8) , (9) , (10) are portrayed in Fig. 2. One notes that the minima and maxima increase with the order of differentiation.The input function u(t) is computed as in Eq. ( 14) using the derivatives (ν)  ω,T and is depicted in Fig. 3.

Conclusion
In this contribution, we presented a procedure to compute the polynomial structure of derivatives of function compositions using Bell polynomials and Faà di Bruno's formula.The presented algorithm is implemented as Julia source code and is applied to an example function from flatness-based control.One of the main challenges of our proposed method is the performance of the computation of coefficients α ν,k,μ .These costs may be reduced in subsequent development of the proposed method using parallelization as briefly noted in Sect. 4. In further research, symbolic and algorithmic differentiation methods need to be compared substantially with Faà di Bruno's approach to improve the computational differentiation of function compositions.

Figure 1
Figure 1 Computing time of J ν,k (left) and computing time of α ν,k,μ (right).The red areas describe the upper and lower bounds

Table 1
Example functions and derivatives of φ and γ

Table 2
Number of iterations to calculate J ν,k and a ν,k,μ