Skip to main content

Optimal control of multiphase steel production

Abstract

An optimal control problem for the production of multiphase steel is investigated that takes into account phase transformations in the steel slab. The state equations are a semilinear heat equation coupled with an ordinary differential equation, that describes the evolution of the steel microstructure. The time-dependent heat transfer coefficient serves as a control function. Necessary and sufficient optimality conditions for the control problem are derived. For the numerical solution of the control problem, a reduced sequential quadratic programming method with a primal-dual active set strategy is developed. The numerical results are presented for the optimal control of a cooling line in the production of hot-rolled Mo–Mn dual phase steel.

1 Introduction

We consider an optimal control problem that describes the hot rolling process of multiphase steel, in particular dual phase (DP) steel. Dual phase steels have shown high potential for automotive applications due to their remarkable property combination with high strength and good formability. The microstructure of DP steel typically consists of a soft ferrite phase with dispersed islands of a hard martensite as the secondary phase [3]. The essential industrial process route for the production of DP steel consists of the hot rolling and subsequent controlled cooling on the run out table (ROT) which is located behind the finishing mill.

The hot rolling process of dual phase steel consists of 4 steps as shown in Fig. 1: Rolling in roughing and finishing stands, which results in the refinement of austenite (initial phase) grain size due to the repeating static recrystallization (1), laminar cooling into two phase region (2), isothermal holding at ferrite transformation region temperatures, where the temperatures remain relatively constant (3), and finally, fast continuous cooling to the required coiling temperature, during which martensite transformation takes place and bainite transformation can be avoided (4).

Figure 1
figure 1

A sketch of the processing scheme for hot-rolled dual phase steel

The controlled cooling of stages (2)–(4) happens on the run out table. Here, the most important control parameters are the flow-rate of water and the feed velocity of the strip. Since the process window for the adjustment of the phase composition is very tight, the computation of optimal process parameters is an important task. The goal of this paper is the analysis of a mathematical optimal control problem to compute the desired ferrite fraction and temperature at the end of step 3 of the process.

The controlled cooling of steel is a well-studied topic in engineering science and mathematics. There are a variety of methods used for the control approaches. An algorithm for the computation of optimal strategies for the cooling of steel strips in hot strip mills was proposed by Landl et al. [17]. The authors considered the problem of determination of suitable cooling strategy as a discrete optimization problem and demonstrated the numerical results for the real hot rolling mill. While they considered an integer optimization problem for switching on and off cooling sections, the goal of this study is to optimize the amount of coolant in a single cooling section. Lezius and Tröltzsch [18] considered a simplified numerical approach for the controlled cooling of steel profiles. A method of model predictive control for the temperature evolution of the strip has been proposed by Hashimoto, Yoshioka and Ohtsuka [10]. In Zheng and Li [26] a control strategy based on Kalman filter and model predictive control is discussed for the hot-rolled strip laminar cooling process. Wang et al. [25] discussed the method to calculate the convective heat transfer coefficient by combining a mathematical model with a back propagation neural network. While previous optimal control approaches for run out tables solely focus on the evolution of temperature, the main novelty of this paper is that we put a special emphasis on the microstructure, i.e., the composition of steel phases produced upon cooling. As mentioned earlier, from application point of view this is of high relevance, especially for the production of modern multiphase steels such as dual phase or trip steels.

We formulate an optimal control problem which consists in obtaining the cooling strategy such that the desired dual phase microstructure in steel is reached most accurately. This problem is a nonlinear boundary control problem, in which the state system consists of a semilinear heat equation coupled with an ordinary differential equation. The latter describes the evolution of the ferrite phase fraction. The heat transfer coefficient in the Newton type cooling boundary condition acts as the control parameter. In a previous paper [4], we have shown how to relate this coefficient to the flow-rate of coolant in a real cooling process. The scope of this paper is to analyze the resulting boundary coefficient control problem subject to a semilinear heat equation and rate law to describe the evolution of ferrite phase. Due to the nonlinearity in the coupling term on the right-hand side of the heat equation, the state system requires a detailed analysis, especially concerning the regularity of the solutions, which is of crucial importance for the derivation of second-order sufficient optimality conditions.

We investigate the existence of a solution and derive the first-order necessary and second-order sufficient optimality conditions, which form the basis for the convergence of the second-order optimization algorithms. Second-order optimality conditions for control problems governed by parabolic equations have been discussed, e.g., in Goldberg and Tröltzsch [7] and Raymond and Tröltzsch [20]. In comparison to the very general and abstract setting of the latter contribution, the main novelty of this paper is twofold, we consider a control in coefficient problem and we add an additional evolution equation to the state system to account for the evolution of steel microstructure.

To solve the control problem numerically, we use a reduced sequential quadratic programming (rSQP) method. This method has proven to be very effective in many areas of application, such as optimal control. A successful numerical application of the rSQP method to parabolic control problems has been reported by Hintermüller, Volkwein and Diwoky [12], Kupfer and Sachs [16].

In each iteration of rSQP method, the quadratic optimal control problem \((\mathit{QP}^{k})\) with control constraints has to be solved. To treat the \((\mathit{QP}^{k})\) problems, we apply a primal-dual active set strategy as, for instance, proposed by Bergonioux, Ito and Kunisch [2] for control constrained optimal control problems.

The paper is organized as follows: In Sect. 2, we analyze the optimal control problem and derive optimality conditions. In Sect. 3, we discuss the numerical optimization algorithms, i.e., the reduced SQP method with the active set strategy. The last section is devoted to numerical results.

2 The optimal control problem

2.1 Problem formulation and assumptions

We consider an optimal control problem for the controlled cooling of steel profiles in order to obtain a desired temperature and phase distribution in the steel slab. After the last deformation step, the steel sheet is cooled by water jets on the run out table, where the steel undergoes the austenite-ferrite phase transformation, see, e.g., [3]. The evolution of ferrite can be described in general form by the following initial value problem

$$ \begin{gathered} f_{t} =G(f,\theta ), \\ f(0) =0. \end{gathered} $$

Here, f denotes the volume fraction of ferrite and θ refers to the temperature. Typically, the function G can be a nonlinear function in its arguments f and θ. For an example of concrete model for the austenite-ferrite phase transformation in the hot rolling process, we refer to [22]. The temperature distribution in the steel slab is described by the heat equation

$$ \rho c_{p}\theta _{t}-\kappa \Delta \theta =\rho Lf_{t}. $$

The density ρ, the heat capacity \(c_{p}\), the heat conductivity κ and the latent heat L are assumed to be positive constants. The term \(\rho Lf_{t}\) describes the release of heat due to the phase transformation of ferrite. The boundary condition for the temperature imposed on the top and the bottom boundary of the domain Ω is given as Newton’s law of cooling

$$ -\kappa \frac{\partial \theta }{\partial n}=u(t)\beta (x) (\theta - \theta _{w}), $$

where \(\theta _{w}\) is the temperature of the coolant. The proportionality factor is the heat transfer coefficient, which is split into two parts, one depending only on time and the other only on the space variable. The function β can describe, for instance, a profile of cooling medium distribution on the surface of the steel slab, see Fig. 2. The function u can be expressed through a coolant flow-rate during the cooling and serves as the control variable in our problem.

Figure 2
figure 2

The scheme of the cooling of steel profiles

We seek an optimal cooling strategy \(\bar{u}=\bar{u}(t)\) such that a desired final phase distribution \(f_{d}(x)\) is reached. At the same time, we want the temperature \(\theta _{d}(x,t)\) to be realized during the cooling process. Thus, the control problem (P) to obtain an optimal time-dependent heat transfer coefficient \(u(t)\) can be formulated as follows:

$$ \min_{\theta ,f,u} J(\theta ,f,u)=\frac{\alpha _{1}}{2} \int _{\varOmega } \bigl(f(x,T)-f_{d}(x)\bigr)^{2} \,dx+\frac{\alpha _{2}}{2} \iint _{Q}( \theta -\theta _{d})^{2}\,dx \,dt +\frac{\alpha _{3}}{2} \int _{0} ^{T}u^{2}\,dt $$
(1)

subject to

$$\begin{aligned}& f_{t} =G(\theta ,f),\quad \text{in }Q=\varOmega \times (0,T), \end{aligned}$$
(2a)
$$\begin{aligned}& f(0) =0,\quad \text{in }\varOmega, \end{aligned}$$
(2b)
$$\begin{aligned}& \rho c_{p}\theta _{t}-k\Delta \theta =\rho Lf_{t},\quad \text{in }Q, \end{aligned}$$
(2c)
$$\begin{aligned}& -k\frac{\partial \theta }{\partial n} =u(t)\beta (x) (\theta -\theta _{w}),\quad \text{on }\varSigma _{1}=\varGamma _{1}\times (0,T), \end{aligned}$$
(2d)
$$\begin{aligned}& -k\frac{\partial \theta }{\partial n} =0,\quad \text{on }\varSigma _{2}=( \partial \varOmega \setminus \varGamma _{1})\times (0,T), \end{aligned}$$
(2e)
$$\begin{aligned}& \theta (0) =\theta _{0},\quad \text{in }\varOmega \end{aligned}$$
(2f)

and

$$ u\in U_{\mathrm{ad}}=\bigl\{ u \in L^{\infty }(0,T): u_{a}\le u \le u_{b}, u_{a},u _{b}\ge 0\bigr\} , $$

where \(\varGamma _{1}\) denotes the top and the bottom boundary of the domain Ω (see Fig. 2). The factors \(\alpha _{i}\), \(i=1,\ldots,3\), are positive constants. The third term in the cost functional represents a Tikhonov regularization term that can also be interpreted as a measure of the costs of the control. The control is bounded by two positive constants \(u_{a}\) and \(u_{b}\) since we consider only the cooling process and due to the restrictions on the maximal amount of coolant.

Further, we make some assumptions on the quantities of the optimal control problem that we need for the analysis.

Assumptions

  1. (A1)

    \(\varOmega \subset \mathbb{R}^{3}\) denotes a bounded domain with Lipschitz boundary ∂Ω.

  2. (A2)

    The function \(G=G(\theta ,f)\) is twice continuously differentiable with respect to θ and f. There is a constant \(M>0\), such that

    $$ \bigl\vert G(\theta ,f) \bigr\vert \le M,\quad \forall (\theta ,f)\in \mathbb{R}^{2}. $$

    The second derivative of G w.r.t. \((\theta ,f)\) is uniformly Lipschitz on bounded sets, i.e., for all \(M>0\) there exists \(L_{M}>0\) such that G satisfies

    $$ \bigl\vert G''(\theta _{1},f_{1})-G''( \theta _{2},f_{2}) \bigr\vert \le L_{M}\bigl( \vert \theta _{1}- \theta _{2} \vert + \vert f_{1}-f_{2} \vert \bigr) $$

    for all \(\theta _{i}, f_{i}\in \mathbb{R}\) with \(\vert \theta _{i} \vert , \vert f_{i} \vert \le M\), \(i=1,2\).

  3. (A3)

    \(\beta \in L^{\infty }(\varSigma _{1})\), \(\theta _{w}\in L^{\infty }(\varSigma _{1})\), \(\theta _{0}\in C(\bar{\varOmega })\) and \(\theta _{d}\in L^{\infty }(Q)\).

  4. (A4)

    \(f_{d}\in L^{\infty }(\varOmega )\), \(0\le f_{d}\le 1\) a.e. in Ω.

Remark 1

Assumption (A2) can be relaxed and has been chosen only to avoid technicalities when computing the derivatives. For more realistic phase transformation models we refer to [6].

Remark 2

The choice of the cost functional in (1) is somewhat arbitrary. Mutatis mutandis, also a control of the temperature at end-time and/or a control of the distributed ferrite fraction is possible.

2.2 Analysis of the state system

Let us start with the discussion of the initial value problem (2a)–(2b) in the state system. In view of the assumptions, the following result can be proven by standard arguments. For a detailed proof, we refer to [13] or [14].

Lemma 1

Suppose that (A2) holds true. Then, we have the following:

  1. (a)

    Let \(\theta \in L^{1}(Q)\) be given, then (2a), (2b) has a unique solution \(f\in W^{1,\infty }(0,T;L^{\infty }(\varOmega ))\) and

    $$ \Vert f \Vert _{W^{1,\infty }(0,T;L^{\infty }(\varOmega ))}\le M_{1} $$

    with a constant independent of θ.

  2. (b)

    Let \(\theta _{1}, \theta _{2}\in L^{p}(Q)\), \(1\le p <\infty \) and let \(f_{1}\), \(f_{2}\) be the corresponding solutions of (2a), (2b), then there exists a constant \(M_{2}>0\) such that

    $$ \Vert f_{1}-f_{2} \Vert _{W^{1,p}(0,T;L^{p}(\varOmega ))}\le M_{2} \Vert \theta _{1}- \theta _{2} \Vert _{L^{p}(Q)}. $$

Before considering the heat equation, we recall the following results from the theory of linear parabolic equations. We consider the following linear parabolic problem

$$\begin{aligned}& \rho c_{p}\theta _{t}-k\Delta \theta =r, \quad \text{in }Q, \end{aligned}$$
(3a)
$$\begin{aligned}& -k\frac{\partial \theta }{\partial n} =u(t)\beta (x) (\theta -\theta _{w}),\quad \text{on }\varSigma _{1}, \end{aligned}$$
(3b)
$$\begin{aligned}& -k\frac{\partial \theta }{\partial n} =0,\quad \text{on }\varSigma _{2}, \end{aligned}$$
(3c)
$$\begin{aligned}& \theta (0) =\theta _{0},\quad \text{in }\varOmega . \end{aligned}$$
(3d)

It is well known that a suitable function space for the solution of linear parabolic partial differential equations is

$$ W(0,T)=\bigl\{ \theta \in L^{2}\bigl(0,T;H^{1}(\varOmega ) \bigr): \theta _{t}\in L^{2}(0,T;H ^{1}(\varOmega )^{*}\bigr\} . $$

Under additional assumptions on the data r, u, \(\theta _{w}\), \(\theta _{0}\), the following result can be obtained from Theorem 5.5 in Tröltzsch [24]:

Lemma 2

Suppose that (A3) holds true, and \(r\in L^{s_{1}}(Q)\), \(u\in L^{ \infty }(0,T)\), \(u\ge 0\). Let \(s_{1}>5/2\), \(s_{2}>4\), then the initial value problem (3a)(3d) admits a unique solution \(\theta \in W(0,T)\cap C(\bar{Q})\) satisfying the a priori estimate with a constant \(C>0\)

$$ \Vert \theta \Vert _{W(0,T)}+ \Vert \theta \Vert _{C(\bar{Q})}\le C\bigl( \Vert r \Vert _{L^{s_{1}}(Q)}+ \Vert u \Vert _{L^{s_{2}}(0,T)}+ \Vert \theta _{0} \Vert _{C(\bar{Q})}\bigr). $$
(4)

It is a useful result for the proof of solvability of the state system (2a)–(2f), which is discussed below.

Theorem 1

Let (A1)(A4) be satisfied. Then, the state system (2a)(2f) admits for every control \(u\in U_{\mathrm{ad}}\) a unique solution

$$ (\theta , f)\in W(0,T)\cap C(\bar{Q})\times W^{1,\infty }\bigl(0,T;L^{ \infty }( \varOmega )\bigr) $$

satisfying

$$ \Vert \theta \Vert _{W(0,T)}+ \Vert \theta \Vert _{C(\bar{Q})}+ \Vert f \Vert _{W^{1,\infty }(0,T;L ^{\infty }(\varOmega ))}\le M_{3}. $$

Proof

If not otherwise stated, c denotes a generic constant, not to be confused with the heat capacity \(c_{p}\). To prove the existence of a local unique solution to (2c)–(2f), we apply the Banach’s fixed point theorem. For that purpose, we define an operator \(F: K\subset W(0,T)\rightarrow W(0,T)\) that maps \(\hat{\theta }\in W(0,T)\) to the solution θ of

$$\begin{aligned}& \rho c_{p}\theta _{t}-k\Delta \theta =\rho L\hat{f}_{t},\quad \text{in }Q, \end{aligned}$$
(5a)
$$\begin{aligned}& -k\frac{\partial \theta }{\partial n} =u\beta (\theta -\theta _{w}),\quad \text{on } \varSigma _{1}, \end{aligned}$$
(5b)
$$\begin{aligned}& -k\frac{\partial \theta }{\partial n} =0,\quad \text{on }\varSigma _{2}, \end{aligned}$$
(5c)
$$\begin{aligned}& \theta (0) =\theta _{0}\quad \text{in }\varOmega , \end{aligned}$$
(5d)

where solves (2a)–(2b) with θ̂.

From Lemma 1 we find that \(\hat{f}\in W^{1,\infty }(0,T;L ^{\infty }(\varOmega ))\) is uniquely determined. It follows from the theory of the linear parabolic equations that the problem (5a)–(5d) possesses a unique solution in \(W(0,T)\) (see, e.g., [24], Chap. 3.4.4). Hence, we can conclude that F is well-defined. Furthermore, the following a priori estimate with a constant \(C_{1}>0\) is valid

$$ \Vert \theta \Vert _{W(0,T)}\le C_{1}\bigl( \Vert \hat{f} \Vert _{L^{2}(Q)}+ \Vert u\beta \theta _{w} \Vert _{L^{2}(\varSigma _{1})}+ \Vert \theta _{0} \Vert _{L^{2}(\varOmega )}\bigr)\le C_{2}, $$

where \(C_{2}\) depends only on \(\theta _{0}\) and the constant \(M_{1}\) from Lemma 1. Hence, if M is chosen big enough, F is a self mapping on

$$ K=\bigl\{ \eta \in W(0,T): \Vert \eta \Vert _{W(0,T)}\le M \bigr\} . $$

Now, we want to show that F is a contraction. Let \(\hat{\theta } _{i}\in K\), \(i=1,2\), \(\theta _{i}=F(\hat{\theta }_{i})\) and \(\hat{\theta }=\hat{\theta }_{1}-\hat{\theta }_{2}\). Then, \(\theta = \theta _{1}-\theta _{2}\) solves

$$\begin{aligned}& \rho c_{p}\theta _{t}-k\Delta \theta =\rho L\bigl(G(\hat{ \theta }_{1},f _{1})-G(\hat{\theta }_{2},f_{2}) \bigr),\quad \text{in }Q, \\& -k\frac{\partial \theta }{\partial n} =u(t)\beta (x)\theta ,\quad \text{on }\varSigma _{1}, \\& -k\frac{\partial \theta }{\partial n} =0,\quad \text{on }\varSigma _{2}, \\& \theta (0) =0\quad \text{in }\varOmega. \end{aligned}$$

Here again, we use the a priori estimate

$$ \Vert \theta \Vert _{W(0,T)}\le c \bigl\Vert G(\hat{ \theta }_{1},f_{1})-G( \hat{\theta }_{2},f_{2}) \bigr\Vert _{L^{2}(Q)}. $$
(6)

Due to the Lipschitz continuity of G in both variables (Assumption (A2)) and Lemma 1(b), we obtain

$$ \Vert \theta \Vert _{W(0,T)}\le c\bigl( \Vert \hat{\theta } \Vert _{L ^{2}(Q)}+ \Vert f_{1}-f_{2} \Vert _{L^{2}(Q)} \bigr)\le c \Vert \hat{\theta } \Vert _{L^{2}(Q)}. $$
(7)

Further, we use the fact that \(W(0,T)\hookrightarrow C(0,T,L^{2}( \varOmega ))\)

$$ \Vert \theta \Vert _{W(0,T)}\le c \Vert \hat{\theta } \Vert _{L ^{2}(Q)}\le c T^{1/2} \Vert \hat{\theta } \Vert _{L^{\infty }(0,T;L^{2}(\varOmega ))}\le c T^{1/2} \Vert \hat{\theta } \Vert _{W(0,T)}. $$
(8)

Hence, choosing \(T^{+}< T\) small enough, we conclude that F is a contraction on \(W(0,T^{+})\). Since F is also a self-mapping on K, we can apply the Banach’s fixed point theorem to conclude that F has a unique fixed point θ, which is a local solution to (2c)–(2f). By a bootstrapping argument, the solution can be extended to the time interval \([0,T]\).

Moreover, in view of Lemma 1 we can apply Lemma 2 and obtain the additional regularity for θ. □

In view of the analysis of the state system, we define

$$ Y=W(0,T)\cap C(\bar{Q}) $$

and introduce the control-to-state mapping

$$ S=(S_{\theta }, S_{f}): L^{\infty }(0,T) \rightarrow Y\times {W^{1,p}\bigl(0,T;L ^{p}(\varOmega ) \bigr)},\quad 1\le p < \infty , $$
(9)

which assigns to every control \(u(t)\in L^{\infty }(0,T)\) the solution of the state system (2a)–(2f). Moreover, the mapping is Lipschitz continuous:

Corollary 1

Suppose that (A1)(A4) hold true and let \((\theta _{1}, f_{1})\), \(( \theta _{2}, f_{2})\) be the solutions of (2a)(2f) corresponding to \(u_{1}, u_{2}\in L^{\infty }(0,T)\). Then, there exists a constant \(C>0\), such that

$$ \Vert \theta _{1}-\theta _{2} \Vert _{C(\bar{Q})}+ \Vert f_{1}-f_{2} \Vert _{W^{1,p}(0,T;L ^{p}(\varOmega ))}\le C \Vert u_{1}-u_{2} \Vert _{L^{\infty }(0,T)}. $$

Proof

Defining \(\theta =\theta _{1}-\theta _{2}\) and \(f=f_{1}-f_{2}\), one finds that \((\theta , f)\) solves

$$\begin{aligned}& f_{t} =G(\theta _{1},f_{1})-G(\theta _{2},f_{2}),\quad \text{in }Q, \end{aligned}$$
(10a)
$$\begin{aligned}& f(0) =0,\quad \text{in }\varOmega, \end{aligned}$$
(10b)
$$\begin{aligned}& \rho c_{p}\theta _{t}-k\Delta \theta =\rho Lf_{t},\quad \text{in }Q, \end{aligned}$$
(10c)
$$\begin{aligned}& -k\frac{\partial \theta }{\partial n} =u_{1}(t)\beta (x)\theta +(u _{1}-u_{2}) (t)\beta (x) (\theta _{2}-\theta _{w}),\quad \text{on } \varSigma _{1}, \end{aligned}$$
(10d)
$$\begin{aligned}& -k\frac{\partial \theta }{\partial n} =0,\quad \text{on }\varSigma _{2}, \end{aligned}$$
(10e)
$$\begin{aligned}& \theta (0) =0,\quad \text{in }\varOmega . \end{aligned}$$
(10f)

Further, we prove the Lipschitz continuity regarding the \(L^{\infty }(Q)\)-norm. The multiplication of (10c) by \(\theta ^{2k-1}\), for an arbitrary \(k\in \mathbb{N}\) and integration over Ω and over \((0,t)\) yields

$$\begin{aligned} \begin{aligned}[b] &\frac{\rho c_{p}}{2k} \int _{\varOmega } \theta ^{2k}(t) \,dx+\kappa (2k-1) \int _{0}^{t} \int _{\varOmega } \theta ^{2k-2} \vert \nabla \theta \vert ^{2} \,dx\,ds+ \int _{0}^{t} \int _{\varGamma _{1}} u_{1}(t)\beta (\sigma )\theta ^{2k} \,d\sigma \,ds \\ &\quad =- \int _{0}^{t} \int _{\varGamma _{1}} (u_{1}-u_{2})\beta (\sigma ) ( \theta _{2}-\theta _{w})\theta ^{2k-1} \,d\sigma \,ds+ \int _{0}^{t} \int _{ \varOmega } f_{t}\theta ^{2k-1} \,dx\,ds. \end{aligned} \end{aligned}$$
(11)

Applying Lemma 1(b) and Hölder’s inequality gives

$$ \int _{0}^{t} \int _{\varOmega } \bigl\vert f_{t}\theta ^{2k-1} \bigr\vert \,dx\,ds\le C_{1} \int _{0} ^{t} \int _{\varOmega } \theta ^{2k} \,dx\,ds. $$
(12)

In order to estimate the first term on the right hand side of (11), we apply Young’s inequality

$$ \vert ab \vert \le \frac{\varepsilon ^{p} \vert a \vert ^{p}}{p}+ \frac{\varepsilon ^{-q} \vert b \vert ^{q}}{q}, \quad \frac{1}{p}+\frac{1}{p}=1 $$

with \(a=\theta ^{2k-1}\), \(b=(\theta _{2}-\theta _{w})(u_{1}-u_{2})\beta \), \(p=\frac{2k}{2k-1}\), \(q=2k \) and \(\varepsilon > 0\)

$$\begin{aligned} & \int _{0}^{t} \int _{\varGamma _{1}} \bigl\vert (u_{1}-u_{2})\beta (\sigma ) (\theta _{2}-\theta _{w})\theta ^{2k-1} \bigr\vert \,d\sigma \,ds \\ &\quad \le \frac{\varepsilon ^{p}}{p} \int _{0}^{t} \int _{\varGamma _{1}} \theta ^{2k} \,d\sigma \,ds \\ &\quad\quad{} +\frac{\varepsilon ^{-q}}{q} \int _{0}^{t} \int _{\varGamma _{1}} (u_{1}-u_{2})^{2k} \beta ^{2k}(\theta _{2}-\theta _{w})^{2k} \,d\sigma \,ds \\ &\quad \le C_{2}\frac{\varepsilon ^{p}}{p} \int _{0}^{t} \int _{\varOmega } \theta ^{2k} \,dx\,ds+C_{2}k \frac{\varepsilon ^{p}}{p} \int _{0}^{t} \int _{\varOmega } \theta ^{2k-2} \vert \nabla \theta \vert ^{2} \,dx\,ds \\ &\quad\quad{} +C_{3} \frac{\varepsilon ^{-q}}{q} \Vert u_{1}-u_{2} \Vert _{L^{\infty }(0,T)}^{2k} \bigl\Vert \beta (\theta _{2}-\theta _{w}) \bigr\Vert _{L^{\infty }(\varSigma _{1})}^{2k}. \end{aligned}$$
(13)

The second inequality in (13) is valid due to the trace theorem. Further, we aim at ensuring that \((\kappa (2k-1)-C_{2}k\frac{ \varepsilon ^{p}}{p})\int _{0}^{t}\int _{\varOmega } \theta ^{2k-2} \vert \nabla \theta \vert ^{2} \,dx\,ds\ge 0\) for all \(k\in \mathbb{N}\). For this purpose, we choose \(\varepsilon =(\frac{p\kappa }{2C_{2}})^{1/p}\). The inequality (13) reduces to

$$ \begin{aligned}[b] & \int _{0}^{t} \int _{\varGamma _{1}} \bigl\vert (u_{1}-u_{2})\beta (\theta _{2}-\theta _{w})\theta ^{2k-1} \bigr\vert \,d\sigma \,ds \\ &\quad \le \frac{\kappa }{2} \int _{0}^{t} \int _{\varOmega } \theta ^{2k} \,dx\,ds \\ &\quad\quad{} +\frac{\kappa k}{2} \int _{0}^{t} \int _{\varOmega } \theta ^{2k-2} \vert \nabla \theta \vert ^{2} \,dx\,ds+ \frac{C_{5}}{2k}C_{4}^{2k} \Vert u_{1}-u_{2} \Vert _{L^{\infty }(0,T)}^{2k}. \end{aligned} $$
(14)

Inserting (12) and (14) into (11) we conclude

$$ \begin{aligned} & \int _{\varOmega } \theta ^{2k}(t) \,dx\le C_{5}C_{4}^{2k} \Vert u_{1}-u_{2} \Vert _{L^{\infty }(0,T)}^{2k}+C_{6}2k \int _{0}^{t} \int _{\varOmega } \theta ^{2k} \,dx\,ds. \end{aligned} $$
(15)

Gronwall’s Lemma yields

$$ \bigl\Vert \theta (t) \bigr\Vert _{L^{2k}}^{2k}\le C_{5} C_{4}^{2k} \Vert u_{1}-u_{2} \Vert _{L ^{\infty }(0,T)}^{2k}\exp (C_{6}2kt), \quad \forall t\in [0,T]. $$

Taking the \((2k)\)-th root,

$$ \sup_{0\le t\le T} \bigl\Vert \theta (t) \bigr\Vert _{L^{2k}}\le C_{7} \Vert u_{1}-u_{2} \Vert _{L ^{\infty }(0,T)}. $$

Letting \(k\rightarrow \infty \), we obtain the Lipschitz continuity of the solution operator in \(L^{\infty }\)-norm. The coincidence of \(L^{\infty }(Q)\)- and \(C(\bar{Q})\)-norms implies the Lipschitz stability of the solution operator in \(C(\bar{Q})\) space. The estimate for \(\Vert f_{1}-f_{2} \Vert _{W^{1,p}(0,T;L^{p}(\varOmega ))}\) is deduced from Lemma 1. □

Now, let us discuss the differentiability of the solution operator that we need for the derivation of first-order and second-order optimality conditions.

Theorem 2

Let Assumptions (A1)(A4) be satisfied. Then, the solution operator S is twice Frechét-differentiable from \(L^{\infty }(0,T)\) to \(Y\times {W^{1,p}(0,T;L^{p}(\varOmega ))}\), \(1\le p <\infty \). The directional derivative \((\theta _{h}, f_{h})=S'(u)h=(S_{\theta }'(u)h,S _{f}'(u)h)\) at point \(u\in L^{\infty }(0,T)\) in direction \(h\in L^{ \infty }(0,T)\) is given by the solution of

$$\begin{aligned}& (f_{h})_{t} =G_{\theta }(\theta ,f) \theta _{h}+G_{f}(\theta ,f)f_{h}, \quad \textit{in }Q, \end{aligned}$$
(16a)
$$\begin{aligned}& f_{h}(0) =0,\quad \textit{in }\varOmega, \end{aligned}$$
(16b)
$$\begin{aligned}& \rho c_{p}(\theta _{h})_{t}-k\Delta \theta _{h} =\rho L(f_{h})_{t},\quad \textit{in }Q, \end{aligned}$$
(16c)
$$\begin{aligned}& -k\frac{\partial \theta _{h}}{\partial n} =u(t)\beta (x)\theta _{h}+h(t) \beta (x) (\theta -\theta _{w}),\quad \textit{on }\varSigma _{1}, \end{aligned}$$
(16d)
$$\begin{aligned}& -k\frac{\partial \theta _{h}}{\partial n} =0,\quad \textit{on }\varSigma _{2}, \end{aligned}$$
(16e)
$$\begin{aligned}& \theta _{h}(0) =0,\quad \textit{in }\varOmega , \end{aligned}$$
(16f)

with \((\theta , f)=S(u)\). Furthermore, \((\theta _{h_{1}h_{2}},f_{h _{1}h_{2}})=S''(u)[h_{1},h_{2}]\) is the solution of

$$\begin{aligned} &(f_{h_{1}h_{2}})_{t} =G_{\theta }(\theta ,f)\theta _{h_{1}h_{2}}+G_{f}( \theta ,f)f_{h_{1}h_{2}} \\ & \hphantom{ (f_{h_{1}h_{2}})_{t}}\quad{} +G''(\theta ,f)\bigl[(\theta _{h_{1}},f_{h_{1}}),(\theta _{h_{2}},f _{h_{2}}) \bigr],\quad \textit{in }Q, \end{aligned}$$
(17a)
$$\begin{aligned} & f_{h_{1}h_{2}}(0) =0,\quad \textit{in }\varOmega, \end{aligned}$$
(17b)
$$\begin{aligned} & \rho c_{p}(\theta _{h_{1}h_{2}})_{t}-k\Delta \theta _{h_{1}h_{2}} = \rho L(f_{h_{1}h_{2}})_{t},\quad \textit{in }Q, \end{aligned}$$
(17c)
$$\begin{aligned} & -k\frac{\partial \theta _{h_{1}h_{2}}}{\partial n} =u(t)\beta (x) \theta _{h_{1}h_{2}}+h_{1}(t) \beta (x)\theta _{h_{2}} \\ & \hphantom{-k\frac{\partial \theta _{h_{1}h_{2}}}{\partial n}}\quad{} +h_{2}(t)\beta (x)\theta _{h_{1}}, \quad \textit{on }\varSigma _{1}, \end{aligned}$$
(17d)
$$\begin{aligned} & -k\frac{\partial \theta _{h_{1}h_{2}}}{\partial n} =0, \quad \textit{on } \varSigma _{2}, \end{aligned}$$
(17e)
$$\begin{aligned} & \theta _{h_{1}h_{2}}(0) =0, \quad \textit{in }\varOmega , \end{aligned}$$
(17f)

with \((\theta _{h_{i}},f_{h_{i}})=S'(u)h_{i}\), \(i=1,2\).

Proof

The existence of a unique solution \((\theta _{h},f_{h})\) of the linearized state system (16a)–(16f) in \(W(0,T)\times W^{1,\infty }(0,T;L ^{10/3}(\varOmega ))\) can be shown similarly to the proof of Theorem 1. Moreover, the terms on the right-hand side of (16c), (16d) have enough regularity, namely

$$\begin{aligned}& h(t)\beta (x) (\theta -\theta _{w})\in L^{\infty }(\varSigma _{1}), \quad\quad G_{f}( \theta ,f)f_{h} \in L^{\infty }\bigl(0,T;L^{10/3}(\varOmega )\bigr), \\& G_{\theta }(\theta ,f)\theta _{h} \in L^{10/3}(Q). \end{aligned}$$

The latter is true due to the fact that \(G_{\theta }(\theta ,f)\in L ^{\infty }(Q)\), \(\theta _{h} \in W(0,T)\) and therefore \(\theta _{h} \in L^{10/3}(Q)\) (see Lemma 6.7 in [13]). Then, the continuity of \(\theta _{h}\) follows from Lemma 2.

For a given control \(u\in L^{\infty }(0,T)\) and a direction \(h\in L^{\infty }(0,T)\), we define \((\theta ,f)=S(u)\) and \((\theta ^{h},f^{h})=S(u+h)\), respectively. Furthermore, let \((\theta _{h},f _{h})\) be the unique solution of (16a)–(16f). Considering the remainder terms

$$ r_{\theta }=\theta ^{h}-\theta -\theta _{h}, \quad \quad r_{f}=f^{h}-f-f_{h}, $$

it remains to verify that

$$ \Vert r_{\theta } \Vert _{C(\bar{Q})}+ \Vert r_{f} \Vert _{W^{1,p}(0,T;L^{p}(\varOmega ))}=o\bigl( \Vert h \Vert _{L^{\infty }(0,T)}\bigr). $$

In view of Assumption (A2), this can be proven similarly to the estimates in Corollary 1 using a first-order Taylor expansion of the function G. Furthermore, one can analogously show Lipschitz continuity of the first derivative of the solution operator, i.e., for all \(u_{1}, u_{2}, h\in L^{\infty }(0,T)\), there exist a constant \(C>0\) such that

$$\begin{aligned} \begin{aligned} & \bigl\Vert \bigl(S'_{\theta }(u_{1})-S'_{\theta }(u_{2}) \bigr)h \bigr\Vert _{C(\bar{Q})}+ \bigl\Vert \bigl(S'_{f}(u _{1})-S'_{f}(u_{2})\bigr)h \bigr\Vert _{W^{1,p}(0,T;L^{p}(\varOmega ))} \\ &\quad \le C \Vert u_{1}-u_{2} \Vert _{L^{\infty }(0,T)} \end{aligned} \end{aligned}$$

holds true. By means of this and again Assumption (A2), one can show that the unique solution of the linear system (17a)–(17f) represents the second derivative of the solution operator. To prove this, one has to derive the remainder term of second order and proceed as before, which we omit here for reasons of space. □

2.3 Existence and optimality conditions of optimal solutions

Since the state system is nonlinear, we cannot expect uniqueness of an optimal control and we have to deal with local optimal controls. We have the following result.

Theorem 3

(Existence of optimal controls)

Let Assumptions (A1)(A4) be satisfied. Then, there exists at least one solution of the optimal control problem (P).

To prove Theorem 3, we need the following auxiliary result:

Lemma 3

Assume \(\{\theta _{k}\}\) is bounded in \(L^{2} (0,T; H^{1} (\varOmega )) \cap L^{\infty }(Q)\) and

$$\begin{aligned} \theta _{k} \to \theta\quad &\textit{strongly in } L^{2} \bigl(0,T; L^{2}( \varOmega )\bigr) \end{aligned}$$
(18)
$$\begin{aligned} &\textit{and weakly in } L^{2} \bigl(0,T;H^{1} (\varOmega )\bigr) . \end{aligned}$$
(19)

Then, it also holds

$$\begin{aligned} \theta _{k} \to \theta \quad \textit{strongly in } L^{2} \bigl(0,T;L^{2}( \partial \varOmega )\bigr) . \end{aligned}$$

Proof

We define the operator \(A: L^{2}(0,T; H^{1} (\varOmega )) \to L^{2} (0,T)\) by

$$ A\theta = \int _{\partial \varOmega } \theta (\sigma ,t)\,d\sigma . $$

A is linear and also continuous, since the application of the trace theorem yields

$$\begin{aligned} \Vert A\theta \Vert ^{2}_{L^{2}(0,T)} = & \int ^{T}_{0} \biggl( \int _{\partial \varOmega } \theta (\sigma ,t)\,d\sigma \biggr)^{2} \,dt \\ \le & \vert \partial \varOmega \vert \int ^{T}_{0} \int _{\partial \varOmega } \theta ^{2} ( \sigma ,t)\,d\sigma \,dt \le c \Vert \theta \Vert ^{2}_{L^{2} (0,T;H^{1}(\varOmega ))} . \end{aligned}$$

In view of (19), we can infer

$$ A\theta _{k} \rightharpoonup A\theta \quad \text{in } L^{2} (0,T) . $$

Utilizing the boundedness of \(\{\theta _{k}\}\) in \(L^{\infty }(Q) \cap L^{2} (0,T; H^{1} (\varOmega ))\), we observe that

$$\begin{aligned} \bigl\Vert \theta ^{2}_{k} \bigr\Vert ^{2}_{L^{2}(0,T;H^{1}(\varOmega ))} = \int ^{T}_{0} \int _{\varOmega }\theta _{k}^{4} \,dx\,dt + 2 \int ^{T}_{0} \int _{\varOmega } \vert \theta _{k} \nabla \theta _{k} \vert ^{2}\,dx\,dt \le c . \end{aligned}$$
(20)

Now we take smooth functions \(\varphi (x)\) and \(\chi (t)\), then

$$\begin{aligned}& \begin{gathered} \int ^{T}_{0} \biggl( \int _{\varOmega }\theta _{k}^{2} \varphi \,dx \biggr) \chi (t)\,dt + \int ^{T}_{0} \biggl( \int _{\varOmega }\nabla \bigl(\theta ^{2}_{k} \bigr) \nabla \varphi \,dx \biggr) \chi (t)\,dt \\ \quad = \int ^{T}_{0} \biggl( \int _{\varOmega }\theta ^{2}_{k} \varphi \,dx \biggr) \chi (t)\,dt + 2 \int ^{T}_{0} \biggl( \int _{\varOmega }\theta _{k} \nabla \theta _{k} \nabla \varphi \,dx \biggr) \chi (t) \,dt . \end{gathered} \end{aligned}$$

Since φ and χ are smooth, using (18) and (19) we deduce that

$$\begin{aligned} \bigl\langle \theta _{k}^{2},\varphi \chi \bigr\rangle _{L^{2}(0,T;H^{1}(\varOmega ))} \to \bigl\langle \theta ^{2},\varphi \chi \bigr\rangle _{L^{2}(0,T;H^{1}(\varOmega ))} . \end{aligned}$$

Together with (20), we have shown that

$$ \theta _{k}^{2} \rightharpoonup \theta ^{2} \quad \text{weakly in } L^{2}\bigl(0,T;H^{1}(\varOmega )\bigr) . $$

Since the limit does not depend on the extracted subsequence the whole sequence converges. From this, we infer

$$\begin{aligned}& A\theta _{k}^{2} \rightharpoonup A\theta ^{2} \quad \text{which means} \\& \Vert \theta _{k} \Vert _{L^{2}(0,T;L^{2}(\partial \varOmega ))} \to \Vert \theta _{k} \Vert _{L^{2}(0,T;L^{2}(\partial \varOmega ))} \end{aligned}$$

and thus \(\theta _{k} \to \theta \) strongly in \(L^{2}(0,T;L^{2}(\partial \varOmega ))\). □

With Lemma 3 at hand, we are now able to prove the existence of optimal solution of control problem (P).

Proof of Theorem 3

Due to Theorem 1, there exist a unique solution

$$ (\theta ,f)\in W(0,T)\cap C(\bar{Q})\times W^{1,p}\bigl(0,T;L^{p}( \varOmega )\bigr) $$

of the state system (2a)–(2f) for every control \(u\in U_{\mathrm{ad}}\). Since the set of admissible controls is bounded in \(L^{\infty }(0,T)\), the set of respective solutions \((\theta ,f)\) of the state system is bounded in \(W(0,T)\cap C(\bar{Q})\times W^{1,p}(0,T;L^{p}(\varOmega ))\), see Lemma 1 and Theorem 1. By means of boundedness of the cost functional, there exists a minimizing sequence \(\{\theta _{k},f_{k},u_{k}\}\) such that

$$ j=\lim_{k\to \infty } J(\theta _{k},f_{k},u_{k})= \inf J(\theta ,f,u), $$

where \((\theta _{k},f_{k})=S(u_{n})\) is the solution of the state system w.r.t. to the control \(u_{k}\).

Since \(U_{\mathrm{ad}}\) is bounded, closed and convex, there exists a subsequence \(\{u_{k'}\}\) such that

$$ u_{k'} \rightharpoonup \bar{u} \quad \text{weakly in } L^{2}(0,T) . $$

In view of Theorem 1, extracting possibly a further subsequence still indexed by \(k'\), we have

$$\begin{aligned} \theta _{k'} \rightharpoonup \theta \quad & \text{weakly in } W(0,T) \end{aligned}$$
(21)
$$\begin{aligned} & \text{strongly in } L^{2}(Q) . \end{aligned}$$
(22)

Applying Lemma 1 we obtain

$$ f_{k'} \to f \quad \text{strongly in } W^{1,2} \bigl(0,T;L^{2}(\varOmega )\bigr) , $$

where f is the solution corresponding to θ. We use test functions \(\varphi \in H^{1}(\varOmega )\) and \(\chi \in C^{1} [0,T]\) such that \(\chi (T) = 0\) and consider the weak formulation of (2c)–(2f) for \((\theta _{k'}, f_{n'}, u_{n'})\)

$$\begin{aligned}& \rho c_{p} \int ^{T}_{0} \int _{\varOmega }\theta _{k',t} \varphi \chi \,dx\,dt + k \int ^{T}_{0} \int _{\varOmega }\nabla \theta _{k'} \nabla \varphi \chi \,dx\,dt \\& \quad\quad{} + \int ^{T}_{0} \biggl( \int _{\varGamma _{1}} \theta _{k'}\beta (\sigma ) \varphi \,d \sigma \biggr) u_{k'}(t) \chi \,dt \\& \quad = \int ^{T}_{0} \biggl( \int _{\varGamma _{1}} \theta _{w} \beta (\sigma )\varphi \,d \sigma \biggr) u _{k'} (t) \chi \,dt + \int ^{T}_{0} \int _{\varOmega }f_{k'} \varphi \chi \,dx\,dt . \end{aligned}$$
(23)

Except of the third term in (23) we can pass to the limit by standard arguments. To pass to the limit in the remaining term we define

$$ \alpha _{k} (t) = \biggl( \int _{\varGamma _{1}} \theta _{k}\beta (\sigma ) \varphi \,d \sigma \biggr) \chi (t) $$

and estimate

$$ \int ^{T}_{0} (\alpha _{k'} -\alpha )^{2} \,dt = \int ^{T}_{0} \biggl( \int _{\varGamma _{1}} (\theta _{k'}-\theta ) \beta (\sigma ) \varphi \,d\sigma \biggr)^{2} \chi ^{2}(t)\,dt \le c \int ^{T}_{0} \Vert \theta _{k'}- \theta \Vert ^{2}_{L^{2}(\varGamma _{1})}\,dt . $$

Now we apply Lemma 3 and obtain

$$ \alpha _{k'} \to \alpha \quad \text{strongly in } L^{2}( \varGamma _{1}) , $$

which enables us to pass to the limit in the remaining term in (23). Since the solution to the state equation is unique, we can infer

$$ \theta = \theta (\bar{u}) =:\bar{\theta } \quad \text{and} \quad f =f(\bar{ \theta }) = :\bar{f} . $$

The optimality of \((\bar {\theta },\bar {f},\bar {u})\) follows by standard arguments using the lower semicontinuity of the cost functional w.r.t. u. □

In the following theorem first-order necessary optimality conditions are characterized by respective adjoint equations.

Theorem 4

(Necessary optimality conditions)

Let \(\bar {u}\in U_{\mathrm{ad}}\) be an optimal control of problem (P) and \((\bar {\theta }, \bar {f})=S(\bar {u})\) the associated solution of the state system (2a)(2f). Then there exists a unique solution \((\bar{p}, \bar{q})\in Y\times W^{1,\infty }(0,T;L^{\infty }(\varOmega ))\) such that

$$\begin{aligned}& -\bar{q}_{t} =G_{f}(\bar {\theta },\bar {f}) (\bar{q}+\rho L \bar{p}),\quad \textit{in }Q, \end{aligned}$$
(24a)
$$\begin{aligned}& \bar{q}(T) =\alpha _{1}\bigl(\bar {f}(T)-f_{d}\bigr),\quad \textit{in }\varOmega, \end{aligned}$$
(24b)
$$\begin{aligned}& -\rho c_{p}\bar{p}_{t}-k\Delta \bar{p} =G_{\theta }(\bar {\theta },\bar {f}) ( \rho L\bar{p}+\bar{q})+\alpha _{2}(\bar {\theta }-\theta _{d}),\quad \textit{in }Q, \end{aligned}$$
(24c)
$$\begin{aligned}& -k\frac{\partial \bar{p}}{\partial n} =\bar{u}(t)\beta (x)\bar{p},\quad \textit{on }\varSigma _{1}, \end{aligned}$$
(24d)
$$\begin{aligned}& -k\frac{\partial \bar{p}}{\partial n} =0,\quad \textit{on }\varSigma _{2}, \end{aligned}$$
(24e)
$$\begin{aligned}& \bar{p}(T) =0,\quad \textit{in }\varOmega . \end{aligned}$$
(24f)

Moreover, the following variational inequality is valid

$$ \iint _{\varSigma _{1}}\biggl(-\bar{p}\beta (\sigma ) (\bar {\theta }-\theta _{w})+\frac{ \alpha _{3}}{ \vert \varGamma _{1} \vert }\bar {u}\biggr) (u-\bar {u})\,d\sigma \,dt\ge 0\quad \forall u\in U_{\mathrm{ad}}. $$
(25)

Proof

First observe that the system (24a)–(24f) is a linear backward-in-time system of the parabolic equation and ODE. After the time transformation \(t\mapsto T-t\) one can proceed as in the proof of Theorem 2 in order to prove the existence of the unique solution \((\bar{p},\bar{q}) \in W(0,T)\cap C(\bar{Q})\times W^{1, \infty }(0,T;L^{\infty }(\varOmega ))\) of the system (24a)–(24f).

By means of the control to state mapping (9), the reduced cost functional of problem (P) is given by

$$\begin{aligned} \min_{u\in U_{\mathrm{ad}}} j(u)=J\bigl(S(u),u\bigr)={} &\frac{\alpha _{1}}{2} \int _{\varOmega } \bigl(S_{f}(u) (T)-f_{d} \bigr)^{2} \,dx \\ &{}+\frac{\alpha _{2}}{2} \iint _{Q}\bigl(S_{\theta }(u)-\theta _{d} \bigr)^{2}\,dx \,dt+\frac{\alpha _{3}}{2} \int _{0}^{T}u^{2}\,dt. \end{aligned}$$

Due to Theorem 2, j is differentiable and the set of admissible controls \(U_{\mathrm{ad}}\) bounded, closed and convex. Hence, the first-order necessary optimality conditions for a (local) optimal solution \(\bar {u}\in U_{\mathrm{ad}}\) is given by \(j'(\bar {u})(u-\bar {u}) \ge 0\), \(\forall u\in U_{\mathrm{ad}}\). For given direction \(h\in L^{\infty }(0,T)\) we have

$$ \begin{aligned}[b] j'(\bar {u})h &=\alpha _{1} \int _{\varOmega } \bigl(S_{f}(\bar {u}) (T)-f_{d}\bigr)S'_{f}(\bar {u})h \,dx \\ &\quad {}+\alpha _{2} \iint _{Q}\bigl(S_{\theta }(\bar {u})-\theta _{d}\bigr)S'_{\theta }(\bar {u})h\,dx \,dt+\alpha _{3} \int _{0}^{T}\bar {u}h\,dt. \end{aligned} $$
(26)

We will rewrite the directional derivative with the help of \((\bar{p},\bar{q})\) which solves the adjoint system (24a)–(24f). The existence of a unique solution of (24a)–(24f) can be proven similar to Theorem 1. For brevity we introduce \(f_{h}=S'_{f}(\bar {u})h\) and \(\theta _{h}=S'_{\theta }(\bar {u})h\) as the solution of the linearized system (16a)–(16f). We start by multiplying (16a) with and integrate over Q:

$$\begin{aligned} 0 &= \iint _{Q}\bigl((f_{h})_{t}-G_{\theta }( \bar {\theta },\bar {f})\theta _{h}-G_{f}(\bar {\theta },\bar {f}) f_{h}\bigr)\bar{q} \,dx \,dt \\ &= \iint _{Q}-\bar{q}_{t}f_{h}-\bar{q} \bigl(G_{\theta }(\bar {\theta },\bar {f}) \theta _{h}+G_{f}( \bar {\theta },\bar {f})f_{h}\bigr)\,dx \,dt+ \int _{\varOmega }f _{h}(T)\bar{q}(T)\,dx. \end{aligned}$$

Due to end-time condition for , one can obtain for the first term in (26)

$$\begin{aligned} \alpha _{1} \int _{\varOmega } \bigl(f_{h}(T)-f_{d} \bigr)f_{h}(T) \,dx &= \iint _{Q}\bar{q}_{t}f_{h}+\bar{q} \bigl(G_{\theta }(\bar {\theta },\bar {f})\theta _{h}+G _{f}(\bar {\theta },\bar {f})f_{h}\bigr)\,dx \,dt \\ &= \iint _{Q}-\rho LG_{f}(\bar {\theta },\bar {f}) \bar{p}f_{h}+ \bar{q} G_{\theta }(\bar {\theta },\bar {f})\theta _{h}. \end{aligned}$$

Next, we test (24c) with \(\theta _{h}\), integrate over Q such that

$$\begin{aligned} \alpha _{2} \iint _{Q}(\bar {\theta }-\theta _{d}) \theta _{h}\,dx \,dt&=- \int _{0}^{T}\rho c_{p} \bar{p}_{t}\theta _{h}\,dt-\kappa \iint _{Q} \Delta \bar{p}\theta _{h}\,dx \,dt \\ &\quad{} - \iint _{Q}G_{\theta }(\bar {\theta },\bar {f}) (\rho L\bar{p}+ \bar{q})\theta _{h}\,dx \,dt \\ &= \int _{0}^{T}\rho c_{p}\bar{p}(\theta _{h})_{t} \,dt-\kappa \iint _{Q}\Delta \theta _{h} \bar{p}\,dx \,dt \\ &\quad{} - \iint _{Q}G_{\theta }(\bar {\theta },\bar {f}) (\rho L\bar{p}+ \bar{q})\theta _{h}\,dx \,dt - \iint _{\varSigma _{2}}h\beta (\bar {\theta }- \theta _{w})\bar{p} \,d\sigma \,dt \\ &=- \iint _{\varSigma _{1}}h\beta (\bar {\theta }-\theta _{w})\bar{p} \,d \sigma \,dt- \iint _{Q}G_{\theta }(\bar {\theta },\bar {f}) (\rho L \bar{p}+\bar{q})\theta _{h}\,dx \,dt \\ &\quad {}+ \iint _{Q}\rho L\bigl(G_{\theta }(\bar {\theta },\bar {f}) \theta _{h}+G_{f}(\bar {\theta },\bar {f})f_{h}\bigr) \bar{p}\,dx\,dt. \end{aligned}$$

Summarizing, one replace (26) by

$$ j'(\bar{u})h=- \iint _{\varSigma _{1}}h\beta (\bar {\theta }-\theta _{w}) \bar{p} \,d\sigma \,dt+\alpha _{3} \int _{0}^{T}\bar {u}h\,dt. $$

Thus, the first-order optimality conditions for a (local) optimal solution ū are represented by the variational inequality (25). □

Next, we will formulate second-order sufficient optimality conditions regarding the optimal control problem (P). Therefore, we provide the second derivative of the reduced cost functional \(j(u)=J(S(u),u)\). Straightforward computation and the use of the adjoint variables introduced in Theorem 4 yields

$$ \begin{aligned}[b] j''(u)[h_{1},h_{2}]= {}&\alpha _{1} \int _{\varOmega }f_{h_{1}}(T)f_{h _{2}}(T)\,dx+\alpha _{2} \iint _{Q}\theta _{h_{1}}\theta _{h_{2}}\,dx \,dt \\ &{}+\alpha _{3} \int _{0}^{T}h_{1}h_{2}\,dt- \iint _{\varSigma _{1}}( \theta _{h_{1}}h_{2}+\theta _{h_{2}}h_{1})p \,d\sigma \,dt \\ &{}+ \iint _{Q} G''\bigl(\theta (u),f(u) \bigr)\bigl[(\theta _{h_{1}},f_{h_{1}}),( \theta _{h_{2}},f_{h_{2}})\bigr](\rho L p+q)\,dx\,dt, \end{aligned} $$
(27)

with \((\theta _{h_{i}},f_{h_{i}})=S'(u)h_{i}\), \(i=1,2\) and \((p,q)\) is the solution of the adjoint system (24a)–(24f).

In all what follows we denote by ū an admissible control of problem (P) with associated solution \((\bar {\theta },\bar {f})=S(\bar {u})\) of the state system (2a)–(2f). We suppose that the first-order optimality conditions given in Theorem 4 are satisfied with respective adjoint states \((\bar{p},\bar{q})\). Let us define the strongly active set associated to ū. For fixed \(\tau >0\) we set

$$ A_{\tau }(\bar {u})= \biggl\{ t\in (0,T): \biggl\vert \int _{\varGamma _{1}} -\bar{p}(\sigma ,t) \bigl(\bar {\theta }(\sigma ,t)- \theta _{w}(\sigma ,t)\bigr) \,d\sigma +\alpha _{3} \bar {u}(t) \biggr\vert >\tau \biggr\} . $$

Next, we shall assume a coercivity condition on the second derivative of the cost functional for directions associated to the previous strongly active set, henceforth called second-order sufficient optimality conditions:

$$ \left. \begin{aligned} &\text{There exist }\tau >0\text{ and }\delta >0\text{ such that} \\ & \quad j''(\bar {u})h^{2}\ge \delta \Vert h \Vert _{L^{2}(0,T)}^{2} \\ &\text{holds for all }h=\bar {u}-u, u\in U_{\mathrm{ad}}\text{ with }h=0 \text{ on } A_{\tau }(\bar {u}) \end{aligned} \right\}. $$
(SSC)

Theorem 5

Let ū be an admissible control of problem (P) with associated state \((\bar {\theta },\bar {f})=S(\bar {u})\) satisfying the first-order necessary optimality conditions given in Theorem 4 with associated adjoint states \((\bar{p},\bar{q})\). Further, it is assumed that (SSC) holds at ū. Then there exist a \(\tilde{\delta }>0\) and \(\rho >0\) such that

$$ J(\theta ,f,u)\ge J(\bar {\theta },\bar {f},\bar {u})+\tilde{\delta } \Vert u- \bar{u} \Vert _{L^{2}(0,T)}^{2} $$
(28)

holds for all \(u\in U_{\mathrm{ad}}\) with \(\Vert u-\bar{u} \Vert _{L^{\infty }(0,T)} \le \rho \) with associated states \((\theta ,f)=S(u)\).

Proof

The proof closely resembles that of Theorem 5.17 in [24], therefore we will not give here all details and refer to [24]. We only indicate some important arguments that need a bit more explanation.

The crutial point in the proof is the fact that the quadratic form \(j''(u)[h_{1},h_{2}]\) has to depend continiously on \(h_{i}\), \(i=1,2\) in the \(L^{2}\)-norm, i.e we have to ensure the following continuity estimate

$$ \bigl\vert j''(u)[h_{1},h_{2}] \bigr\vert \le c \Vert h_{1} \Vert _{L^{2}(0,T)} \Vert h_{2} \Vert _{L^{2}(0,T)}. $$
(29)

The first two terms in \(j''(u)[h_{1},h_{2}]\) (see (27)) can be estimated with respect to the \(L^{2}\)-norm of \(h_{i}\), \(i=1,2\) by applying standard a priori estimates and Lemma 1(b), e.g.

$$ \begin{aligned} & \Vert \theta _{h_{i}} \Vert _{L^{\infty }(Q)}\le c \Vert \bar{ \theta } \Vert _{C(\bar{Q})} \Vert h_{i} \Vert _{L^{2}(0,T)}, \\ & \Vert f_{h_{i}} \Vert _{L^{\infty }(Q)}\le c \Vert \bar{ \theta } \Vert _{C(\bar{Q})} \Vert h_{i} \Vert _{L^{2}(0,T)}. \end{aligned} $$

The other terms are more delicate. Here we take advantage of the regularity of the adjoint state. Using trace theorem we estimate

$$ \begin{aligned} \biggl\vert \iint _{\varSigma _{1}}\theta _{h_{i}}h_{j}p \,d\sigma \,dt \biggr\vert &\le c \Vert p \Vert _{C(\bar{Q})} \Vert \theta _{h_{i}} \Vert _{L^{2}(0,T;H^{1}(\varOmega ))} \Vert h _{j} \Vert _{L^{2}(0,T)} \\ &\le c \Vert p \Vert _{C(\bar{Q})} \Vert \theta _{h_{i}} \Vert _{W(0,T)} \Vert h_{j} \Vert _{L^{2}(0,T)}\le c \Vert p \Vert _{C(\bar{Q})} \Vert h_{i} \Vert _{L^{2}(0,T)} \Vert h _{j} \Vert _{L^{2}(0,T)} \end{aligned} $$

for \(i,j=1,2\), \(i\neq j\). For the last term in (27) we need to estimate the second derivative of \(G(\theta ,f)\)

$$ \begin{aligned}& \bigl\vert G''(\theta ,f)\bigl[ (\theta _{h_{1}},f_{h_{1}}),(\theta _{h_{2}},f_{h_{2}})\bigr] \bigr\vert \\ &\quad = \bigl\vert G_{\theta \theta }[\theta _{h_{1}},\theta _{h_{2}}]+G_{\theta f}[ \theta _{h_{1}},f_{h_{2}}]+G_{f\theta }[f_{h_{1}}, \theta _{h_{2}}]+G _{ff}[f_{h_{1}},f_{h_{2}}] \bigr\vert \\ &\quad \le c\bigl( \Vert \theta _{h_{1}} \Vert _{C(\bar{Q})} \Vert \theta _{h_{2}} \Vert _{C(\bar{Q})}+ \Vert \theta _{h_{1}} \Vert _{C(\bar{Q})} \Vert f _{h_{2}} \Vert _{C(\bar{Q})} \\ &\quad\quad{} + \Vert f_{h_{1}} \Vert _{C(\bar{Q})} \Vert \theta _{h_{2}} \Vert _{C(\bar{Q})}+ \Vert f_{h_{1}} \Vert _{C(\bar{Q})} \Vert f_{h_{2}} \Vert _{C(\bar{Q})}\bigr). \end{aligned} $$

The last step of the estimation is valid due to the uniformly boundedness of the partial derivatives of \(G(\theta ,f)\) up to the order two on the bounded sets (it follows from Assumption (A2)).

The next important issue is to estimate the second-order remainder term of the reduced cost functional j. We denote \(h=u-\bar{u}\). It follows from Taylor’s theorem with integral remainder (see, e.g., Theorem 8.14.3, p. 186 in [5]) that

$$ j(u)=j(\bar{u})+j'(\bar{u})h+\frac{1}{2}j''( \bar{u})h^{2}+r_{2}^{j}( \bar{u},h) $$

with the remainder

$$ r_{2}^{j}(\bar{u},h)= \int _{0}^{1}(1-s) \bigl(j''( \bar{u}+sh)-j''(\bar{u})\bigr)h ^{2}\,ds. $$

Let \((\bar{\theta },\bar{f})=S(\bar{u})\), \((\theta ,f)=S(\bar{u}+sh)\) and \((\bar{\theta }_{h},\bar{f}_{h})=S'(\bar{u})h\), \((\theta _{h},f_{h})=S'( \bar{u}+sh)h\). Further, we consider

$$ \begin{aligned}[b] & \bigl(j''( \bar{u} +sh)-j''(\bar{u})\bigr)h^{2} \\ &\quad =\alpha _{1} \int _{\varOmega }f _{h}^{2}(T)- \bar{f}_{h}^{2}(T)\,dx+\alpha _{2} \int _{\varOmega }\theta _{h}^{2}(T)-\bar{\theta }_{h}^{2}(T)\,dx \\ &\quad \quad{} -2 \iint _{\varSigma _{1}}( \theta _{h}p-\bar{\theta }_{h}\bar{p})h \,d\sigma \,dt \\ &\quad \quad{} + \iint _{Q} G''(\theta ,f) (\theta _{h},f_{h})^{2}(\rho L p+q)-G''( \bar{\theta },\bar{f}) (\bar{\theta }_{h},\bar{f}_{h})^{2}( \rho L \bar{p}+\bar{q})\,dx\,dt. \end{aligned} $$
(30)

In order to estimate the terms in (30), we need the following estimates

$$ \begin{aligned} & \Vert f_{h}- \bar{f}_{h} \Vert _{W^{1,p}(0,T;L^{p}(\varOmega ))}+ \Vert \theta _{h}- \bar{ \theta }_{h} \Vert _{C(\bar{Q})}\le cs \Vert h \Vert _{L^{\infty }(0,T)} \Vert h \Vert _{L^{2}(0,T)}, \\ & \Vert q-\bar{q} \Vert _{W^{1,p}(0,T;L^{p}(\varOmega ))}+ \Vert p- \bar{p} \Vert _{C(\bar{Q})}\le cs \Vert h \Vert _{L^{\infty }(0,T)}, \end{aligned} $$
(31)

which can be obtained by the standard a priori estimates and Lipschitz continuity of the solution operator. Using (31) and Lipschitz continuity of \(G''(\theta ,f)\), we can estimate the remainder term \(r_{2}^{j}\) as follows

$$ \bigl\vert r_{2}^{j}(\bar{u},h) \bigr\vert \le c \int _{0}^{1}(1-s)s \Vert h \Vert _{L^{\infty }(0,T)} \Vert h \Vert _{L^{2}(0,T)}^{2}\,ds\le c \Vert h \Vert _{L^{\infty }(0,T)} \Vert h \Vert _{L^{2}(0,T)}^{2}. $$

From this point, we can argue along exactly the same lines as on pp. 292–294 in the proof of Theorem 5.17 in [24] to conclude the validity of the assertion. □

Such kind of sufficient optimality conditions is an indispensable tool basis for carrying out numerical analysis of optimal control problems, e.g., convergence analysis of the sequential quadratic programming method in order to solve optimal control problems numerically.

3 Numerical implementation

In this section we introduce numerical algorithms for the solution of optimal control problem (P) analyzed in the previous section. This problem belongs to the class of the nonlinear boundary control problems with control constraints. The SQP (Sequential Quadratic Programming) method has turned out to be one of the most successful methods in nonlinear optimization (see, e.g., [1, 19]). The principal idea is to linearize the nonlinear equality constraints and to replace the cost functional by a quadratic approximation of the Lagrangian. It is well known that the SQP algorithm exhibits local quadratic convergence in finite-dimensional spaces. The convergence analysis for nonlinear parabolic boundary control problems was presented in the works of Tröltzsch [8, 23].

In this work we focus on the reduced SQP method (rSQP), where the reduction onto the control space takes place when solving the \((\mathit{QP}^{k})\)-subproblems. We also introduce the primal-dual active set (PDAS) strategy, used for the treatment of the quadratic \((\mathit{QP}^{k})\) problems in each iteration of rSQP method. The conjugate gradient (CG) method has been applied to solve the linear system of equations arising in the (PDAS) algorithm.

3.1 Reduced SQP method

The main idea of reduced SQP methods in contrast to usual SQP methods is to use only an approximation of the projected Hessian of the Lagrangian onto the kernel of the linearized constraint, instead of an approximation of the full Hessian of the Lagrangian.

We introduce the Lagrange functional

$$ \mathcal{L}(\theta ,f,u,p,q): Y\times W^{1,\infty }\bigl(0,T;L^{\infty } \bigr) \times L^{\infty }(0,T)\times Y\times W^{1,\infty } \bigl(0,T;L^{\infty }\bigr) \rightarrow \mathbb{R} $$

with \(Y:=W(0,T)\cap C(\bar{Q})\) and

$$ \begin{aligned} \mathcal{L}(\theta ,f,u,p,q)={} &J(\theta ,f,u)-\biggl( \int _{0}^{T} \rho c_{p}\langle \theta _{t},p\rangle _{H^{1}(\varOmega )^{*},H^{1}(\varOmega )}\,dt+a(u)[\theta ,p] \\ &{} -\bigl(u(t)\beta \theta _{w},p\bigr)_{\varSigma _{1}} -\bigl(\rho L G(\theta ,f),p\bigr)_{Q}+\bigl(f _{t}-G(\theta ,f),q \bigr)_{Q}\biggr), \end{aligned} $$

with a bilinear form

$$ a(u)[\theta ,v]:= \iint _{Q}k\nabla \theta \cdot \nabla v\,dx\,dt+ \iint _{\varSigma _{1}}u\beta \theta v \,d\sigma \,dt, $$

and \((\cdot ,\cdot )_{Q}\), \((\cdot ,\cdot )_{\varSigma _{1}}\) denote the scalar products in \(L^{2}(Q)\) and \(L^{2}(\varSigma _{1})\), respectively.

At each iteration of the SQP method a quadratic approximation of the Lagrangian is minimized under linearized constraints, where it is assumed that the current iterate \(x^{k}=(\theta ^{k},f^{k},u^{k})\) is sufficiently close to a local optimal solution \((\bar {\theta },\bar {f},\bar {u})\):

$$\begin{aligned}& \min \frac{1}{2}\mathcal{L}'' \bigl(x^{k},p^{k},q^{k}\bigr)[\delta x,\delta x]+J'\bigl(x ^{k}\bigr)\delta x \\& \quad \text{s.t.} \\& \begin{aligned} \delta f_{t} &=G_{f}\bigl(\theta ^{k},f^{k}\bigr)\delta f+G_{\theta }\bigl(\theta ^{k},f ^{k}\bigr)\delta \theta \\ &\quad {}-f_{t}^{k}+G\bigl(\theta ^{k},f^{k} \bigr),\quad \text{in }Q, \end{aligned} \\& \delta f(0)=-f^{k}(0),\quad \text{in }\varOmega, \\& \rho c_{p}\delta \theta _{t}-k\Delta \delta \theta =\rho L \delta f_{t}-\bigl(\rho c_{p} \theta ^{k}_{t}-k \Delta \theta ^{k}-\rho Lf^{k}_{t}\bigr),\quad \text{in }Q, \hspace{90pt}(\mathrm{QP}^{k}) \\& \begin{aligned} -k\frac{ \partial \delta \theta }{\partial n}-u^{k}(t)\beta (x)\delta \theta &= \delta u(t) \bigl(\theta ^{k}-\theta _{w}\bigr) \\ &\quad {}+k\frac{\partial \theta ^{k}}{\partial n}+u^{k}(t)\beta (x) \bigl(\theta ^{k}-\theta _{w}\bigr),\quad \text{on }\varSigma _{1}, \end{aligned} \\& -k\frac{\partial \delta \theta }{\partial n}=k\frac{ \partial \theta ^{k}}{\partial n},\quad \text{on }\varSigma _{2}, \\& \delta \theta (0)=\theta _{0}-\theta ^{k}(0),\quad \text{in }\varOmega, \\& u_{a}\le \delta u+u^{k}\le u_{b} \quad \text{in }(0,T). \end{aligned}$$

Note that

$$ J'\bigl(x^{k}\bigr)\delta x=\alpha _{1} \bigl(f^{k}(T)-f_{d},\delta f\bigr)_{\varOmega }+\alpha _{2}\bigl(\theta ^{k}-\theta _{d},\delta \theta \bigr)_{Q}+\alpha _{3}\bigl(u^{k}, \delta u \bigr)_{(0,T)} $$

and

$$ \begin{aligned}[b] \mathcal{L}'' \bigl(x^{k},p^{k},q^{k}\bigr)[\delta x,\delta x]={} &\alpha _{1}( \delta f,\delta f)_{\varOmega }+\alpha _{2}( \delta \theta ,\delta \theta )_{Q}+\alpha _{3}(\delta u, \delta u)_{(0,T)} \\ &{}-2\bigl(\delta u\beta \delta \theta ,p^{k}\bigr)_{\varSigma _{1}}+ \bigl(G''\bigl(\theta ^{k},f ^{k} \bigr)[\delta \theta ,\delta f]^{2},\rho Lp^{k}+q^{k} \bigr)_{Q}. \end{aligned} $$
(32)

In order to prescribe the resulting optimality system in a preferably compact way, we will introduce an abstract description of the state equation and its linearization. The state system can be written as a mapping

$$ e(\theta ,f,u)= \begin{pmatrix} e_{1}(\theta ,f,u) \\ e_{2}(\theta ,f,u) \end{pmatrix} :Y \times L^{\infty }(0,T)\rightarrow L^{2}\bigl(0,T;H^{1}( \varOmega )^{*}\bigr) \times L^{r}(Q) $$

and

$$ e(\theta ,f,u)=0. $$

Moreover, the mapping is defined by using test functions \(p\in L^{2}(0,T;H ^{1}(\varOmega ))\), \(q\in L^{s}(Q)\):

$$\begin{aligned}& \begin{aligned}[b] e_{1}(\theta ,f,u) (p):={}& \int _{0}^{T}\rho c_{p}\langle \theta _{t},p \rangle _{H^{1}(\varOmega )^{*},H^{1}(\varOmega )}\,dt+a(u)[\theta ,p]-\bigl(\rho LG( \theta ,f),p\bigr)_{Q} \\ &{}-\bigl(u(t)\beta (x)\theta _{w},p\bigr)_{\varSigma _{1}}, \end{aligned} \end{aligned}$$
(33)
$$\begin{aligned}& e_{2}(\theta ,f,u) (q):= \iint _{Q} f_{t}q-G(\theta ,f)q \,dx\,dt. \end{aligned}$$
(34)

By means of this, the linearized state system in problem (\(\mathrm{QP}^{k}\)) is given by

$$ e_{x}\bigl(x^{k}\bigr) (\delta \theta ,\delta f,\delta u)= \begin{pmatrix} e_{1,\theta }(x^{k})\delta \theta +e_{1,f}(x^{k})\delta f+e_{1,u}(x ^{k})\delta u \\ e_{2,\theta }(x^{k})\delta \theta +e_{2,f}(x^{k})\delta f \end{pmatrix} =-e \bigl(x^{k}\bigr). $$

Note that \(e_{2,u}(\cdot )\) is zero. The partial derivatives are defined as follows:

$$\begin{aligned}& \begin{aligned} \bigl(e_{1,\theta }\bigl(\theta ^{k},f^{k},u^{k}\bigr)\delta \theta \bigr) (v)&= \int _{0} ^{T}\rho c_{p}\langle \delta \theta _{t},v \rangle _{H^{1}(\varOmega )^{*},H^{1}(\varOmega )}\,dt+a\bigl(u^{k} \bigr)[\delta \theta ,v] \\ &\quad {}-\bigl(\rho L G_{\theta }\bigl(\theta ^{k},f^{k}\bigr)\delta \theta ,v\bigr)_{Q}, \end{aligned} \\& \bigl(e _{1,f}\bigl(\theta ^{k},f^{k},u^{k} \bigr)\delta f\bigr) (v)= -\bigl(\rho L G_{f}\bigl(\theta ^{k},f ^{k}\bigr)\delta f,v\bigr)_{Q}, \\& \bigl(e_{1,u}\bigl(\theta ^{k},f^{k},u^{k} \bigr)\delta u\bigr) (v)= \bigl( \delta u(t)\beta (x) \bigl(\theta ^{k}- \theta _{w}\bigr),v\bigr)_{\varSigma _{1}}, \\& \bigl(e_{2, \theta }\bigl(\theta ^{k},f^{k},u^{k} \bigr)\delta \theta \bigr) (q)= \bigl(-G_{\theta }\bigl( \theta ^{k},f^{k}\bigr)\delta \theta ,q\bigr)_{Q}, \\& \bigl(e_{2,f}\bigl(\theta ^{k},f^{k},u ^{k}\bigr)\delta f\bigr) (q)= \bigl(\delta f_{t}-G_{f} \bigl(\theta ^{k},f^{k}\bigr)\delta f,q\bigr)_{Q}. \end{aligned}$$
(35)

Hence, problem (\(\mathrm{QP}^{k}\)) can be written as

$$ \begin{aligned} &\min \frac{1}{2}\mathcal{L}''\bigl(x^{k},p^{k},q^{k} \bigr)[\delta x, \delta x]+J'\bigl(x^{k}\bigr)\delta x \\ &\quad \text{s.t.}\quad e_{x}\bigl(x^{k}\bigr) (\delta \theta ,\delta f,\delta u)=-e\bigl(x^{k}\bigr),\hspace{177pt}(\mathrm{QP}^{k}) \\ &\delta u\in U_{\mathrm{ad}}-\bigl\{ u^{k}\bigr\} . \end{aligned} $$

Introducing adjoint variables with respect to the linearized state system and neglecting the inequality constraints for a moment, the optimality system is given in the following compact form

$$ \begin{pmatrix} \mathcal{L}_{\theta \theta }''&\mathcal{L}_{\theta f}''&\mathcal{L} _{\theta u}''& e_{1,\theta }^{*}& e_{2,\theta }^{*} \\ \mathcal{L}_{f\theta }''&\mathcal{L}_{ff}''&\mathcal{L}_{f u}''& e _{1,f}^{*}& e_{2,f}^{*} \\ \mathcal{L}_{u\theta }''&\mathcal{L}_{u f}''&\mathcal{L}_{uu}''& e _{1,u}^{*}& 0 \\ e_{1,\theta }&e_{1,f}&e_{1,u} & 0&0 \\ e_{2,\theta }&e_{2,f}&0 &0 &0 \end{pmatrix} \begin{pmatrix} \delta \theta \\ \delta f \\ \delta u \\ p \\ q \end{pmatrix} = \begin{pmatrix} -J_{\theta } \\ -J_{f} \\ -J_{u} \\ -e \end{pmatrix} . $$
(36)

For simlicity, function arguments are now omitted. Unless otherwise stated, all functions are to be evaluated at k-th iterate. Introducing the notation \(\mathcal{L}_{(\theta ,f)}''\)—the second derivative of the Lagrangian \(\mathcal{L}\) with respect to the state pair variable \((\theta ,f)\), we can rewrite the KKT matrix as \(3\times 3\) block matrix. Since the linearized state system is uniquely solvable for every right hand side (it can be shown along the lines of Theorem 1), we can derive the following decomposition of the full KKT matrix in (36) by Gaussian block elimination

$$ \begin{pmatrix} \mathcal{L}_{(\theta ,f)}''&\mathcal{L}_{(\theta ,f)u}''&e_{(\theta ,f)} ^{*} \\ \mathcal{L}_{u(\theta ,f)}''&\mathcal{L}_{uu}''&e_{u}^{*} \\ e_{(\theta ,f)}&e_{u}&0 \end{pmatrix} = \begin{pmatrix} \mathcal{L}_{(\theta ,f)}''e_{(\theta ,f)}^{-1}&0&I \\ \mathcal{L}_{u(\theta ,f)}''e_{(\theta ,f)}^{-1}&I&e_{u}^{*}e_{( \theta ,f)}^{-*} \\ I&0&0 \end{pmatrix} \begin{pmatrix} e_{(\theta ,f)}&e_{u}&0 \\ 0&H&0 \\ 0&W&e_{(\theta ,f)}^{*} \end{pmatrix} . $$

The so called reduced Hessian H is defined by

$$ H=\mathcal{L}_{u}''+e_{u}^{*}e_{(\theta ,f)}^{-*} \bigl(\mathcal{L}_{( \theta ,f)}''e_{(\theta ,f)}^{-1}e_{u}- \mathcal{L}_{(\theta ,f)u}''\bigr)- \mathcal{L}_{u(\theta ,f)}''e_{(\theta ,f)}^{-1}e_{u}. $$
(37)

Moreover, we have

$$ W=-\mathcal{L}_{(\theta ,f)}''e_{(\theta ,f)}^{-1}e_{u}+ \mathcal{L} _{(\theta ,f)u}''. $$

By means of this decomposition, (36) can be treated by:

  1. (i)

    Solve the reduced Hessian system:

    $$ H\delta u=\underbrace{-J_{u}+e_{u}^{*}e_{(\theta ,f)}^{-*} \bigl(J_{(\theta ,f)}- L_{(\theta ,f)}''e_{(\theta ,f)}^{-1}e \bigr)+ L_{u(\theta ,f)}''e _{(\theta ,f)}^{-1}e}_{:=r}; $$
    (38)
  2. (ii)

    Solve the linearized state system, i.e.

    $$ e_{(\theta ,f)} \begin{pmatrix} \delta \theta \\ \delta f \end{pmatrix} =-e_{u}\delta u-e; $$
  3. (iii)

    Solve the adjoint state system, i.e.

    $$ e^{*}_{(\theta ,f)} \begin{pmatrix} p \\ q \end{pmatrix} =-J_{(\theta ,f)}-\mathcal{L}_{(\theta ,f)}'' \begin{pmatrix} \delta \theta \\ \delta f \end{pmatrix} -\mathcal{L}_{(\theta ,f)u}'' \delta u. $$

Based on this arguments and taking the control constraints into account, the reduced optimality conditions of the linear quadratic problem (\(\mathrm{QP}^{k}\)) are given by

$$ \bigl(H\bigl(x^{k},p^{k},q^{k} \bigr)\delta u-r\bigl(x^{k},p^{k},q^{k}\bigr),\delta v-\delta u\bigr)_{(0,T)} \ge 0 \quad \forall \delta v\in U_{\mathrm{ad}}-\bigl\{ u^{k}\bigr\} , $$
(39)

where H is defined as in (37) and the residuum r has to be evaluated by

$$ r:=-J_{u}+e_{u}^{*}e_{(\theta ,f)}^{-*} \bigl(J_{(\theta ,f)}- L_{(\theta ,f)}''e _{(\theta ,f)}^{-1}e\bigr)+ L_{u(\theta ,f)}''e_{(\theta ,f)}^{-1}e. $$

Concluding, we state the rSQP algorithm for tackling the problem (P) in Algorithm 1.

Algorithm 1
figure a

Reduced SQP method (outer loop)

3.2 Primal-dual active set (PDAS) strategy

In a next step we have to specify how to solve the reduced linear quadratic optimal control problems arising in the iterations of the above SQP method. To this end, we will use an Primal-dual active set strategy. Let us assume that the active sets of the optimal solution of problem (\(\mathrm{QP}^{k}\)) are known, i.e. we can define

$$\begin{aligned}& A^{-}=\bigl\{ t\in (0,T)\mid \delta u=u_{a}-u^{k} \bigr\} , \\& A^{+}=\bigl\{ t\in (0,T)\mid \delta u=u_{b}-u^{k} \bigr\} , \\& I=(0,T)\setminus \bigl(A^{-}\cup A^{+}\bigr). \end{aligned}$$

Furthermore, we decompose the control \(\delta u=\delta u_{I}+\delta u _{A}\) in an active part \(\delta u_{A}\) and inactive part \(\delta u _{I}\) according to the previous sets:

$$ \delta u_{A}= \textstyle\begin{cases} u_{a}-u^{k}, &t\in A^{-},\\ u_{b}-u^{k},& t\in A^{+},\\ 0,&\text{else}, \end{cases}\displaystyle \quad \quad \delta u_{I}= \textstyle\begin{cases} 0,& t\in A^{-},\\ 0,& t\in A^{+},\\ \text{unknown},&t \in I. \end{cases} $$

The problem (\(\mathrm{QP}^{k}\)) can be interpreted as an free optimal control problem, where \(\delta u_{I}\) serves as control variable. For a given active part \(\delta u_{A}\), then the variational inequality (39) simplifies to:

$$ H\bigl(x^{k},p^{k},q^{k}\bigr)\delta u_{I}=r\bigl(x^{k},p^{k},q^{k}\bigr)-H \bigl(x^{k},p^{k},q ^{k}\bigr)\delta u_{A}. $$

Now, the idea of the active set strategy is to iterate with respect to the active sets based on initial sets \(A^{-}_{0}\), \(A^{+}_{0}\) and \(I_{0}\). Suppose that for given active sets \(A^{-}_{l}\) and \(A^{+}_{l}\) the solution of the respective free optimal control problem is denoted by \(\delta u_{I}^{l}\) and we set \(\delta u^{l}=\delta u _{I}^{l}+\delta u_{A}^{l}\). Based on the variational inequality, an update of the active sets for a fixed constant \(c>0\) can be defined as follows

$$\begin{aligned}& A^{-}_{l+1}:=\bigl\{ t\in (0,T)\mid c\bigl(\delta u_{I}^{l}-u_{a}+u^{k}\bigr)-H \bigl(x^{k},p ^{k},q^{k}\bigr)\delta u_{I}^{l}+r\bigl(x^{k},p^{k},q^{k} \bigr)< 0\bigr\} , \\& A^{+}_{l+1}:=\bigl\{ t\in (0,T)\mid c\bigl(\delta u_{I}^{l}-u_{b}+u^{k}\bigr)-H \bigl(x^{k},p ^{k},q^{k}\bigr)\delta u_{I}^{l}+r\bigl(x^{k},p^{k},q^{k} \bigr)>0\bigr\} , \\& I_{l+1}=(0,T)\setminus \bigl(A_{l+1}^{-}\cup A_{l+1}^{+}\bigr). \end{aligned}$$

A usual stopping criterion is the coincidence of subsequent active sets \(A^{-}_{l+1}=A^{-}_{l}\) and \(A^{+}_{l+1}=A^{+}_{l}\). One can easily check, that if the previous condition is satisfied the optimal active sets are determined such that the variational inequality (39) is fulfilled and problem (\(\mathrm{QP}^{k}\)) is solved. Summarized, the active set strategy for solving the linear quadratic subproblems (\(\mathrm{QP}^{k}\)) of the SQP method is in Algorithm 2.

Algorithm 2
figure b

Primal-dual active set strategy for solving (\(\mathrm{QP}^{k}\)) (inner loop)

In a last step, we have to provide a method for solving the linear system of equations in step 4 of the primal dual active set strategy. Due to the definition of the reduced Hessian in (37), the system matrix H is not explicitly given after choosing a discretization strategy for the underlying partial differential equations. Hence, an iterative solver has to be established for tackling the reduced Hessian system, e.g. Conjugate gradient method (CG method) or Generalized minimal residual method (GMRES). In view of second-order sufficient optimality conditions for the original problem, we have applied the CG method for solving

$$ \tilde{H}\delta u_{l}=(E_{I_{l}}HE_{I_{l}}+E_{A_{l}}) \delta u_{l,I}=E _{I_{l}}(r-H\delta u_{l,A})=:b. $$

4 Numerical results

In this section we discuss the numerical solution of the control problem (P). Firstly, we construct a test control problem in order to check the convergence of the reduced SQP method with a primal-dual active set strategy described above. Then we solve the optimal control problem for the hot rolling of DP steel. Here, for a globalization of the rSQP method, we use a projected gradient algorithm (see e.g. [15]) with a line search according to the Armijo rule to find suitable initial values for the rSQP method.

The numerical algorithms have been implemented in WIAS-pdelib software. For the solving the state and adjoint system the finite element toolbox pdelib was used.

4.1 A test problem

Let \(\varOmega =(0,1)\times (0,1)\), Γ denotes the boundary of Ω and \(T> 0\). We apply the rSQP method discussed above to the semilinear parabolic boundary control problem

$$ \min J(\theta ,u)=\frac{1}{2} \int _{0}^{T} \int _{\varOmega }(\theta - \theta _{d,\varOmega })^{2}\,dx \,dt+\frac{1}{2} \int _{0}^{T} \int _{\varGamma }( \theta -\theta _{d,\varGamma })^{2}\,dx \,dt+\frac{1}{2} \int _{0}^{T}(u-u_{d})^{2}\,dt $$

subject to

$$\begin{aligned} &\theta _{t}-\Delta \theta =-\theta ^{5}+f(x,t), \quad \text{in } \varOmega \times (0,T), \\ &\frac{\partial \theta }{\partial \nu }+\theta =\bigl(\tilde{u}(t)-u(t)\bigr)g(x), \quad \text{on } \varGamma \times (0,T), \\ &\theta (x,0)=\theta _{0}(x), \quad \text{in } \varOmega , \end{aligned}$$

and

$$\begin{aligned} u_{a}\leq u(t)\leq u_{b} \quad \text{a.e. in }[0,T], \end{aligned}$$

where

$$\begin{aligned}& f(x,t)=e^{-5t}\cos ^{5}\pi x_{1}\cdot \cos ^{5}\pi x_{2}+e^{-t}\bigl(2\pi ^{2}-1 \bigr)\cos \pi x_{1}\cdot \cos \pi x_{2}, \\& \tilde{u}(t)=\bar{u}+e^{-t}, \\& g(x)=\cos \pi x_{1}\cdot \cos \pi x_{2}, \\& \theta _{d,\varOmega }=-5e^{-4t}(t-T)\cos ^{5}\pi x_{1}\cdot \cos ^{5} \pi x_{2}-\bigl(2\pi ^{2}(t-T)-e^{-t}-1\bigr)\cos \pi x_{1}\cdot \cos \pi x_{2}, \\& \theta _{d,\varGamma }=\bigl(e^{-t}-t+T\bigr)\cos \pi x_{1} \cdot \cos \pi x_{2}, \\& u_{d}=-e^{-t}-2(t-T), \\& \theta _{0}=\cos \pi x_{1}\cdot \cos \pi x_{2}, \\& T=1, \qquad u_{a}=-0.85,\qquad u_{b}=-0.4. \end{aligned}$$

The optimal solution to this problem with corresponding state and adjoint variables θ̄, is given by

$$\begin{aligned}& \bar{u}=\varPi _{[u_{a},u_{b}]}\bigl(-e^{-t}\bigr), \\& \bar{\theta }=e^{-t}\cos \pi x_{1}\cdot \cos \pi x_{2}, \\& \bar{p}=(t-T)\cos \pi x_{1}\cdot \cos \pi x_{2}. \end{aligned}$$

The triple of functions \((\bar {u},\bar {\theta },\bar{p})\) is chosen a priori, such that the first-order necessary optimality conditions are fulfilled.

To prove local optimality, we show that the second-order sufficient optimality conditions are satisfied. The formal Lagrange function is given by

$$ \begin{aligned} \mathcal{L}(\theta ,u,p) &=J(\theta ,u)- \int _{0}^{T} \int _{\varOmega }\bigl( \theta _{t}-\Delta \theta +\theta ^{5}-f(x,t)\bigr)p\,dx\,dt \\ &\quad{} - \int _{0}^{T} \int _{\varGamma }\biggl(\frac{\partial \theta }{\partial t}+\theta -( \tilde{u}-u)g(x) \biggr)p\,ds\,dt- \int _{\varOmega }\bigl(\theta (x,0)-\theta _{0}\bigr)p\,dx \end{aligned} $$

with

$$ \begin{aligned}[b] \mathcal{L}''( \bar {\theta },\bar {u},\bar {p}) (\theta ,u)&= \int _{0}^{T} \int _{\varOmega }\theta ^{2}\,dx\,dt+ \int _{0}^{T} \int _{\varGamma }\theta ^{2}\,ds\,dt+ \int _{0}^{T}u^{2}\,dt \\ &\quad{} -20 \int _{0}^{T} \int _{\varOmega }\bar {\theta }^{3}\theta ^{2}\bar {p}\,dx\,dt. \end{aligned} $$
(41)

The last term in (41) is non-negative due to \(\bar {\theta }^{3}\bar {p}=(t-T)\cos ^{4} \pi x_{1} \cos ^{4}\pi x_{2}\le 0\) for all \(t\in [0,T]\). Hence,

$$ \mathcal{L}''(\bar {\theta },\bar {u},\bar {p}) (\theta ,u) \ge \Vert u \Vert _{L^{2}(0,T)}^{2}. $$

The sufficient optimality condition holds even in the entire control-state space, i.e. it is satisfied in a strong form. Following the lines of the proof of Theorem 5.19 in Tröltzsch [24], it can be shown that \((\bar{u}, \bar{\theta })\) is in fact a global optimal solution of the control problem formulated above. Hence, we can expect the global convergence of the rSQP method for arbitrary starting points.

We choose an initial point for the rSQP method

$$ u^{0}(t)\equiv -0.8,\quad \quad \theta ^{0}(x,t)=p^{0}(x,t) \equiv 1. $$

The parabolic problem was solved numerically by applying the semi-discretization approach, where the elliptic system in each time increment was solved by the Finite Element Method (FEM). The controls were chosen as piecewise constant functions on the time grid. The spatial domain is discretized with triangular finite elements with a maximal edge length of \(h=0.0125\). The time interval is discretized uniformly with stepsize \(\Delta t=0.001\).

The sequence of controls \(u^{k}\) produced by the rSQP algorithm is depicted in Fig. 3. The corresponding state and adjoint variables are displayed in Fig. 4.

Figure 3
figure 3

Controls \(u^{k}(t)\)

Figure 4
figure 4

State variable θ̄ (left) and adjoint variable (right) at the end time \(T=1\)

Table 1 illustrates the convergence behavior of the rSQP method. It contains the value of the objective function \(J_{k}\), the rate of convergence \(e_{k}\) and the error \(\tau _{k}\) that was used for the termination criterion,

$$\begin{aligned} &e_{k}=\frac{ \Vert u^{k}-\bar {u}\Vert _{L^{2}(0,T)}+ \Vert \theta ^{k}-\bar {\theta }\Vert _{L^{2}(Q)}+ \Vert p^{k}-\bar{p} \Vert _{L^{2}(Q)}}{ \Vert u^{k-1}-\bar {u}\Vert ^{2}_{L^{2}(0,T)}+ \Vert \theta ^{k-1}-\bar {\theta }\Vert ^{2}_{L^{2}(Q)}+ \Vert p^{k-1}-\bar{p} \Vert ^{2}_{L^{2}(Q)}}, \\ &\tau _{k}= \bigl\Vert u^{k}-u^{k-1} \bigr\Vert _{L^{2}(0,T)}, \end{aligned}$$

and the number of PDAS-Loops in the \(k^{\text{th}}\) iteration of the rSQP method. The rSQP method shows a good convergence to the exact optimal solution ū.

Table 1 Iterations history of the rSQP method with primal-dual active set strategy

As reported in [8, 9], the quadratic convergence of the SQP methods is assured, if the quadratic subproblems \((\mathit{QP}^{k})\) are solved with a quite high precision. The time-space discretization has to match the current accuracy of the SQP step. In our test example, we observe that the speed of convergence of the rSQP method is limited after the third iteration by the time-space discretization error.

4.2 Optimal control problem for dual phase steel

In this subsection we present a numerical solution of the optimal control problem (P) formulated for the production of Mo–Mn dual phase (DP) steel.

Let us choose a two-dimensional domain \(\varOmega =(0,7.5)\times (0,0.69) \ \text{cm}^{2}\). This corresponds to the vertical cross section of the steel slab moving through the cooling segment with a fixed strip speed. The aim is to compute the optimal cooling strategy for the DP steel with a desired ferrite fraction \(f_{d}(x)=85\%\) and a temperature \(\theta _{d}(x)=660^{o} C\) at the final time \(T=7\ \text{s}\). Thus, the optimal control problem reads as follows

$$ \begin{aligned} &\min J(\theta ,f,u)=\frac{\alpha _{1}}{2} \int _{\varOmega } \bigl(f(x,T)-f _{d}(x) \bigr)^{2} \,dx+\frac{\alpha _{2}}{2} \int _{\varOmega } \bigl(\theta (x,T)- \theta _{d}(x) \bigr)^{2} \,dx \\ &\hphantom{\min J(\theta ,f,u)=}{}+\frac{\alpha _{3}}{2} \int _{0}^{T}u ^{2}\,dt \\ &\quad \text{s.t.}\quad (\theta , f, u)\text{ satisfies (2a)--(2f) and } 0\le u(t)\le 0.3 \text{ a.e. in } [0,T]. \end{aligned} $$

The function \(G(\theta ,f)\), which describes the ferrite growth is given by

$$ G(\theta ,f)=\bigl(f_{\mathrm{eq}}(\theta )-f\bigr)\mathcal{H} \bigl(f_{\mathrm{eq}}(\theta )-f\bigr)g_{1}( \theta )g_{2}, $$
(42)

where \(\mathcal{H}\) is a monotone approximation of the Heaviside function

$$ \mathcal{H}(x)= \textstyle\begin{cases} 1, & \text{for } x\geq \delta , \\ 10(\frac{x}{\delta })^{6}-24(\frac{x}{\delta })^{5}+15(\frac{x}{ \delta })^{4}, &\text{for } \delta >x\geq 0, \\ 0 , &\text{for } x< 0. \end{cases} $$

The arguments for the Heaviside function are in \((-1,1)\), hence we found a realistic choice for δ to be \(\delta =10^{-2}\). The equilibrium volume fraction of ferrite \(f_{\mathrm{eq}}(\theta )\) and the temperature dependent factor \(g_{1}(\theta )\) are cubic spline functions interpolating the pointwise data as shown in Fig. 5. The factor representing the preconditioning of the initial phase austenite is given by \(g_{2}=10\). The model (42) for the austenite-ferrite phase transformation in the hot rolling process has been discussed in [22]. For further details about the modeling we refer to this article. We note, that Assumption (A2) is too strong for the function \(G(\theta ,f)\). Nevertheless, the existence and uniqueness of the solution to state system can be also shown for this function and all other theoretical and numerical considerations remain unchanged.

Figure 5
figure 5

The functions \(f_{\mathrm{eq}}(\theta )\) (left) and \(g_{1}(\theta )\) (right)

We use the following physical parameters for the heat equation (2c). The reference density at 20C is chosen to be \(\rho =7.85\ \frac{\text{g}}{\text{cm}^{3}} \). The values for the heat conductivity κ and specific heat \(c_{p}\) are set to

$$ c_{p}=0.5096 \ \frac{\text{J}}{\text{g}\cdot \text{K}}, \quad\quad \kappa =0.5\ \frac{\text{J}}{\text{s}\cdot \text{cm} \cdot \text{K}}. $$

These values only represent a rough approximation by a constant function for the temperature dependent functions \(\kappa (\theta )\), \(c_{p}( \kappa )\). More details on the thermal physical parameters can be found, e.g., in [21]. The latent heat L of the austenite-ferrite phase transformation is specified according to [11] as \(L= 77.0\ \frac{\text{J}}{\text{g}}\).

The initial condition for the temperature is \(\theta _{0}=860 ^{\circ }\text{C} \) and \(\theta _{w}=20^{\circ }\text{C}\). Notice that (A3) is satisfied. The water profile in the cooling segment is given by

$$ \beta (x)=e^{-0.01(x-3.75)^{2}}. $$

It should be mentioned that a choice of weighting factors \(\alpha _{1}\), \(\alpha _{2}\), \(\alpha _{3}\) in the cost functional of optimal control problem (P) is of crucial importance for the numerical computations. The volume phase fraction \(f\in [0,1]\), while the temperature θ is in the range of 20C–1200C. Therefore, in order to obtain useful results, an equilibrating of this two terms in cost functional is necessary. In the subsequent computations we set \(\alpha _{1}=1\), \(\alpha _{2}=5\cdot 10^{-6}\). The factor \(\alpha _{3}\) is a Tikhonov regularization parameter and is chosen as 0.1.

The nonlinear state system (2a)–(2f) as well the corresponding adjoint system in each iteration of projected gradient method can be solved numerically using semi-implicit Euler scheme. The rSQP method requires a solving of the linearized problems \((\mathit{QP}^{k})\). Here, the linear parabolic equation was discretized in a standard way using method of lines and ODE for the phase transition was treated numerically by explicit Euler scheme.

The FE triangulation of the computational domain Ω is done by a uniform mesh with \(N=561\) degrees of freedom. For the time step, we take \(\Delta t=0.0125\). We approximate the control function \(u(t)\) with piecewise constant functions on the time grid such that the unknown control function is represented as \(u=(u_{1},\ldots,u_{n-1})^{T}\), \(u_{i}=u(t_{i})\), \(i=1,\ldots,n-1\).

As explained above, we use the gradient projection method for the globalization of the rSQP algorithm. As an initial guess for the gradient projection method we take \(u_{0}\equiv 0\). The algorithm was terminated after 7 iterations, provided the relative error \(\Vert u ^{k+1}-u^{k} \Vert _{L^{2}(0,T)}/ \Vert u^{k} \Vert _{L^{2}(0,T)}\) is smaller then 0.01. The obtained control function û with corresponding state variables θ̂, and adjoint variables , serve as the initial iteration of the rSQP method.

Table 2 shows the convergence history of the rSQP steps. As expected, the rSQP method converges in a few steps to the optimal solution with \(\mathit{tol}=10^{-3}\) in termination condition. In Fig. 6, some iterations of the gradient projection algorithm and the rSQP method are represented.

Figure 6
figure 6

Some iterations of optimization procedure

Table 2 Value of objective function \(J_{k}\), relative error \(\tau _{k}\) and number of PDAS loops in \(k^{\text{th}}\)-iteration of the rSQP method

The optimal control \(u(t)\) is depicted in Fig. 7. Closer to the end of the time interval the optimal control decreases to zero, which is the lower bound of the control. This fact also reflects the presence of the box constraints and the functioning of the active set method.

Figure 7
figure 7

Optimal control \(u(t)\)

Figure 8 shows the simulated final temperature (left) and phase distribution (right) in the cross section of the steel slab in selected iterations of the optimization procedure. In each iteration of the rSQP method, the temperature distribution in the steel slab becomes more homogeneous and closer to the desired value \(\theta _{d}=660^{ \circ }\text{C}\). On the other hand, the maximal difference between the ferrite values at the final time is about of 17%. However, in each iteration of rSQP method, the ferrite phase fraction in the largest part of the cross section is close to 85%.

Figure 8
figure 8

The simulated final temperature (left) and phase distribution (right) in the cross section of the steel slab in certain number of iterations of the optimization procedure. In both pictures the 1st and 7th (final) iteration of the gradient projection method, and 1st and 3rd (final) iterations of the rSQP method are depicted in order from top to bottom

We additionally plot the temperature and ferrite growth during the cooling in the middle of the cross section of the steel slab. The simulation results are shown in Fig. 9. The desired temperature of 660C and ferrite fraction of 85% are reached very accurately in the middle of the cross section.

Figure 9
figure 9

The simulated temperature (left) and ferrite fraction evolution (right) in the middle of the cross section of the steel slab

5 Conclusions

We have studied the optimal control problem that describes the hot rolling process of multiphase steel. The nonlinear boundary control problem was analyzed and the first-order necessary and second-order sufficient optimality conditions were derived. The control problem was solved numerically by a reduced SQP method with active set strategy.

The approach has already been tested in an industrial setting. The results of the optimal control of the cooling line have been verified in hot rolling experiments at the pilot hot rolling mill at the Institute for Metal Forming (IMF), TU Bergakademie Freiberg. For more details we refer to a recent paper [4].

The challenging topic for the future research will be the real time control of the hot rolling process, which is an important task for the industrial employment of this approach. Here, recent developments in model reduction techniques seem to be a promising tool and will be subject of further work of the authors.

Abbreviations

DP:

Dual Phase

ROT:

Run Out Table

rSQP:

reduced Sequential Quadratic Programming

ODE:

Ordinary Differential Equation

SSC:

Second-order Sufficient Optimality Conditions

PDAS:

Primal-Dual Active Set

KKT:

Karush–Kuhn–Tucker

CG:

Conjugate Gradient

GMRES:

Generalized Minimal Residual

FEM:

Finite Element Method

References

  1. Alt W. Nichtlineare Optimierung: Eine Einführung in Theorie, Verfahren und Anwendungen. Vieweg Studium: Aufbaukurs Mathematik. Wiesbaden: Vieweg+Teubner; 2002.

    Book  Google Scholar 

  2. Bergounioux M, Ito K, Kunisch K. Primal-dual strategy for constrained optimal control problems. SIAM J Control Optim. 1999;37:1176–94.

    Article  MathSciNet  Google Scholar 

  3. Bhadeshia H, Honeycombe R. Steels: microstructure and properties. Amsterdam: Elsevier; 2011.

    Google Scholar 

  4. Bleck W, Hömberg D, Prahl U, Suwanpinij P, Togobytska N. Optimal control of a cooling line for production of hot rolled dual phase steel. Steel Res Int. 2014;85:1328–33.

    Article  Google Scholar 

  5. Dieudonne J. Foundations of modern analysis. Pure and applied mathematics. Read Books; 1960.

    MATH  Google Scholar 

  6. Fasano A, Hömberg D, Panizzi L. A mathematical model for case hardening of steel. Math Models Methods Appl Sci. 2009;19:2101–26.

    Article  MathSciNet  Google Scholar 

  7. Goldberg H, Tröltzsch F. Second order sufficient optimality conditions for a class of nonlinear parabolic boundary control problems. SIAM J Control Optim. 1993;31:1007–25.

    Article  MathSciNet  Google Scholar 

  8. Goldberg H, Tröltzsch F. On a Lagrange–Newton method for a nonlinear parabolic boundary control problem. Optim Methods Softw. 1998;8:225–47.

    Article  MathSciNet  Google Scholar 

  9. Goldberg H, Tröltzsch F. On a SQP-multigrid technique for nonlinear parabolic boundary control problems. Berlin: Springer; 1998.

    Book  Google Scholar 

  10. Hashimoto T, Yoshioka Y, Ohtsuka T. Model predictive control for hot strip mill cooling system. In: Proceedings of the IEEE international conference on control applications. 2010. p. 646–51.

    Google Scholar 

  11. Hengerer F, Strässle B, Bremi P. Berechnung der Abkühlungsvorgänge beim Öl- und Lufthärten zylinder- und plattenförmiger Werkstücke aus legiertem Vergütungsstahl mit Hilfe einer elektronischen Rechenanlage: Calcul, à l’aide d’une calculatrice électronique, des processus de refroidissement se déroulant lors de la trempe à l’huile et à l’air de cylindres et de plaques en acier allié. In: Bericht des Werkstoffausschusses des Vereins deutscher Eisenhüttenleute. Stahleisen; 1969.

    Google Scholar 

  12. Hintermüller M, Volkwein S, Diwoky F. Fast solution techniques in constrained optimal boundary control of the semilinear heat equation. In: Control of coupled partial differential equations. Internat. series numer. math. vol. 155. Basel: Birkhäuser; 2007. p. 119–47.

    Chapter  Google Scholar 

  13. Hömberg D, Sokolowski J. Optimal shape design of inductor coils for surface hardening. SIAM J Control Optim. 2003;42(3):1087–117.

    Article  MathSciNet  Google Scholar 

  14. Hömberg D, Volkwein S. Control of laser surface hardening by a reduced-order approach using proper orthogonal decomposition. Math Comput Model. 2003;37:1003–28.

    Article  MathSciNet  Google Scholar 

  15. Kelley CT. Iterative methods for optimization. Frontiers in applied mathematics. vol. 18. Philadelphia: SIAM; 1999.

    Book  Google Scholar 

  16. Kupfer F-S, Sachs EW. Numerical solution of a nonlinear parabolic control problem by a reduced SQP method. Comput Optim Appl. 1992;1:113–35.

    Article  MathSciNet  Google Scholar 

  17. Landl G, Engl HW. Optimal strategies for the cooling of steel strips in hot strip mills. Inverse Probl Eng. 1995;2:103–18.

    Article  Google Scholar 

  18. Lezius R, Tröltzsch F. Theoretical and numerical aspects of controlled cooling of steel profiles. In: Progress in industrial mathematics at ECMI 94. Berlin: Springer; 1996. p. 380–8.

    Chapter  Google Scholar 

  19. Nocedal J, Wright S. Numerical optimization. 2nd ed. Springer series in operations research and financial engineering. New York: Springer; 2006.

    MATH  Google Scholar 

  20. Raymond JP, Tröltzsch F. Second order sufficient optimality conditions for nonlinear parabolic control problems with sate constaints. Discrete Contin Dyn Syst. 2000;6:431–50.

    Article  Google Scholar 

  21. Spittel M, Spittel T. Metal forming data of ferrous alloys—deformation behaviour. Landolt–Börnstein—group VIII advanced materials and technologies. vol. 2C1. Berlin: Springer; 2009.

    Google Scholar 

  22. Suwanpinij P, Togobytska N, Prahl U, Weiss W, Hömberg D, Bleck W. Numerical cooling strategy design for hot rolled dual phase steel. Steel Res Int. 2010;11:1001–9.

    Article  Google Scholar 

  23. Tröltzsch F. An SQP method for the optimal control of a nonlinear heat equation. Control Cybern. 1994;23:268–88.

    MathSciNet  MATH  Google Scholar 

  24. Tröltzsch F. Optimal control of partial differential equations: theory, methods, and applications. Graduate studies in mathematics. vol. 112. Providence: Am. Math. Soc.; 2010.

    MATH  Google Scholar 

  25. Wang B-X, Zhang D-H, Wang J, Yu M, Zhou N, Cao G-M. Application of neural network to prediction of plate finish cooling temperature. J Cent South Univ Technol. 2008;15:136–40.

    Article  Google Scholar 

  26. Zheng Y, Li N, Li S. Hot-rolled strip laminar cooling process plant-wide temperature monitoring and control. Control Eng Pract. 2013;21(1):23–30.

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to thank Marcel Graf and Piyada Suwanpinij for carrying out the experiments on the hot rolling mill. Special thanks to Wolf Weiss for fruitful discussions about mathematical modeling and the interpretation of measurements. The third author is grateful to the financial support from the DFG.

Availability of data and materials

Please contact author for data requests.

Funding

The work on this paper has been partially supported by Deutsche Forschungsgemeinschaft (DFG) within the priority program 1204 “Algorithms for fast, material specific process-chain design and analysis in metal forming”.

Author information

Authors and Affiliations

Authors

Contributions

The three authors are equally contributors to this paper. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Nataliya Togobytska.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Consent for publication

Not applicable.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hömberg, D., Krumbiegel, K. & Togobytska, N. Optimal control of multiphase steel production. J.Math.Industry 9, 6 (2019). https://doi.org/10.1186/s13362-019-0063-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13362-019-0063-x

Keywords