 Research
 Open Access
 Published:
A new nonmonotone adaptive trust region line search method for unconstrained optimization
Journal of Mathematics in Industry volume 10, Article number: 13 (2020)
Abstract
This paper proposes a new nonmonotone adaptive trust region line search method for solving unconstrained optimization problems, and presents a modified trust region ratio, which obtained more reasonable consistency between the accurate model and the approximate model. The approximation of Hessian matrix is updated by the modified BFGS formula. Trust region radius adopts a new adaptive strategy to overcome additional computational costs at each iteration. The global convergence and superlinear convergence of the method are preserved under suitable conditions. Finally, the numerical results show that the proposed method is very efficient.
Introduction
Consider the following unconstrained optimization problem
where \(f : R^{n} \to R\) is a twice continuously differentiable function. Trust region method is one of prominent class of iterative methods. The basic idea of trust region methods as follows: at the current step \(x_{k}\), the trial step \(d_{k}\) is obtained by solving the subproblem:
where \(f_{k} = f(x_{k})\), \(g_{k} = \nabla f(x_{k})\), \(G_{k} = \nabla^{2}f(x_{k})\), \(B_{k}\) be a symmetric approximation of \(G_{k}\), \(\Delta_{k}\) is trust region radius, and \(\\cdot\\) is the Euclidean norm.
To evaluate an agreement between the model and the objective function, the most ordinary ratio is defined as follows:
where the numerator is called the actual reduction and the denominator is called the predicted reduction. The ratio \(\rho_{k}\) is used to determine whether the trial step \(d_{k}\) is accepted. Given \(\mu\in [ 0,1 ]\), if \(\rho_{k} < \mu \), the trial step \(d_{k}\) is not successful and the subproblem (2) should be resolved with a smaller radius. Otherwise, \(d_{k}\) is acceptable and the radius should be increased.
It is wellknown that monotone techniques may slow down the rate of convergence, especially in the presence of the narrow curved valley. The monotone techniques that require the objective function to be decreased at each iteration. In order to overcome these disadvantages, Grippo et al. [1] proposed a nonmonotone technique for Newton’s method in 1986. In 1998, Nocedal and Yuan [2] proposed a nonmonotone trust region method with line search techniques, the step size \(\alpha_{k}\) satisfies the following inequality:
where \(\sigma\in(0,1)\). The general nonmonotone term \(f_{l(k)}\) is defined by \(f_{l(k)} = \max_{0 \le j \le m(k)}\{ f_{k  j}\}\), in which \(m(0) = 0\), \(0 \le m(k) \le\min\{ m(k  1) + 1,N\}\) and \(N \ge0\) is an integer constant.
However, the general nonmonotone strategy does not sufficiently employ the current value of the objective function f. It seems that the nonmonotone term has well performance far from the optimum. In order to introduce a more relaxed nonmonotone strategy, Ahookhosh et al. [3] introduced a modified nonmonotone term in 2002. More precisely, for \(\sigma\in(0,1)\), the step size \(\alpha_{k}\) satisfies the following inequality:
where the nonmonotone term \(R_{k}\) is defined by
in which \(\eta_{k} \in[\eta_{\min},\eta_{\max} ]\), with \(\eta_{\min} \in [0,1)\), and \(\eta_{\max} \in[\eta_{\min},1]\).
One knows that an adaptive radius avoid the blindness of updating the initial trust region radius, and may cause the decrease in the total number of iterations. In 1997, Sartenear [4] proposed a new strategy for automatically determining the initial trust region radius. In 2002, Zhang et al. [5] proposed a new scheme to determine trust region radius as follows: \(\Delta_{k} = c^{p} \Vert \widehat{B}_{{k}}^{  1} \Vert \Vert g_{k} \Vert \). To avoid calculating the inverse of the matrix \(B_{k}\) and an estimation of \(\widehat{B}_{k}^{  1}\) in each iteration, Li [6] proposed an adaptive trust region radius as follows: \(\Delta_{k} = \frac{ \Vert d_{k  1} \Vert }{ \Vert y_{k  1} \Vert } \Vert g_{k} \Vert \), where \(y_{k  1} = g_{k}  g_{k  1}\). Inspired by these facts, some modified versions of adaptive trust region methods have been proposed in [7–14].
This paper is organized as follows. In Sect. 2, we describe the new algorithm. The global and superlinear convergence of the algorithm are established in Sect. 3. In Sect. 4, numerical results are reported, which show that the new method is effective. Finally, conclusions are drawn in Sect. 5.
New algorithm
In this section, a new adaptive nonmonotone trust region line search algorithm is proposed. Here, based on the method of Li [6], we proposed a adaptive trust region radius as follows:
\(c_{k}\) is an adjustment parameter. Prompted by the adaptive technique, the proposed method has the following well properties: it is convenient to adjust the radius by using the adjustment parameter \(c_{k}\), and the algorithm also reduces the related workload and calculation time.
On the basis of considered discussion, at each iteration, a trial step \(d_{k}\) is obtained by solving the following trust region subproblem:
where \(y_{k  1} = g_{k}  g_{k  1}\). The matrix \(B_{k}\) is updated by a modified BFGS formula [11],
where \(d_{k} = x_{k + 1}  x_{k}\), \(y_{k} = g_{k + 1}  g_{k}\), \(z_{k} = y_{k} + t_{k} \Vert g_{k} \Vert d_{k}\), \(t_{k} = 1 + \max \{  \frac{y_{k}^{T}d_{k}}{ \Vert g_{k} \Vert \Vert d_{k} \Vert },0 \}\).
Considering advantage of the Ahookhosh’s nonmonotone term, the best convergence behavior can be obtained by adopting a stronger nonmonotone strategy away from the solution and a weaker monotone strategy closer to the solution. We defined a modified form of trust region ratio as follows:
As seen, the effect of nonmonotonicity can be controlled in (10) by numerator and denominator.
Now, we list the new adaptive nonmonotone trust region line search algorithm as follows:
Algorithm 2.1
(New nonmonotone adaptive trust region algorithm)

Step 0.
Given initial point \(x_{0} \in R^{n}\), a symmetric matrix \(B_{0} \in R^{n} \times R^{n}\). The constants \(0 < \mu_{1} < \mu_{2} < 1\), \(0 < \eta_{\min} \le\eta_{\max} < 1\), \(0 < \beta_{1} < 1 < \beta_{2}\), \(0 < \delta_{1} < 1 < \delta_{2}\), \(N > 0\) and \(\varepsilon> 0\) are also given. Set \(k = 0\), \(c_{0} = 1\).

Step 1.
If \(\Vert g_{k} \Vert \le\varepsilon \), then stop. Otherwise, go to Step 2.

Step 2.
Solve the subproblem (8) to obtain \(d_{k}\).

Step 3.
Compute \(R_{k}\) and \(\widehat{\rho}_{k}\) respectively.

Step 4.
$$c_{k + 1}: = \left \{ \textstyle\begin{array}{l@{\quad}l} \beta_{1}c_{k}, &\mbox{if } \widehat{\rho}_{k} < \mu_{1}, \\ c_{k},& \mbox{if} \mu_{1} \le\widehat{\rho}_{k} < \mu_{2}, \\ \beta_{2}c_{k},& \mbox{if }\widehat{\rho}_{k} \ge\mu_{2} . \end{array}\displaystyle \right . $$

Step 5.
If \(\widehat{\rho}_{k} \ge\mu_{1}\), set \(x_{k + 1} = x_{k} + d_{k}\) and go to Step 6. Otherwise, find the step size \(\alpha_{k}\) satisfying (5). Set \(x_{k + 1} = x_{k} + \alpha_{k}d_{k}\), go to Step 6.

Step 6.
Update the trust region radius by \(\Delta_{k + 1} = c_{k + 1}\frac{ \Vert x_{k + 1}  x_{k} \Vert }{ \Vert g_{k + 1}  g_{k} \Vert } \Vert g_{k + 1} \Vert \) and go to Step 7.

Step 7.
Compute the new Hessian approximation \(B_{k + 1}\) by a modified BFGS formula (9). Set \(k = k + 1\) and go to Step 1.
Assumption 2.1

H1.
The level set \(L(x_{0}) = \{ x \in R^{n} \vert f(x) \le f(x_{0})\} \subset\varOmega \), where \(\varOmega\in R^{n}\) is bounded. \(f(x)\) is continuously differentiable on the level set \(L(x_{0})\).

H2.
The matrix \(B_{k}\) is uniformly bounded, i.e., there exists a constant \(M_{1} > 0\) such that \(\Vert B_{k} \Vert \le M_{1}\), \(\forall k \in N \cup\{ 0\}\).
Remark 2.1
If f is a twice continuously differentiable function, then H1 implies that ∇f is continuous and uniformly bounded on Ω. Hence, there exists a constant L such that
Convergence analysis
Lemma 3.1
There is a constant\(\tau\in(0,1)\), the trial step\(d_{k}\)satisfies the following inequalities:
Proof
The proof is exactly similar to the proof of Lemma 6 and Lemma 7 of [15] and here is omitted. □
Lemma 3.2
Suppose that Assumption2.1holds, then we have,
where\(p_{k}\)is the iteration of the solution to subproblem from the previous trial step\(d_{k  1}\)to the currently acceptable trial step\(d_{k}\).
Proof
According to Step 4 of Algorithm 2.1, the trust region radius satisfies \(\Delta_{k} \ge c_{k}\frac{ \Vert g_{k} \Vert }{ \Vert B_{k} \Vert } \ge\frac{\beta_{{1}}^{p_{k}} \Vert g_{k} \Vert }{ \Vert B_{k} \Vert } \ge\frac{\beta_{{1}}^{p_{k}} \Vert g_{k} \Vert }{M_{1}}\). Thus, according to \(\Vert d_{k} \Vert \le\Delta_{k}\), we assume that \(\overline{d}_{k} = \frac{\beta_{1}^{p_{k}} \Vert g_{k} \Vert }{M_{1}}\) is a feasible solution to trust region subproblem. Therefore, we obtain,
□
Lemma 3.3
Suppose that the sequence\(\{ x_{k} \}\)is generated by Algorithm 2.1. Then we have,
Proof
Using \(R_{k} = \eta_{k}f_{l(k)} + (1  \eta_{k})f_{k}\) and \(f_{k} \le f_{l(k)}\), we have
□
Lemma 3.4
Suppose that Assumption2.1holds. Step 4 and Step 5 of Algorithm 2.1are welldefined.
Proof
Set \(\overline{d}_{k} = \frac{\beta_{1}^{p_{k}} \Vert g_{k} \Vert }{M_{1}}\) is a solution of subproblem (8) corresponding to \(p_{k} = p \).
Firstly, we prove that \(\widehat{\rho}_{k} \ge\mu_{1}\), for sufficiently large p. Using Lemma 3.1, Lemma 3.2 and Taylor’s formula, we have
Therefore, we have \(\widehat{\rho}_{k} \ge\mu_{1}\), for sufficiently large p. This implies that Steps 4 and 5 of Algorithm 2.1 are welldefined. □
Lemma 3.5
Suppose that Assumption2.1holds and the sequence\(\{ x_{k} \}\)is generated by Algorithm 2.1. The sequence\(\{ f_{l ( k )} \}\)is (not monotonically increasing) convergent.
Proof
The proof is exactly similar to the proof of Lemma 2.1 and Corollary 2.1 in [3] and here is omitted. □
Lemma 3.6
Suppose that the sequence\(\{ x_{k} \}\)is generated by Algorithm 2.1. Using\(\Vert d_{k} \Vert \le \Delta_{k}\), there exists a constantκsuch that\(\Vert d_{k} \Vert \le\kappa \Vert g_{k} \Vert \).
Proof
From (7) and \(\Vert d_{k} \Vert \le\Delta_{k}\), we observe that
Thus, setting \(\kappa= c_{k}\frac{ \Vert d_{k  1} \Vert }{ \Vert y_{k  1} \Vert }\). □
Lemma 3.7
Suppose that Assumptions2.1holds, and the sequence\(\{ x_{k} \}\)is generated by Algorithm 2.1. For\(\rho_{k} < \mu_{1}\), the step size\(\alpha_{k}\)satisfies the following inequality:
Proof
Set \(\alpha= \frac{\alpha_{k}}{\rho} \), where \(\rho\in(0,1)\). According to Step 5 of Algorithm 2.1 and (5), it is easy to show that
Using the definition of \(R_{k}\) and Taylor expansion, we have
where \(\xi\in(x_{k},x_{k + 1})\). Thus,we get
On the other hand, form \(\Vert d_{k} \Vert \le\kappa \Vert g_{k} \Vert \) and (13), we can write
Hence, combining above inequality and (20), we have
Thus, we can obtain (18). □
Lemma 3.8
Suppose that Assumption2.1holds and the sequence\(\{ x_{k} \}\)is generated by Algorithm 2.1, then we have,
Proof
From Lemma 3.3, we know that Algorithm 2.1 generates an infinite sequence \(\{ x_{k} \}\) satisfying \(\widehat{\rho}_{k} \ge\mu_{1}\), we obtain,
Then,
Replacing k by \(l(k)  1\), we can write
Combine Lemma 3.8 with the above inequality, we get
According to Assumption 2.1 and (12), we have
where \(\omega= \frac{\tau}{\kappa} \min \{ 1,\frac{1}{\kappa M_{1}} \}\). It follows from (25) that
The reminder of the proof is similar to a theorem of [1] and here is omitted. □
On the basis of the above lemmas and analysis, we can obtain the global convergence result of Algorithm 2.1 as follows:
Theorem 3.1
(Global convergence)
Suppose that Assumption2.1holds and the sequence\(\{ x_{k} \}\)is generated by Algorithm 2.1. Then we have,
Proof
We assume that \(\overline{d}_{k}\) be the solution of subproblem (8) corresponding to \(p_{k} = p \), and we have an infinite sequence \(\{ x_{k} \}\) satisfying \(\widehat{\rho}_{k} \ge\mu_{1}\).
According to Lemma 3.2, we have,
This above inequality and Lemma 3.8 indicate that (27) holds. □
We will prove the superlinear convergence of Algorithm 2.1 under suitable conditions.
Theorem 3.2
(Superlinear convergence)
Suppose that Assumption2.1holds and Algorithm 2.1generated the sequence\(\{ x_{k} \}\)converges to\(x^{*}\). Moreover, assume that\(\nabla^{2}f(x^{*})\)is positive definite matrix and\(\nabla^{2}f(x)\)is Lipschitz continuous in a neighborhood of\(x^{*}\). If\(\Vert d_{k} \Vert \le\Delta_{k}\), where\(d_{k} =  B_{k}^{  1}g_{k}\), and
Then the sequence\(\{ x_{k} \}\)converges to\(x^{*}\)superlinearly, that is,
Proof
From (28) and \(\Vert d_{k} \Vert \le\Delta_{k}\), we obtain
Using Taylor expansion, there exists \(t_{k} \in(0,1)\) such that
Thus, we can obtain that
From (28) and \(\nabla^{2}f(x^{*})\) is Lipschitz continuous in a neighborhood of \(x^{*}\), we get
Note that by Theorem 3.1, it is implied that
and thus, we have \(d_{k} \to0\). We can obtain
then,
Combine \(\nabla^{2}f(x^{*})\) is a positive definite matrix and (33). Then, there exists a constant \(\varsigma> 0\), and \(k_{0} \ge0\) such that
Thus, we obtain
Combine above inequality with (31), we get \(\lim_{k \to\infty} \frac{ \Vert x_{k}  x^{*} \Vert }{ \Vert x_{k + 1}  x^{*} \Vert } = 0\). So the proof is completed. □
Preliminary numerical experiments
In this section, we perform numerical experiments on Algorithm 2.1. A set of unconstrained test problems are selected from [16]. The simulation experiment uses MATLAB 9.4, the processor uses Intel (R) Core (TM), 2.00 GHz, 6 GB RAM. Take exactly the same value for the public parameters of these algorithms: \(\mu_{1} = 0.25\),
\(\mu_{2} = 0.75\), \(\beta_{1} = 0.25\), \(\beta_{2} = 1.5\), \(c_{0} = 1\), \(N = 5\). The matrix \(B_{k}\) is updated by (9). The stopping criterions are \(\Vert g_{k} \Vert \le10^{  6}\) and the number of iterations exceeds 5000. We denote the number of gradient evaluations by “\(n_{i}\)”, the number of function evaluations by “\(n_{f}\)”.
For convenience, we use the following notations to represent the algorithms:
 SNTR::
Standard nonmonotone trust region method [17].
 ATRG::
Nonmonotone Shi’s adaptive trust region method with \(q_{k} =  g_{k}\) [18].
 ATRN::
Nonmonotone Shi’s adaptive trust region method with \(q_{k} =  B_{k}^{  1}g_{k}\) [18].
 NLS::
New nonmonotone adaptive trust region line search method.
For standard nonmonotone trust region method, we update \(\Delta_{k}\) by the following formula
Table 1 shows that the experiments were conducted to compare NLS and the standard trust region method with a different initial radius. One knows that an initial radius has a significant influence on the numerical results in the standard trust region methods. Moreover, the total number of iterations and function evaluations of the new algorithm are partly less than the standard nonmonotone trust region method. We also know that NLS outperforms with ATRG, ATRN respect to the total number of function evaluations and the total number of gradient evaluations. The performance profiles given by Dolan and More [19] are used to compare the efficiency of the three algorithms. Figures 1–2 give the performance profiles of the three algorithms for the number of function evaluations, and the number of gradient evaluations, respectively. As the figures show that Algorithm 2.1 grows up faster than the other algorithms. Therefore, we can deduce that the new algorithm is more efficient and robust than the other considered trust region algorithms for solving unconstrained optimization.
Conclusions
In this paper, a new nonmonotone adaptive trust region line search method is presented for unconstrained optimization problems. A new nonmonotone trust region ratio is introduced to enhance the effective of the algorithm. A new trust region radius is proposed, which relaxes the condition of accepting a trial step for the trust region methods. Theorem 3.1 and Theorem 3.2 have been shown that the proposed algorithm can preserve global convergence and superlinear convergence, respectively. Numerical experiments have been done on a set of unconstrained optimization test problems of [16]. They showed practical efficiency of the proposed algorithm.
References
 1.
Grippo L, Lamparillo F, Lucidi S. A nonmonotone line search technique for Newton’s method. SIAM J Numer Anal. 1986;23:707–16.
 2.
Nocedal J, Yuan Y. Combining trust region and line search techniques. In: Yuan Y, editor. Advances in nonlinear programming. Dordrecht: Kluwer Academic; 1996. p. 153–75.
 3.
Ahookhoosh M, Amini K, Peyghami M. A nonmonotone trust region line search method for large scale unconstrained optimization. Appl Math Model. 2012;36(1):478–87.
 4.
Sartenaer A. Automatic determination of an initial trust region in nonlinear programming. SIAM J Sci Comput. 1997;18(6):1788–803.
 5.
Zhang XS, Zhang JL, Liao LZ. An adaptive trust region method and its convergence. Sci China Ser A, Math. 2002;45(1):620–31.
 6.
Li D. A trust region method with automatic determination of the trust region radius. Chin J Eng Math. 2006;23(5):843–8.
 7.
Shi ZJ, Wang HQ. A new selfadaptive trust region method for unconstrained optimization. Technical report. College of Operations Research and Management, Qufu Normal University; 2004.
 8.
Shi ZJ, Guo JH. A new trust region method for unconstrained optimization. J Comput Appl Math. 2008;213(1):509–20.
 9.
Kimiaei M. A new class of nonmonotone adaptive trustregion methods for nonlinear equations with box constraints. Calcolo. 2017;3:769–812.
 10.
Amini K, Shiker MAK, Kimiaei M. A line search trustregion algorithm with nonmonotone adaptive radius for a system of nonlinear equations. 4OR. 2016;4(2):132–52.
 11.
Shanmin P, Lanping C. A new family of nonmonotone trust region algorithm. Math Pract Theory. 2011;2011(10):211–8.
 12.
Reza Peyghami M, Ataee Tarzanagh D. A relaxed nonmonotone adaptive trust region method for solving unconstrained optimization problems. Comput Optim Appl. 2015;61:321–41.
 13.
Zhou Q, Hang D. Nonmonotone adaptive trust region method with line search based on new diagonal updating. Appl Numer Math. 2015;91:75–88.
 14.
Wang XY, Ding XF, Qu Q. A new filter nonmonotone adaptive trust region method for unconstrained optimization. Symmetry. 2020;12(2):208.
 15.
Sang Z, Sun Q. A selfadaptive trust region method with line search based on a simple subproblem model. J Comput Appl Math. 2009;232(2):514–22.
 16.
Andrei N. An unconstrained optimization test functions collection. Environ Sci Technol. 2008;10:6552–8.
 17.
Gu NZ, Mo JT. Incorporating nonmonotone strategies into the trust region for unconstrained optimization. Comput Math Appl. 2008;55:2158–72.
 18.
Ahookhosh M, Amini K. A nonmonotone trust region method with adaptive radius for unconstrained optimization. Comput Math Appl. 2010;60:411–22.
 19.
Dolan ED, More JJ. Benchmarking optimization software with performance profiles. Math Program. 2002;91:201–13.
Author information
Affiliations
Contributions
The main idea of this paper was proposed by WXY and DXF. QQ prepared the manuscript initially and performed all the steps of the proofs in this research. All authors read and approved the final manuscript.
Corresponding author
Correspondence to Xianfeng Ding.
Ethics declarations
Competing interests
The authors declare that they have no competing interests.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Wang, X., Ding, X. & Qu, Q. A new nonmonotone adaptive trust region line search method for unconstrained optimization. J.Math.Industry 10, 13 (2020). https://doi.org/10.1186/s13362020000806
Received:
Accepted:
Published:
Keywords
 Unconstrained optimization
 Trust region method
 Nonmonotone adaptive
 Convergence