Figures
Abstract
In this paper, aiming at the unconstrained optimization problem, a new nonmonotone adaptive retrospective trust region line search method is presented, which takes advantages of multidimensional filter technique to increase the acceptance probability of the trial step. The new nonmonotone trust region ratio is presented, which based on the convex combination of nonmonotone trust region ratio and retrospective ratio. The global convergence and the superlinear convergence of the algorithm are shown in the right circumstances. Comparative numerical experiments show the better effective and robustness.
Citation: Ding X, Qu Q, Wang X (2021) A modified filter nonmonotone adaptive retrospective trust region method. PLoS ONE 16(6): e0253016. https://doi.org/10.1371/journal.pone.0253016
Editor: Roberto Barrio, University of Zaragoza, SPAIN
Received: February 26, 2021; Accepted: May 27, 2021; Published: June 17, 2021
Copyright: © 2021 Ding et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: Data for this study are third-party (see reference [20]) and can be accessed here: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.665.3152&rep=rep1&type=pdf.
Funding: International Cooperation Program of Chengdu City, Project number: 2020-GH02-00023-HZ.
Competing interests: The authors have declared that no competing interests exist.
1. Introduction
Consider the following unconstrained optimization problem
(1)
where f:Rn→R is a twice continuously differentiable function.
Trust region method and line search method are two effective methods to solve unconstrained optimization problems. At present, the both numerical methods mentioned above for solving nonlinear programming are widely used in many applications of various engineering design, automation, transportation, economic analysis, pattern recognition, artificial intelligence, network design and many other modern high-tech research and development.
Compared with the line search method, the trust region method has novel idea, strong convergence and stable numerical performance, see [1–4]. It can not only solve the well-conditioned problem quickly, but also solve the ill-conditioned optimization problem effectively. The basic idea of trust region method as follows: at the iteration point xk, the trial step dk is obtained by solving the subproblem.
(2)
gk denotes by ∇f(xk), Bk is a symmetric approximation of ∇2f(xk), Δk stands for the trust region radius, and ||.|| shows any vector norm, usually the Euclidean norm.
To evaluate the consistency between the quadratic model and the objective function, the most classical ratio, denoted by , is defined as follows,
(3)
The trial step dk is accepted whenever is closed to 1, that is to say xk+1 = xk+dk, and Δk is updated suitably. Otherwise,
is negative or positive but not close to 1, Δk is decreased and the subproblem should be resolved.
It is well-known that monotone techniques may not only lead to the rate of convergence slows down, especially in the presence of the narrow curved valley, but also the objective function is required to be decreased at each iteration. Considering this fact, it is most meaningful to study nonmonotone technique for improving algorithm. Obviously, the first nonmonotone technique was the alleged watchdog technique proposed by Chamberlain et al. [5], which was designed to overcome the Maratos effect. At the same time, nonmonotone techniques in [6–8] have attracted extensive attention from scholars, for example, Deng et al. [9] proposed a nonmonotone trust region method by replacing f(xk) in (3) with fl(k) given by
(4)
where fi = f(xi), m(0) = 0, 0≤m(k)≤min{N, m(k−1)+1}, and N≥0 is an integer constant. Deng et al. [9] proposed another nonmonotone trust region method with the following ratio.
Motivated by this, Ahookhoosh et al. [6] proposed a nonmonotone trust region method with the following ratio,
(6)
where
(7)
in which ηk∈[ηmin,ηmax] with ηmin∈[0,1) and ηmax∈[ηmin,1].
After that, Grippo et al. [10] proposed a nonmonotone line search technique for Newton’s method, that is,
However, Grippo’s nonmonotone technique has a drawback in that numerical performance is highly dependent on the choice of N. It makes more sense, for given σ∈(0,1), the step size αk is chosen so that
(8)
As is well-known, an appropriate updating strategy of trust region radius plays a valuable role in a manner that it may prominently affect the computational efficiency. Motivated by this, the versions of trust region radius have attracted considerable attention from many scholars in [11–13]. In order to avoid the gradient or the Hessian information is not precisely employed in the standard trust region, Zhang et al. [14] proposed a new scheme, which use the following trust region radius,
(9)
where c is a constant, p is an adjustment parameter,
, i∈N.
Despite effectiveness of the Zhang’s method, calculating an estimation of the inverse of the Hessian in each iteration which results in some additional computational costs. Qu et al. [15] refered to another adaptive strategy for updating the trust region radius as follows,
(10)
Here, the filter technique introduced by Fletcher and Leyffer [16], which avoids the difficulty of updating the penalty parameter in penalty functions. The filter is able to reject poor trial iterates and enforce global convergence from arbitrary starting points. In this case, it is worth mentioning that Gould et al. [17] proposed an algorithm by using filter technique for unconstrained optimization problems, the main idea is to accept the new iteration point as much as possible.
The filter is primarily composed of a series of gradient of iteration points, the so-called trial step is accepted by the filter, in fact, the corresponding gradient is accepted by the filter. Set ∇f(xk) = g(xk) = gk = (gk,1,gk,2,…,gk,n), gk,i (i = 1,2,…,n) is the i-th component of gk. We say that an iterate x1 dominates x2, whenever
The basic concept of multidimensional filter is a list of n–tuples of the form (gk,1,gk,2,…,gk,n). Suppose that
,
, there exists j∈(1,2,…n) such that
Subsequently, compared with [17], in order to maximize the possibility of acceptance of the trial point, we introduced an improved multidimensional filter , specifically as follows,
Set , a new trial point xk is acceptable if there exists j∈{1,2,3,…n}, such that
(11)
When an iteration point xk is accepted, we add gk to the filter and Meanwhile and remove the points which are dominated xk by from the filter.
The rest of the paper is organized as follows. Section 2 is devoted to describe the new filter nonmonotone adaptive restrospective trust region method in details. The global convergence and superlinear convergence are established in Section 3. Some preliminary numerical results are introduced in Section 4. Finally, some conclusions are summarized in Section 5.
2. Materials and methods
In this section, we propose a new filter nonmonotone adaptive retrospective trust region algorithm for solving unconstrained optimization problems. In order to reduce the computational cost, at the iteration point xk, the new nonmonotone ratio and retrospective ratios are introduced based on [13] as follows:
(12)
(13)
where εk∈[0,ηk]. As can be seen from the motivation of this nonmonotone term Rk, the better convergence result can be obtained by freely selecting the parameter ηk and εk. As a result of above discussion, a new nonmonotone ratio is introduced to improve computational efficiency through the convex combination of
and
, i,e.
(14)
where γ∈[γmin,γmax]⊂[0,1]. More exactly,
is used to determine whether the trial step is acceptable, while
is employed for updating trust region radius.
In the criteria presented earlier, the Hessian matrix is time-consuming to calculate and the convergence rate of the adaptive trust region method will be less efficient. Inder to reduce the workload and computational time, some simpler information of known iteration points can be used to reconstruct the regulation formula of trust region radius.
In this way, the trust region radius adjustment formula takes into account the gradient information of the function and the solution of the trust region subproblem, which ensures the calculation accuracy of the algorithm is not reduced. The improved adaptive trust region radius as follows,
(15)
Note that τk is computed by the following formula,
(16)
More formally, the new Algorithm is described as follows.
Algorithm 2.1 (Nonmonotone Adaptive Filter Retrospective Trust Region Method)
Step 0. Give x0∈Rn, Δmax>0, B0∈Rn×Rn, ε>0, , ϑ>0, 0<μ1<μ2<1, τ0>0, 0<β1<1<β2. Set F = ∅, k≔0.
Step 1. If k = 0, set Δk = min{τk‖gk‖,Δmax}, then go to Step 2.
If , then set τk = β1τk−1 and
, go to Step 2.
Else, compute and
by (13) and (14), respectively. τk is updated by (16) and
.
Step 2. Compute ‖gk‖. If ‖gk‖≤ε, then stop.
Step3. Solve the subproblem (2) to find the trial step dk, set .
Step4. Compute Rk and , respectively.
Step5. If , set
. Otherwise, compute
, at the same time, if
is accepted by the filter
, then
, add
into the
, otherwise, find the step size αk satisfying (8), set xk+1 = xk+αkdk.
Step6. Update the symmetric matrix Bk by (30). Set k = k+1, and go to Step 1.
In this way, it is not necessary to precisely solve the subproblem in Algorithm 2.1, as a result, an approximation of dk satisfies
(17)
(18)
where ϑ∈(0, 1)
Assumption 2.1.
- The level set
is closed and bounded. f(x) is a twice continuously differentiable on the level set L(x0), our claim that in a neighborhood N of Ω, ∇f(x) is Lipschitz continuous, there exists a positive constant L such that
- The matrix Bk is uniformly bounded, i.e., there exists a constant M1>0 such that ‖Bk‖≤M1.
3. Convergence analysis
In order to ease of operation, the following index sets are defined as follows: ,
, S = {k|xk+1 = xk+dk}. Then,
. When k∉S, we have xk+1 = xk+αkdk.
Lemma 3.1. For all k, it makes sense for me that
(19)
Proof. Motivated by the Taylor’s expansion and Assumption 2.1, we have
This completes the proof of the Lemma 3.1.
Lemma 3.2. Suppose that Assumption 2.1 holds, the sequence {xk} is generated by Algorithm 2.1. Moreover, assume that there exists a constant 0<ε<1, so that ‖gk‖>ε, Then, for any k, there exists a nonnegative integer p such that xk+p+1 is a successful iteration point, i.e., .
Proof. On the contrary, that is, assume that there is an iteration k at which xk+p+1 is unsuccessful, for an arbitrary nonnegative integer p, we have
(20)
This clearly implies that
(21)
Thus, according to 0<β1<1 and Eq (21), we have
(22)
Now, according to Lemma 3.1, and Eq (17), we get
Following the definition of Rk, we get Rk≥ηkfk+(1−ηk)fk = fk. Thus, for sufficiently large p, we have
(23)
which contradicts (20). This completes the proof of the Lemma 3.3.
Lemma 3.3. Suppose that the infinite sequence {xk} is generated by Algorithm 2.1. The number of successful iterations is infinite, that is, |S| = +∞. Then, we have {xk}⊂L(x0), meanwhile, the sequence {fl(k)} is not monotonically increasing and convergent.
Proof. The proof follows the process of lemma 3 and Lemma 4 in [15].
Lemma 3.4. Suppose that Assumption 2.1 holds, and there exists a constant ε such that ‖gk‖≥ε for all k. Therefore, there is a constant υ such that
Proof. The proof is similar to the proof of Theorem 6.4.3 in [18].
Theorem 3.1. Suppose that Assumption 2.1 holds, and Algorithm 2.1 generates an infinite sequence {xk} which satisfies
(24)
Proof. The standard of proof can be classified into the following both cases.
Case 1. There is no limit to the number of successful iterations, that is |S| = +∞. Meanwhile, there are many infinitely filter iterations, i.e, |A| = +∞.
Prove by contradiction, assuming that Eq (24) is not true, then there exists a positive constant ε such that ‖gk‖>ε. On account of Assumption 2.1, {‖gk‖} is bounded. Set the index in set A is the sequence {ki}.Therefore, there exists a subsequence {kt}⊆{ki} which satisfies
(25)
This fact, along with the definition of kt, leads to is accepted by the filter. Then there exists j∈{1,2,…,n}, for ∀ t>1, that is to say
(26)
By the fact that , as t is sufficiently large, which is a contradiction.
Case 2. There is no limit to the number of successful iterations, that is |S| = +∞. Meanwhile, there are many finite filter iterations, i.e, |A|<+∞.
Contrarily, suppose that the Eq (24) is not true. Then there exists a positive constant 0<ε<1 such that ‖gk‖>ε, for all k.
As a consequence of |A|<+∞, for sufficiently large k∈S, we have . Set
Following the definition of Rk, Lemma 3.4 and Eq (17), without loss of generality, we have
(27)
Obviously, when p is fixed and k→∞, is obtained by ξk→∞. It’s not surprising that the left end of Eq (27) has no lower bound. Now, we have
(28)
It indicates that there is a lower limit on the left end of Eq (28), which is a contradiction with Eq (27).
Theorem 3.2. Suppose that Assumption 2.1 holds, and {xk} is the sequence generated by Algorithm 2.1 converges to x*. On the other hand, is a solution of the subproblem, and ∇2f(x*) is a positive definite matrix. Bk satisfies the following condition
(29)
Then the sequence {xk} converges to x* superlinearly.
Proof. The proof follows the same path as given in the proof of Theorem 4.1 in [19].
4. Preliminary numerical experiments
In this section, our purpose is to investigate the computational results of algorithm 2.1 on some middle-large size test problems from Andrei [20]. All algorithms are implemented in MATLAB (R2018a) on a PC Intel(R) Core(TM) i7-4558u CPU @2.80GHz 2.80 GHz 4.00 GB RAM memory and double precision format, the following notations represent the different algorithms.
ANTRL: ANTRL method is denoted by Ahookhosh et al. in [6];
NFTR: NFTR method is denoted by [17];
WA FTR: WAFTR method is expressed by Qu et al. in [15];
NAFRTR-1: Algorithm 2.1 with Δ0 = 1;
NAFRTR-2: Algorithm 2.1 with Δ0 = 10;
NAFRTR-3: Algorithm 2.1 with Δ0 = 100;
As is known to all, BFGS correction is one of the most important methods in quasi-Newton method. Several improved BFGS methods are given in [21, 22], and the convergence theory has been well established. More specifically, Bk+1 is revised by [23]
(30)
where
, and ρk = 2(f(xk)−f(xk+1))+(g(xk+1)+g(xk))Tdk. It is easy to know that the formula has not only gradient value information, but also function value information.
The parameters of these algorithms are exploited identically as follows: μ1 = 0.25, μ2 = 0.75, N = 5, β1 = 0.25, β2 = 1.5, Δmax = 100. It is worth mentioning that the stopping criterion is either ‖gk‖≤10−6, or the number of iterations exceeds 10,000. As menthoded in [6], set η0 = 0.25, the updating principle of ηk follows the following formula,
(31)
For the sake of simplicity, we draw efficiency comparisons involved in the number of function evaluations (nf), the number of gradient function evaluations (ni), and the running time (CPU) by using the Dolan-Moré performance profile, in particular, the particulars are detailed in [24]. To account for this, we can choose a performance index as a comparison metric between the above algorithms. For every τ≥1, the proportion ρ(τ) of the test problems is given by the performance profile. The performance of each considered algorithmic variant was the best within a range of τ of the best as well.
In this way, it has to be noted that the selection of initial trust region radius has a great influence in the efficiency of the algorithm. Tables 1 and 2 and Figs 1–3 imply that NAFRTR-2 solves more than 95% of the problems with the minimum number of failures compared with the other two initial radius. At a glance, we choose the initial trust region radius Δ0 = 10, as a given parameter for the following algorithm.
Moreover, as shown in Figs 4–6 and Table 2, the NAFRTR-2 is the best solver, in the field of CPU, nf and ni, about 98% problems respectively. As it is clear that NAFRTR-2 is effective and obtains better performance profiles by comparing with ANTRL, NFTR and WAFTR. Based on the above main observations, the modified trust region method comes out to be fairly effective for unconstrained optimization.
Moreover, as shown in Figs 4–6 and Table 3, the RNATR-A is the best solver, in the field of CPU, nf and ni, about 98% problems respectively. As it is clear that NAFRTR-2 is effective and obtains better performance profiles by comparing with ANTRL, NFTR and WAFTR. Based on the above main observations, the modified trust region method comes out to be fairly effective for unconstrained optimization.
5. Conclusions
In this paper, making proper use of the filter technique, a new trust region method has been proposed that the trust region radius takes into account the gradient information of the function and the solution of the trust region subproblem. To some extent, it is more reasonable to adopt the convex combination of nonmonotone trust region ratio and retrospective ratio. In addition, the approximation of the Hessian matrix is updated by an improved quasi-Newton formula. From the theoretical view, the new algorithm keeps global convergent and superlinear convergent. Numerical experiments show the availability of the new proposed algorithm in accordance with the Dolan-Moré performance profile.
Acknowledgments
At the point of finishing this paper, I’d like to express my sincere thanks to all those who have lent me hands in the course of my writing this paper. Xinyi Wang proposes the new trust region algorithm for unconstrained optimization. Quan Qu proves the global convergence and superlinear convergence of the algorithm. Xianfeng Ding conceives of the study and participated in its design and coordination. All authors read and approved the final manuscript.
References
- 1.
Nocedal J, Yuan YX. Combining trust region and line search techniques. In: Yuan Y.X. (ed.) Advances in Nonlinear Programming. 1998;153–175.
- 2. Cui Z, Wu B. A new modified nonmonotone adaptive trust region method for unconstrained optimization. Comput. Optim. Appl.2012;53(3);795–806.
- 3. Zhang, Wang Y. A new trust region method for nonlinear equations. Math. Meth. Oper. Res.2003; 58;283–298.
- 4. Wachter A, Biegler LT. Line search filter methods for nonlinear programming and global convergence. SIAM J. Optim.2005;16;1–31.
- 5. Chamberlain RM, Powell MJD, Lemarechal C, Pedersen HC. The watchdog technique for forcing convergence in algorithm for constrained optimization. Math. Program.Stud.1982;16;1–17.
- 6. Ahookhoosh M, Amini K, Peyghami M. A nonmonotone trust region line search method for large scale unconstrained optimization. Appl. Math. Model.2012;36(1);478–487.
- 7. Zhang HC, Hager WW. A nonmonotone line search technique and its application to unconstrained optimization. SIAM Journal on Optimization.2004;14(4);1043–1056.
- 8. Gu NZ, Mo JT. Incorporating nonmonotone strategies into the trust region for unconstrained optimization. Computers and Mathematics with Applications. Computers and Mathematics with Applications.2008;55(9);2158–2172.
- 9. Deng NY, Xiao Y, Zhou FJ. Nonmonotonic trust region algorithm. J. Optim. Theory Appl.1993;76;259–285.
- 10. Grippo L, Lamparillo F, Lucidi S. A nonmonotone line search technique for Newton’s method. Siam J. Numer. Anal.1986;23;707–716.
- 11. Zhou S, Yuan GL, Cui ZR. A new adaptive trust region algorithm for optimization problems. Acta Mathematica Scientia.2018;38B(2);479–496.
- 12. Kimiaei M. A new class of nonmonotone adaptive trust-region methods for nonlinear equations with box constraints. Calcolo.2017;54;769–812.
- 13. Amini K, Shiker Mushtak AK, Kimiaei M. A line search trust-region algorithm with nonmonotone adaptive radius for a system of nonlinear equations. Q. J. Oper. Res.2016;4;132–152.
- 14. Zhang XS, Zhang JL, Liao LZ. An adaptive trust region method and its Convergence. Sci. China.2002;45;620–631.
- 15. Qu Q, Ding XF, Wang XY. A Filter and Nonmonotone Adaptive Trust Region Line Search Method for Unconstrained Optimization. Symmetry-Basel.2020;12;656.
- 16. Fletcher R, Leyffer S. Nonlinear programming without a penalty function. Math. Program.2002;91;239–269.
- 17. Gould NI, Sainvitu C, Toint PL. A filter-trust-region method for unconstrained optimization. Siam J. Optim.2005;16;341–357.
- 18.
Conn AR, Gould NIM, Toint PL. Trust-Region Methods. MPS/SIAM Series on Optimization, SIAM, Philadelphia.2000.
- 19. Zhou Sheng, Gonglin Yuan. A NEW ADAPTIVE TRUST REGION ALGORITHM FOR OPTIMIZATION PROBLEMS. Acta Mathematica Scientia.2018;38(2);479–496.
- 20. Andrei N. An unconstrained optimization test functions collection. Environ. Sci. Technol. 2008;28;6552–6558.
- 21. D Li M Fukushima. A modified BFGS method and its global convergence in nonconvex minimization. J Comp. Appl. Math.2001;129;15–35.
- 22. Wei Z, Li G, Qi L. New quasi-Newton methods for unconstrained optimization problems. Appl. Math. Comput.2006;175(2);1156–1188.
- 23. Yuan G, Wei Z. Convergence analysis of a modified BFGS method on convex minimizations. Comput. Optim. Appl.2010;47(2);237–255.
- 24. Dolan E.D, Mor´e J.J. Benchmarking optimization software with performance profiles. Math. Program.2002;91;201–213.