Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

An active-set algorithm for solving large-scale nonsmooth optimization models with box constraints

  • Yong Li,

    Roles Methodology, Writing – original draft

    Affiliation Department of Mathematics, Baise University, Baise, Guangxi, 533000, China

  • Gonglin Yuan ,

    Roles Funding acquisition, Investigation, Methodology

    glyuan@gxu.edu.cn

    Affiliation Guangxi Colleges and Universities Key Laboratory of Mathematics and Its Applications, College of Mathematics and Information Science, Guangxi University, Nanning, Guangxi 530004, China

  • Zhou Sheng

    Roles Methodology, Writing – original draft, Writing – review & editing

    Affiliation Guangxi Colleges and Universities Key Laboratory of Mathematics and Its Applications, College of Mathematics and Information Science, Guangxi University, Nanning, Guangxi 530004, China

Abstract

It is well known that the active set algorithm is very effective for smooth box constrained optimization. Many achievements have been obtained in this field. We extend the active set method to nonsmooth box constrained optimization problems, using the Moreau-Yosida regularization technique to make the objective function smooth. A limited memory BFGS method is introduced to decrease the workload of the computer. The presented algorithm has these properties: (1) all iterates are feasible and the sequence of objective functions is decreasing; (2) rapid changes in the active set are allowed; (3) the subproblem is a lower dimensional system of linear equations. The global convergence of the new method is established under suitable conditions and numerical results show that the method is effective for large-scale nonsmooth problems (5,000 variables).

Introduction

Consider (1) where f: ℜn → ℜ is a possibly nonsmooth convex function, K = {xlxu}, the vectors l and u represent lower and upper bounds on the variables, and n is the number of variables. Similar problems are discussed by Fukushima [1, 2], in which equality constraints are considered and a penalty strategy is used. The form of problem (1) can be viewed as an extension of the linearly constrained convex nonsmooth problem considered in, e.g., [3, 4] from linear to possibly nonlinear. In fact, many fields including finance, engineering, management, biology, and medicine can convert to the optimization models (1) (see [59] in detail).

Generally, nonsmooth problems are very difficult to solve even when they are unconstrained. Derivative-free methods, like Powell’s method [10] or genetic algorithms [11], may be unreliable and become inefficient whenever the dimension of the problem increases. The direct application of smooth gradient-based methods to nonsmooth problems may lead to a failure in optimality conditions, in convergence, or in gradient approximation [12]. Wolfe [13] and Lemaréchal [14] initiated a giant stride forward in nonsmooth optimization by the bundle concept. Kiwiel [15] proposed a bundle variant that is close to the bundle trust iteration method [16]. Some good results about the bundle technique can be found in [1719] etc. At the moment, various versions of bundle methods are regarded as the most effective and reliable methods for nonsmooth optimization. Bundle methods are efficient for small- and medium-scale problems. This is explained by the fact that bundle methods need relatively large bundles to be capable of solving the problems efficiently [17]. Therefore, special tools for solving nonsmooth optimization problems are needed. At present, Haarala et al. (see [20, 21] etc.) introduce the limited memory bundle methods for large scale nonsmooth unconstrained and constrained minimization, which are a hybrid of the variable metric bundle methods and the limited memory variable metric methods and some good results are obtained. More related literature can be found in [2226]. The test problems can have thousands of decision variables. Yuan et al. [2731] make some studies where nonsmooth problems with the largest dimension 100,000 were solved in the unconstrained cases [28].

The active-set method can be generalized easily when the objective function is nonsmooth. For example, Sreedharan [32] extends the method developed in [33] to solve nonsmooth problems with a special objective function and inequality constraint. Also, it is quite easy to generalize the ε-active set method to the nondifferentiable case (see, e.g., [34]). In this paper we use the active-set method to solve (1) when the objective function f is convex but not necessarily differentiable. Convexity, which is not essential for our study, is assumed only for simplicity. For the objective function, we first use the Moreau-Yosida regularization technique to make it smooth. Then the active-set limited memory BFGS (L-BFGS) technique is proposed to solve it. Global convergence is established under suitable conditions. The main features of the proposed method are as follows.

  1. The iterates are feasible; large changes are allowed in the active set; the subproblem has lower dimension; and the objective function sequence {fMY (xk, εk)} is decreasing.
  2. The L-BFGS method uses function and gradient values.
  3. Global convergence is established under suitable conditions.
  4. Numerical results show that the method is effective for large-scale problems (up to 5,000 variables).

The paper is organized as follows. In the next section, we briefly review some nonsmooth analysis, a BFGS method and the L-BFGS method for unconstrained optimization, and the motivation for using these techniques. In Section 3, we describe the active-set algorithm with L-BFGS update for (1). In Section 4, global convergence is established under suitable conditions. Numerical results are reported in Section 5, and conclusions are given in the last section.

Nonsmooth analysis and the L-BFGS update

This section states some results on nonsmooth analysis, a modified BFGS formula, and a L-BFGS formula for unconstrained optimization problems.

Some results of convex analysis and nonsmooth analysis

Let fMY: ℜn → ℜ be the so-called Moreau-Yosida regularization of f defined by (2) where ‖⋅‖ denotes the Euclidean norm of vectors and λ is a positive parameter. Then it is not difficult to see that problem (1) is equivalent to the problem (3) The function fMY is a differentiable convex function and has a Lipschitz continuous gradient even when f is nondifferentiable. Under some reasonable conditions, using the following properties of fMY (x) and assuming ∇fMY (x) is globally Lipschitz continuous, the gradient ∇fMY (x) is semismooth (see [35, 36] etc.). By these properties, many algorithms have been given for solving (3) (see [37] etc.) when K = ℜn. Some features of fMY (x) can be seen in [3840] et al. Set and denote p(x) = argmin θ(z). Since θ(z) is strongly convex, it is easy to deduce that p(x) is well-defined and unique. Then fMY (x) in (2) can be rewritten as

The generalized Jacobian of fMY (x) and the property of BD-regular can be found in [41, 42], respectively. Here some properties are listed without proof.

(i) The function fMY is finite-valued, convex, and everywhere differentiable. If g(x) = ∇fMY (x), then g: ℜn → ℜn is globally Lipschitz continuous: (4) where (5)

(ii) g is BD-regular at x means that all matrices V ∈ ∂Bg(x) are nonsingular. Then there exist constants μ1 > 0, μ2 > 0 and a neighborhood Ω of x satisfying It is easy to find that p(x) of the minimizer for θ(z) is difficult or even impossible to solve exactly. Fortunately, for each x ∈ ℜn and any ε > 0, there exists a vector p(x, ε) ∈ ℜn satisfying (6) Thus, we can use p(x, ε) to define approximations of fMY (x) and g(x) by (7) and (8) respectively. Some implementable algorithms to find such p(x, ε) for a nondifferentiable convex function are introduced in [43]. A remarkable feature of fMY (x, ε) and g(x, ε) given by [35] is introduced, which show that, by choosing parameter ε small enough, we can compute approximations fMY (x, ε) and g(x, ε) closing to fMY (x) and g(x) respectively.

Proposition 1. Suppose that fMY (x, ε) and g(x, ε) are defined by (7) and (8), respectively. Let p(x, ε) be a vector satisfying (6). Then (9) (10) and (11) hold.

A modified BFGS formula and the L-BFGS formula

The BFGS method is one of the most effective quasi-Newton methods for unconstrained optimization problems (UNP) minx∈ℜn h(x), where h(x): ℜn → ℜ is continuously differentiable. The famous BFGS quasi-Newton formula is (12) where sk = xk+1xk, yk = ∇h(xk+1) − ∇h(xk), and it is easy to see that the quasi-Newton equation (13) holds. If Hk is the inverse of Bk, we get the inverse update formula of (12): (14) which is the dual form of the DFP update formula in the sense that HkBk, Hk+1Bk+1, and skyk. The L-BFGS method is an adaptation of the BFGS method to large-scale problems (see [4446] in detail). Instead of storing the matrices Hk, at every iteration xk the method stores a small number, say m, of correction pairs {si, yi}, i = k − 1, …, km. Let and . The L-BFGS update has the form (15) which can provide a fast rate of linear convergence and requires minimal storage. From the BFGS formula (14) and the L-BFGS update (15), it is not difficult to find that both of these formulas contain only the gradient information of the objective function, while the function values available are neglected. Some modified quasi-Newton formulas using both gradient and function information are presented (e.g. [47, 48]). Wei et al. [49] also gave a new quasi-Newton equation where and the corresponding BFGS update formula is defined by (16) where . The quasi-Newton formula (16) contains both gradient and function information; moreover the modified BFGS update formula possesses a higher order approximation of ∇2h(x) than that of the standard BFGS update (see [47, 49] in detail).

Global convergence and superlinear convergence of the quasi-Newton method with (16) have been established for uniformly convex functions [47, 49], but fails for general convex functions. One of the main reasons lies in the condition that may not hold for general convex functions. To overcome the weaknesses, Yuan and Wei [50] presented a modified named . The idea of paper [50] is based on the following two cases:

  1. Case i: It follows from Ak > 0 that (17)
  2. Case ii: On the other hand, if Ak < 0, it is easy to get (18) where the second inequality follows the definition of the convexity of h(x), which means that holds. This modified BFGS formula with possesses global convergence and superlinear convergence for general convex functions. However, its applications in L-BFGS and nonsmooth optimization have not been widely studied.

This article will attempt to do this. The following gives the modified L-BFGS formula for (3) with form (19) where , , δk = g(xk+1, εk+1) − g(xk, εk), and . It is clear that the modified L-BFGS formula (19) contains both function and gradient information at the current and previous step if . In the following, the matrix Hk is generated by (19). This is very costly for even moderately large nonsmooth problems with box constraints, since the limited memory update is used to store , update it as a full matrix, reduce in the free subspace, and the set of active constraints changes at the first finite steps.

Inspired by the Moreau-Yosida regularization and the modified method of [50], we combine them with the limited memory technique, and use them to solve box constrained optimization with nonsmooth objective function. This paper can be regarded as an improvement of the method in [51] with extension to nonsmooth objective functions. Comparing with [51], at each step of our method, a lower-dimensional system of nonlinear equations and nonsmooth objective function needs to be solved. The method is also similar to the algorithm in [44], but, at each iteration, we use an identification technique and solve nonsmooth optimization problems.

L-BFGS active-set algorithm

The following assumptions are needed to obtain convergence.

Assumption A The level set ϕ = {x ∈ ℜnfMY (x) ≤ fMY (x0)} ⋂ K is compact.

Assumption B fMY is bounded from below and the sequence {εk} converges to zero. We first solve (3) and adapt its solution to problem (1). With the feasible region K = {x ∈ ℜn: lixiui, i = 1, …, n}, a vector is said to be a stationary point for problem (3) if the relations (20) hold. Consider the relations (21) where the scalar tends to zero. By the definition of and , we have By (10), if we get Thus, if (21) holds, it is easy to deduce that (20) holds. In the following, without special note, we concentrate on the relation (21) and regard it as the stationary point condition. In the following, we always suppose that the point xk is consistent with εk and is consistent with without special remark.

Similar to normal numerical optimization methods, the iteration formula is (22) where {xk} ⊆ K = {x ∈ ℜn: lixiui, i = 1, …, n}, xi is the ith element of x, dk is a descent direction of fMY at xk, and αk is a step length determined by the Armijo line search technique (23) where , αk = 2i with the smallest integer i = 0, 1, 2, …, and the sequence {εk} satisfies εk > εk+1 > 0. Before we give the direction definition, we introduce the procedure that estimates the active bounds. Suppose that is a stationary point of problem (1). Let the associated active constraint set be (24) and the set of free variables be Then condition (21) can be stated in the form (25) where is the ith element of . Let ai and bi be nonnegative continuous bounded from above on K, satisfying, if xi = li or xi = ui then ai(x) > 0 or bi(x) > 0, respectively. Define the following approximation Υ(x), Γ(x) and Λ(x) to and , respectively: (26)

Theorem 1. For any feasible x, Υ(x) ∩ Λ(x) = ∅. Furthermore, if is a stationary point of problem (3) where strict complementarity holds, then there exists a neighborhood Ψ of such that for every feasible point x in this neighborhood we have (27)

Proof. For any feasible x, if k ∈ Υ(x), it is obvious that gk(x, ε) ≥ 0 holds. Suppose that k ∈ Λ(x); then we have ukxkuk + bk(x)gk(x, ε) ≥ uk. This implies that lk = xk = uk and gk(x, ε) = 0, which is a contradiction. Thus Υ(x) ∩ Λ(x) = ∅.

Now we prove the second conclusion. If , then by the definition of , . Since ai is nonnegative, then . Since both ai and gi are continuous in , we deduce that i ∈ Υ(x). Thus we have .

Otherwise if i ∈ Υ(x), then by the definition of Υ(x), ai(x)gi(x, ε) ≥ xili ≥ 0. Since ai is nonnegative, gi(x, ε) ≥ 0. Since gi is continuous in , we deduce that . Thus we get .

Therefore, we obtain . Analogously, we can conclude that and . □

This theorem proved that Υ(x), Γ(x) and Λ(x) are “good” estimates of and . The proof can also be found in [52].

The search direction is chosen as (28) where (29) with (30) and (31)Υk = Υ(xk), Γk = Γ(xk), Λk = Λ(xk), is an approximation to the reduced inverse Hessian matrix, Hk is an approximation of the full space inverse Hessian matrix, Z is the matrix whose columns are {eii ∈ Γk}, and ei is the ith column of the identity matrix in ℜn×n. If the strict complementarity condition holds, is a strict interior point of , and is always positive (see [53] in detail).

Based on the above discussions, we state our algorithm as follows.

Algorithm 1. (Act-L-BFGS-Alt-Non)

Step 0: Given x0 ∈ Ψ, ε0 ∈ (0, 1), and positive integer m, the “basic matrix” θI, set k = 0.

Step 1: Use (26) to determine Υk = Υ(xk), Λk = Λ(xk), and Γk = Γ(xk).

Step 2: Compute dk by (28).

Step 3: If dk = 0, stop.

Step 4: Choose 0 < εk+1 < εk and αk = 2i, where i is the smallest integer of {0, 1, 2, …} such that the line search rule (23) holds.

Step 5: Let xk+1 = xk + αkdk and update Hk by (19).

Step 6: Set kk + 1 and go to Step 1.

Global convergence

In order to prove global convergence of Algorithm 1, the following further assumption is needed.

Assumption C. There exist positive scalars ς1, ς2 such that any sequence of matrices satisfy

The following lemma shows that dk ≠ 0 is determined by (28) satisfies the sufficiently descent property.

Lemma 1. Suppose that dk ≠ 0 is determined by (28) and xk ∈ Ψ. Then the inequality (32) holds with a constant ω > 0.

Proof. We prove this result by three cases.

Case 1. i ∈ Υk. By xk ∈ Ψ and we get implying ai(xk) > 0 and where Ai is an upper bound on ai(x) in Ψ.

Case ii. i ∈ Λk. As in Case i, it is easy to get which means that bi(xk) > 0 and where Bi is an lower bound on bi(x) in Ψ.

Case iii. i ∈ Γk. By (29), (30), and with symmetric positive definite, we obtain Then By Assumption C, we have . So we have Letting completes the proof. □

The following lemma is similar to [52], so we state it without proof.

Lemma 2. If the conditions in Lemma 1 hold, then xk is a stationary point of (3) if and only if dk = 0. Moreover, is a stationary point of problem (3) when the subsequences and {dk}K → 0 as k → ∞.

Now we establish the global convergence theorem of Algorithm 1.

Theorem 2. Let the sequence {xk} be generated by Algorithm 1 under Assumptions A, B, and C. Then the sequence {xk} at least has a limit point, and every limit point is a stationary point for problem (3). □

Proof. If dk = 0, by Lemma 2, the theorem obviously holds. Suppose that dk ≠ 0. By Lemma 1, (23), and Assumption B, we obtain which shows that the sequence {fMY (xk, εk)} is descending. So {xk} has at least a limit point. Suppose that is a limit point of {xk}. It is sufficient to prove that is a stationary point for problem (3). Lemma 2 means that we only show that the sequence {dk} → 0. Without loss of generality, we suppose that and . By the property of limit, it is clear that is feasible.

By the feasibility of , Theorem 1, and the positive functions ai(x) and bi(x) with any possible choice, we have It follows from (27) that By (30), we get Thus . By Lemma 2, we deduce that is a stationary point for problem (3).

Remark. If the condition holds, by (11) it is not difficult to deduce that as . By the convexity of fMY (x), the point is the optimal solution.

Numerical results

In this section, we test the numerical behavior of Algorithm 1. All codes were written in MATLAB 7.6.0 and run on a PC Core 2 Duo CPU, E7500 @2.93GHz with 2GB memory and Windows XP operating system.

Initialization

Our experiments are performed on a set of the nonlinear box-constrained nonsmooth problems from Karmitsa [54] which have given initial points. We choose σ = 0.1, ai(x) = bi(x) = 10−5 in (26), θ = 1 and the “basic matrix” to be the identity matrix I in the limited memory BFGS method, and m = 5. εk = 1/(NF + 1)2(NF is the function number). For subproblem (2), we use the PRP conjugate gradient algorithm, where the iteration number and the function number are added to the main program. Since the line search cannot always ensure the descent condition uphill search direction may occur in the numerical experiments. In this case, the line search rule may fail. In order to avoid it, the stepsize αk is accepted if the search number is more than six in the line search. The following Himmelblau stopping rule isp used: If ∣ fMY (xk, εk) ∣ > 10−4, let ; Otherwise, let stop1 = ∣ fMY (xk, εk) − fMY (xk+1, εk+1) ∣. If stop1 < 10−4, the program stop. We also stop the program if the iteration number is more than 5000, and the corresponding method is considered to have failed.

Results

In this section, the test results of our algorithm for some box-constrained nonsmooth problems are reported. The columns of Table 1 have the following meaning:

Dim: the dimension of the problem;    NI: the total number of iterations;

NF: the total number of function values;  cpu: the cpu time in second;

: denotes the function value at the point when the program is stopped.

The numerical results indicate that our algorithm is effective for these box constrained nonsmooth problems. The iteration number and function number do not change obviously with the increasing dimension. Problems Chained CB3 I and Chained CB3 II, and Chained Crescent I and Chained Crescent II, have many similar properties and have the same optimal values. From Table 1, we see that the final function value is close to the optimal value, especially for Problems Chained CB3 I and Chained CB3 II, and Chained Crescent I and Chained Crescent II, whose final function values are the same respectively, which shows that the presented method is stable. The cpu time is acceptable for this algorithm, although the iteration number is large for some problems. In the experiments, we find that different stopping rules influence the iteration number and the function number, but not the final function value.

To show the sequence of function values, we give the line chart graph (Figs 1, 2, 3 and 4) for problems Generalization of MAXQ (Fig 1), Chained LQ (Fig 2), Generalization of Brown function 2 (Fig 3), and Chained Crescent I (Fig 4) withp 5,000 variables. We see that the functions are descending. The descent property of the first two steps is very obvious, and these two steps make the function value close to the optimal value. However, the descent is not obvious for other steps. In our opinion, the reason is that the stopping rules are not ideal. Overall, the numerical performance of the proposed algorithm is reasonable for these large-scale nonsmooth problems. We conclude that the method provides a valid approach for solving large-scale box-constrained nonsmooth problems.

thumbnail
Fig 3. Generalization of Brown function with 5,000 variables.

https://doi.org/10.1371/journal.pone.0189290.g003

Conclusion

In this paper, a modified L-BFGS method was presented for solving box constrained nonsmooth optimization problems. This method uses both gradient information and function values in the L-BFGS update formula. The proposed algorithm possesses global convergence.

(i) It is well known that nonsmooth problems are difficult to solve even when the objective function is unconstrained, especially for large-scale nonsmooth problems. To overcome this drawback, the Moreau-Yosida regularization technique is proposed to make the objective function smooth. Moreover, the L-BFGS method is introduced to reduce the computation and make the active-set algorithm suitable for solving large-scale nonsmooth problems.

(ii) The bundle method is one of the most effective methods for nonsmooth problems. However, its efficiencies are applied to small- and medium-scale problems. In order to find more effective methods for large-scale nonsmooth problems, the bundle L-BFGS algorithms are presented by many scholars, where the dimension can be 1,000 variables. In this paper, the given algorithm can successfully solve 1,000-5,000 variables nonsmooth problems with bound constraints.

(iii) In experiments, we find the different stopping rules influence the iteration numbers and the function numbers but not the final functions. Moreover, from Figs 14, we see that the first two iteration steps are the most effective, which shows that the proposed algorithm is effective for large-scale nonsmooth box constrained problems. In our opinion, the reason lies in the stopping criteria. Better rules should be found.

(iv) Considering the above discussions, we think there are at least four issues that could lead to improvements. The first that should be considered is the choice of the parameters in the active-set identification technique. The parameters used are not the only choice. Another important point that should be further investigated is the adoption of the gradient projection technique. The third is adjustment of the constant m in the L-BFGS update formula. The last is the most important one, from the numerical experiments, namely whether are there other optimality conditions and convergence conditions in the nonsmooth problems? We will study these aspects in our future works.

Although the proposed method does not obtain significant development that we expected, we feel that its performance is noticeable.

Acknowledgments

The authors are very grateful to the anonymous referees for their suggestions, which have helped improve the numerical results and presentation of the paper. This work is supported by the National Natural Science Foundation of China (Grant No. 11661009 and 11261006), the Guangxi Science Fund for Distinguished Young Scholars (Grant No. 2015GXNSFGA139001), the Guangxi Fund of Young and Middle-aged Teachers for the Basic Ability Promotion Project(No. 2017KY0019) and the Guangxi Natural Science Key Fund (No. 2017GXNSFDA198046).

References

  1. 1. Fukushima M. A successive quadratic programming method for a class of constrained nonsmooth optimization problems, Mathematical Programming, 49, 231–251 (1991)
  2. 2. Yuan G., Wei Z., Zhang M. An active-set projected trust region algorithm for box constrained optimization problems. Journal of Systems Science and Complexity, 28, 1128–1147 (2015)
  3. 3. Kiwiel K. C. An algorithm for linearly constrained convex nondifferentiable minimization problems, Journal of Mathematical Analysis and Applications, 105, 452–465 (1985)
  4. 4. Panier E. R. An active set method for solving linearly constrained nonsmooth optimization problems, Mathematical Programming, 37, 269–292 (1987)
  5. 5. Yuan G., Wei Z., Lu X. Global convergence of BFGS and PRP methods under a modified weak Wolfe-Powell line search, Applied Mathematical Modelling, 47, 811–825 (2017)
  6. 6. Yuan G., Sheng Z., Wang B. et al. The global convergence of a modified BFGS method for nonconvex functions, Journal of Computational and Applied Mathematics, 327, 274–294 (2018)
  7. 7. Xu C, Zhang J. A survey of quasi-Newton equations and quasi-Newton methods for optimization. Annals of Operations Research, 103(1–4), 213–234 (2001)
  8. 8. Li G, Tang C, Wei Z. New conjugacy condition and related new conjugate gradient methods for unconstrained optimization. Journal of Computational and Applied Mathematics, 202(2), 523–539 (2007)
  9. 9. Sheng Z., Yuan G., Cui Z. et al. An adaptive trust region algorithm for large residual nonsmooth least squares problems. Journal of Industrial and Management Optimization, (2017)
  10. 10. Fletcher R. Practical Methods of Optimization, 2nd ed. John Wiley and Sons, Chichester, (1987)
  11. 11. Goldberg D. E. Genetic Algorithms in Search, Optimization, and Machine Learning. Addison-Wesley, Reading, MA, (1998)
  12. 12. Lemaréchal C. Nondifferentiable optimization. In Optimization, Nemhauser G. L., Rinnooy Kan A. H. G., and Todd M. J., Eds. Elsevier North-Holland, Inc., New York, 529–572, (1989)
  13. 13. Wolfe P. A method of conjugate subgradients for minimizing nondifferentiable convex functions, Mathematical Programming Study, 3, 145–173 (1975)
  14. 14. Lemaréchal, C. Extensions diverses des méthodes de gradient et applications, Thèse d’Etat, Paris, (1980)
  15. 15. Kiwiel K. C. Proximity control in bundle methods for convex nondifferentiable optimization, Mathematical Programming, 46, 105–122 (1990)
  16. 16. Schramm H., Zowe J. A version of the bundle idea for minimizing a nonsmooth function: conceptual idea, convergence analysis, numerical results, SIAM Journal on Optimization, 2, 121–152 (1992)
  17. 17. Kiwiel K. C. Methods of descent for nondifferentiable optimization, lecture notes in Mathematics 1133, Springer-Verlag, Berlin, New York, (1985)
  18. 18. Kiwiel K. C. Proximal level bundle methods for convex nondifferentiable optimization, saddle-point problems and variational inequalities, Mathematical Programming, 69, 89–109 (1995)
  19. 19. Schramm H. Eine kombination yon bundle-und trust-region-verfahren zur Lösung nicht- differenzierbare optimierungsprobleme, Bayreuther Mathematische Schriften, Heft 30, Bayreuth, Germany, (1989)
  20. 20. Haarala M., Mäkelä M. M. Limited memory bundle algorithm for large bound constrained nonsmooth minimization problems, Reports of the Department of Mathematical Information Technology, Series b. Scientific Computing, No. B. 1/2006, University of Jyväskylä, Finland, (2006)
  21. 21. Haarala M., Miettinen K., Mäkelä M. M. New limited memory bundle method for large-scale nonsmooth optimization, Optimization Methods and Software, 19, 673–692 (2004)
  22. 22. Floudas C. A., Pardalos P. M. Encyclopedia of optimization, Springer Science & Business Media, (2001)
  23. 23. Karmitsa N. Limited memory bundle method for large bound constrained nonsmooth optimization, Proceedings of the International Conference on Engineering Optimization, Rio de Janeiro, (2008)
  24. 24. Karmitsa N., Mäkelä M. M. Limited memory bundle method for large bound constrained nonsmooth optimization: convergence analysis, Optimization Methods & Software, 25(6), 895–916 (2010)
  25. 25. Karmitsa N., Mäkelä M. M. Adaptive limited memory bundle method for bound constrained large-scale nonsmooth optimization, Optimization, 59(6), 945–962 (2010)
  26. 26. Demyanov V. F. Constructive Nonsmooth Analysis and Related Topics, New York: Springer, (2014)
  27. 27. Yuan G., Meng Z., Li Y. A modified Hestenes and Stiefel conjugate gradient algorithm for large-scale nonsmooth minimizations and nonlinear equations, Journal Optimization Theory and Applications, 168, 129–152 (2016)
  28. 28. Yuan G., Sheng Z., Liu W., The modified HZ conjugate gradient algorithm for large-scale nonsmooth optimization, PLoS ONE, 11, 1–15 (2016)
  29. 29. Yuan G., Wei Z. The Barzilai and Borwein gradient method with nonmonotone line search for nonsmooth convex optimization problems, Mathematical Modelling and Analysis, 17, 203–216 (2012)
  30. 30. Yuan G., Wei Z., Li G. A modified Polak-Ribière-Polyak conjugate gradient algorithm with nonmonotone line search for nonsmooth convex minimization, Journal of Computational and Applied Mathematics, 255, 86–96 (2014)
  31. 31. Yuan G., Wei Z., Wang Z. Gradient trust region algorithm with limited memory BFGS update for nonsmooth convex minimization, Computational Optimization and Applications, 54,45–64 (2013)
  32. 32. Sreedharan V.P. A subgradient projection algorithm, Journal of Approximation Theory, 35, 111–126 (1982)
  33. 33. Schultz H. K. A Kuhn-Tucker algorithm, SIAM Journal on Control and Optimization, 11, 438–445 (1973)
  34. 34. Nguyen V. H., Strodiot J. J. A linearly constrained algorithm not requiring derivative continuity, Engineering Structures, 6, 7–11 (1984)
  35. 35. Fukushima M., Qi L. A global and superlinearly convergent algorithm for nonsmooth convex minimization, SIAM Journal on Optimization, 6, 1106–1120 (1996)
  36. 36. Qi L., Sun J. A nonsmooth version of Newton’s method, Mathematical Programming, 58, 353–367 (1993)
  37. 37. Birge J. R., Qi L. and Wei Z. A general approach to convergence properties of some methods for nonsmooth convex optimization, Applied Mathematics & Optimization, 38, 141–158 (1998)
  38. 38. Bonnans J. F., Gilbert J. C., Lemaréchal C., and Sagastizábal C. A. A family of veriable metric proximal methods, Mathematical Programming, 68, 15–47 (1995)
  39. 39. Correa R. and Lemaréchal C. Convergence of some algorithms for convex minimization, Mathematical Programming, 62, 261–273 (1993)
  40. 40. Hiriart-Urruty J. B., Lemmaréchal C. Convex analysis and minimization algorithms II, Spring-Verlay, Berlin, Heidelberg, (1983)
  41. 41. Calamai P., Moré J. J. Projected gradient for linearly constrained programms, Mathematical Programming, 39, 93–116 (1987)
  42. 42. Qi L. Convergence analysis of some algorithms for solving nonsmooth equations, Mathematics of Operations Research, 18, 227–245 (1993)
  43. 43. Fukushima M. A descent algorithm for nonsmooth convex optimization. Mathematical Programming, 30(2), 163–175 (1984)
  44. 44. Ryrd R. H., Lu P. H., Nocedal J., Zhu C. Y. A limited memory algorithm for bound constrained optimization, SIAM Journal on Scientific Computing, 16, 1190–1208 (1995)
  45. 45. Byrd R. H., Nocedal J., Schnabel R. B., Representations of quasi-Newton matrices and their use in limited memory methods, Mathematical Programming, 63, 129–156 (1994)
  46. 46. Powell M.J.D., A fast algorithm for nonlinearly constrained optimization calculations, Numer. Ana., 155–157 (1978)
  47. 47. Wei Z., Li G., Qi L. New Quasi-Newton Methods for unconstrained optimization problems, Applied Mathematics and Computation, 175, 1156–1188 (2006)
  48. 48. Zhang J. Z., Deng N. Y., Chen L. H. New quasi-Newton equation and related methods for unconstrained optimization, Journal Optimization Theory and Applications, 102, 147–167 (1999)
  49. 49. Wei Z., Yu G., Yuan G., Lian Z. The superlinear convergence of a modified BFGS-type method for unconstrained optimization, Computational Optimization and Applications, 29, 315–332 (2004)
  50. 50. Yuan G., Wei Z. Convergence analysis of a modified BFGS method on convex minimizations, Computational Optimization and Applications, 47, 237–255 (2010)
  51. 51. Facchinei F., Júdice J., An active set Newton algorithm for large-scale nonlinear programs with box constraints, SIAM Journal on Optimization, 8, 158–186 (1998)
  52. 52. Yuan G., Lu X. An active set limited memory BFGS algorithm for bound constrained optimization, Applied Mathematical Modelling, 35, 3561–3573 (2011)
  53. 53. Xiao Y., Wei Z. A new subspace limited memory BFGS algorithm for large-scale bound constrained optimization, Applied Mathematics and Computation, 185, 350–359 (2007)
  54. 54. Karmitsa N. Test problems for large-scale nonsmooth minimization, Reports of the Department of Mathematical Information Technology, Series B. Scientific Computing, No. B. 4/2007, University of Jyväskylä, Finland, (2007)