Abstract
Limited-memory BFGS (L-BFGS; Liu and Nocedal, 1989) is often considered to be the method of choice for continuous optimization when first- or second-order information is available. However, the use of L-BFGS can be complicated in a black box scenario where gradient information is not available and therefore should be numerically estimated. The accuracy of this estimation, obtained by finite difference methods, is often problem-dependent and may lead to premature convergence of the algorithm.
This article demonstrates an alternative to L-BFGS, the limited memory covariance matrix adaptation evolution strategy (LM-CMA) proposed by Loshchilov (2014). LM-CMA is a stochastic derivative-free algorithm for numerical optimization of nonlinear, nonconvex optimization problems. Inspired by L-BFGS, LM-CMA samples candidate solutions according to a covariance matrix reproduced from m direction vectors selected during the optimization process. The decomposition of the covariance matrix into Cholesky factors allows reducing the memory complexity to , where n is the number of decision variables. The time complexity of sampling one candidate solution is also but scales as only about 25 scalar-vector multiplications in practice. The algorithm has an important property of invariance with respect to strictly increasing transformations of the objective function; such transformations do not compromise its ability to approach the optimum. LM-CMA outperforms the original CMA-ES and its large-scale versions on nonseparable ill-conditioned problems with a factor increasing with problem dimension. Invariance properties of the algorithm do not prevent it from demonstrating a comparable performance to L-BFGS on nontrivial large-scale smooth and nonsmooth optimization problems.