Abstract

Support vector machine (SVM) soft margin classifiers are important learning algorithms for classification problems. They can be stated as convex optimization problems and are suitable for a large data setting. Linear programming SVM classifiers are especially efficient for very large size samples. But little is known about their convergence, compared with the well-understood quadratic programming SVM classifier. In this article, we point out the difficulty and provide an error analysis. Our analysis shows that the convergence behavior of the linear programming SVM is almost the same as that of the quadratic programming SVM. This is implemented by setting a stepping-stone between the linear programming SVM and the classical 1-norm soft margin classifier. An upper bound for the misclassification error is presented for general probability distributions. Explicit learning rates are derived for deterministic and weakly separable distributions, and for distributions satisfying some Tsybakov noise condition.

This content is only available as a PDF.