The generalization properties of learning classifiers with a polynomial kernel function are examined. In kernel methods, input vectors are mapped into a high-dimensional feature space where the mapped vectors are linearly separated. It is well-known that a linear dichotomy has an average generalization error or a learning curve proportional to the dimension of the input space and inversely proportional to the number of given examples in the asymptotic limit. However, it does not hold in the case of kernel methods since the feature vectors lie on a submanifold in the feature space, called the input surface. In this letter, we discuss how the asymptotic average generalization error depends on the relationship between the input surface and the true separating hyperplane in the feature space where the essential dimension of the true separating polynomial, named the class, is important. We show its upper bounds in several cases and confirm these using computer simulations.

This content is only available as a PDF.