Abstract

This letter focuses on the issue of whether risk functionals derived from information-theoretic principles, such as Shannon or Rényi's entropies, are able to cope with the data classification problem in both the sense of attaining the risk functional minimum and implying the minimum probability of error allowed by the family of functions implemented by the classifier, here denoted by min Pe. The analysis of this so-called minimization of error entropy (MEE) principle is carried out in a single perceptron with continuous activation functions, yielding continuous error distributions. In spite of the fact that the analysis is restricted to single perceptrons, it reveals a large spectrum of behaviors that MEE can be expected to exhibit in both theory and practice. In what concerns the theoretical MEE, our study clarifies the role of the parameters controlling the perceptron activation function (of the squashing type) in often reaching the minimum probability of error. Our study also clarifies the role of the kernel density estimator of the error density in achieving the minimum probability of error in practice.

You do not currently have access to this content.