Crammer and Singer's method is one of the most popular multiclass support vector machines (SVMs). It considers L1 loss (hinge loss) in a complicated optimization problem. In SVM, squared hinge loss (L2 loss) is a common alternative to L1 loss, but surprisingly we have not seen any paper studying the details of Crammer and Singer's method using L2 loss. In this letter, we conduct a thorough investigation. We show that the derivation is not trivial and has some subtle differences from the L1 case. Details provided in this work can be a useful reference for those who intend to use Crammer and Singer's method with L2 loss. They do not need a tedious process to derive everything by themselves. Furthermore, we present some new results on and discussion of both L1- and L2-loss formulations.

You do not currently have access to this content.