We propose a general information-theoretic approach to semi-supervised metric learning called SERAPH (SEmi-supervised metRic leArning Paradigm with Hypersparsity) that does not rely on the manifold assumption. Given the probability parameterized by a Mahalanobis distance, we maximize its entropy on labeled data and minimize its entropy on unlabeled data following entropy regularization. For metric learning, entropy regularization improves manifold regularization by considering the dissimilarity information of unlabeled data in the unsupervised part, and hence it allows the supervised and unsupervised parts to be integrated in a natural and meaningful way. Moreover, we regularize SERAPH by trace-norm regularization to encourage low-dimensional projections associated with the distance metric. The nonconvex optimization problem of SERAPH could be solved efficiently and stably by either a gradient projection algorithm or an EM-like iterative algorithm whose M-step is convex. Experiments demonstrate that SERAPH compares favorably with many well-known metric learning methods, and the learned Mahalanobis distance possesses high discriminability even under noisy environments.