Nonnegative matrix factorization (NMF) is well known to be an effective tool for dimensionality reduction in problems involving big data. For this reason, it frequently appears in many areas of scientific and engineering literature. This letter proposes a novel semisupervised NMF algorithm for overcoming a variety of problems associated with NMF algorithms, including poor use of prior information, negative impact on manifold structure of the sparse constraint, and inaccurate graph construction. Our proposed algorithm, nonnegative matrix factorization with rank regularization and hard constraint (NMFRC), incorporates label information into data representation as a hard constraint, which makes full use of prior information. NMFRC also measures pairwise similarity according to geodesic distance rather than Euclidean distance. This results in more accurate measurement of pairwise relationships, resulting in more effective manifold information. Furthermore, NMFRC adopts rank constraint instead of norm constraints for regularization to balance the sparseness and smoothness of data. In this way, the new data representation is more representative and has better interpretability. Experiments on real data sets suggest that NMFRC outperforms four other state-of-the-art algorithms in terms of clustering accuracy.

You do not currently have access to this content.