Modeling videos and image sets by linear subspaces has achieved great success in various visual recognition tasks. However, subspaces constructed from visual data are always notoriously embedded in a high-dimensional ambient space, which limits the applicability of existing techniques. This letter explores the possibility of proposing a geometry-aware framework for constructing lower-dimensional subspaces with maximum discriminative power from high-dimensional subspaces in the supervised scenario. In particular, we make use of Riemannian geometry and optimization techniques on matrix manifolds to learn an orthogonal projection, which shows that the learning process can be formulated as an unconstrained optimization problem on a Grassmann manifold. With this natural geometry, any metric on the Grassmann manifold can theoretically be used in our model. Experimental evaluations on several data sets show that our approach results in significantly higher accuracy than other state-of-the-art algorithms.