Given a constraint set with k constraints in the framework of Optimality Theory (OT), what is its capacity as a classification scheme for linguistic data? One useful measure of this capacity is the size of the largest data set of which each subset is consistent with a different grammar hypothesis. This measure is known as the Vapnik-Chervonenkis dimension (VCD) and is a standard complexity measure for concept classes in computational learnability theory. In this work, I use the three-valued logic of Elementary Ranking Conditions to show that the VCD of Optimality Theory with k constraints is k-1. Analysis of OT in terms of the VCD establishes that the complexity of OT is a well-behaved function of k and that the ‘hardness’ of learning in OT is linear in k for a variety of frameworks that employ probabilistic definitions of learnability.