Data clustering is a complex optimization problem with applications ranging from vision and speech processing to data transmission and data storage in technical as well as in biological systems. We discuss a clustering strategy that explicitly reflects the tradeoff between simplicity and precision of a data representation. The resulting clustering algorithm jointly optimizes distortion errors and complexity costs. A maximum entropy estimation of the clustering cost function yields an optimal number of clusters, their positions, and their cluster probabilities. Our approach establishes a unifying framework for different clustering methods like K-means clustering, fuzzy clustering, entropy constrained vector quantization, or topological feature maps and competitive neural networks.