In an overcomplete basis, the number of basis vectors is greater than the dimensionality of the input, and the representation of an input is not a unique combination of basis vectors. Overcomplete representations have been advocated because they have greater robustness in the presence of noise, can be sparser, and can have greater flexibility in matching structure in the data. Overcomplete codes have also been proposed as a model of some of the response properties of neurons in primary visual cortex. Previous work has focused on finding the best representation of a signal using a fixed overcomplete basis (or dictionary). We present an algorithm for learning an overcomplete basis by viewing it as probabilistic model of the observed data. We show that overcomplete bases can yield a better approximation of the underlying statistical distribution of the data and can thus lead to greater coding efficiency. This can be viewed as a generalization of the technique of independent component analysis and provides a method for Bayesian reconstruction of signals in the presence of noise and for blind source separation when there are more sources than mixtures.