We propose a theory of the early processing in the mammalian visual pathway. The theory is formulated in the language of information theory and hypothesizes that the goal of this processing is to recode in order to reduce a “generalized redundancy” subject to a constraint that specifies the amount of average information preserved. In the limit of no noise, this theory becomes equivalent to Barlow's redundancy reduction hypothesis, but it leads to very different computational strategies when noise is present. A tractable approach for finding the optimal encoding is to solve the problem in successive stages where at each stage the optimization is performed within a restricted class of transfer functions. We explicitly find the solution for the class of encodings to which the parvocellular retinal processing belongs, namely linear and nondivergent transformations. The solution shows agreement with the experimentally observed transfer functions at all levels of signal to noise.