Abstract
Numerous studies suggest that the visual system uses both phase-and position-shift receptive field (RF) mechanisms for the processing of binocular disparity. Although the difference between these two mechanisms has been analyzed before, previous work mainly focused on disparity tuning curves instead of population responses. However, tuning curve and population response can exhibit different characteristics, and it is the latter that determines disparity estimation. Here we demonstrate, in the framework of the disparity energy model, that for relatively small disparities, the population response generated by the phase-shift mechanism is more reliable than that generated by the position-shift mechanism. This is true over a wide range of parameters, including the RF orientation. Since the phase model has its own drawbacks of underestimating large stimulus disparity and covering only a restricted range of disparity at a given scale, we propose a coarse-to-fine algorithm for disparity computation with a hybrid of phase-shift and position-shift components. In this algorithm, disparity at each scale is always estimated by the phase-shift mechanism to take advantage of its higher reliability. Since the phase-based estimation is most accurate at the smallest scale when the disparity is correspondingly small, the algorithm iteratively reduces the input disparity from coarse to fine scales by introducing a constant position-shift component to all cells for a given location in order to offset the stimulus disparity at that location. The model also incorporates orientation pooling and spatial pooling to further enhance reliability. We have tested the algorithm on both synthetic and natural stereo images and found that it often performs better than a simple scale-averaging procedure.