Skip Nav Destination
Close Modal
Update search
NARROW
Format
Journal
TocHeadingTitle
Date
Availability
1-2 of 2
Zhanyi Hu
Close
Follow your search
Access your saved searches in your account
Would you like to receive an alert when new items match your search?
Sort by
Journal Articles
Publisher: Journals Gateway
Neural Computation (2018) 30 (2): 447–476.
Published: 01 February 2018
FIGURES
| View All (7)
Abstract
View article
PDF
Under the goal-driven paradigm, Yamins et al. ( 2014 ; Yamins & DiCarlo, 2016 ) have shown that by optimizing only the final eight-way categorization performance of a four-layer hierarchical network, not only can its top output layer quantitatively predict IT neuron responses but its penultimate layer can also automatically predict V4 neuron responses. Currently, deep neural networks (DNNs) in the field of computer vision have reached image object categorization performance comparable to that of human beings on ImageNet, a data set that contains 1.3 million training images of 1000 categories. We explore whether the DNN neurons (units in DNNs) possess image object representational statistics similar to monkey IT neurons, particularly when the network becomes deeper and the number of image categories becomes larger, using VGG19, a typical and widely used deep network of 19 layers in the computer vision field. Following Lehky, Kiani, Esteky, and Tanaka ( 2011 , 2014 ), where the response statistics of 674 IT neurons to 806 image stimuli are analyzed using three measures (kurtosis, Pareto tail index, and intrinsic dimensionality), we investigate the three issues in this letter using the same three measures: (1) the similarities and differences of the neural response statistics between VGG19 and primate IT cortex, (2) the variation trends of the response statistics of VGG19 neurons at different layers from low to high, and (3) the variation trends of the response statistics of VGG19 neurons when the numbers of stimuli and neurons increase. We find that the response statistics on both single-neuron selectivity and population sparseness of VGG19 neurons are fundamentally different from those of IT neurons in most cases; by increasing the number of neurons in different layers and the number of stimuli, the response statistics of neurons at different layers from low to high do not substantially change; and the estimated intrinsic dimensionality values at the low convolutional layers of VGG19 are considerably larger than the value of approximately 100 reported for IT neurons in Lehky et al. ( 2014 ), whereas those at the high fully connected layers are close to or lower than 100. To the best of our knowledge, this work is the first attempt to analyze the response statistics of DNN neurons with respect to primate IT neurons in image object representation.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2010) 22 (8): 2161–2191.
Published: 01 August 2010
FIGURES
| View All (17)
Abstract
View article
PDF
Markov random field (MRF) and belief propagation have given birth to stereo vision algorithms with top performance. This article explores their biological plausibility. First, an MRF model guided by physiological and psychophysical facts was designed. Typically an MRF-based stereo vision algorithm employs a likelihood function that reflects the local similarity of two regions and a potential function that models the continuity constraint. In our model, the likelihood function is constructed on the basis of the disparity energy model because complex cells are considered as front-end disparity encoders in the visual pathway. Our likelihood function is also relevant to several psychological findings. The potential function in our model is constrained by the psychological finding that the strength of the cooperative interaction minimizing relative disparity decreases as the separation between stimuli increases. Our model is tested on three kinds of stereo images. In simulations on images with repetitive patterns, we demonstrate that our model could account for the human depth percepts that were previously explained by the second-order mechanism. In simulations on random dot stereograms and natural scene images, we demonstrate that false matches introduced by the disparity energy model can be reliably removed using our model. A comparison with the coarse-to-fine model shows that our model is able to compute the absolute disparity of small objects with larger relative disparity. We also relate our model to several physiological findings. The hypothesized neurons of the model are selective for absolute disparity and have facilitative extra receptive field. There are plenty of such neurons in the visual cortex. In conclusion, we think that stereopsis can be implemented by neural networks resembling MRF.