Abstract
Recurrent processing is a crucial feature in human visual processing supporting perceptual grouping, figure-ground segmentation, and recognition under challenging conditions. There is a clear need to incorporate recurrent processing in deep convolutional neural networks, but the computations underlying recurrent processing remain unclear. In this article, we tested a form of recurrence in deep residual networks (ResNets) to capture recurrent processing signals in the human brain. Although ResNets are feedforward networks, they approximate an excitatory additive form of recurrence. Essentially, this form of recurrence consists of repeating excitatory activations in response to a static stimulus. Here, we used ResNets of varying depths (reflecting varying levels of recurrent processing) to explain EEG activity within a visual masking paradigm. Sixty-two humans and 50 artificial agents (10 ResNet models of depths −4, 6, 10, 18, and 34) completed an object categorization task. We show that deeper networks explained more variance in brain activity compared with shallower networks. Furthermore, all ResNets captured differences in brain activity between unmasked and masked trials, with differences starting at ∼98 msec (from stimulus onset). These early differences indicated that EEG activity reflected “pure” feedforward signals only briefly (up to ∼98 msec). After ∼98 msec, deeper networks showed a significant increase in explained variance, which peaks at ∼200 msec, but only within unmasked trials, not masked trials. In summary, we provided clear evidence that excitatory additive recurrent processing in ResNets captures some of the recurrent processing in humans.