Abstract
Historically, reproducibility has been the sine qua non of experimental findings that are considered to be scientifically useful. Typically, findings from functional magnetic resonance imaging (fMRI) studies are assessed with statistical parametric maps (SPMs) using a p value threshold. However, a smaller p value does not imply that the observed result will be reproducible. In this study, we suggest interpreting SPMs in conjunction with reproducibility evidence. Reproducibility is defined as the extent to which the active status of a voxel remains the same across replicates conducted under the same conditions. We propose a methodology for assessing reproducibility in functional MR images without conducting separate experiments. Our procedures include the empirical Bayes method for estimating effects due to experimental stimuli, the threshold optimization procedure for assigning voxels to the active status, and the construction of reproducibility maps. In an empirical example, we implemented the proposed methodology to construct reproducibility maps based on data from the study by Ishai et al. (2000). The original experiments involved 12 human subjects and investigated brain regions most responsive to visual presentation of 3 categories of objects: faces, houses, and chairs. The brain regions identified included occipital, temporal, and fusiform gyri. Using our reproducibility analysis, we found that subjects in one of the experiments exercised at least 2 mechanisms in responding to visual objects when performing alternately matching and passive tasks. One gave activation maps closer to those reported in Ishai et al., and the other had related regions in the precuneus and posterior cingulate. The patterns of activated regions are reproducible for at least 4 out of 6 subjects involved in the experiment. Empirical application of the proposed methodology suggests that human brains exhibit different strategies to accomplish experimental tasks when responding to stimuli. It is important to correlate activations to subjects' behavior such as reaction time and response accuracy. Also, the latency between the stimulus presentation and the peak of the hemodynamic response function varies considerably among individual subjects according to types of stimuli and experimental tasks. These variations per se also deserve scientific inquiries. We conclude by discussing research directions relevant to reproducibility evidence in fMRI.