The discovery of mirror neurons—neurons that code specific actions both when executed and observed—in area F5 of the macaque provides a potential neural mechanism underlying action understanding. To date, neuroimaging evidence for similar coding of specific actions across the visual and motor modalities in human ventral premotor cortex (PMv)—the putative homologue of macaque F5—is limited to the case of actions observed from a first-person perspective. However, it is the third-person perspective that figures centrally in our understanding of the actions and intentions of others. To address this gap in the literature, we scanned participants with fMRI while they viewed two actions from either a first- or third-person perspective during some trials and executed the same actions during other trials. Using multivoxel pattern analysis, we found action-specific cross-modal visual–motor representations in PMv for the first-person but not for the third-person perspective. Additional analyses showed no evidence for spatial or attentional differences across the two perspective conditions. In contrast, more posterior areas in the parietal and occipitotemporal cortex did show cross-modal coding regardless of perspective. These findings point to a stronger role for these latter regions, relative to PMv, in supporting the understanding of others' actions with reference to one's own actions.