Although visual input arrives continuously, sensory information is segmented into (quasi-)discrete events. Here, we investigated the neural correlates of spatiotemporal binding in humans with magnetoencephalography using 2 tasks where separate flashes were presented on each trial but were perceived, in a bistable way, as either a single or two separate events. The first task (two-flash fusion) involved judging one versus 2 flashes, whereas the second task (apparent motion: AM) involved judging coherent motion versus two stationary flashes. Results indicate two different functional networks underlying 2 unique aspects of temporal binding. In two-flash fusion trials, involving an integration window of ∼50 msec, evoked responses differed as a function of perceptual interpretation by ∼25 msec after stimuli offset. Multivariate decoding of subjective perception based on prestimulus oscillatory phase was significant for alpha-band activity in the right medial temporal (V5/MT) area, with the strength of prestimulus connectivity between early visual areas and V5/MT being predictive of performance. In contrast, the longer integration window (∼130 msec) for AM showed evoked field differences only ∼250 msec after stimuli offset. Phase decoding of the perceptual outcome in AM trials was significant for theta-band activity in the right intraparietal sulcus. Prestimulus theta-band connectivity between V5/MT and intraparietal sulcus best predicted AM perceptual outcome. For both tasks, phase effects found could not be accounted by concomitant variations in power. These results show a strong relationship between specific spatiotemporal binding windows and specific oscillations, linked to the information flow between different areas of the “where” and “when” visual pathways.