Abstract
The most common visual feedback technique in teleoperation is in the form of monoscopic video displays. As robotic autonomy increases and the human operator takes on the role of a supervisor, three-dimensional information is effectively presented by multiple, televised, two-dimensional (2-D) projections showing the same scene from different angles. To analyze how people go about using such segmented information for estimations about three-dimensional (3-D) space, 18 subjects were asked to determine the position of a stationary pointer in space; eye movements and reaction times (RTs) were recorded during a period when either two or three 2-D views were presented simultaneously, each showing the same scene from a different angle. The results revealed that subjects estimated 3-D space by using a simple algorithm of feature search. Eye movement analysis supported the conclusion that people can efficiently use multiple 2-D projections to make estimations about 3-D space without reconstructing the scene mentally in three dimensions. The major limiting factor on RT in such situations is the subjects' visual search performance, giving in this experiment a mean of 2270 msec (SD = 468; N = 18). This conclusion was supported by predictions of the Model Human Processor (Card, Moran, & Newell, 1983), which predicted a mean RT of 1820 msec given the general eye movement patterns observed. Single-subject analysis of the experimental data suggested further that in some cases people may base their judgments on a more elaborate 3-D mental model reconstructed from the available 2-D views. In such situations, RTs and visual search patterns closely resemble those found in the mental rotation paradigm (Just & Carpenter, 1976), giving RTs in the range of 5-10 sec.