Existing video teleconferencing techniques suffer from limited field of view, low resolution, and fixed viewpoint. We present a set of novel techniques to overcome these limitations. Based on the light field rendering concept, we utilize an array of cameras to capture the participant(s) from multiple view angles. Based on either assumed or estimated scene geometry, these images are assembled to create a high-resolution seamless image from a user-controllable viewpoint. Two different camera configurations, one dense and one sparse, are presented; the dense format is optimized for high-fidelity view synthesis while the sparse configuration is for expanded viewing volume. For the dense configuration, we provide a detailed analysis on the number of camera images required. For the sparse configuration, we present a robust technique to estimate an approximation of the scene geometry to provide smooth transition when the viewpoint is changed.
The novelty of our approach is that it allows the users to electronically steer the viewpoint in real time for a live scene. Therefore it can be used in 3D teleconferencing systems to generate stereoscopic views, or in group teleconferencing to provide virtual camera views that minimize the overall perspective distortions. We demonstrate the effectiveness of our approach with a point-to-point teleconferencing system distributed in several locations across the continental United States.