Augmented reality (AR) has the goal of enhancing a person's perception of the surrounding world, unlike virtual reality (VR) that aims at replacing the perception of the world with an artificial one. An important issue in AR is making the virtual world sensitive to the current state of the surrounding real world as the user interacts with it. For providing the appropriate augmentation stimulus at the right position and time, the system needs some sensor to interpret the surrounding scene. Computer vision holds great potential in providing the necessary interpretation of the scene. While a computer vision-based general interpretation of a scene is extremely difficult, the constraints from the assembly domain and specific marker-based coding scheme are used to develop an efficient and practical solution. We consider the problem of scene augmentation in the context of a human engaged in assembling a mechanical object from its components. Concepts from robot assembly planning are used to develop a systematic framework for presenting augmentation stimuli for this assembly domain. An experimental prototype system, VEGAS (Visual Enhancement for Guiding Assembly Sequences), is described, that implements some of the AR concepts for guiding assembly using computer vision.