Scientists working in the Mars Exploration Rover (MER) mission (2004–2018) reported having a sense of presence on Mars. How is this possible, given that many of the factors underlying presence in mundane situations were absent? We use Riva and Waterworth's (2014) Three-Level model to elucidate how presence was achieved. It distinguishes among proto-presence, core-presence, and extended-presence. We argue that scientists did not experience proto-presence because it requires a tight sensorimotor coupling not available due to the way the rovers were controlled and due to the lengthy delays in getting feedback. Instead, the design of the sociotechnical system made core-presence and extended-presence possible. Extended-presence involved successfully establishing long-term conceptual goals during strategic planning meetings. Core-presence involved enacting short-term tactical goals by carrying out specific actions on particular targets, abstracting away from sensorimotor details. The shift of perspective to the Martian surface was facilitated by team members “becoming the rover,” which allowed them to identify relevant affordances evident in images. We argue, however, that because Mars exploration is a collective activity involving shared agency by a distributed cognitive system, the experience of core- and extended-presence was a collective sense of presence through the rovers.
Among the greatest achievements of modern science is the exploration of Mars. Much of this has been done using rovers such as Spirit and Opportunity in the Mars Exploration Rover mission (MER; 2004–2018; see Figure 1). Ethnographies of the teams exploring Mars have shown that a key to their success was that team members had a technologically mediated sense of presence on the Red Planet (Clancey, 2012; Vertesi, 2015). Reports are full of statements like “we are 4 m from the outcrop we want to image” (Vertesi, 2015), “we have arrived at Endurance Crater,” and “it's below our feet,” referring to the lake beds at Gusev Crater (Clancey, 2012). In another typical statement, an astrogeologist said, “I put myself out there in the scene, [as] the rover . . . trying to figure out where to go and what to do” (Clancey, 2012, p. 100). The PI Squyres said, “If you had told me that we were going to climb to the summit of Husband Hill, and get to Home Plate, then get down into Endurance Crater, then Victoria Crater. . . I wouldn't have believed it” (Clancey, 2012, p. 101).
The central question driving our research, a question that1 should be of interest to the readers of Presence, is “how is this sense of presence on Mars possible?” Prima facie there are important factors working against experiencing telepresence in this context. First, due to the great distances involved, the actions carried out on Mars take place after significant delays, with feedback not available until the next sol. Second, scientists do not directly control the rovers; engineers control them through batch programming, sending up commands once per sol. Thus, factors that normally establish virtual presence are unavailable to scientists working remotely on Mars (Carlson et al., 2017; Khenak, Vézien, Théry, & Bourdot, 2018; Sheridan, 1992).
MER rover and tools.
Although further research is required to externally validate the account, we propose a conceptual scheme that we believe can illuminate telepresence on Mars, one that draws on the excellent ethnographies by Vertesi (2012, 2015) and Clancey (2012), as well as on the first-hand account by Squyres (2005). Our account connects presence to agency and the ability to enact intentions (Zahoric & Jenison, 1998). We make use of Riva and Waterworth's (2014) Three-Level model, which distinguishes “proto-presence,” “core-presence,” and “extended-presence.” Presence at each of these levels depends on being able to respond effectively to affordances at different levels of abstraction and at different timescales. We argue that Mars team members’ experiences are best described as involving core-presence and extended-presence, facilitated by the imaginative process of “becoming the rover.” However, we modify the Three-Level model because the timescales are longer than in mundane situations and because the intentions involved are shared intentions. Mars exploration is a collective activity that involves the shared agency of a distributed cognitive system, as no one acts alone on Mars (Chiappe & Vervaeke, 2020).
2 The Three-Level Model
According to Riva and Waterworth's (2014; Triberti & Riva, 2016) Three-Level model, presence is the feeling of inhabiting an environment that arises when one has an optimal grip on relevant affordances in that environment such that one can successfully enact one's intentions. We sense deviations in optimal grip in tasks we are engaged in and take steps to regain control and reestablish presence. Although presence is usually experienced as a unitary phenomenon, it is made up of different layers, such that some can occur in some contexts while the others are absent. The experience of presence is strongest, however, when all three levels are focused on the same object or situation.
Proto-presence reflects the degree of perception-action coupling as we are engaged in interactions with objects (Riva & Waterworth, 2014). Micro-movements of the body have to be tightly coupled to relevant features of objects to be effective. For example, the hand has to adjust its precise orientation and grip to match the specific features of an object. The intentions that underlie proto-presence are “motor intentions” (M-intentions). These occur along with an action, providing fine-grained guidance and control, operating over elemental timescales (i.e., 10–300 msecs). Proto-presence results when motor responses correctly predict consequences at the elemental timescale.
Core-presence accompanies actions specified at a higher level, abstracting from micro-movements. It makes use of our ability to integrate fluctuating sensations into a relatively stable perception of the environment. For example, perceiving that one needs to get out of the way of an object moving in one's direction and doing so successfully without focusing on the specific movements involved is core-presence. The intentions that underlie it are “present-directed intentions” (P-intentions). They initiate an action in a particular context and sustain it until it is complete. They track the action as a whole and monitor it for collateral effects. If the action has unwanted consequences, a high-level command is issued to eliminate them by influencing the formation of M-intentions. P-intentions are formulated in the situation in which they are enacted, that is, while perceiving objects, and are anchored to particular situations. They have indexical content, such that they involve doing this action on this particular object, and typically operate over integrative timescales (i.e., 300 msec to 3 secs).
Extended-presence arises when one successfully formulates conceptually articulated long-term goals. Extended-presence is greatest when those goals are attainable and consistent with stories that define our identity. According to Riva and Waterworth (2014), extended-presence involves setting goals not related to the here and now. It allows us to plan activities and imagine possible future situations. The intentions that underlie extended-presence are “distal-intentions” (D-intentions). They terminate practical reasoning about goals and prompt practical reasoning to generate a plan for achieving them. They consist of a general description of a type of action, and are typically formed in situations different from those in which a concrete action will unfold. They operate over narrative timescales (i.e., >3 seconds).
According to the Three-Level model, the intentions that underlie the three types of presence form an action–cascade. To reduce the discrepancy between the current state of the world and abstract intentions, D-intentions are transformed into P-intentions. These are consistent with the D-intentions insofar as they help to achieve them, but they are more situated and specific with respect to environmental affordances; the target objects are perceptually present. As objects are approached, motor affordances reveal themselves and help to determine the best M-intentions for the situation.
2.1 Extended-Presence in the MER Mission
Extended-presence involves the formation of D-intentions, prompting reasoning about a course of action that can be undertaken in the future. An example of a D-intention in MER was for Spirit to explore the region around Home Plate, a 90-m wide plateau in the Columbia Hills. The intention was to approach this plateau and study the role of water in its formation (Wang et al., 2008). This prompted deliberation about the means for carrying out this exploration, including determining what route to take around Home Plate, such that a safe shelter could be reached in time for winter hibernation.
The D-intentions guiding the actions of the MER rovers, however, were not individual intentions, they were shared intentions—ones belonging to the team. As a result, extended-presence on Mars was a collective extended-presence, a feeling that “we” are on Mars, which is revealed by the ethnographic reports cited above. Shared intentions are intentions about what the “we” plans to do; they reflect the will of the group (Tollefsen, 2014). Intentions can be attributed to a group if they are arrived at through free and open discussion among all members, and if they integrate everyone's input. If this condition is met, it produces an intention that transcends any single individual.
In MER, shared D-intentions were arrived at during strategic planning meetings, the weekly “End of sol” meetings led by a Long Term Planning Lead. Strategic meetings also included each of the four Science Theme groups (i.e., Geology, Mineralogy and Geochemistry, Rocks and Soil, and Atmospherics; Squyres, 2005), as well as engineers who provided input on the health and status of the rover and its instruments. During these meetings, the team had to arrive at a consensus regarding the activities of the rover in the coming weeks or months. The intentions described goals in general terms, as specific targets were selected for investigation during tactical planning meetings. Consistent with the Three-Level model, the process of arriving at D-intentions involved narrative timescales and they were formed prior to the joint action.
The Three-Level model holds that the intentions involved in extended-presence are ones that are consistent with the narratives people tell about themselves, that is, who they are and what they are about (Velleman, 2007). Narratives provide stability and coherence to D-intentions, thereby increasing extended-presence. In the case of group agency, the narratives involved in constituting an identity are what Gallagher and Tollefsen (2019) call “we-narratives.” These are narratives that define the shared identity of the group, specifying its structure, goals, and values. In MER, we-narratives included narratives about what the group was trying to accomplish on Mars. Shared D-intentions had to be relevant to addressing the goals of the mission.
Other essential features of the we-narratives in MER included a commitment to a flattened social hierarchy. Individuals were obligated to engage in reasoned discussion with others. During strategic planning meetings, MER team members collectively examined each other's views prior to arriving at a decision. True to the flattened hierarchy, discussions did not appeal to authority, seniority, or other factors pertaining to the identity of the team members (Vertesi, 2015). Instead, the factors that were considered included matters of scientific relevance and engineering considerations (e.g., the health of the rover and its instruments, and the basic need to keep the rover alive so that the mission could continue).3
2.2 Core-Presence in the MER Mission
Core-presence is experienced through the successful enactment of P-intentions. In MER, a P-intention would be to use the Mössbauer spectrometer to search for iron-bearing minerals in a particular rock, or to analyze the elements in a specific sample by applying the Alpha-Particle X-Ray Spectrometer. Core-presence involves tracking actions to make sure they are proceeding as intended. In mundane situations, this happens within an integrative timescale of seconds. In MER, however, because feedback was not available until data was downloaded and processed by experts, monitoring the side-effects of actions took place over narrative timescales. Complex actions were broken down into steps that unfolded over several sols, each step being verified before proceeding to the next (Squyres, 2005).
P-intentions on MER, like D-intentions, were shared intentions; that is, they belonged to the group. This is because they were formulated during Science Operation Working Group (SOWG) meetings, which employed consensus-based operations. Any team member could request an observation, and all could challenge any proposed observation. Every team member also had to indicate approval of the plan before it was finalized. The SOWG Chair went around the room asking each member if they were “happy” with the plan, which reminded them that “they are all complicit in the day's activity plan” (Vertesi, 2015, p. 45). Shared P-intentions were constrained by strategic plans, that is, shared D-intentions, as well as by the current resources available to the rovers. Engineers provided scientists with a framework during which observations could take place, taking into account, for example, when a rover had to “nap” to recharge its batteries, and when it had to pause activities to communicate with Earth through satellites. SOWG meetings determined a plan for the rover's actions the next sol, enacting a one-sol turnaround cycle of operation.
Core-presence was established by interacting with images, which functioned as surrogates for the objects of their concern, as scientists were not in direct perceptual contact with them. Images were projected onto large screens during SOWG meetings so all team members could comment on a proposed observation (Clancey, 2012). For example, in one such meeting circles were drawn around three candidate rock targets in the Home Plate region to help focus discussion on which one should be analyzed. In the course of developing P-intentions, scientists gave names to targets to ensure that the correct one was analyzed (Vertesi, 2015). Naming thus helped to convert a general plan into an action on a specific object, converting a D-intention to a P-intention.
Pancam image taken by Opportunity with Endurance Crater in the distance.
A key type of image used to establish core-presence on MER was the panorama. These were images that often stitched together multiple images taken from the two Pancams. Some were laid out on tables so that scientists could walk around them and jointly discuss features in the landscape around the rover (Clancey, 2012). Phenomenologically, these images helped to establish Mars as a place that could be inhabited and explored, instead of an abstract space (Messeri, 2016). They established a “seeing from” perspective, a point of view, centered on the rover, from which the viewing was taking place (Ihde, 1990). This perspective changed as the rover moved across the Martian surface. For example, Squyres (2005) recounts the following as they were approaching Endurance Crater (see Figure 3):
“We can see a little bit of the interior, enough to tantalize, but it's still impossible to tell what we're dealing with. We are coming in from the east, and the crater wall is higher on the far side than it is on the near side. So we can see the top part of the western wall, facing toward us and glowing gold in the morning sunlight in Pancam pictures. From this distance it looks almost cliff like, imposing and actually a little frightening. But it's impossible to tell yet how much of that is real and how much is just foreshortening. We should find out soon enough” (p. 330).
Scientists and engineers used images to plan rover movements and to deploy instruments to answer scientific questions, an experience that differs markedly from a merely aesthetic appreciation of images. Images extended the affordance space so that opportunities for action by the rovers on Mars would present themselves to team members on Earth. When looking at images, what they saw was what the MER team could enact through the rover, and they anticipated results that would be evident in subsequent imagery. Indeed, team members sought an optimal grip on the landscape of affordances revealed by these images. They could sense when the rover was not suitably placed in the Martian terrain to achieve their goals, and they took steps to remedy the situation. For example, Vertesi (2015) recounts the case of scientists seeing different-colored bands on the rim of Victoria Crater. They could also sense, however, that they did not have an optimal grip to determine whether it provided evidence of “layering” of depositional materials. This is because the image was taken from too far away. This led to the formation of further P-intentions to determine the best locations from which to take Pancam images to answer the stratigraphic questions.
The ability to identify relevant affordances and establish core-presence on Mars was facilitated by what the ethnographic reports describe as “becoming the rover” (Clancey, 2012; Vertesi, 2012). “Becoming the rover” involved team members developing an embodied sense of the rover's capacities, ones that were acted out overtly or covertly. Vertesi (2012, p. 400) describes the identification with the rover as a “technomorphism” that involves “developing a sensibility to what the rover might see, think, or feel, in relation to specific activities that must be planned.” This was clearly the case with engineers on the team. For example, Vertesi (2012) observed a Pancam operator pretending to be a rover by using her hands held up to her head to represent the left and right Pancams, and then rotating at the waist. This was done to figure out how the rover would have to move the IDD to take an image of a particular object.
Scientists working with the rovers also displayed technomorphism. This was achieved as a result of their experience requesting images, drives, and measurements, giving them a sense of what the rover body could accomplish. Their technomorphism was evident in their use of gestures. While looking at images during SOWG meetings, they pretended to be the rover, moving the chairs they were sitting on to work out potential movements (Vertesi, 2015). They also used an arm to plan movements of the rover's Instrument Deployment Device (IDD). As Vertesi says, “this involves lifting the right upper arm to shoulder height, dropping the forearm to 90° with the fist pointed at the ground, and articulating the arm in a limited fashion first side-to-side from the shoulder, then swinging forward from the elbow” (2015, p. 172). When scientists worked out a potential sequence of maneuvers using the IDD, they often used these gestures. As a result, team members had common ground for understanding each other, which facilitated the process of arriving at a consensus regarding shared P-intentions.
Scientists’ becoming the rover also influenced their perceptions of which features of the Martian terrain could be interacted with using the specific tools onboard the rovers. As a result, they came to virtually inhabit the environment in the way that the rover does. What was perceived on images as near versus far, reachable versus not reachable, RAT-able versus not RAT-able, for example, was determined by the constraints and opportunities offered by the rover, and scientists developed a skillful know-how of these factors. They also acquired a sense of the conditions that could endanger the rover, which had to be kept within narrow limits of temperature and battery levels to remain operational. It also had to be kept away from very loose soil and from steep cliffs. Any proposed movement that would endanger it led to an appropriate affective reaction in the team. Interestingly, the affective identification with the rovers at times yielded some strange associations. For example, a scientist interviewed by Vertesi (2012, p. 402) said, “I was working in the garden one day and all of a sudden, I don't know what's going on with my right wrist, I cannot move it—out of nowhere! I get here [to the planning meeting], and Spirit has, its right front wheel is stuck! Things like that, you know? . . . I am totally connected to [Spirit]!” According to Witmer and Singer (1998), emotional involvement is crucial for the experience of presence, so it is likely that emotional reactions by MER team members helped to foster a sense of core-presence on the landscape.
As this discussion demonstrates, the way core-presence was achieved on MER is very different from the way telepresence is usually achieved. In typical cases, a teleoperation mode of control is employed (Lester & Thronson, 2011; Draper, Kaber & Usher, 1988). In teleoperations, the operator controls a remotely located robot by issuing commands through a manual control unit. These commands are then executed through actuators and the operator receives near real-time, continuous feedback from sensors (Cui, Tosunoglu, Roberts, Moore, & Repperger, 2003; Sheridan, 1992). When combined with displays providing an immersive interface, the operator can feel present in the distal environment (Cummings & Bailenson, 2016). In this type of mediated action, the proximal tool is incorporated into the body schema, creating a shift in peripersonal space (Bourgeois, Farnè, & Coello, 2014; Riva & Mantovani, 2012). When attention shifts to the distal tool, near-space and far-space come to be centered on the distal tool. This leads to the experience of spatial presence in the distal environment. It locates the self in a distal place from which it can assess its actions (Lee, 2004).
In MER, although scientists were instrumental in arriving at P-intentions, they did not themselves manipulate proximal tools. It is engineers who controlled the rover tools, and they used batch programming, not teleoperations (Mishkin, Limonadi, Laubach, & Bass, 2006). The shift in spatial presence instead arose through the exercise of the embodied imagination. According to Gallagher's (2015) enactive account, imagination is a type of pretense, a kind of overt or covert simulated activity, one that can expand our ability to perceive affordances. Team members thus imagined that they were the rover acting on the surface of Mars and used this ability to grasp affordances evident in the images they interacted with. Their ability to do so improved as they were able to successfully anticipate the results of certain actions. As Clancey (2012, p. 113) says:
“[The] scientists were adapting to the rover's sensors and effectors; they think in terms of what they can do on Mars with these ways of looking and manipulating. Insofar as using these interfaces becomes automatic . . . the scientists might find themselves adapting to the machine's capabilities in the kinds of sensing and manipulations they can imagine. . . Once you begin to think in terms of seeing with Mini-TES eyes, you are effectively working as a cyborg.”
Although the ability to “become the rover” through imaginative processes seems mysterious, merely imagining using a tool has been found to lead to changes in the body schema consistent with the incorporation of the tool. For example, Baccarini et al. (2014) examined whether imagining using a mechanical grabber that extends the arm's reach by 40 cm can alter the subsequent characteristics of a person's movements. They compared people's arm movements in a reaching task before and after doing imagination training. This training required that they practice imagining reaching and grabbing a target with a mechanical grabber. They found that participants' wrist movements reached a lower peak velocity compared to the pre-imagery condition, and deceleration peak velocity was also smaller. This is consistent with the claim that the tool, even though just imagined in use, has become incorporated into the body schema as an elongated arm.
2.3 Proto-Presence in the MER Mission
Proto-presence requires a tight coupling between motor responses and sensory feedback provided over elemental timescales. It involves the enactment of M-intentions that provide fine-grained guidance and control. In the case of the MER mission, although scientists requested specific observations and drives, it was engineers who actually programmed the M-intentions sent to the rovers using the Rover Sequencing and Visualization Program (RSVP). The software on the rovers interpreted these instructions and executed them. As a result, scientists did not enact M-intentions and thus did not experience proto-presence.
Moreover, M-intentions were not the sole purview of engineers. The rovers had the capacity to enact some M-intentions on their own and to autonomously monitor their actions for unwanted side-effects. For example, the MER rovers could halt their movement if sensors detected that pitch, roll, or tilt exceeded a particular range (Leger et al., 2005). In addition, the AutoNav software enabled them to plan their own route by using their onboard cameras to autonomously detect obstacles and to select a path around them, formulating M-intentions on the fly. Engineers alternated between using and not using this software depending on the situation because although it significantly increased drive times, it freed them to focus on other tasks (Clancey, 2012). In short, the fact that many M-intentions were formulated and executed by rovers themselves means that the conditions for proto-presence were not generally available for engineers either.
Although OnSight has come to be widely used by team members on MSL, it is doubtful that it is able to provide full proto-presence. This is because the surface that team members walk on is different from the Martian ground displayed through the VR headset, producing sensorimotor inconsistencies that undercut proto-presence (Riva & Waterworth, 2014). Scientists also cannot interact with the objects that they see through the headset in ways that they would be able to if they were physically present on the Mars. They cannot kick over rocks or pick them up and feel or taste them, as field geologists do on Earth. Thus, to the extent that it produces proto-presence, it is a significantly curtailed experience.
To conclude, although the reports of scientists and engineers claiming to have a sense of presence on Mars at first appear paradoxical, the conceptual work of researchers in the field of presence can help us make sense of their claims, clarifying in what ways presence is possible by a team working remotely on Mars. In particular, using Riva and Waterworth's (2014) Three-Level model, we argue that the sense of presence in MER is best described as extended-presence and core-presence, not proto-presence. However, because the D-intentions and P-intentions involved were shared intentions, the sense of extended- and core-presence was that of a collective presence through the rover. Our work has drawn heavily on excellent ethnographies of these teams. However, future studies will have to further validate our claims using the various metrics researchers use to assess presence (e.g., Grassini & Laumann, 2020). We hope our discussion can provide the basis for such an investigation.
The first author would like to thank Dr. Alexandra Holloway (JPL) for hosting him and other CSULB Human Factors researchers and students at JPL and demonstrating the OnSight visualization tool used in the MSL mission. He would also like to thank Alice Winter (JPL) and Parker Abercrombie (JPL) for sharing their research on this tool.