Abstract

Anthropomimetic robotics differs from conventional approaches by capitalizing on the replication of the inner structures of the human body, such as muscles, tendons, bones, and joints. Here we present our results of more than three years of research in constructing, simulating, and, most importantly, controlling anthropomimetic robots. We manufactured four physical torsos, each more complex than its predecessor, and developed the tools required to simulate their behavior. Furthermore, six different control approaches, inspired by classical control theory, machine learning, and neuroscience, were developed and evaluated via these simulations or in small-scale setups. While the obtained results are encouraging, we are aware that we have barely exploited the potential of the anthropomimetic design so far. But, with the tools developed, we are confident that this novel approach will contribute to our understanding of morphological computation and human motor control in the future.

1 Introduction

In the last 20 years it has become clear that a full understanding of what it is to be human must be rooted in the interplay between two products of coevolution: our human embodiment, and our human brain [45]. One of the most important manifestations of this interplay is our ability to move our body: We do so with accuracy and effectiveness, both in navigating through the world, and in interacting with objects. However, in order to understand how the body is controlled by the brain, we would need to understand both the nature of the body as a physical and mechanical system, and the nature of the brain as a controller [25]. We believe that the current state of knowledge of the brain as a controller is progressing rapidly, but does not yet provide an adequate foundation for such a quest (see, e.g., [13] and [18]). In the work described here, we have therefore mainly focused on the nature of the human body, and how that nature influences and facilitates the ways in which its movements can, and perhaps must, be controlled [44]. Out of necessity, we have employed the synthetic methodology “understanding by building” [59]; as well as providing insights into the essential character of the human body, this also opens a route in the long term to the construction of artifacts that will be capable of moving and acting in ways that are truly and distinctively human.

The study of artificial systems has already significantly contributed to our understanding of human motor control. For instance, industrial manipulators have been used to study the problems associated with the planning and execution of human limb trajectories [26]. However, robotic manipulators are very different from humans. First, the use of high-precision torque actuators located within joints is an oversimplification of the human muscular system, where even single joint movements require the phased excitation and relaxation of multiple muscles. Second, robotic manipulators are traditionally composed of rigid links attached to single-degree-of-freedom revolute or prismatic joints [11], a characteristic that facilitates control and improves manipulation performance. However, the human body is neither stiff nor made exclusively of revolute joints. These missing properties pose severe limitations on the aspects of motor control that can be investigated by using industrial robots and on the conclusions that can be drawn from the experimental results. Therefore, more detailed replicas of the human motor apparatus are a requisite for the effective investigation of human motor control issues in artificial systems—in accordance with the concept of embodiment [45].

Some progress toward this goal has been made by the many humanoid robots that have been developed around the world, such as Asimo [50] or iCub [55], which can deliver impressive performance on the tasks that they are designed for (e.g., [8]). So far, however, they have failed to generate a significant stream of new findings in the area of human motor control. We believe that this is due to the fact that these anthropomorphic robots focus on replicating the appearance of humans, while they still rely on traditional methods of sensing, actuation, and control. It is therefore our strong conviction that the understanding of human motor control can only advance by a radically different approach, namely anthropomimetic design.

Anthropomimetic designs differ from conventional humanoid robots in not only imitating the morphological appearance of humans, but capitalizing on the replication of the inner structures of the human body such as equivalents of compliant muscles, tendons, bones, and joints. In principle, this approach is not entirely new, as demonstrated by the Japanese musculoskeletal robots Kenta, Kotaro, and Kojiro—all developed by the University of Tokyo [27, 39, 40]. These legged robots, which are ≈140 cm in size, are highly complex structures that provide many features that are relevant for the investigation of human motor control strategies and for aspects of morphological computation such as compliant, tendon-driven actuators and complex joint types. However, due to this complexity, significant results on motor control have yet to appear from research on these robots.

Therefore, the first problem we had to solve was “How could the complex musculoskeletal structure of the human body be replicated and manufactured?” Ideally, we would have aimed at a full-body walking robot, similar to Kojro, Kotaro, and Kenta. But we tempered ambition with realism, and confined ourselves to anthropomimetic torsos. In addition, we chose an iterative building approach to evaluate different design choices and to gradually increase the complexity of the musculoskeletal structure. The second question that immediately followed was “Is it possible to control such a nonstandard compliant multi-degree-of-freedom structure, and if so, by what means?” This question is a topic of ongoing discussion in biology and neuroscience, and two opposing theories have emerged: (i) the equilibrium point [20] and (ii) the internal model hypothesis [33]. However, limiting ourselves to these theories would be too restrictive, and so multiple approaches, inspired by both classical control theory and neuroscience, were implemented and tested on the anthropomimetic robot platforms. Furthermore, to support the efficient evaluation of these approaches, a physics-based simulation toolbox was developed that is capable of simulating the complex dynamics of this class of tendon-driven robots.

This article summarizes the results of more than three years of robot and simulation model development as well as of research into the control issues of anthropomimetic robots. The problems associated with robot construction and manufacturing as well as the development of a human-inspired control architecture that can cope with the large number of actuators and sensors in these systems, in both hardware and software, are summarized in Section 2 and Section 3, respectively. Section 4 describes the physics-based simulation models that were used for the offline testing of the motor control approaches presented in Section 5. Finally, conclusions and prospects for future work are presented in Section 6.

2 The Robots

To date, we have developed and manufactured four anthropomimetic torsos of increasing complexity. Ecce-1, which is essentially a refurbished, sensorized Cronos (originally manufactured in 2006 [25]), features 44 tendon-driven actuators and two different types of shoulder designs (see Figure 1a). Its successor, Ecce-2, completed in 2010, introduced a more complex neck and head structure (see Figure 1b), whereas Ecce-3, finished in 2011, served as a test bed for a detailed replica of the spine (see Figure 1c). All improvements from these iterations are combined in the latest prototype, the Ecce-4, which is the robot described in the following sections. It is mounted on a mobile base and features the most detailed replicas of the human spine and shoulder girdle (see Figure 1d). In addition to these four torsos, two small-scale test rigs were designed and manufactured to support the investigation of specific control aspects in a less complex setup (cf., Figures 5a and 9c, discussed later in this article).

Figure 1. 

Overview of the developed and manufactured anthropomimetic torsos. (a) Ecce-1, which is essentially a refurbished, sensorized Cronos from 2006, features 44 muscles and two shoulder design variants. (b) Ecce-2 features a more complex head and neck design but is not sensorized. (c) Ecce-3 introduced a more complex spine structure and tendon force sensors for each actuator. (d) Ecce-4 is the most complex prototype so far. It is fully sensorized and mounted to a mobile base.

Figure 1. 

Overview of the developed and manufactured anthropomimetic torsos. (a) Ecce-1, which is essentially a refurbished, sensorized Cronos from 2006, features 44 muscles and two shoulder design variants. (b) Ecce-2 features a more complex head and neck design but is not sensorized. (c) Ecce-3 introduced a more complex spine structure and tendon force sensors for each actuator. (d) Ecce-4 is the most complex prototype so far. It is fully sensorized and mounted to a mobile base.

2.1 Skeletons

The adult human skeleton consists of ≈206 bones, and replicating this complex skeletal structure by conventional engineering techniques would be extremely challenging and perhaps impossible. Our solution was to employ a hand-crafted manufacturing process and to use a new type of engineering thermoplastic for the fabrication of the bones. This thermoplastic, which is known in the UK as Polymorph and in the U.S. as Friendly Plastic, is technically a caprolactone polymer. While it is like polythene in many ways, it fuses when heated to only 60°C (e.g., by plunging it into hot water) and can be freely hand-molded until it restiffens at ≈30°C. It has a distinctly bonelike appearance when cold, and since it is a true thermoplastic, it can be reheated and remolded as many times as necessary. In practical engineering terms, Polymorph is tough and springy. Its tensile strength is good—Polymorph has the highest tensile strength of all the caprolactones, at 580 kg/cm2. It can be further strengthened and stiffened, if necessary, by adding other materials, such as wires, metal rods, or bars.

One anatomical structure that we put particular emphasis on is the shoulder girdle, which extends the range of motions possible at the shoulder joint. In humans, the abduction of the humerus (the upper arm bone) away from the body is restricted to just 90° by contact with the arch of bone that provides the attachment for the deltoid, and all movements beyond this are achieved by a combination of the shoulder joint and the articulation of the scapula on the back of the thorax. The movement of the scapula itself is only restricted by the joint attaching the clavicle to the sternum, and so the scapula has great freedom of movement (e.g., upward as in shrugging the shoulders, or rolling as in raising the arm). The combined actions of the scapula and upper arm serve to keep the shallow cup of the shoulder joint beneath the ball of the humerus and thus stabilize the joint. More complex dynamic interactions also occur, as in the lifting of a heavy weight, where the powerful muscles that roll the scapula are employed to maintain the momentum of the weight once maximal force has been achieved in the shoulder muscles themselves. This complex shoulder girdle structure, with its six degrees of freedom, has been replicated for the Ecce-4.

2.2 Actuators

Human skeleton muscles are endowed with mechanical properties, such as an intrinsic compliance, that contribute significantly to the grace, speed, efficiency, and robustness of human movement dynamics [32]. Today, several technologies exist to mimic these properties for artificial muscles, most of them based on electromagnetic, pneumatic, or hydraulic actuators. However, even though each of these actuation technologies outperforms the human muscle in some aspects, no actuator exists that matches the overall performance of human muscles [43]. For example, pneumatic actuators, such as the McKibben muscle, provide a high force-to-mass ratio and are inherently compliant, but require large compressors, unsuitable for mobile robots.

For our robots, we used electromagnetic, tendon-driven actuators, usually incorporating a series elastic element in the tendons. Each actuator consists of a DC motor equipped with a gearbox as the active element (corresponding to the muscle) and a length of kite line in series with an optional elastic element that functions as the tendon (see Figure 2). By coiling the kite line around the gearbox shaft, the artificial muscle can be either activated or relaxed, depending on the direction of rotation of the shaft. The motors and elastic elements were selected to match the required power and compliance properties of each individual muscle. Moreover, the attachment points of the actuators were chosen to reflect humanlike movement dynamics.

Figure 2. 

Tendon-driven actuator. Electromagnetic, complaint tendon-driven actuators were used as artificial muscles. Each actuator consists of an elastic element in series with a kite line that, when coiled on the gearbox shaft, activates or relaxes the muscle. For proprioception, each actuator unit was equipped with four types of sensors: (i) an absolute potentiometer mounted to the gearbox shaft, (ii) an incremental encoder on the motor shaft, (iii) a current sensor within the control electronics, and (iv) a force sensor attached to the distal attachment point of the muscle.

Figure 2. 

Tendon-driven actuator. Electromagnetic, complaint tendon-driven actuators were used as artificial muscles. Each actuator consists of an elastic element in series with a kite line that, when coiled on the gearbox shaft, activates or relaxes the muscle. For proprioception, each actuator unit was equipped with four types of sensors: (i) an absolute potentiometer mounted to the gearbox shaft, (ii) an incremental encoder on the motor shaft, (iii) a current sensor within the control electronics, and (iv) a force sensor attached to the distal attachment point of the muscle.

2.3 Sensors

In humans, two types of mechanoreceptors provide feedback for proprioception—the sense of static positions and movements of the limbs of the body [32]. The first type, the muscle mechanoreceptors, provide feedback for muscle properties, such as muscle length, speed, or stretch. These were imitated by equipping each tendon-driven actuator with: (i) a force sensor mounted to the distal attachment point of the muscle for stretch sensing, (ii) a current sensor within the electronic control units (ECUs) for motor control, (iii) an incremental encoder to provide a measure for the length of the kite line, and (iv) an absolute potentiometer that was used to avoid a calibration routine for the incremental encoders (see Figure 2).

The second type of receptors, the skeletal mechanoreceptors, provide joint angle feedback. Though these sensors are easily imitated for the revolute joints of industrial manipulators, they are difficult to replicate for the complex joints of anthropomimetic robot systems [56]. Therefore, these sensors were omitted, but an approximation of the body pose can be backcomputed by fusing the sensory information from the various actuator sensors.

3 Control and Software Architecture

To cope with the large number of actuators and sensors in the torsos, a novel control and software architecture was developed. Here, inspiration was again drawn from biology, where the neuronal circuits of the spinal cord, brain stem, and forebrain form a hierarchical control architecture [32]. This hierarchy was emulated using distributed ECUs for the low-level actuator feedback loops (see Section 5.1); these are controlled by a central computer via a controller area network (CAN see Figure 3 and [31]). Each ECU is capable of controlling two tendon-driven actuators and is equipped with a microcontroller, a CAN; interface, motor drivers for two brushed DC motors, several A/D converters for analog sensor connection, and two integrated Hall-effect-based measurement devices in the motor loop for motor current feedback. To achieve a high degree of robustness, each ECU was implemented as a fail-silent unit [48] with a firmware based on a finite state machine (FSM).

Figure 3. 

Control architecture. A human-inspired control architecture (see left panel; taken from [32]) was developed to cope with the large number of actuators and sensors of the anthropomimetic torsos. In this architecture, multiple distributed electronic control units (ECUs) are mounted in the torsos for low-level actuator control.

Figure 3. 

Control architecture. A human-inspired control architecture (see left panel; taken from [32]) was developed to cope with the large number of actuators and sensors of the anthropomimetic torsos. In this architecture, multiple distributed electronic control units (ECUs) are mounted in the torsos for low-level actuator control.

This control architecture was complemented by a comprehensive and modular software framework, termed the Eccerobot Research Platform (ERP; see Figure 4). The ERP was implemented in C++ and built on the robotic middleware package OpenRTM, which provides state-driven robot-technology components (RTCs) and uses the common object request broker architecture (CORBA) for inter-RTC communication [2]. To form the core of the framework, multiple RTCs have been implemented to provide a plant abstraction layer for the robot-independent execution of actuator commands and sensor data acquisition. This component kernel is dynamically configurable through Extensible Markup Language (XML) files to support the various prototypes with different sensor and actuator configurations. Furthermore, high-level components based on Qt are built on this kernel to provide tools for real-time data analysis, robot debugging, or robot control.

Figure 4. 

Eccerobot research platform (ERP). The ERP was implemented in C++ and built on the robot middleware OpenRTM [2]. The core of the ERP is the plant abstraction layer, which provides standardized interfaces for executing actuator commands and acquiring sensor readings. Furthermore, the simulation environment Caliper was developed for simulating the dynamics of tendon-driven robots, and this was integrated into the ERP.

Figure 4. 

Eccerobot research platform (ERP). The ERP was implemented in C++ and built on the robot middleware OpenRTM [2]. The core of the ERP is the plant abstraction layer, which provides standardized interfaces for executing actuator commands and acquiring sensor readings. Furthermore, the simulation environment Caliper was developed for simulating the dynamics of tendon-driven robots, and this was integrated into the ERP.

In addition, a physics-based simulator was developed and integrated into the ERP for the simulation of the dynamics of tendon-driven robots. This simulator, called Caliper [63], is capable of importing and exporting simulation models using the open-standard XML schema Collaborative Design Activity (Collada) [3]. It relies on the physics-based simulation engine Bullet Physics [10], originally developed for computer games, to simulate the dynamics of the robots (see Section 4).

4 Physics-Based Modeling

Simulating anthropomimetic robots using analytical modeling techniques is not feasible, due to the sheer complexity of the musculoskeletal structure that features spherical joints, muscles with pose-dependent lever arms, muscle-skeleton collisions, and even multi-articular muscles (muscles spanning more than one joint). Therefore, we used Bullet Physics [10] for the simulation of the skeleton (see Section 4.1) and implemented custom extensions to model the tendon-driven, compliant actuators (see Section 4.2). Bullet Physics, which is particularly optimized for performance, also offers the advantage of simulating the robot models in real time—a property that also enables entirely new prospective uses for the simulation, for example, as an internal model for prediction and control. Moreover, basic sensor models were implemented to provide the sensor modalities for the evaluation of the control approaches on a virtual robot, similar to the physical setups (see Section 4.3). Finally, an automated, steady-state pose calibration algorithm was developed, based on a (μ, λ) evolution strategy, to reduce the simulation-reality gap of the models (see Section 4.4).

4.1 Skeleton Modeling

Due to the hand-crafted manufacturing of the torsos, the impossibility of dismantling and reassembling individual components, and the high intrinsic complexity of the skeletons, the derivation of accurate skeleton models is an inherently difficult task. Hence, we tackled this problem by a three-step approach (see Figure 5a and [62]). In the first step we used true-to-scale photographs to define a preliminary 3D approximation of the skeleton structure. This step, however, suffered from two major drawbacks: (i) attachments, such as DC motors and potentiometers, were difficult to accurately position and orient in 3D, and (ii) there were size inaccuracies due to parallax error. In the second step, we employed laser scans of the robots to refine the 3D model of the first step, and a satisfactory approximation of the hand-crafted skeletons was eventually obtained by merging the rough model of the first step with the scan data.

Figure 5. 

Physics-based robot model and model calibration results. (a) Registered laser scans, static 3D, and dynamic simulation model of the test rig shown in the top left corner. (b) Convergence of the fitness of the calibration of the robot model shown in (a) using evolution strategies. (c) Pose errors for each calibration and validation pose of the calibration run shown in (b).

Figure 5. 

Physics-based robot model and model calibration results. (a) Registered laser scans, static 3D, and dynamic simulation model of the test rig shown in the top left corner. (b) Convergence of the fitness of the calibration of the robot model shown in (a) using evolution strategies. (c) Pose errors for each calibration and validation pose of the calibration run shown in (b).

In the last step, the initial static 3D model of the skeleton was converted into a dynamic model for the Caliper simulation framework (see Section 3). First, collision shapes were assigned to each bone. To ensure good simulation performance, complex meshed shapes were avoided, and shape primitives, such as cylinders, were used whenever possible. Second, masses were estimated based on the volume of the bone and the density of Polymorph, and the corresponding inertia tensors were computed from the assigned shapes and masses. Finally, holonomic constraints with custom viscous and Coulomb friction extensions were defined and integrated into the models to simulate the joints of the robot.

4.2 Actuator Modeling

Models of the tendon-driven actuators (see Section 2.2) were derived by standard system identification techniques and implemented as custom extensions for the Bullet Physics engine [10]. The simulation of an actuator in one simulation step (from time t to t + Δt) proceeds as follows [62]: First, the force F exerted by the tendon is computed using a linear-spring damper model given by
formula
formula
where k and D are the spring and damping coefficients, respectively, and Δx is the spring extension computed from the length of the kite line (LK), the resting length of the spring (LR), and the total length of the tendon (LT, from attachment point to attachment point —see Figure 2). As tendon-driven actuators can only pull and not push, F is set to 0 if negative. Subsequently, F is applied to the rigid bodies at the tendon attachment points ( and , respectively) and set as the load input for the combined state-space model of the DC motor and gearbox, along with the input voltage V and the Coulomb friction torque τCtot. As long as the input torque produced by the tendon is less than the output torque of the motor, this model is given by
formula
with
formula
and
formula
where i is the motor current, R and L are the armature resistance and inductance, respectively, KE is the back-EMF constant, KT is the torque constant, r is the gearbox output shaft radius, ωG is the angular velocity of the gearbox shaft, η is the gearbox efficiency, N is the gearbox ratio, JM and JG are the motor and gearbox inertias, μM and μG are the motor and gearbox viscous friction constants, and τCM and τCG are the constant motor and gearbox Coulomb friction torques. In the last step, this state-space model is numerically integrated to compute ωG and to change LK for the next time step in accordance with
formula

For the tendon length LT, we evaluated two simulation models: (i) a virtual tendon, where LT is equal to the Euclidean distance between and , and (ii) a physical tendon simulated using soft-body dynamics, where LT is the sum of the Euclidean distances between all soft bodies. The virtual tendon implementation suffered from two main problems. First, no collisions between the skeleton and the tendon were considered, resulting in erroneous force vectors for some configurations. Second, stiff actuators—actuators that are not equipped with the optional elastic element—can only be approximated using high spring constants. The physical tendon implementation can cope with these problems, but introduces new ones due to the high computational load and occasional simulation instabilities resulting from the collisions between the tendon and the skeleton. We therefore used the virtual tendon implementation and extended this model by using spherical pulleys to compensate for the lack of tendon-skeleton collisions. These pulleys approximate the effects of the tendon when it collides with parts of the skeleton, by introducing spherical wrapping surfaces for the tendons.

4.3 Sensor Modeling

As listed in Section 2.3, each actuator is equipped with four types of sensors for proprioception. These sensors were modeled for the Caliper simulator to provide the required sensory feedback for the development and testing of controllers (see Figure 6a and [62]).

Figure 6. 

Kite line length and tendon force control. (a) Example of the kite line length control mode during an elbow joint movement (in simulation). (b) Step-response example of the force control mode (on the physical robot), showing impressive control performance, with a rise time of 318 ms for a step from 2 to 40 N, and a steady state RMS deviation of 0.236 N at 40 N.

Figure 6. 

Kite line length and tendon force control. (a) Example of the kite line length control mode during an elbow joint movement (in simulation). (b) Step-response example of the force control mode (on the physical robot), showing impressive control performance, with a rise time of 318 ms for a step from 2 to 40 N, and a steady state RMS deviation of 0.236 N at 40 N.

The absolute potentiometers and incremental encoders were merged and modeled by a “perfect” potentiometer model. Provided there is an initial mapping between LK and the potentiometer value, the current potentiometer value SP in degrees is equal to
formula
where SP0 is the initial potentiometer position corresponding to the initial kite line length LK0. The force sensors were modeled by providing access to the tendon force F (see Equation 1), whereas the motor current i is extracted from the motor model (see Equation 3) to emulate the Hall-effect-based current sensors included in the motor loop circuits. In addition to these three proprioceptive sensor modalities, we also implemented sensors that are not part of the physical robots, such as joint angle sensors, to investigate their relevance for control.

4.4 Model Calibration

Even though the derived models of the individual subcomponents of the robot, such as motors or springs, were identified and validated against their real counterparts, preliminary simulations of the full robot models exhibited a significant simulation-reality gap. We hypothesized that this is not due to the physics-based simulation approach per se, but rather due to: (i) the hand-crafted nature of the robot, which complicated the derivation of an accurate skeleton model, and (ii) the modeling simplifications that were necessary to simulate the complex dynamics of the tendon-driven muscles (as outlined in the previous sections). To evaluate this conjecture, we developed an automated, steady-state pose calibration algorithm based on a (μ, λ) evolution strategy (ES) [61]. For the acquisition of the poses of the physical robot, a stereo-vision, infrared-marker-based motion capture system with real-time capabilities was developed, and a Gaussian-based, non-isotropic, self-adapting mutation operator was used by the ES to explore the search space [6].

For the calibration, we first recorded synchronized sensor data of various steady-state poses of the robot (kite line length of all muscles and corresponding joint angles) and evaluated the individuals of the ES by setting the recorded kite line length as the control reference for each muscle (see Section 5.1) and using the resulting joint angle deviations as the fitness measure. After the calibration, we assessed the quality of the calibration by analyzing the joint errors of the calibrated model for a second, disjoint set of validation poses created by linear interpolation of the kite line length reference values for two neighboring calibration poses.

The results of the calibration and validation of the anthropomimetic robot arm shown in Figure 5a with 11 muscles, a spherical shoulder, and a revolute elbow joint are presented in Figure 5b and Figure 5c, respectively. For these results, the following model parameters were selected as object parameters subject to the mutation operator of the ES: (i) the muscle attachment points and (both ∈ ), (ii) the initial kite line length LK0, (iii) the spring constant k of the linear spring-damper model, and (iv) the gearbox shaft spindle radius r, which is considered to be constant in the simulation but varies stochastically in reality due to the coiled kite line. These nine parameters were initialized for each actuator, based on laser scan data and manual measurements, resulting in a total of 99 object parameters. The parent and offspring population sizes were set to μ = 6 and λ = 36, respectively, to achieve a ratio of 1/6, which has been shown to provide a maximum rate of convergence [23]. At generation 0, the average error between the physical and the simulated robot for all 15 calibration poses was equal to 32.9° for the elbow and 35.7° for the shoulder joint. However, the ES converged rapidly, and final average errors of 1.9° and 2.5° for the elbow and shoulder joints were achieved for the calibration poses after the full run of 250 generations. Similar deviations were observed for the best individual of the calibration phase and the disjoint set of validation poses. Here, final average errors of 1.9° and 2.0° were achieved for the elbow and shoulder joints, respectively.

5 Motor Control

In this section, we describe approaches to the control of anthropomimetic robots. First, a set of controllers for single tendon-driven actuators are presented in Section 5.1, which are used by the higher-level controllers. Two different approaches to joint space control are described in Sections 5.2 and 5.3. While the former relies exclusively on concepts from classical control theory, the latter introduces machine learning techniques to capture the effects of multi-articular and colliding muscles. The remaining sections deal with different approaches to operational space control, namely the positioning of the end-effector. Section 5.4 employs a developmental framework to learn reaching from sensorimotor interactions. Section 5.5 presents preliminary results on the application of reinforcement learning techniques to this problem in a full-fledged simulation of Ecce-2. Finally, Section 5.6 provides an integrated feedforward-feedback approach based on a black box model.

5.1 Actuator Control

For the control of a single tendon-driven actuator, four control modes were implemented on the ECU microcontrollers and emulated for the physics-based simulation: (i) a voltage control mode, (ii) a proportional (P) kite line length control mode that relies on feedback from the motor encoders (see Figure 6a), (iii) a proportional-integral-derivative (PID) current control mode that exploits the Hall-effect-based current sensors integrated in the ECUs, and (iv) a tendon force controller based on the force sensors integrated into the distal tendon attachment points (see Figure 6b). All these controllers can be parametrized and modulated at run time through higher control layers—reminiscent of the spinal cord and brain-stem circuits in humans [32]. For the control cycles, a fixed frequency of 500 Hz was used. Even though this frequency is high compared to those in biological systems, it is low by engineering standards. Hence, particular emphasis had to be put on the control laws to achieve stable and responsive control, especially for the tendon force control mode. Here, a state space controller was implemented that proved to be superior to conventional proportional-derivative (PD) approaches [30]. It can be designed directly in the discrete space domain, taking the control frequency into account at design time, utilizing the state space actuator model (see Section 4.2), by moving the closed loop poles to the desired values (Ackermann's formula [1]). Furthermore, to compensate for the steady state offset of the state space controller due to the missing integrator element, a prefilter was incorporated.

5.2 Puller-Follower Control

In this section we investigate the application of classical control methods to the control of hinge joints, each actuated by a pair of compliant agonist-antagonist actuators. The advantages of humanlike joint structures have been identified in a number of investigations [41, 57, 58], but definite answers to the control problem are yet to be found. Very few results have been presented on attempts at control of the antagonistically coupled compliant drives [28, 37]. In this section we focus on energy efficiency and suggest a new control approach inspired by patterns of electromyography signals measured in antagonistically coupled human muscles [21].

The suggested puller-follower approach is inspired by the asymmetric conditioning of human muscles; the agonist actuator, the puller, takes the responsibility for the joint position as the main control requirement, whereas the antagonistic actuator, the follower, keeps its tendon stretched (thus preventing slackening) by maintaining the tension at some appropriate low level. This prescribed value of the antagonistic tendon force is called the reference tension force. For this system to be energy efficient, the reference tension force has to be low, but sufficiently high to prevent the slacking of the tendons. Motions with variable speeds require the actuators to exchange roles (puller becomes follower and vice versa). This switching is followed by oscillations in tendon forces (switching shock), which can cause slackening. It has been proved that there exists an optimal (relatively high) level of the follower tension that minimizes this shock [47]. However, this high reference tension force would reduce the energy efficiency of the system. Thus, we suggest an adaptive reference tension, which is low during regular motion and is increased during the periods of switching (see Figure 7a). To control the joint position as well as the antagonist tension force, we use input-output feedback linearization [42] through a two-step compensator. Satisfactory results in controlling a single joint were achieved.

Figure 7. 

Examples of the puller-follower control concept. Two motors always have opposing roles: The agonist (puller) controls the joint position, while the antagonist (follower) controls the tension in the opposing tendon to prevent it from going slack. (a) At the moment of switching, an adaptive reference tension is applied to the follower to minimize oscillations in the transient phase and to optimize the energy efficiency in the steady state. (b) Trajectory tracking of the puller-follower concept applied to a multi-joint system, comprising two revolute joints. At ≈1 s an external disturbance, grasping a 0.2-kg object, occurs.

Figure 7. 

Examples of the puller-follower control concept. Two motors always have opposing roles: The agonist (puller) controls the joint position, while the antagonist (follower) controls the tension in the opposing tendon to prevent it from going slack. (a) At the moment of switching, an adaptive reference tension is applied to the follower to minimize oscillations in the transient phase and to optimize the energy efficiency in the steady state. (b) Trajectory tracking of the puller-follower concept applied to a multi-joint system, comprising two revolute joints. At ≈1 s an external disturbance, grasping a 0.2-kg object, occurs.

We have also evaluated the puller-follower concept on a multi-joint system with seven DOF, each actuated by nonlinear compliant antagonistically coupled drives [46]. The results revealed problems due to the dynamic coupling between joints, the variable effective joint inertia (depending on the entire mechanism position), and the influence of gravity. To overcome these difficulties we applied the H-infinity loop shaping method—a combination of loop shaping and robust stabilization [36]—and online gravity compensation based on a static model [15]. After these modifications, satisfactory results were obtained (see Figure 7b). However, one should be aware of serious limitations of the puller-follower approach, such as the lack of support of spherical joints or multi-articular muscles.

5.3 Computed Force Control

Classical control theory provides a unified and flexible framework for the control of industrial robots. Computed force control now aims at exploiting this framework by extending the concept of computed torque control for tendon-driven actuators.

In computed torque control, a dynamic model of the robot is used to compute the joint torques τ from reference joint accelerations in accordance with [11]:
formula
where H is the mass matrix representing the inertial effects, C is a vector of centrifugal and Coriolis terms, and τG is a vector of gravity terms. If we presume that the skeletal structure of anthropomimetic robots is stiff, this model can be used as an internal dynamics model for predicting and compensating for dynamic effects—reminiscent of the concept of internal forward models in human motor control [33]. However, as our robots are equipped with tendon-driven actuators that apply forces to the limbs at the attachment points, a transformation from the joint torque to the tendon force domain is required to utilize this approach for the class of anthropomimetic robots.
This transformation is provided by the muscle Jacobian L that can be derived by partially differentiating the geometric mapping between tendon lengths l and joint angles q with respect to q [65]:
formula
Essentially, the muscle Jacobian is describing the change of muscle lengths with respect to the change of joint angles. The principle of virtual work can then be used to derive the missing transformation between the muscle forces f and joint torques τ [17]:
formula

However, the analytical derivation of the muscle Jacobian for our anthropomimetic robots is not possible, because of the absence of CAD data, and because of the presence of tendon-skeleton collisions. Therefore, we decided to approximate it by (i) utilizing machine learning techniques to learn the geometric mapping between tendon lengths l and joint angles q, and (ii) differentiating this geometric mapping using the difference quotient [29]. As samples can be drawn from the entire joint space, the function approximator need only support the interpolation between samples, without the need for extrapolation. Well suited for this kind of problem are artificial neural networks (ANNs) that can be trained by a supervised learning approach using the well-known backpropagation algorithm [60]. The main advantage of this machine learning approach is that the muscle Jacobian can be learned directly on the physical robot (e.g., by supervised movement of the limbs and backcomputation of the tendon length) and that complex musculoskeletal interactions, such as multi-articular actuators, are automatically captured. The only limitation is that the geometric mapping of the actuator lengths has to be unique and continuously differentiable with respect to the joint angles.

This computed force control approach was implemented and tested for the simulation model of our shoulder prototype (see Figure 5a). Samples for training the ANN were obtained by (i) moving the simulated joints to different joint angles and (ii) performing random movements through motor babbling. In summary, a control cycle proceeds as follows (see Figure 8a): First, reference accelerations are computed from the reference joint angles qref, using a standard PID controller. Subsequently, Equation 6 is used to compute the required joint torques τ to reach . In the next step, the machine-learned muscle Jacobian is utilized to transform the joint torques τ into the corresponding actuator forces f. However, as Equation 8 is underdetermined if solved for f, a quadratic optimization method was used to resolve this redundancy [16]. As objective function, we chose to use the sum of squares of the Euclidean distances between all actuator forces, which minimizes the sum of internal forces. However, different objective functions like minimum energy or minimum muscle activation could be defined, which would result in different system dynamics. This is one of the advantages of the computed force control approach, as it relies on explicitly formulated system models, which can be exploited in different ways to obtain a behavior that possibly matches human movements. The optimization routine can also be used to prevent tendons from going slack. This can easily be achieved by defining a minimum actuator tension as a constraint for the optimization routine. Finally, the “optimized” forces f are set as new references for the force controllers of the 11 actuators (see Section 5.1). An example of joint trajectory tracking with this approach is shown in Figure 8b. The achieved tracking of the reference angles is impressive, both for single-joint movements (see markers 1, 2, and 3) and for multi-joint movements (markers 4 and 5).

Figure 8. 

Computed force control. (a) The control scheme uses an analytical model of the skeleton, along with the muscle Jacobian, to calculate reference forces for the individual actuator force controller. (b) The resulting joint angles for trajectory control are given as follows. First only the revolute elbow joint is moved (1–2), and subsequently the spherical shoulder joint (3). Finally, all joints are moved simultaneously (4–5). We observe accurate tracking in all joint angles, even though the joint coordinates are tightly coupled by the biarticular actuator as well as by the system dynamics of the skeleton.

Figure 8. 

Computed force control. (a) The control scheme uses an analytical model of the skeleton, along with the muscle Jacobian, to calculate reference forces for the individual actuator force controller. (b) The resulting joint angles for trajectory control are given as follows. First only the revolute elbow joint is moved (1–2), and subsequently the spherical shoulder joint (3). Finally, all joints are moved simultaneously (4–5). We observe accurate tracking in all joint angles, even though the joint coordinates are tightly coupled by the biarticular actuator as well as by the system dynamics of the skeleton.

We are currently preparing the testing of this computed force control approach on a physical robot. However, as the approach requires joint angle and velocity feedback, we first have to solve the problem of measuring these sensor modalities for the spherical shoulder joint. For this, approaches based on extrinsic sensors, such as the motion capture system used in Section 4.4, or on accurate system models to backcompute the joint angles from the proprioceptive sensors of the actuators, render promising alternatives.

5.4 Sensorimotor Control

The sensorimotor approach to control aims to include principles from developmental biology in the design of control architectures for artificial systems. In this section we describe the sensorimotor framework developed for learning to reach in a reduced motor space (for details, see [35]). Our framework includes two learning processes: an unsupervised learning process (ULP) and a supervised learning process (SLP). The former exploits the passive dynamics of our compliant robots to build a basic set of motor synergies based on Hebbian learning principles; the latter uses the motor synergies to learn reaching in a reduced motor space (see also [51, 54]).

The schematic diagram of the sensorimotor framework with the ULP and SLP is shown in Figure 9a. To describe the sequence of events in each process, we will use the numbering system in Figure 9a. The ULP works as follows. In U1, spontaneous motor twitches (SMTs) produce independent contractions in individual actuators. This type of actuator activity has been inspired by observations of human behavior during sleep, from early stages of development to adulthood [7, 49]. In U2, the SMTs produce forces, which are propagated through the musculoskeletal system (as well as through the environment where it is embedded). In U3, the changes produced in the musculoskeletal system are captured by the different body sensors, which (in U4) convert them into sensor activity. In U5, the correlation between sensor and motor activity is used to learn sensorimotor mappings, which form the basic motor synergies. A variant of this scheme has also been used to self-organize spinal reflexes [34].

Figure 9. 

Sensorimotor control framework. (a) Ui and Sj are steps i and j of the ULP and SLP, respectively, M is a column vector with the motor signals, is a column vector with the derivatives of the x and y positions taken at every time step, PM and PS are physical processes involved in producing muscle forces and extracting sensor information (respectively), Q is a 2 × 4 matrix (each sensor against each motor signal) containing the self-organized motor synergies, S∗ is a column vector with the goal information (i.e., the difference between the target position and the current position of the pendulum in 2D), α is a column vector containing the modulations of the goal parameters, ϕ contains the modulated goal parameters, ⊗ means elementwise multiplication, S′ is a column vector containing the difference between the final and the initial position of the pendulum at every iteration of the SLP, and e is the error signal between S′ and S∗, which is used to modify the α parameters at every iteration). (b) The connectivity matrix Q obtained (the matrix has been transposed for visualization purposes). (c) A schematic of the pendulum platform with four muscles M1–4 and the x and y axes from a camera placed under the pendulum. (d) The reaching error obtained as a function of the SLP iteration number (see text).

Figure 9. 

Sensorimotor control framework. (a) Ui and Sj are steps i and j of the ULP and SLP, respectively, M is a column vector with the motor signals, is a column vector with the derivatives of the x and y positions taken at every time step, PM and PS are physical processes involved in producing muscle forces and extracting sensor information (respectively), Q is a 2 × 4 matrix (each sensor against each motor signal) containing the self-organized motor synergies, S∗ is a column vector with the goal information (i.e., the difference between the target position and the current position of the pendulum in 2D), α is a column vector containing the modulations of the goal parameters, ϕ contains the modulated goal parameters, ⊗ means elementwise multiplication, S′ is a column vector containing the difference between the final and the initial position of the pendulum at every iteration of the SLP, and e is the error signal between S′ and S∗, which is used to modify the α parameters at every iteration). (b) The connectivity matrix Q obtained (the matrix has been transposed for visualization purposes). (c) A schematic of the pendulum platform with four muscles M1–4 and the x and y axes from a camera placed under the pendulum. (d) The reaching error obtained as a function of the SLP iteration number (see text).

Once the motor synergies are obtained, they are used by the SLP. The SLP consists of an iterative process that aims to identify the appropriate goal modulation parameters required to achieve a given goal in an iterative way; it works as follows. In S1, a goal is set in the system; in our case this is a 2D reaching goal. In S2, the parameters of the goal signals are modulated (i.e., scaled). These are the only parameters being learned during the SLP. In S3, the modulated goal parameters activate the motor synergies, which combine the activations of the individual actuators. In S4, the activation of the different motors cause the musculoskeleton to move, which (in S5) induces sensory stimulation. In S6, the novel sensor values are evaluated with respect to the goal. And in S7, the modulation gains are changed and the process is iterated.

For these experiments, we used a real tailored pendulum platform. This platform consists of a pendulum actuated by four tendon-driven actuators as shown in Figure 9c. Each actuator is equipped with a force sensor placed in series with the elastic element, and a potentiometer placed on the gearbox shaft (similar to Figure 2); the two resulting measurements are used to estimate the actual tendon length (i.e., the distance between the motor and the attachment point of the muscle on the pendulum rod). Additionally, a camera is placed on the base of each pendulum to track the position of the endpoint of the rod (projected on the image plane of the camera). For these experiments, only the information on the x and y axes from the camera is used for sensors. A schematic representation of the pendula is shown in Figure 9c.

The results of the ULP unsupervised process are shown in Figure 9b. The diagram represents the connectivity between each sensor element and each motor element in the matrix Q. An intuitive way of interpreting Q is considering the filled squares as the muscles that will be recruited when the goal entails a positive sign in the respective dimension, and the empty squares as the muscles that will be recruited when the goal entails a negative sign in the respective dimension. For example, if the goal entails a positive projection on the x axis, then muscles M1 and M2 will be recruited; if it entails a negative projection, then muscles M3 and M4 will be recruited. In a way this is related to the muscle Jacobian mentioned in Section 5.1.

During the SLP, the calculation of the individual motor signals from the motor synergies is given by M = Q(α ⊗ S∗) (see caption of Figure 9). The results of the SLP are shown in Figure 9d. For each trial, the plot shows the average and standard deviation achieved for 25 different reaching targets. The reaching targets are equally spaced on a grid covering the workspace of the pendulum. For each trial, we allow a total of 20 iterations. As can be seen, in general, the error decreases with increasing number of trials. The small increases observed in some of the iterations (e.g., iteration 17) are mainly due to the fact that the resting position of the pendulum is slightly different at every iteration step. To investigate the scalability of our algorithm, we are currently building a scaled-up version of this model in simulation, which contains more muscles and more degrees of freedom.

5.5 Control via Muscle Synergies Emergent Under Reinforcement Learning

In this section we investigate the application of reinforcement learning (RL) techniques to the learning of control strategies in anthropomimetic robots. Evidence for aspects of reinforcement or reward-based learning abounds in motor cortex research, yet classical RL approaches (TD learning, etc.) suffer heavily from the curse of dimensionality when used as a generic solution where every actuator can take any input level at every time step. Although generic RL algorithms optimized for high-dimensional systems are now emerging [53], we instead pursue the idea that the biomimetic nature of the Ecce's structure gives scope to significantly reducing dimensionality via a muscle synergy approach employing relatively simple control signals and leveraging the natural dynamics and redundancy of the structure.

In support we cite both anatomical evidence [24, 38] and the decomposition analysis of muscle EMG signals [9] suggesting that signals from the brain are projected not to individual muscles, but to groups of related muscles. Distinct time-invariant muscle weighting profiles (synergies) can be identified, which appear to be driven by shared time-varying activation signals [9]. Individual muscle EMGs can be shown to match a linear combination of these synergies [9, 12–14]. Applying these principles, synergy-based low-dimensional controllers of modeled frog legs [5] and RL-based learning of synergy weightings for modeled musculoskeletal human arms [19] have been successfully demonstrated.

Here we report early results of applying this approach (manuscript in preparation) to the full torso, physics-based model of the Ecce-2 anthropomimetic robot (Figure 10a). It is a challenging control subject, and was considered of a similar order of difficulty to that of the robot itself.

Figure 10. 

Experiment for acquiring reaching control through reinforcement learning. (a) Side and top view of RL experimental setup. The target location (white cross) is generated at random within the zone denoted by the red lines. In the electronic version, the green sphere indicates the extended zone for obtaining some reward by proximity of the hand to the target. For the experiment, the Ecce-2 physics-based model was used, as it was considered of a similar order of difficulty to the robot itself. It contains more than 50 DOF, powered by some 36 separate elastic “muscles” (tendon-driven actuators) in the arms, shoulders, scapulae, spine, and torso. The shoulder blade is fully floating, supported by wrapping muscles with the rib cage mounted on a flexible spine of separate vertebrae. (b) Example screenshots showing how, after learning, the model is able to reach the target object in a majority of randomly generated target locations.

Figure 10. 

Experiment for acquiring reaching control through reinforcement learning. (a) Side and top view of RL experimental setup. The target location (white cross) is generated at random within the zone denoted by the red lines. In the electronic version, the green sphere indicates the extended zone for obtaining some reward by proximity of the hand to the target. For the experiment, the Ecce-2 physics-based model was used, as it was considered of a similar order of difficulty to the robot itself. It contains more than 50 DOF, powered by some 36 separate elastic “muscles” (tendon-driven actuators) in the arms, shoulders, scapulae, spine, and torso. The shoulder blade is fully floating, supported by wrapping muscles with the rib cage mounted on a flexible spine of separate vertebrae. (b) Example screenshots showing how, after learning, the model is able to reach the target object in a majority of randomly generated target locations.

For our approach, we will define a motor plan as comprising simply a parametrized control signal shared in weighted proportion between a set of muscles. New motor plans are generated for trialing by weighted combinations of existing plans, employing a basis function of the target location but also favoring those that have proven rewarding as contributors to past plans. In this way, we specifically seek to encourage emergence of particular muscle combinations (synergies) that have proven most effective as modular contributors to an action. The RL acts to optimize the control signal parameters and the muscle weighting combination within the plans. Of particular interest is the extent to which the controller can still perform on just a subset of plans acting as key muscle synergies, or primitives.

Initial work has focused on open loop motor control of a reaching task, namely, moving a hand to a target object located randomly per trial. This task was selected because it called for precision control of a complex, compliant structure, using multiple cooperating muscles.

Initial experiments considered a simple scenario with the robot torso model placed on a static base (Figure 10a). All trials commence with the model balanced at rest with arms at the sides. The input to the learning algorithm comprises solely the random [x, y, z] target position (Figure 10a, red zone). The target object is balanced on a minimal static base. Motor plans are allocated a reward for striking or touching the bottle and also for passing the hand into the larger reward zone (the green sphere) and holding it there, thus encouraging plans that slow or stop the hand near the target, supporting future extensions incorporating grasping.

Exploitation, exploration, reliability, and optimization are targeted through the use of different forms of added noise. The random placement of targets naturally generates weighting noise, as the contributions of plans are never the same. Signal-dependent noise proportional to amplitude is added to encourage the emergence of increasingly optimal velocity profiles as predicted by increasing reliability under noise [22].

We do not attempt as yet to bootstrap control emerging from a zero starting state; instead, the natural amenability to control offered by the structure's biomechanical constraints means that a starting set of rough motor plans can be relatively easily handcrafted. The RL acts to combine and optimize these plans to generate accuracy and reliability for any target in the zone. Future collaborative work may investigate the emergence of the initial set from reflexes and sensorimotor dynamics (see Section 5.4).

The initial results are generally encouraging. After a sufficient learning period, the robot model can at least strike the target on 76% of random trials (Figure 10b), although higher target locations are less reliably struck. Slowing the hand sufficiently for reliable grasping is less well developed and currently appears to be restricted to a subset of amenable locations. A video is also available to view (http://youtu.be/0D_pb8qjNKE).

Current analysis is investigating whether the motor plans that do slow the hand are doing so via antagonist stiffening, whether velocity and jerk profiles show any approach to optimality, and whether there is evidence that learning is exploiting compliance and natural dynamics. The emergence of key primitives is under investigation by looking at the effect of reducing the bank of stored plans. Future work will consider reaching from any starting posture, reaching in the presence of obstacles, and the applicability of this learning approach to one of our physical robot torsos. In conclusion, this appears to be a promising approach to controlling highly biomimetic structures, subject to the caveat that it is currently untested in higher-dimensional problem spaces.

5.6 Experience-Based Integrated Feedforward-Feedback Control

The control approaches presented in the previous sections were either exclusive feedforward or exclusive feedback control schemes. Voluntary movement control in humans, however, is proposed to rely on both of these concepts. For instance, whereas feedforward is considered to be important for adjusting the grip force during an arm movement [33], feedback is hypothesized to play a role in feedback-error learning of inverse models [64]. Inspired by these ideas, we developed an integrated feedforward-feedback control approach that aims at solving the point-to-point control problem (end-effector position without orientation) in the operational space [52].

The feedforward stage of this approach consists of two phases. In the first phase, the experience acquisition, a series of input-to-output experiments is performed to form a black-box forward model of the robot, which we called the experience base. Here, different initial positions Xs of the end effector are specified, and linearly scaled input voltage patterns V(t)—derived from EMG recordings—are applied for each initial position. The resulting end-effector position Xf as well as the initial condition and the applied voltage pattern are recorded and stored in the experience base. In the second phase, which we called system exploitation, the acquired forward model of the first phase is employed as an inverse model to derive the required control input patterns Vd(t) for a desired end-effector position Xd, which was not part of the previous experience acquisition phase. Here, a linear interpolation scheme was used to compute the inputs from a set of closest neighbors of the target position.

To compensate for the steady-state offsets of the feedforward control stage, we subsequently superimposed a feedback control signal that depended on the final position error ΔX—similarly to eye-hand coordination [4]. Here, we tested two approaches: (i) fuzzy logic feedback and (ii) experience-based linear feedback gain. For the fuzzy logic feedback approach, the superimposed control voltage ΔV(t) was equal to
formula
where KJac is the Jacobi coefficient that relates joint with end-effector shifts, and Vfuzzy is the fuzzy voltage. KJac was estimated using the experience base acquired for the feedforward stage, and Vfuzzy was computed by a fuzzy inference engine with the inputs ΔX and its derivative .
The second feedback approach relies on the linear feedback gain matrix K to compensate for the steady-state offset ΔX. As the gain matrix varies with the end-effector position X, K was experimentally determined for each Xf of the previous experience acquisition phase by evaluating the influence H of each motor on the end-effector position and computing the weighted pseudo-inverse of H. The superimposed feedback voltage ΔV(t) was then equal to
formula

This control approach was tested on a simulation model, comprising seven revolute joints and two tendon-driven actuators per joint. The results are shown in Figure 11. For the feedforward stage, the end-effector position offset was equal to 6 mm along the z axis and 2 mm along the x and y axes. This error was further decreased by both superimposed feedback controllers, and a final error of less than 1 mm was achieved for all position coordinates. However, the experience-based linear feedback gain approach has proven to be slightly superior to the fuzzy logic feedback, as can be seen in Figure 11.

Figure 11. 

Experience-based integrated feedforward-feedback control—simulation comparison of position tracking of the end effector for two superimposed feedback approaches: (i) fuzzy logic feedback and (ii) experience-based linear feedback gain. For both approaches, the final position error is less than 1 mm for all position coordinates, with slightly better results for the experience-based linear feedback gain.

Figure 11. 

Experience-based integrated feedforward-feedback control—simulation comparison of position tracking of the end effector for two superimposed feedback approaches: (i) fuzzy logic feedback and (ii) experience-based linear feedback gain. For both approaches, the final position error is less than 1 mm for all position coordinates, with slightly better results for the experience-based linear feedback gain.

As this control approach is based on input-to-output experience, it is normally expected that the methods and algorithms presented can be applied to a wider range of robots. However, the approach was tested in a relatively simple simulation model, and further tests are required to assess its performance in more complex setups providing additional challenges to the anthropomimetic design, such as spherical joints or multi-articular actuators.

6 Conclusion

In this article we have presented the results of our approach to anthropomimetic robotics. We have built four torso prototypes, each mechanically more complex than its predecessor. The latest prototype, the Ecce-4, is mounted to a mobile base and features 54 actuators as well as detailed replicas of the human spine and shoulder girdle. Furthermore, we developed a flexible, distributed control and software framework, tailored for this particular class of robots, that can cope with the large number of actuators and sensors of these systems. Moreover, we implemented all the required tools to simulate the dynamics of this class of robots.

From the scientific point of view, we have presented control approaches in the joint and operational space domain. We showed results of controlling antagonistically actuated joints using methods from traditional control theory, as well as two approaches based on unsupervised learning techniques, that suggest that machine learning is a promising technique to achieve operational space control. Finally, we have tested a neuro-inspired approach relying on a developmental framework that has the capacity to elucidate the development of human motor control.

Although we have made progress in the design, simulation, and control of anthropomimetic robots, we recognize that we have barely scratched the real potential of this novel approach, particularly with respect to morphological computation. In general, the advantages of tendon-driven robots are becoming more and more acknowledged in the morphological computation community. But a number of technical problems still prevent the field from being completely established. One such problem is the lack of required sensor modalities (e.g., joint angles) for this type of robot, which also prevented us from evaluating our control approaches on the full-scale robot. Another, very important issue is that the body and the controller have to coevolve due to their morphological computation interplay. Once these problems are fully understood and solved, we are confident that the anthropomimetic approach will shed new light on the role of morphological computation in human motor control.

Acknowledgments

The research leading to these results has received funding from the European Community's Seventh Framework Programme FP7/2007-2013–Challenge 2–Cognitive Systems, Interaction, Robotics, under grant agreement no. 231864—ECCEROBOT.

References

1. 
Ackermann
,
J.
(
1985
).
Sampled-data control systems.
Berlin
:
Springer
.
2. 
Ando
,
N.
,
Suehiro
,
T.
, &
Kotoku
,
T.
(
2008
).
A software platform for component based RT-system development: OpenRTM-Aist.
In
Proceedings of the 1st International Conference on Simulation, Modeling, and Programming for Autonomous Robots
(pp.
87
98
).
Berlin
:
Springer
.
3. 
Arnaud
,
R.
, &
Barnes
,
M. C.
(
2006
).
Collada: Sailing the gulf of 3D digital content creation.
Natick, MA
:
A.K. Peters
.
4. 
Batista
,
A. P.
,
Buneo
,
C. A.
,
Snyder
,
L. H.
, &
Andersen
,
R. A.
(
1999
).
Reach plans in eye-centered coordinates.
Science
,
285
(
5425
),
257
260
.
5. 
Berniker
,
M.
,
Jarc
,
A.
,
Bizzi
,
E.
, &
Tresch
,
M. C.
(
2009
).
Simplified and effective motor control based on muscle synergies to exploit musculoskeletal dynamics.
Proceedings of the National Academy of Sciences of the United States of America
,
106
(
18
),
7601
7606
.
6. 
Beyer
,
H.-G.
, &
Schwefel
,
H.-P.
(
2002
).
Evolution strategies: A comprehensive introduction.
Natural Computing
,
1
,
3
52
.
7. 
Blumberg
,
M. S.
, &
Lucas
,
D. E.
(
1996
).
A developmental and component analysis of active sleep.
Developmental Psychobiology
,
1
,
1
22
.
8. 
Chestnutt
,
J.
,
Michel
,
P.
,
Kuffner
,
J.
, &
Kanade
,
T.
(
2007
).
Locomotion among dynamic obstacles for the Honda ASIMO.
In
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems IROS
(pp.
2572
2573
).
9. 
Cheung
,
V. C. K.
,
Piron
,
L.
,
Agostini
,
M.
,
Silvoni
,
S.
,
Turolla
,
A.
, &
Bizzi
,
E.
(
2009
).
Stability of muscle synergies for voluntary actions after cortical stroke in humans.
Proceedings of the National Academy of Sciences of the United States of America
,
106
(
46
),
19563
19568
.
10. 
Coumans
,
E.
(
2012
).
Bullet Physics.
.
11. 
Craig
,
J. J.
(
2005
).
Introduction to robotics
, 3rd ed.
Englewood Cliffs, NJ
:
Pearson
.
12. 
D'Avella
,
A.
, &
Bizzi
,
E.
(
2005
).
Shared and specific muscle synergies in natural motor behaviors.
Proceedings of the National Academy of Sciences of the United States of America
,
102
(
8
),
3076
3081
.
13. 
D'Avella
,
A.
,
Saltiel
,
P.
, &
Bizzi
,
E.
(
2003
).
Combinations of muscle synergies in the construction of a natural motor behavior.
Nature Neuroscience
,
6
(
3
),
300
308
.
14. 
D'Avella
,
A.
, &
Tresch
,
M. C.
(
2002
).
Modularity in the motor system: Decomposition of muscle patterns as combinations of time-varying synergies.
Advances in Neural Information Processing Systems
,
14
,
141
148
.
15. 
De Luca
,
A.
,
Siciliano
,
B.
, &
Zollo
,
L.
(
2005
).
PD control with on-line gravity compensation for robots with elastic joints: Theory and experiments.
Automatica
,
41
(
10
),
1809
1819
.
16. 
De Sapio
,
V.
,
Warren
,
J.
, &
Khatib
,
O.
(
2006
).
Predicting reaching postures using a kinematically constrained shoulder model.
In
Advances in robot kinematics
(pp.
209
218
),
Berlin
:
Springer
.
17. 
De Sapio
,
V.
,
Warren
,
J.
,
Khatib
,
O.
, &
Delp
,
S.
(
2005
).
Simulating the task-level control of human motion: A methodology and framework for implementation.
The Visual Computer
,
21
(
5
),
289
302
.
18. 
Dominici
,
N.
,
Ivanenko
,
Y. P.
,
Cappellini
,
G.
,
d'Avella
,
A.
,
Mondì
,
V.
,
Cicchese
,
M.
,
Fabiano
,
A.
,
Silei
,
T.
,
Di Paolo
,
A.
,
Giannini
,
C.
,
Poppele
,
R. E.
, &
Lacquaniti
,
F.
(
2011
).
Locomotor primitives in newborn babies and their development.
Science
,
334
(
6058
),
997
999
.
19. 
Fagg
,
A. H.
,
Sitkoff
,
N.
,
Barto
,
A. G.
, &
Houk
,
J. C.
(
1997
).
A model of cerebellar learning for control of arm movements using muscle synergies.
In
Proceedings of the 1997 IEEE International Symposium on Computational Intelligence in Robotics and Automation, CIRA
(pp.
6
12
).
20. 
Feldman
,
A. G.
(
1966
).
Functional tuning of the nervous system with control of movement or maintenance of a steady posture. II. Controllable parameters of the muscles.
Biophysics
,
11
(
3
),
498
508
.
21. 
Hannaford
,
B.
, &
Stark
,
L.
(
1985
).
Roles of the elements of the triphasic control signal.
Experimental Neurology
,
90
(
3
),
619
634
.
22. 
Harris
,
C. M.
, &
Wolpert
,
D. M.
(
1998
).
Signal-dependent noise determines motor planning.
Nature
,
394
(
6695
),
780
784
.
23. 
Hoffmeister
,
F.
, &
Bäck
,
T.
(
1991
).
Genetic algorithms and evolution strategies—Similarities and differences.
In
Proceedings of the 1st Workshop on Parallel Problem Solving from Nature
(pp.
455
469
).
Berlin
:
Springer-Verlag
.
24. 
Holdefer
,
R. N.
, &
Miller
,
L. E.
(
2002
).
Primary motor cortical neurons encode functional muscle synergies.
Experimental Brain Research
,
146
(
2
),
233
243
.
25. 
Holland
,
O.
, &
Knight
,
R.
(
2006
).
The anthropomimetic principle.
In
Proceedings of the AISB06 Symposium on Biologically Inspired Robotics
.
26. 
Hollerbach
,
J. M.
(
1982
).
Computers, brains and the control of movement.
Trends in Neurosciences
,
5
,
189
192
.
27. 
Inaba
,
M.
,
Mizuuchi
,
I.
,
Tajima
,
R.
,
Yoshikai
,
T.
,
Sato
,
D.
,
Nagashima
,
K.
, &
Inoue
,
H.
(
2003
).
Building spined muscle-tendon humanoid.
In R. Jarvis & A. Zelinsky (Eds.)
,
Robotics Research
(pp.
113
127
).
Berlin
:
Springer
.
28. 
Jacobsen
,
S.
,
Ko
,
H.
,
Iversen
,
E.
, &
Davis
,
C.
(
1990
).
Control strategies for tendon-driven manipulators.
IEEE Control Systems Magazine
,
10
(
2
),
23
28
.
29. 
Jäntsch
,
M.
,
Schmaler
,
C.
,
Wittmeier
,
S.
,
Dalamagkidis
,
K.
, &
Knoll
,
A.
(
2011
).
A scalable joint-space controller for musculoskeletal robots with spherical joints.
In
Proceedings of the IEEE International Conference on Robotics and Biomimetics, ROBIO
(pp.
2211
2216
).
30. 
Jäntsch
,
M.
,
Wittmeier
,
S.
,
Dalamagkidis
,
K.
, &
Knoll
,
A.
(
2012
).
Computed muscle control for an anthropomimetic elbow joint.
In
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS
(pp.
2192
2197
).
31. 
Jäntsch
,
M.
,
Wittmeier
,
S.
, &
Knoll
,
A.
(
2010
).
Distributed control for an anthropomimetic robot.
In
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS
(pp.
5466
5471
).
32. 
Kandel
,
E. R.
,
Schwartz
,
J. H.
, &
Jessel
,
T. M.
(
2000
).
Principles of neural science
, 4th ed.
New York
:
McGraw-Hill
.
33. 
Kawato
,
M.
(
1999
).
Internal models for motor control and trajectory planning.
Current Opinion in Neurobiology
,
9
(
6
),
718
727
.
34. 
Marques
,
H. G.
,
Imtiaz
,
F.
,
Iida
,
F.
, &
Pfeifer
,
R.
(
2012
).
Self-organisation of reflexive behaviour from spontaneous motor activity.
Biological Cybernetics
,
1
13
.
DOI 10.1007/s00422-012-0521-7
.
35. 
Marques
,
H. G.
,
Schaffner
,
P.
, &
Kuppuswamy
,
N.
(
2012
).
Unsupervised learning of a reduced dimensional controller for a tendon driven robot platform.
Simulation of adaptive behaviour, SAB2012
,
351
360
.
36. 
McFarlane
,
D.
, &
Glover
,
K.
(
1992
).
A loop-shaping design procedure using H∞ synthesis.
IEEE Transactions on Automatic Control
,
37
(
6
),
759
769
.
37. 
Migliore
,
S.
,
Brown
,
E.
, &
DeWeerth
,
S.
(
2005
).
Biologically inspired joint stiffness control.
In
Proceedings of the IEEE International Conference on Robotics and Automation, ICRA
(pp.
4508
4513
).
38. 
Miller
,
L. E.
, &
Houk
,
J. C.
(
1995
).
Motor co-ordinates in primate red nucleus: Preferential relation to muscle activation versus kinematic variables.
The Journal of Physiology
,
488
(
Pt 2
),
533
548
.
39. 
Mizuuchi
,
I.
,
Nakanishi
,
Y.
,
Sodeyama
,
Y.
,
Namiki
,
Y.
,
Nishino
,
T.
,
Muramatsu
,
N.
,
Urata
,
J.
,
Hongo
,
K.
,
Yoshikai
,
T.
, &
Inaba
,
M.
(
2007
).
An advanced musculoskeletal humanoid Kojiro.
In
Proceedings of the IEEE/RAS International Conference on Humanoid Robots
(pp.
294
299
).
40. 
Mizuuchi
,
I.
,
Yoshikai
,
T.
,
Sodeyama
,
Y.
,
Nakanishi
,
Y.
,
Miyadera
,
A.
,
Yamamoto
,
T.
,
Niemela
,
T.
,
Hayashi
,
M.
,
Urata
,
J.
,
Namiki
,
Y.
,
Nishino
,
T.
, &
Inaba
,
M.
(
2006
).
Development of musculoskeletal humanoid Kotaro.
In
Proceedings of the IEEE International Conference on Robotics and Automation, ICRA
(pp.
82
87
).
41. 
Osada
,
M.
,
Ito
,
N.
,
Nakanishi
,
Y.
, &
Inaba
,
M.
(
2010
).
Realization of flexible motion by musculoskeletal humanoid Kojiro with add-on nonlinear spring units.
In
Proceedings of the IEEE/RAS International Conference on Humanoid Robots
(pp.
174
179
).
42. 
Palli
,
G.
,
Melchiorri
,
C.
, &
De Luca
,
A.
(
2008
).
On the feedback linearization of robots with variable joint stiffness.
In
Proceedings of the IEEE International Conference on Robotics and Automation, ICRA
(pp.
1753
1759
).
43. 
Pelrine
,
R.
,
Kornbluh
,
R.
,
Pei
,
Q.
,
Stanford
,
S.
,
Oh
,
S.
,
Eckerle
,
J.
, &
Meijer
,
K.
(
2002
).
Dielectric elastomer artificial muscle actuators: Toward biomimetic motion.
Proceedings of SPIE
,
4695
,
126
137
.
44. 
Pfeifer
,
R.
, &
Iida
,
F.
(
2003
).
Embodied artificial intelligence: Trends and challenges.
In
Embodied artificial intelligence
(pp.
1
26
).
Berlin
:
Springer
.
45. 
Pfeifer
,
R.
,
Lungarella
,
M.
, &
Iida
,
F.
(
2007
).
Self-organization, embodiment, and biologically inspired robotics.
Science
,
318
(
5853
),
1088
1093
.
46. 
Potkonjak
,
V.
,
Svetozarevic
,
B.
,
Jovanovic
,
K.
, &
Holland
,
O.
(
2011
).
Anthropomimetic robot with passive compliance—Contact dynamics and control.
In
Mediterranean Conference on Control Automation (MED)
(pp.
1059
1064
).
47. 
Potkonjak
,
V.
,
Svetozarevic
,
B.
,
Jovanovic
,
K.
, &
Holland
,
O.
(
2012
).
The puller-follower control of compliant and noncompliant antagonistic tendon drives in robotic systems.
International Journal of Advanced Robotic Systems
,
8
(
5
),
143
155
.
48. 
Reisinger
,
J.
, &
Steininger
,
A.
(
1993
).
The design of a fail-silent processing node for the predictable hard real-time system MARS.
Distributed Systems Engineering
,
1
(
2
),
104
111
.
49. 
Robinson
,
S.
, &
Brumley
,
M.
(
2005
).
Prenatal behavior.
In
The behaviour of the laboratory rat: A handbook with tests
(pp.
257
265
).
Oxford, UK
:
Oxford University Press
.
50. 
Sakagami
,
Y.
,
Watanabe
,
R.
,
Aoyama
,
C.
,
Matsunaga
,
S.
,
Higaki
,
N.
, &
Fujimura
,
K.
(
2002
).
The intelligent ASIMO: System overview and integration.
In
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS
(pp.
2478
2483
).
51. 
Sanger
,
T. D.
(
1994
).
Optimal unsupervised motor learning for dimensionality reduction of nonlinear control systems.
IEEE Transactions on Neural Networks
,
5
(
6
),
965
973
.
52. 
Shadmehr
,
R.
, &
Mussa-Ivaldi
,
F. A.
(
1994
).
Adaptive representation of dynamics during learning of a motor task.
Journal of Neuroscience
,
14
(
5 Pt 2
),
3208
3224
.
53. 
Stulp
,
F.
,
Buchli
,
J.
,
Theodorou
,
E.
, &
Schaal
,
S.
(
2010
).
Reinforcement learning of full-body humanoid motor skills.
In
Proceedings of the 10th IEE/RAS International Conference on Humanoid Robots
(pp.
405
410
).
54. 
Todorov
,
E.
, &
Ghahrammani
,
Z.
(
2003
).
Unsupervised learning of sensory-motor primitives.
In
Proceedings of the 25th Annual International Conference of the IEEE Engineering in Medicine and Biology Society
(pp.
1750
1753
).
55. 
Tsagarakis
,
N.
,
Sinclair
,
M.
,
Becchi
,
F.
,
Metta
,
G.
,
Sandini
,
G.
, &
Caldwell
,
D.
(
2006
).
Lower body design of the “iCub”: A human-baby like crawling robot.
In
Proceedings of the IEEE/RAS International Conference on Humanoid Robots
(pp.
450
455
).
56. 
Urata
,
J.
,
Nakanishi
,
Y.
,
Miyadera
,
A.
,
Mizuuchi
,
I.
,
Yoshikai
,
T.
, &
Inaba
,
M.
(
2006
).
A three-dimensional angle sensor for a spherical joint using a micro camera.
In
Proceedings of the IEEE International Conference on Robotics and Automation
(pp.
4428
4430
).
57. 
Vanderborght
,
B.
,
Tsagarakis
,
N.
,
Semini
,
C.
,
Van Ham
,
R.
, &
Caldwell
,
D.
(
2009
).
MACCEPA 2.0: Adjustable compliant actuator with stiffening characteristic for energy efficient hopping.
In
Proceedings of the IEEE International Conference on Robotics and Automation, ICRA
(pp.
544
549
).
58. 
Vanderborght
,
B.
,
Van Ham
,
R.
,
Lefeber
,
D.
,
Sugar
,
T. G.
, &
Hollander
,
K. W.
(
2009
).
Comparison of mechanical design and energy consumption of adaptable, passive-compliant actuators.
The International Journal of Robotics Research
,
28
(
1
),
90
103
.
59. 
Webb
,
B.
(
2009
).
Animals versus animats: Or why not model the real iguana?
Adaptive Behavior
,
17
(
4
),
269
286
.
60. 
Werbos
,
P. J.
(
1990
).
Backpropagation through time: What it does and how to do it.
Proceedings of the IEEE
,
78
(
10
),
1550
1560
.
61. 
Wittmeier
,
S.
,
Gaschler
,
A.
,
Jäntsch
,
M.
,
Dalamagkidis
,
K.
, &
Knoll
,
A.
(
2012
).
Calibration of a physics-based model of an anthropomimetic robot using evolution strategies.
In
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS
(pp.
445
450
).
62. 
Wittmeier
,
S.
,
Jäntsch
,
M.
,
Dalamagkidis
,
K.
, &
Knoll
,
A.
(
2011
).
Physics-based modeling of an anthropomimetic robot.
In
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS
(pp.
4148
4153
).
63. 
Wittmeier
,
S.
,
Jäntsch
,
M.
,
Dalamagkidis
,
K.
,
Rickert
,
M.
,
Marques
,
H. G.
, &
Knoll
,
A.
(
2011
).
Caliper: A universal robot simulation framework for tendon-driven robots.
In
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS
(pp.
1063
1068
).
64. 
Wolpert
,
D. M.
, &
Ghahramani
,
Z.
(
2000
).
Computational principles of movement neuroscience.
Nature Neuroscience
,
3
,
1212
1217
.
65. 
Zatsiorsky
,
V. M.
(
2002
).
Kinetics of human motion
,
Champaign, IL
:
Human Kinetics Publishers
.

Author notes

Contact author.

∗∗

Chair of Robotics and Embedded Systems, Faculty of Informatics, Technische Universität München, Boltzmannstr. 3, 85748 Garching, Germany. E-mail: steffen.wittmeier@in.tum.de (S.W.); dalamagkidis@tum.de (K.D.); michael.jaentsch@in.tum.de (M.J.); knoll@in.tum.de (A.K.)

Artificial Intelligence Laboratory, Department of Informatics, University of Zurich, Andreasstr. 15, 8050 Zurich, Switzerland. E-mail: alessandro@ifi.uzh.ch (C.A.); hgmarques@ifi.uzh.ch (H.G.M.); pfeifer@ifi.uzh.ch (R.P.)

Faculty of Electrical Engineering, University of Belgrade, Bulevar kralja Aleksandra 73, 11000 Belgrade, Serbia. E-mail: nenadb@etf.rs (N.B.); kostaj@etf.rs (K.J.); predrag.m@etf.rs (P.M.); potkonjak@yahoo.com (V.P.)

§

Department of Informatics, University of Sussex, Falmer, East Sussex BN1 9QJ, United Kingdom. E-mail: djcdevereux@gmail.com (D.D.); a.diamond@sussex.ac.uk (A.D.); mitra.bhargav@gmail.com (B.M.); o.e.holland@sussex.ac.uk (O.H.)

The Robot Studio SARL, 122 rue de la Versoix, 01220 Divonne les bains, France. E-mail: rob.knight@therobotstudio.com

#

Automatic Control Lab, ETH Zurich, Switzerland. E-mail: bratislavs@control.ee.ethz.ch