In this paper, a human–robot interaction system based on a novel combination of sensors is proposed. It allows one person to interact with a humanoid social robot using natural body language. The robot understands the meaning of human upper body gestures and expresses itself by using a combination of body movements, facial expressions, and verbal language. A set of 12 upper body gestures is involved for communication. This set also includes gestures with human–object interactions. The gestures are characterized by head, arm, and hand posture information. The wearable Immersion CyberGlove II is employed to capture the hand posture. This information is combined with the head and arm posture captured from Microsoft Kinect. This is a new sensor solution for human-gesture capture. Based on the posture data from the CyberGlove II and Kinect, an effective and real-time human gesture recognition method is proposed. The gesture understanding approach based on an innovative combination of sensors is the main contribution of this paper. To verify the effectiveness of the proposed gesture recognition method, a human body gesture data set is built. The experimental results demonstrate that our approach can recognize the upper body gestures with high accuracy in real time. In addition, for robot motion generation and control, a novel online motion planning method is proposed. In order to generate appropriate dynamic motion, a quadratic programming (QP)-based dual-arms kinematic motion generation scheme is proposed, and a simplified recurrent neural network is employed to solve the QP problem. The integration of a handshake within the HRI system illustrates the effectiveness of the proposed online generation method.