Abstract

We present research for automatic assessment of pianist hand posture that is intended to help beginning piano students improve their piano-playing technique during practice sessions. To automatically assess a student's hand posture, we propose a system that is able to recognize three categories of postures from a single depth map containing a pianist's hands during performance. This is achieved through a computer vision pipeline that uses machine learning on the depth maps for both hand segmentation and detection of hand posture. First, we segment the left and right hands from the scene captured in the depth map using per-pixel classification. To train the hand-segmentation models, we experiment with two feature descriptors, depth image features and depth context features, that describe the context of individual pixels' neighborhoods. After the hands have been segmented from the depth map, a posture-detection model classifies each hand as one of three possible posture categories: correct posture, low wrists, or flat hands. Two methods are tested for extracting descriptors from the segmented hands, histograms of oriented gradients and histograms of normal vectors. To account for variation in hand size and practice space, detection models are individually built for each student using support vector machines with the extracted descriptors. We validate this approach using a data set that was collected by recording four beginning piano students while performing standard practice exercises. The results presented in this article show the effectiveness of this approach, with depth context features and histograms of normal vectors performing the best.

This content is only available as a PDF.