2010. 04. 08 Jongwon Yoon

Gesture spotting with body-worn inertial sen-sors

to detect user activitiesHolger Junker, Oliver Amft, Paul Lukowicz, and Gerhard Troster

Pattern Recognition, vol. 41, no. 6, pp. 2010-2024, 2008

2010. 04. 08Jongwon Yoon

Contents

• Introduction– Related works– Contributions– Terminologies

• Spotting approach• Case studies• Spotting implementation

– Preselection stage– Classification stage

• Experiments• Results• Discussion• Conclusion

Introduction• Activity recognition

– Motivated by a variety of mobile and ubiquitous computing applications

• Body-mounted motion sensors for activity recognition– Advantage : Only influenced by user activity– Difficult to extract relevant features

• Information is often ambiguous and incomplete• Sensors do not provide exact trajectory because of gravity and arm speed

changes

• Solution– Spotting of sporadically occurring activities

Related works Introduction

• Wearable instrumentation for gesture recognition– Kung Fu moves (Chambers et al., 2002)– “atomic” gestures recognition (Benbasat, 2000)– House holding activities recognition (Bao, 2003)– Workshop activities recognition (Lukowicz et el., 2004)

• Spotting task– HMM-based endpoint detection in continuous data (Deng and Tsui, 2000)

• Used HMM-based accumulation score• Search start point using the viterbi algorithm

– HMM-based Threshold model (Lee and Kim, 1999)• Calculates the likelihood threshold of an input pattern

– Partitioning the incoming data using an intensity analysis (Lukowicz, 2004)

Contributions Introduction

• Two-stage gesture spotting method– Novel method based on body-worn motion sensors– Specifically designed towards the needs and constraints of activity

recognition in wearable and pervasive systems• Large null class• Lack of appropriate models for the null class• Large variability in the way gestures are performed• Variable gesture length

• Verification of the proposed method on two scenarios– Comprise nearly a thousand relevant gestures– Scenario1) Interaction with different everyday objects

• Part of a wide range of wearable systems applications– Scenario2) Nutrition intake

• Highly specialized application motivated by the needs of a large industry domi-nated health monitoring project

Terminologies Introduction

• Motion segment– Represents atomic, non-overlapping unit of human motion– Characterized by their spatio-temporal trajectory

• Motion event– Span a sequence of motion segments

• Activity– Describes a situation that may consist of various motion events

• Signal segment– A slice of sensor data that corresponds to a motion segment

• Candidate section– A slice of sensor data that may contain a gesture

Spotting approach• Naïve approach

– Performs on all possible sections in the data stream– Computational effort problem

• Two-stage gesture spotting method– Preselection stage

• Localize and preselect sections in the continuous signal stream– Classification stage

• Classify candidate sections

Case studies• Case study 1

– Spotting of diverse object interaction gestures• Key component in a context recognition system• May facilitate more natural human-computer interfaces

• Case study 2– Dietary intake gestures

• Become one sensing domain of an automated dietary monitoring system

Spotting implementation• Framework

• Relevant gestures

Motion segment partitioning Preseselection stage

• Preselection stage– 1) Initial partitioning of the signal stream– 2) Identify potential selection– 3) Candidate selection

• Partition a motion parameter into non-overlapping, meaningful segments

– Used motion parameter : Pitch and Roll of the lower arm

• Used sliding-window and bottom-up algorithm (SWAB)– Ex) Partitioning of each buffer of length n

• Step 1) Start from the arbitrary segmentation of the signal into n/2 segments• Step 2) Calculate the cost of merging each pair of adjacent segments

– Cost : The error of approximating the signal with its linear regression• Step 3) Merge the lowest cost pair

Motion segment partitioning (cont.) Preseselec-tion stage

• Used sliding-window and bottom-up algorithm (SWAB) (cont.)

• Extension of the segmentation algorithm– To ensure that the algorithm provided a good approximation– Merge adjacent segments if their linear regressions had similar slopes

Section similarity search Preseselection stage

• Each motion segment endpoint is considered as potential end of a gesture

– For each endpoint, potential start points were derived from preceding motion segment boundaries

• Confining the search space– 1) For the actual length T of the section, Tmin ≤ T ≤ Tmax

– 2) For the number of motion segments nMS in the actual section, NMS,min ≤ nMS ≤ NMS,max

Section similarity search (cont.) Preseselection stage

• Searching– Used simple single-value features

• Min / max signal values, sum of signal samples, duration of the gesture …

– If d(fPS;Gk) smaller than a gesture-specific threshold ▶ Contain gesture Gk

• Selection of candidate sections– Collision of two sections can be occurred– Select sections with the smallest similarity

Classification stage Spotting implementation

• HMM based classification

• Features– Pitch and roll angles from the lower / upper arm sensors– Derivative of the acceleration signal from the lower arm– The cumulative sum of the acceleration from the lower arm– Derivative of the rate of turn signal from the lower sensor– The cumulative sum of the rate of turn from the lower arm

• Model– Single Gaussian models– Consisted of 4-10 states

Experiments• Experimental setting

– Five inertial sensors– One female and three male

• Right-handed• Aged 25-35 years

• Data sets– No constraints to the movements of

the subjects• To obtain data sets with a realistic

zero-class– Eight additional similar gestures

• To enrich the diversity of movements

Evaluation metrics Results

• Recall and Precision

• Other evaluation metrics

Preselection stage Results

• Precision-recall curves

• Evaluation results

Classification stage Results

• Initial testing– Case 1 : 98.4% / Case 2 : 97.4%

• Classification of candidate sections

Extensions of the Classification Results

• Including Zero-class model– Case 1 : Extracted from all relevant gesture models– Case 2 : Constructed on the basis of additional gestures that were carried

out by the subjects

• Summary of the total spotting results

Conclusion• Similarity-based search

– Way to avoid the explicit modeling of a zero-class• Explicit zero-class model can be added to improve the recognition

– Permits different feature sets for individual gestures

• Future work– Additional challenges

• Differences in the size and consistency of food pieces• Additional degrees of freedom• Temporal aspects

– The presented spotting approach can be applied to other types of motion events

Documents

2010. 04. 08 Jongwon Yoon