22
A Regression Approach to Music Emotion Recognition Yi-Hsuan Yang, Yu-Ching Lin, Ya-Fan Su, and Homer H. Chen, Fellow, IEEE IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 16, NO. 2, FEBRUARY 2008 Sung Eun Park 2009-11-20 Intelligent Database Systems Lab School of Computer Science & Engineering Seoul National University, Seoul, Korea

A Regression Approach to Music Emotion Recognition Yi-Hsuan Yang, Yu-Ching Lin, Ya-Fan Su, and Homer H. Chen, Fellow, IEEE IEEE TRANSACTIONS ON AUDIO,

Embed Size (px)

Citation preview

Page 1: A Regression Approach to Music Emotion Recognition Yi-Hsuan Yang, Yu-Ching Lin, Ya-Fan Su, and Homer H. Chen, Fellow, IEEE IEEE TRANSACTIONS ON AUDIO,

A Regression Approach to Music Emotion Recognition

Yi-Hsuan Yang, Yu-Ching Lin, Ya-Fan Su, and Homer H. Chen, Fellow, IEEE

IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 16, NO. 2, FEBRUARY 2008

Sung Eun Park

2009-11-20

Intelligent Database Systems LabSchool of Computer Science & EngineeringSeoul National University, Seoul, Korea

Page 2: A Regression Approach to Music Emotion Recognition Yi-Hsuan Yang, Yu-Ching Lin, Ya-Fan Su, and Homer H. Chen, Fellow, IEEE IEEE TRANSACTIONS ON AUDIO,

Copyright 2008 by CEBT 2

Contents

Introduction

Simple concept of the model

Body

Regression approach

Model Explanation

Evaluation

Conclusion

Discussion

Contribution

Q&A

Page 3: A Regression Approach to Music Emotion Recognition Yi-Hsuan Yang, Yu-Ching Lin, Ya-Fan Su, and Homer H. Chen, Fellow, IEEE IEEE TRANSACTIONS ON AUDIO,

Copyright 2008 by CEBT 3

Brief Concept of the Model

Thayer’s arousal-valence emotion plane.

♬♬♬♬♬

Page 4: A Regression Approach to Music Emotion Recognition Yi-Hsuan Yang, Yu-Ching Lin, Ya-Fan Su, and Homer H. Chen, Fellow, IEEE IEEE TRANSACTIONS ON AUDIO,

Copyright 2008 by CEBT 4

An application using this concept

Musicovery based on the same concept of this model.

click

Findrelevant music of the point

Page 5: A Regression Approach to Music Emotion Recognition Yi-Hsuan Yang, Yu-Ching Lin, Ya-Fan Su, and Homer H. Chen, Fellow, IEEE IEEE TRANSACTIONS ON AUDIO,

Copyright 2008 by CEBT 5

Many good regressor(regression algorithms ) are readily available.

Given N inputs (xi, yi), 1≤ i ≤ N, where xi is a feature vec-

tor for the ith input sample, and yi ∈ R is the real value

to be predicted for the ith sample, the regression system trains a regression algorithm(regressor) R(∙) such that the mean squared error ε is minimized.

Regression Approach

minimize a feature vectorReal Value

Find this!! Pre-dictedValue

Page 6: A Regression Approach to Music Emotion Recognition Yi-Hsuan Yang, Yu-Ching Lin, Ya-Fan Su, and Homer H. Chen, Fellow, IEEE IEEE TRANSACTIONS ON AUDIO,

Copyright 2008 by CEBT 6

The model

♬♬♬♬

Ground Truth

Musical Features

RegressorReg.A and Reg.V

Subjec-tive test

FeatureExtrac-

tion

Regres-sion

Emotion Visualiza-tion

Page 7: A Regression Approach to Music Emotion Recognition Yi-Hsuan Yang, Yu-Ching Lin, Ya-Fan Su, and Homer H. Chen, Fellow, IEEE IEEE TRANSACTIONS ON AUDIO,

Copyright 2008 by CEBT 7

The model in detail

Preprocessing

RegressorTraining

Training Data

SubjectiveTest

Feature extraction

Reg.A Reg.V

Preprocessing

EmotionVisualization

Test Data

Reg.A Reg.V

Feature extraction

Page 8: A Regression Approach to Music Emotion Recognition Yi-Hsuan Yang, Yu-Ching Lin, Ya-Fan Su, and Homer H. Chen, Fellow, IEEE IEEE TRANSACTIONS ON AUDIO,

Copyright 2008 by CEBT 8

The dependency between the two dimensions,

arousal and valence

What is the positive music?

Then what is the energetic music?

Principle Component Analysis(CPA)

is common way of reducing the correlation

between variables.

An Issue of the Continuous Perspec-tive

ener-getic

calm

Com-puted by PCA

Original data

Principle component

Page 9: A Regression Approach to Music Emotion Recognition Yi-Hsuan Yang, Yu-Ching Lin, Ya-Fan Su, and Homer H. Chen, Fellow, IEEE IEEE TRANSACTIONS ON AUDIO,

Copyright 2008 by CEBT 9

Reducing Correlation Between Vari-ables

AV plane: some dependency

exists

PC plane: no dependency

exists

Train regressorRp ,Rq

Test in PQ plane and compare with AV planeDetails follow in the later presentation

Page 10: A Regression Approach to Music Emotion Recognition Yi-Hsuan Yang, Yu-Ching Lin, Ya-Fan Su, and Homer H. Chen, Fellow, IEEE IEEE TRANSACTIONS ON AUDIO,

Copyright 2008 by CEBT 10

Dataset

Preprocessing

RegressorTraining

Training Data

SubjectiveTest

Feature extraction

Reg.A Reg.V

195 popular songs selected from a number of Western, Chinese, and Japanese albums.

1) These songs should be distributed uniformly in each quadrant of the emotion plane.

2) Each music sample should express a certain dominant emotion.

Page 11: A Regression Approach to Music Emotion Recognition Yi-Hsuan Yang, Yu-Ching Lin, Ya-Fan Su, and Homer H. Chen, Fellow, IEEE IEEE TRANSACTIONS ON AUDIO,

Copyright 2008 by CEBT 11

253 volunteers from the campus

Subjective Test

Preprocessing

RegressorTraining

Training Data

SubjectiveTest

Feature extraction

Reg.A Reg.V

Is asked to listen to ten music samples randomly drawn from the music database and to label the AV values from –1.0 to 1.0 in 11 or-dinal levels.

Label the evoking emotion rather than the perceived one

Standard deviation of evaluation to the same song is 0.3( which is okay)

Same person tend to label same with same music.

Page 12: A Regression Approach to Music Emotion Recognition Yi-Hsuan Yang, Yu-Ching Lin, Ya-Fan Su, and Homer H. Chen, Fellow, IEEE IEEE TRANSACTIONS ON AUDIO,

Copyright 2008 by CEBT 12

Feature Extraction

Preprocessing

RegressorTraining

Training Data

SubjectiveTest

Feature extraction

Reg.A Reg.V

Page 13: A Regression Approach to Music Emotion Recognition Yi-Hsuan Yang, Yu-Ching Lin, Ya-Fan Su, and Homer H. Chen, Fellow, IEEE IEEE TRANSACTIONS ON AUDIO,

Copyright 2008 by CEBT 13

Feature Extraction

Preprocessing

RegressorTraining

Training Data

SubjectiveTest

Feature extraction

Reg.A Reg.V

• Psysound aims to model parameters of Auditory sensation based on some psychoacoustic models.• Earlier research found that 15 of the features are more closely related to emo-tion perception.

Page 14: A Regression Approach to Music Emotion Recognition Yi-Hsuan Yang, Yu-Ching Lin, Ya-Fan Su, and Homer H. Chen, Fellow, IEEE IEEE TRANSACTIONS ON AUDIO,

Copyright 2008 by CEBT 14

Feature Extraction

Preprocessing

RegressorTraining

Training Data

SubjectiveTest

Feature extraction

Reg.A Reg.V

Select features from all extracted fea-tures which is related to Emotion.

RReliefF is used as a feature extrac-tion algorithm(FSA).

RRFm,n is a space with top-m and top-n

selected features.

Page 15: A Regression Approach to Music Emotion Recognition Yi-Hsuan Yang, Yu-Ching Lin, Ya-Fan Su, and Homer H. Chen, Fellow, IEEE IEEE TRANSACTIONS ON AUDIO,

Copyright 2008 by CEBT 15

Regression Algorithms

Preprocessing

RegressorTraining

Training Data

SubjectiveTest

Feature extraction

Reg.A Reg.V

Three regression algorithms:1. Multiple linear regression (MLR)

• Assumes lineal relationship • Simple method

2. Support vector regression (SVR)• Nonlinearly maps input features into higher dimensional feature space• In many cases superior to existing machine learning methods

3. AdaBoost.RT (BoostR)• Nonlinear regression algorithm • A number of regression trees are trained iteratively and weighted according to the prediction accuracy

Page 16: A Regression Approach to Music Emotion Recognition Yi-Hsuan Yang, Yu-Ching Lin, Ya-Fan Su, and Homer H. Chen, Fellow, IEEE IEEE TRANSACTIONS ON AUDIO,

Copyright 2008 by CEBT 16

Method

R2 Statistics : showing how much prediction and real value are close.

AV and PC Plane comparison :

The effect of variance dependency

Evaluation

The best combination

No significantdifference

<<

Page 17: A Regression Approach to Music Emotion Recognition Yi-Hsuan Yang, Yu-Ching Lin, Ya-Fan Su, and Homer H. Chen, Fellow, IEEE IEEE TRANSACTIONS ON AUDIO,

Copyright 2008 by CEBT 17

Evaluation

Regressor Comparison

A plane with no correlation

Selected feature space

Page 18: A Regression Approach to Music Emotion Recognition Yi-Hsuan Yang, Yu-Ching Lin, Ya-Fan Su, and Homer H. Chen, Fellow, IEEE IEEE TRANSACTIONS ON AUDIO,

Copyright 2008 by CEBT 18

Evaluation – The Prediction Accuracy

+ Ground Truth Prediction Result

The best performance of the regression approachreaches 58.3% for arousal and 28.1% for valence by using PCRRF SVR

Page 19: A Regression Approach to Music Emotion Recognition Yi-Hsuan Yang, Yu-Ching Lin, Ya-Fan Su, and Homer H. Chen, Fellow, IEEE IEEE TRANSACTIONS ON AUDIO,

Copyright 2008 by CEBT 19

Performance Evaluation

Using same ground truth data and feature data

=100.3

=117.7

Page 20: A Regression Approach to Music Emotion Recognition Yi-Hsuan Yang, Yu-Ching Lin, Ya-Fan Su, and Homer H. Chen, Fellow, IEEE IEEE TRANSACTIONS ON AUDIO,

Copyright 2008 by CEBT 20

Subjectivity issue

Individual difference : influence of many factors. Cultural background, generation, sex, and personality.

GWMER(Group-wise MER scheme)

Personalization can be an alternative way.

Discussion

R1

R…

R2

R3

R4

Regressor

G1

G…

G2

G3

G4

Users

RegressorChoosing

Page 21: A Regression Approach to Music Emotion Recognition Yi-Hsuan Yang, Yu-Ching Lin, Ya-Fan Su, and Homer H. Chen, Fellow, IEEE IEEE TRANSACTIONS ON AUDIO,

Copyright 2008 by CEBT 21

Contribution

One of the first attempts that develop an MER system from a continuous perspective.(Each song maps to a point in the emotion plane)

A sound theoretical foundation is proposed.

Regression theory.

Extensive performance study.

Several algorithms are tested

Dealing with subjectivity issues of Music Emotion Re-trieval(MER).

Emotion is different from person to person

Two demensions in emotion plane are not dependent.

Page 22: A Regression Approach to Music Emotion Recognition Yi-Hsuan Yang, Yu-Ching Lin, Ya-Fan Su, and Homer H. Chen, Fellow, IEEE IEEE TRANSACTIONS ON AUDIO,

Thank you…

Q&A

Thank you…