22
Video summarization by graph optimization Lu Shi Oct. 7, 2003

Video summarization by graph optimization Lu Shi Oct. 7, 2003

  • View
    212

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Video summarization by graph optimization Lu Shi Oct. 7, 2003

Video summarization by graph optimization

Lu Shi

Oct. 7, 2003

Page 2: Video summarization by graph optimization Lu Shi Oct. 7, 2003

Outline Introduction Goals Stage I: Candidate video shot selection

Video segmentation Video feature detection Candidate video shots

Stage II: Graph based video summary generation Dissimilarity function Spatial-temporal relation graph Optimization

Experiments and Results Conclusion & Future Work

Page 3: Video summarization by graph optimization Lu Shi Oct. 7, 2003

IntroductionMotivation

Huge volume of video data are distributed over the Web

How to help the user to grasp the content of the video quickly

When the bandwidth is narrow, how to present the video to the user

Applications Video skimming (dynamic) Static story board (static)

Page 4: Video summarization by graph optimization Lu Shi Oct. 7, 2003

Goals Criterion for video summary

Conciseness. The video skimming should not exceed the given

target length

Comprehensive coverage Both the visual diversity and temporal distribution of

the original video should be covered.

Visual coherence. The video skimming should not be too jumpy

Page 5: Video summarization by graph optimization Lu Shi Oct. 7, 2003

Stage I: Candidate shot selection

Video segmentation A video shot is an unbroken sequence of images

recorded continuously by a camera. The content of a video shot can be represented by

key frames(e.g first and last) A video sequence is formed by a series of video

shots Video shots can be detected by various video

segmentation methods.

Page 6: Video summarization by graph optimization Lu Shi Oct. 7, 2003

Stage I: Candidate shot selection

Video segmentation Middle slice image (Concatenated by video frame center lines) Calculate minimal pixel difference between rows Filtering and thresholding

Page 7: Video summarization by graph optimization Lu Shi Oct. 7, 2003

Stage I: Candidate shot selection

Video feature detection Face detection Voice, noise detection Audio volume Specific color (fire,etc) Text caption

Features indicate interesting content that should be considered putting into the summary

Page 8: Video summarization by graph optimization Lu Shi Oct. 7, 2003

Stage I: Candidate shot selection

Select candidate shots With interesting features extracted Any combination of extracted features Adjacent candidate shots can be merged into video shot

clusters to increase the visual coherence

Page 9: Video summarization by graph optimization Lu Shi Oct. 7, 2003

Stage II: Graph modeling

Video shot pairwise dissimilarity function Visual(spatial) similarity: Histogram

correlation between key frames Temporal distance: the distance between

shot center points Definition

)),((),(1),( ji shshsTemporalDikjiji eshshVisualSimshshDis

Page 10: Video summarization by graph optimization Lu Shi Oct. 7, 2003

Stage II: Graph modeling

Video shot pairwise dissimilarity function Linear with visual dissimilarity Exponential with temporal distance: to

approximate the user’s memory (k = 400 in the experiment)

Definition Similar definition for video clusters

)),((),(1),( ji shshsTemporalDikjiji eshshVisualSimshshDis

Page 11: Video summarization by graph optimization Lu Shi Oct. 7, 2003

Stage II: Graph modeling Video shot cluster pairwise dissimilarity function

Between one video shot and one video shot cluster

Between two shot clusters

jxj

x

xxiji scsh

sclength

shlengthshshDisscshDis ,

)(

)(),(),(

iyi

y

yjyji scsh

sclength

shlengthscshDisscscDis ,

)(

)(),(),(

Page 12: Video summarization by graph optimization Lu Shi Oct. 7, 2003

Stage II: Graph modeling

Model the candidate shot set as a directional graph G(V,E), conveys both the spatial and the temporal property of

the video A vertex vi corresponds to a video shot, the weight on the

vertex is the shot’s length An edge eij corresponds to the dissimilarity between video

shot i and shot j

Page 13: Video summarization by graph optimization Lu Shi Oct. 7, 2003

Stage II: Graph modeling

The real shot/cluster pairwise dissimilarity function

Page 14: Video summarization by graph optimization Lu Shi Oct. 7, 2003

Stage II: Graph based video summary generation

Video skimming generation Given a target video skimming length SummaryLength A path in the spatial-temporal relation graph corresponds to

a set of video shots The object function is the length of the path Find the longest path, with the constraint that the vertex

weight summation of the path is within [Summarylength-threshold, SummaryLength]

Page 15: Video summarization by graph optimization Lu Shi Oct. 7, 2003

Stage II: Graph based video summary generation

Optimal substructure We denote the state as (ThisShot, LeftSize) The optimal substructure is:

If LeftSize is too small then opt(ThisShot, LeftSize) = 0 And then we can use dynamic programming to find the best

solution.

)(,((max),( 1 NextShotlengthLeftSizeNextShotoptLeftSizeThisShotopt ShotNumThisShotNextShot

)),( NextShotThisShot shshDis

Page 16: Video summarization by graph optimization Lu Shi Oct. 7, 2003

Stage II: Graph based video summary generation

Dynamic programming Set opt(LastShot, 0..threshold) to 0; Set opt(LastShot, threshold+1…SummaryLength) to -X Calculate the opt(ThisShot, LeftSize) with the optimal

substructure equation, ThisShot from LastShot-1 to 0,

Get opt(0,SummaryLength), which is the longest path’s

length. Then trace back to find the path. The time complexity: The spatial complexity:

gthSummaryLenn 2

gthSummaryLenn

Page 17: Video summarization by graph optimization Lu Shi Oct. 7, 2003

Stage II: Graph based video summary generation

Video skimming generation The generated video skimming based on video shots and

video shot clusters is shown below ( SummaryLength= 1500, Video Length = 11479).

Page 18: Video summarization by graph optimization Lu Shi Oct. 7, 2003

Stage II: Graph based video summary generation

Static video story board generation The static video story board is generated with the key

frames of the skimming video shots.

Page 19: Video summarization by graph optimization Lu Shi Oct. 7, 2003

Stage II: Graph based video summary generation

Evaluation The generated video skimming has grasped both

the visual diversity and temporal coverage Massive subjective test not carried out yet (Does it

make sense?) Quantitative objective evaluation is a big problem

Page 20: Video summarization by graph optimization Lu Shi Oct. 7, 2003

Future work

Combine with video structure V-Toc (Video table of

contents) Video shot groups Video scenes

Page 21: Video summarization by graph optimization Lu Shi Oct. 7, 2003

Future work Video structure

Video shot group and video scene

Page 22: Video summarization by graph optimization Lu Shi Oct. 7, 2003

Q & A

Thank you!