64
新新新新 - 新新新新新新新新新 新新新新新新新新新新新新新新 新新新新新新 新新新新新新新 新新新新新新新新 新新新新

新的挑战 - 基于内容的信息处理

  • Upload
    lorna

  • View
    148

  • Download
    0

Embed Size (px)

DESCRIPTION

新的挑战 - 基于内容的信息处理. 张 钹 清华信息科学与技术国家实验室 智能技术与系统国家重点实验室 清华大学信息学院、计算机系. 一、经典信息论 Classical Information Theory. Semantics vs. Information Processing Frequently the messages have meaning: that is they refer to or are correlated according to some - PowerPoint PPT Presentation

Citation preview

Page 1: 新的挑战     - 基于内容的信息处理

新的挑战 - 基于内容的信息处理

张 钹清华信息科学与技术国家实验室智能技术与系统国家重点实验室

清华大学信息学院、计算机系

Page 2: 新的挑战     - 基于内容的信息处理

一、经典信息论Classical Information

Theory

Semantics vs. Information Processing

• Frequently the messages have meaning: that is

they refer to or are correlated according to some

system with certain physical or conceptual entities.

• These semantic aspects of communication

irrelevant to the engineering problem.

C. E. Shannon, A mathematical theory of communication, 1948

Page 3: 新的挑战     - 基于内容的信息处理

经典通讯模型

• Communication Model:

Sender

Receiver

(Markov) Stochastic Process

-C. E. Shannon

( ) ( ) ( )P x P y x P x y

Page 4: 新的挑战     - 基于内容的信息处理

图灵计算理论Turing Computation Theory

Deterministic Turing Machine (1937)

Computability

Algorithms

Computational Complexity

P Read-write Head

tape -4 -3 -2 -1 0 +1

QFinite State Controller

Page 5: 新的挑战     - 基于内容的信息处理

经典信息论下的信息处理Information Processing

The conversion of latent information intomanifest information _C. E. Shannon

X -p(x) Y -

Dissipation Equivocation

Transformation=equivocation-dissipation

Uncertainty (error, noisy, deformation,…)

( )p x y

Page 6: 新的挑战     - 基于内容的信息处理

经典信息论在文本、图像处理中的应用

The central paradigm of classical information theory is the engineering problem of the transmission of information over a noisy channel

• Communication: coding, data

compression, noisy-channel coding,..• Text: editor, compression, spell and syntax correction,…• Image: editor, lossless compression, noise suppression, enhancement, …

Page 7: 新的挑战     - 基于内容的信息处理

二、基于内容(语义)的信息处理

Information (Web) Age

• Complex (variety of ) information:

text, image, speech, video,…

• A huge amount of data

• Man-Machine Interaction

Content (Semantics) based Information Processing

Information Retrieval, Classification, Summary,

Recognition, Understanding,..

Page 8: 新的挑战     - 基于内容的信息处理

Computer

Input

(Encoding)

Output

(Decoding)

Users

Codes Form

Content

从处理形式( Form )到与内容( Semantics)相关的处理

Natural Man-Machine Interaction

Computer Network

Page 9: 新的挑战     - 基于内容的信息处理

三、句法分析( Syntact Analysis ) Holistic Coding (Gestalt)

The basic properties among different quotient spaces such as the falsity preserving, the truth preserving properties are discussed. There are three quotient-space model construction approaches.

数字视频编码技术发展至今已有半个世纪的历史,已取得很大的进展。从五十年代的差分预测编码,到七十年代的变换编码、基于块的运动预测编码,直到如今兴起的分布式编码、立体视编码、多视编码、视觉编码等等

11001001010100100010101010011111000010101000001000111111011110010010001000100000010010101000100111110100

00001001010100111110101010000011101010110100010001110101000110010110111010010011001000100111100100000111

Semantics S Code (Form) X

?

Page 10: 新的挑战     - 基于内容的信息处理

Rule-based, Knowledge-based, Top-down, Parsing (text)

Advantages: Deliberative behaviors (AI expert systems: decision making, diagnosis, design,…) Specific domain, Small-size

Disadvantages: Perception, Common sense, Nature language,..

Uncertainty (exception, ambiguity, vague,..)

Page 11: 新的挑战     - 基于内容的信息处理

Detector (检测子) Semantically meaningful primitives

Text: word-sentence-chapter-

Image: subpart-part-object-

There is no clear boundary among parts

Segmentation Problem

Descriptor (描述子) Structural uncertaintyK. S. Fu, Syntactic pattern recognition, New York: Prentice-Hall, 1974

D. Marr, Vision, New York: Freeman, 1982

Page 12: 新的挑战     - 基于内容的信息处理

图像分割(分词)

Where is the object ?

What is the object ?

Chicken orEgg ?

Page 13: 新的挑战     - 基于内容的信息处理

结构分析Structural analysis, Rule-based,

Syntax

11001001010100100010101010011111000010101000001000111111011110010010001000100000010010101000100111110100

Image: Part, Object, … Text: Words, Sentences, …

• Uncertainty problem

• Scalability

Syntactic Analysis Faced Difficulty !

The basic properties among different quotient spaces such as the falsity preserving, the truth preserving properties are discussed. There are three quotient-space model construction approaches.

Page 14: 新的挑战     - 基于内容的信息处理

四、概率统计模型

Image (text) Classification:

Categories

Low level and local features (words)

Computer easily detectable but less

semantically meaningful

Colors, Textures, Bag of words

Page 15: 新的挑战     - 基于内容的信息处理

不确定性处理 Uncertainty Management

• How to deal with uncertainty ?

• A probabilistic information processing model

Regularity:

Probabilistic distribution

Examples:

Caltech 101, 25 objects

R. O. Duda & P. E. Hart, Pattern Classification and Scene Analysis, New York: John Wiley & Sons, 1973

* * * * * * * * * * * * ** * * ** * · · · · ·

··· · ·· ·· ·· · ······· ·· · · · · ·

Page 16: 新的挑战     - 基于内容的信息处理

词袋法 Bag of (Visual) Words

• Defined in image patches (2005-06)

• Descriptors extracted around interest points

(2002-2004)

• Edge contours (2005-06)

• Regions (2005-06)

Page 17: 新的挑战     - 基于内容的信息处理

检测子与描述子 Detector & Descriptor

Kadir salience

region (points)

Histograms of Oriented Gradients (HOG)

-72 dimensionZuo Yuanyuan (2010-)

Page 18: 新的挑战     - 基于内容的信息处理

1

1 arg min( ( , ))1( )

0

n iv V

i

if w D v rCB w

n otherwise

CB-Codebook (Histograms)

n-the number of regions in an image

ri-an image region i

D(w,ri)-the distance between a codeword w

and region ri

V-vocabulary

Page 19: 新的挑战     - 基于内容的信息处理

数据空间的稀疏结构 - Sparsity

Page 20: 新的挑战     - 基于内容的信息处理

高维空间中的低维结构 -low-dimensional structure of high-dimensional space

Objects Precision

Airplane 0.947

Car-side 0.987

Dalmatian 0.733

Faces 0.760

Leopard 0.827

Pagoda 0.760

Stop-sign 0.800

Windsor-chair 0.787 Data Structure

Page 21: 新的挑战     - 基于内容的信息处理

优化yAxtosubjectxx

11 minargˆ

211 minargˆ yAxtosubjectxx

Sparse representation in sample space

L1 norm

Page 22: 新的挑战     - 基于内容的信息处理

图像库

Page 23: 新的挑战     - 基于内容的信息处理

实验结果

Page 24: 新的挑战     - 基于内容的信息处理

数据的空间结构(非稀疏性)

Page 25: 新的挑战     - 基于内容的信息处理

数据驱动法( Data-driven )

-learning from web data

Germany

Speech Recognition

Speech Synthesis

Text1Translation

Service

Text2

English

Learning from Data

(probabilistic models)

Annotation

Microsoft Research Asia

Page 26: 新的挑战     - 基于内容的信息处理

概率方法的基本缺陷• The semantic gap between low-level local features and high-level global concepts

Less semantically meaningful features: colors or their distribution (histogram), gray-values or their distribution, visual words (descriptors from interest points), image patches, image regions, edge, …

• Lack of structural knowledge

Generalization capacity

Information processing without understanding

Page 27: 新的挑战     - 基于内容的信息处理

五、新的研究方向

Sender

Reader

X X

S Uncertainty F (W,D)

Page 28: 新的挑战     - 基于内容的信息处理

句法分析的复苏Holistic (Gestalt), Probability, Inference

Information Structure Analysis

• Syntactic + Probabilistic

• Data-driven + Knowledge-driven

• Bottom-up + Top-down

• Part-based (Shape-based)

Page 29: 新的挑战     - 基于内容的信息处理

信息结构 Information Structure

The basic properties among different quotient spaces such as the falsity preserving, the truth preserving properties are discussed. There are three quotient-space model construction approaches.

数字视频编码技术发展至今已有半个世纪的历史,已取得很大的进展。从五十年代的差分预测编码,到七十年代的变换编码、基于块的运动预测编码,直到如今兴起的分布式编码、立体视编码、多视编码、视觉编码等等

11001001010100100010101010011111000010101000001000111111011110010010001000100000010010101000100111110100

00001001010100111110101010000011101010110100010001110101000110010110111010010011001000100111100100000111

Semantics Code (Holistic)

?

Page 30: 新的挑战     - 基于内容的信息处理

信息结构分析

History (Structural Analysis)(1) LinguisticsInformation Structure-Syntactic representationInformation packaging, Building the semanticsFocus, Topic, background, comment,…

• M. A. K. Halliday (1967) Notes on transitivity and theme in English, Part II, Journal of Linguistics 3:199-244• W. L. Chafe, Language and consciousness, Language 50:111-133

Page 31: 新的挑战     - 基于内容的信息处理

(2) Psychology Structural Information Theory• Coding: descriptive complexities_ frequencies of occurrence (Shannon)• Information Structure: Formalization of visual regularity: iteration, symmetry, and alternation

E. L. J. Leeuwenberg (1968) Structural information of visual patterns: an efficient coding system in perception, The Hague: MoutonE.L. J. Leeuwenberg (1969), Quantitative specification of information in sequential patterns, Psychological Review, 76, 216-220

Page 32: 新的挑战     - 基于内容的信息处理

(3) Information Science Algorithmic Information Theory Kolmogorov Complexity 16 bits 1111111111111111 0101101001011010 1001110111010111 1100010001001011

R. J. Solomonoff (1964), A formal theory of inductive inference,

Information & Control, V.7, No.1, pp1-22, No.2, pp. 224-254

A. N. Kolmogorov (1965), Three approaches to the definition of

the quantity of information, Problems of Information Transmission, No. 1, pp. 3-11

Page 33: 新的挑战     - 基于内容的信息处理

信息结构的挖掘与利用• Kolmogorov complexity

• Related data structures

low-dimensional structure in high-dimensional

data space

• Under the probabilistic (structure) framework

• Learning from human being

Page 34: 新的挑战     - 基于内容的信息处理

(1) Kolmogorov Complexity

min : ( ) ,( )

p f p x if x ran fK x

otherwise

The absolute measure of information content

in an individual finite object

Algorithmic information theory

The minimum description length

Page 35: 新的挑战     - 基于内容的信息处理

信息距离 Information Distance

max ( ), ( )( , )

max ( ), ( )

K x y K y xd x y

K x K y

K(.) is non-computable

C. Bennett , et al, Information distance, IEEE Trans. on Information Theory, vol.44, no.4, 1998, pp.1407-1423

min ( ), ( )( , )

min ( ), ( )

K x y K y xd x y

K x K y

Page 36: 新的挑战     - 基于内容的信息处理

A statistical approach to Kolgomorov complexity

(information distance)

f(x)-frequency

N-total number

( , ) ( , )

log ( , ) max log ( ),log ( )

log min log ( ),log( )

d x y NSD x y

f x y f x f y

N f x y

近似计算方法

Page 37: 新的挑战     - 基于内容的信息处理

语义距离与形式距离 Semantic & Formal Distance

max ( ), ( )( , )

max ( ), ( )

C x y C y xCDM x y

C x C y

CDM-Compressed Distance Measure (ASCII)

NSD-Normalized Statistical Distance

Xian Zhang, Yu Hao, et al, New information distance measure and its application in question answering system, J. Comput. Sci. & Tech. 23(4), 557-572, 2008

Page 38: 新的挑战     - 基于内容的信息处理

两篇文本之间的信息距离

1 1 11 12 1

2 2 21 22 2

, ,...,

, ,...,

n

m

T W w w w

T W w w w

Text representation: a bag of words

max 1 2 1 2 2 1

1 2 1 2

2 1 2 1

( , ) max ( ), ( )

( ) ( \ )

( ) ( \ )

( ) ( ), ( ) log ( )

i ji j

i ji j

w W

D T T K T T K T T

K T T K w w

K T T K w w

K W K w k w f w

Page 39: 新的挑战     - 基于内容的信息处理

文本(图像)检索

1

2

3

...

n

t

t

t

t

Image-Text

information distance

Web. . . .

. . .

.

. . .

.

. . .

.

. . .

.

. . .

.

. . . .

. . .

.

. . .

.

. . .

.

. . .

.

. . .

.

1

2

.

n

i

i

i

Page 40: 新的挑战     - 基于内容的信息处理

(2) 数据结构 -low-dimensional structure of high-dimensional space

Sparse Representation1 1

arg minx x subject to

Ax y or

BAx z

A: mn matrixx: sample space, n-dimensiony: pixel, m-dimensionz: feature space, d-dimension Yi Ma, UIUC, USA

Page 41: 新的挑战     - 基于内容的信息处理

(3) 扁平的结构模型 -Image Region Annotation: horse, sky, mountain, grass, tree

Yuan Jinhui (2008-)

Page 42: 新的挑战     - 基于内容的信息处理

区域自适应的网格划分 Region-adaptive Grid Partition

,,

* arg max ( ) arg max ( ) ( )

1( , ) exp ( ( ), ) ( , )

( , ) ( ) ( )

z z

i i i i j i ji i j

z p z I p I z p z

p I z f I x z g z zZ

p I z p I z p z

I-dataxi-image position

z-state (feature)g(zi,zj)-the probabilistic region-basedconstraints (co-occurrence)

Page 43: 新的挑战     - 基于内容的信息处理

不同的扁平结构模型

Page 44: 新的挑战     - 基于内容的信息处理

模型学习

n images, each image has mi=HV grids

( , ) ( , ), 1,2...,

( , ) ( , ), 1,2,...,

i i

j ji i i i i

x y x y i n

x y x y j m

(a) i.i.d generative model

(b) i.i.d. discriminative model

(c) 2-dimensional hidden Markov (2D HMM)

(d) Markov Random Field (MRF)

(e) Conditional Random Field (CRF)

Page 45: 新的挑战     - 基于内容的信息处理

标注配置

( , ), 1,2,...,i ix y i N

Given a training data,

MAP (maximal a posterior) : label configuration

1: 1:* arg max ( )m my P y x

For 2D HMM, MRF, CRF

using path limited Viterbi algorithm

Page 46: 新的挑战     - 基于内容的信息处理

Probabilistic distribution P Cs: labeling clique, C0: labeling and feature cliquey* the optimal label configuration

0

1: 1:

( , ) ( , )

1( , ) ( , ) ( , )

i j s k k

m m i j k k

y y C y x C

P x y y y y xZ

1:0( , ) ( , )

* arg max ( , ) ( , )m

i j s k k

i j k k

yy y C y x C

y y y y x

马尔科夫随机场模型

- MRF model

Page 47: 新的挑战     - 基于内容的信息处理

结构预测学习算法Learning Rules

Classification

Structural Prediction

Maximal Joint Likelihood Estimation

Naïve Bayesian Network

Hidden Markov Model (1966)1

Maximal Conditional Likelihood Estimation

Logistic Regression

Conditional Random Field (2001)2

Maximal Margin Learning

SVM Maximal Margin Markov Net (2003)3

Maximal Entropy Discrimination Learning

Maximal Entropy Discrimination Model

Maximal Entropy Discrimination Markov Net (2008)4

Page 48: 新的挑战     - 基于内容的信息处理

相关文献[1] L. E. Baum and T. Petrie. Statistical Inference for Probabilistic Functions of Finte State Markov Chains. The Annals of Mathematical Statistics, Vol. 37, No. 6, pp.1554-1563, 1966

[2] J. Lafferty et al. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. In Proc. of International Conference on Machine Learning (ICML), 2001

[3] B. Taskar et al., Max-Margin Markov Networks. Advances in Neural Information Processing Systems (NIPS), 2003

[4] J. Zhu et al., Laplace Maximum Margin Markov Networks, In Proc. of International Conference on Machine Learning (ICML), 2008

Page 49: 新的挑战     - 基于内容的信息处理

实验设置• 4002 Corel images (384256 or 256384)

• 11 basic (region) concepts

• Features: color moment + wavelet

• 5 models: 2 without structural knowledge

(GMM, SVM)

3 with structural knowledge

(HMM*, RMF*, CRF*)

Page 50: 新的挑战     - 基于内容的信息处理

图像区域标注类别

Page 51: 新的挑战     - 基于内容的信息处理

不同模型的比较

GMM: Gaussian Mixture Model (30 components)

SVM: Support Vector Machine

Gaussian kennel, one-against-one

HMM: Hidden Markov Model

RMF: Random Markov Field

CRF: Conditional Random Field

Limited Path Viterbi Algorithm

Page 52: 新的挑战     - 基于内容的信息处理

实验结果

Page 53: 新的挑战     - 基于内容的信息处理

(4) Latent Hierarchical Model

Web Page Extraction Name, Image, Price, Description, etc.

Web Page

Data Record

Image

Name Desc Price Note

Desc

Note

Data Record

Image

Desc Desc Name Desc Price Note Note

Given Data Record

Hierarchy Computational

efficiency Long-range

dependency Joint extraction

{image}

{name, price}

{name}

{price}

{name}

{price}

{image}

{name, price}

{desc}

{Head}

{Tail}{Info Block}

{Repeat block}

{Note}

{Note}

Zhu Jun (2008- )

Page 54: 新的挑战     - 基于内容的信息处理

Experiment Web page extraction Name, Photo, Price ,

Description Models Multi-lager CRFs, Multi-layer

M^3N, PoMEN, Partially observed HCRFs

Data set: 37 TemplateTraining: 185 (5/per template)

pages, or 1585 data records

Testing: 370 (10/per template) pages, or 3391 data records

Web Page

Data Record

Image

Name Desc Price Note

Desc

Note

Data Record

Image

Desc Desc Name Desc Price Note Note

Page 55: 新的挑战     - 基于内容的信息处理

Results

23/4/22 博士论文 预答辩 55

Performances: Avg F1:

o avg F1 over all attributes

Block instance accuracy:

o % of records whose Name, Image, and Price are correct

每个属性的性能

Page 56: 新的挑战     - 基于内容的信息处理

(5) 人脑视觉机制

Page 57: 新的挑战     - 基于内容的信息处理

Top-down feedback

Top-down feedback

Local connection

Data-driven V1 V2 … IT

High-level

Prior-knowledge

自底向上与自顶向下

Page 58: 新的挑战     - 基于内容的信息处理

• 114 basic functions

Images 1212 pixels

• HOG

视觉基元 Visual Primitive Elements

Page 59: 新的挑战     - 基于内容的信息处理

稀疏编码( Spare Coding )

( , ) ( , )i ii

I x y a x y2

( , ) ( , ) ii i

xy i i

aE I x y a x y S

( , )I x y

( , )i x y

ia

Images

Basic functions

Coefficients

A non-linear function2log(1 )S x or x

Page 60: 新的挑战     - 基于内容的信息处理

Object Recognitionwith sparse, localizedfeatures

MIT-CSAIL-TR-2006-028 T. Serre

Page 61: 新的挑战     - 基于内容的信息处理
Page 62: 新的挑战     - 基于内容的信息处理

HMAX-sum + max

Page 63: 新的挑战     - 基于内容的信息处理

多学科交叉研究Center for Cognitive and Neural Computation

Tsinghua University, Beijing

• Computational Neuroscience

• System Neuroscience

• Intelligent Technology and Systems

• Neural Information and Brain-computer interface

• Learning and Memory

• Cognitive Psychology

Page 64: 新的挑战     - 基于内容的信息处理

谢谢!