16
Intelligent Database Systems Lab 國國國國國國國國 National Yunlin University of Science and Technology 1 A Study of Learning a Merge Model for Multilingual Information Retrieval Presenter : Cheng-Hui Chen Author : Ming-Feng Tsai, Yu-Ting Wang, Hsin-Hsi Chen SIGIR 2008

A Study of Learning a Merge Model for Multilingual Information Retrieval

  • Upload
    latona

  • View
    40

  • Download
    1

Embed Size (px)

DESCRIPTION

A Study of Learning a Merge Model for Multilingual Information Retrieval. Presenter : Cheng- Hui Chen Author : Ming- Feng Tsai , Yu-Ting Wang, Hsin-Hsi Chen SIGIR 2008. Outlines. Motivation Objectives Methodology Experiments Conclusions Comments. Motivation. - PowerPoint PPT Presentation

Citation preview

Page 1: A Study of Learning a Merge Model  for Multilingual Information Retrieval

Intelligent Database Systems Lab

國立雲林科技大學National Yunlin University of Science and Technology

1

A Study of Learning a Merge Model for Multilingual Information Retrieval

Presenter : Cheng-Hui Chen  Author : Ming-Feng Tsai, Yu-Ting Wang, Hsin-Hsi Chen

SIGIR 2008

Page 2: A Study of Learning a Merge Model  for Multilingual Information Retrieval

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

2

Outlines Motivation Objectives Methodology Experiments Conclusions Comments

Page 3: A Study of Learning a Merge Model  for Multilingual Information Retrieval

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

3

Motivation· Multilingual information retrieval (MLIR) that result

list usually includes more irrelevant words.· Traditional merging methods for MLIR that assumption

relevant documents are homogeneously distributed over monolingual result lists.

Page 4: A Study of Learning a Merge Model  for Multilingual Information Retrieval

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Objectives· The various translation and retrieval qualities in

different collections that to merge a unique result list.· To proposes merge method doesn’t assumption

relevant documents are homogeneously distributed over monolingual result lists.

· The enhancement merge model quality.

4

Page 5: A Study of Learning a Merge Model  for Multilingual Information Retrieval

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Methodology· Traditional MLIR Framework.

─ Raw-score─ Round-robin─ Normalized-by-top1─ Normalized-by-top k

· The Proposes a learning method.

─ FRank

5

Page 6: A Study of Learning a Merge Model  for Multilingual Information Retrieval

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.MLIR merge process

6

Feature Set1. Query levels2. Document levels3. Translation levels

The Construction of a Merge Model1. FRank ranking

algorithm2. BM25

Page 7: A Study of Learning a Merge Model  for Multilingual Information Retrieval

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Feature set· Query levels

─ The manually classify the terms within a query into several pre-defined categories. Location/country names (Loc) Organization names (Org) Event names (EN) Technical terms (TT)

· Document levels─ The extracted document length (Dlength) and title length

(Tlength).

7

Page 8: A Study of Learning a Merge Model  for Multilingual Information Retrieval

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Feature set· Translation levels

─ The size of a bilingual dictionary used for various language (i.e., DictSize).

─ The average number of translation equivalents within a query (i.e., AvgTAD). If a query has two query terms both with three translation

equivalents.· AvgTAD of the query is (3 + 3)/2 = 3.

8

AvgTAD DicSize

(4+2)/2=2 3

中文(Translation

QT)(Order) (Park) 訂單 公園 順序 停車 命令 隊形

中文翻譯數目查詢詞的數目

EN

Loc

EN

斗六

食べる

Order 、 Park

英 ->中Loc

Page 9: A Study of Learning a Merge Model  for Multilingual Information Retrieval

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.The Construction of Merge model· The FRank’s generalized additive model, a merge

model can be represented as :

─ mt(x) is a weak learner─ αt is the learned weight ─ t is the number of selected weak learners

· The combine with a retreval model (bm25) by using linear combination .

9

Page 10: A Study of Learning a Merge Model  for Multilingual Information Retrieval

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments· Data set

─ The Details of Experimental Collections

─ The Percentage of Retrieved Documents

10

Page 11: A Study of Learning a Merge Model  for Multilingual Information Retrieval

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments· Mean Average Precision (MAP)

11

Page 12: A Study of Learning a Merge Model  for Multilingual Information Retrieval

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments· The Experimental Results of Our Method

using Different Combination Coefficient λ.

12

Page 13: A Study of Learning a Merge Model  for Multilingual Information Retrieval

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments· Feature Analysis

13

Page 14: A Study of Learning a Merge Model  for Multilingual Information Retrieval

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Conclusions· The proposed merge model can significantly

improve merging quality.· The merge model indicates the key factors are

the number of translatable terms and compound words.

14

Page 15: A Study of Learning a Merge Model  for Multilingual Information Retrieval

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Conclusions· The future work

─ Use other learning-based ranking algorithms. Such as RankSVM and RankNet.

─ Extract more representative features to construct a merge model. Such as linguistic features.

─ Expect to discover more relations within query terms. Such as query term association and substitution.

15

Page 16: A Study of Learning a Merge Model  for Multilingual Information Retrieval

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

16

Comments· Advantage

─ Improve merging quality.· Drawback· Application

─ Multilingual Information retrieval.