45
ntcir at clef 2010-09-21 Noriko Kando 1 What is happening at NTCIR Noriko Kando National Institute of Informatics, Japan http://research.nii.ac.jp/ntcir/ kando@nii. ac. jp Thanks for Teruko Mitamura, Tetsuya Sakai, Fred Gey, Yohei Seki, Daisuke Ishikawa, Atsushi Fujii, Hidetsugu Nanba, Terumasa Ehara for preparing slides

Ntcir at clef 2010-09-21 Noriko Kando 1 What is happening at NTCIR Noriko Kando National Institute of Informatics, Japan

Embed Size (px)

Citation preview

Page 1: Ntcir at clef 2010-09-21 Noriko Kando 1 What is happening at NTCIR Noriko Kando National Institute of Informatics, Japan

ntcir at clef 2010-09-21 Noriko Kando 1

What is happening at NTCIR

Noriko KandoNational Institute of Informatics, Japan

http://research.nii.ac.jp/ntcir/kando@nii. ac. jp

Thanks for Teruko Mitamura, Tetsuya Sakai, Fred Gey, Yohei Seki, Daisuke Ishikawa, Atsushi Fujii, Hidetsugu Nanba, Terumasa Ehara for preparing slides

Page 2: Ntcir at clef 2010-09-21 Noriko Kando 1 What is happening at NTCIR Noriko Kando National Institute of Informatics, Japan

ntcir at clef 2010-09-21Noriko Kando 2

NTCIR: NTCIR: NII Test Collection for Information RetrievalNII Test Collection for Information Retrieval

Research Infrastructure for Evaluating IA

Data sets (Test collections or TCs) Scientific, news, patents, and web Chinese, Korean, Japanese, and English

Project started in late 1997

Tasks (Research Areas) IR: Cross-lingual tasks, patents, web, GeoQA : Monolingual tasks, cross-lingual tasks Summarization, trend info., patent maps Opinion analysis, text mining

Once every 18 months

  A series of evaluation workshops designed to enhance research in information-access technologies by providing an infrastructure for large-scale evaluations.     ■Data sets, evaluation methodologies, and forum

Community-based Research Activities

Page 3: Ntcir at clef 2010-09-21 Noriko Kando 1 What is happening at NTCIR Noriko Kando National Institute of Informatics, Japan

NTCI R 1 2 3 4 5 6 7 8

' 99'01'02'04'05'07'08'09-

■ Community QA

■ ■ ■ Opinion Analysis

Module- Based ■ ■ Cross-Lingual QA + I R

■ Geo Temporal

■ ■ ■ ■ □ Patent

■ ■ ■ Complex/ Any Types

■ ■ Dialog

■ ■ ■ ■ Cross-Lingual

■ ■ ■ ■ Factoid, List

■ ■ ■ ■ ■ Text Mining / Classification

■ ■ ■ Trend I nf o Visualization

■ ■ ■ Text Summarization

Web ■ ■ ■ Web

■ ■ Statistical MT

■ ■ ■ ■ ■ ■ ■ ■ Cross-Lingual I R

■ ■ ■ ■ ■ ■ ■ ■ Non-English Search

Text Retrieval ■ ■ ■ ■ ■ ■ ■ ■ Ad Hoc I R, I R f or QA

The Years the meetings were held. The tasks started 18 months bef ore

CrosslingualRetrieval

User Generated

Contents

IR for FocusedDomain

Summarization /Consolidation

Question

Answering

Tasks at Past NTCIRs

ntcir at clef 2010-09-21 3Noriko Kando

Page 4: Ntcir at clef 2010-09-21 Noriko Kando 1 What is happening at NTCIR Noriko Kando National Institute of Informatics, Japan

NTCIR-8 Tasks (2008.07—2009.06)

1. Advanced CL Info Access- QA(CsCtJE->CsCtJ )  ※ Any Types of

Questions- IR for QA (CsCtJE->CsCtJ )  GeoTime (E, J) Geo Temporal Information

3. Focused Domain : Patent-Patent Translation; English -> Japanese ※ Statistical MT, The World-Largest training data (J-E sentence alignment), Summer School, Extrinsic eval by CLIR

-Patent Mining    papers -> IPC-Evaluation of SMT

New

Th

e 3

rd In

t’l W o

n E

valu

atin

g

Info

rmatio

n A

ccess (E

VIA

)

refe

reed

New

Newntcir at clef 2010-09-21 4Noriko Kando

Page 5: Ntcir at clef 2010-09-21 Noriko Kando 1 What is happening at NTCIR Noriko Kando National Institute of Informatics, Japan

NTCIR-7 & -8 Program Committee

Mark Sanderson, Doug Oard, Atsushi Fujii, Tatsunori Mori, Fred Gey, Noriko Kando (and Ellen Voorhees, Sung Hyun Myaeng, Hsin-Hsi Chen, Tetsuya Sakai)

ntcir at clef 2010-09-21 5Noriko Kando

Page 6: Ntcir at clef 2010-09-21 Noriko Kando 1 What is happening at NTCIR Noriko Kando National Institute of Informatics, Japan

NTCIR-8 Coordination• NTCIR-8 is coordinated by NTCIR Project at NII, Japan.

The following organizations contribute to the organization of NTCIR-8 as Task Organizers

- Academia Sinica - Carnegie Mellon Univ- Chinese Academy of Science- Hiroshima City University- Hitachi, Co Ltd.- Hokkai Gakuen University- IBM- Microsoft Research Asia- National Institute of Information and Communication Technology- National Institute of Informatics- National Taiwan Univ

-National Taiwan Ocean Univ- Oki Electonic Co.- Tokyo Institute of Technology- Tokyo Univ- Toyohashi Univ of Technology and Science- Univ of California Barkeley-Univ of Tsukuba- Yamanashi Eiwa College- Yokohama National University

ntcir at clef 2010-09-21 6Noriko Kando

Page 7: Ntcir at clef 2010-09-21 Noriko Kando 1 What is happening at NTCIR Noriko Kando National Institute of Informatics, Japan

[CCLQA]Carnegie Mellon UnivDalian Univ of TechnologyNational Taiwan Ocean UnivShenyan Institute of

Aeronautical Engineering

Univ of TokushimaWuhan Univ

[IR4QA]Carnegie Mellon UnivChaoyang Univ of

TechnologyDalian Univ of TechnologyDublin City UnivInner Mongolia UnivQueensland Univ of

TechnologyShenyan Inst of Aeronautical

EngineeringTrinity College DublinUniv California, BerkeleyWuhan UnivWuhan Univ (Computer

School)Wuhan Univ of Science and

Technology

[GeoTime]Dublin City UnivHokkaido UnivINESC-ID, PorugalInternational Inst of Technology,

HyderbadKieo UnivNataional Inst of Materials ScienceOsaka Kyoiku UnivUniv California, BerkeleyUniv of IowaUniv of LisbonYokohama City Univ

[MOAT]Beijing Uni of Posts and

TelecommunicationsChaoyang Univ of TechnologyChinese Univ of HK+ Tsinghua UnivCity Univ of Hong Kong (2 groups)Hong Kong Polytechnic UnivKAISTNational Taiwan UnivNEC Laboratories ChinaPeking UnivPohang Univ of Sci and TechSICSToyohashi Univ of TechnologyUniv of AlicanteUniv of NeuchatelYuan Ze Univ

[Patent Mining]Hiroshima City UnivHitachi, Ltd.IBM Japan, Ltd.Institute of Scientific and

Technical Information of China

KAISTNational Univ of SingaporeNECShanghai Jiao Tong UnivShenyang Institute of

Aeronautical EngineeringToyohashi Univ of TechnologyUniv of Applied Sciences - UNIGE

[Patent Translation]Dublin City University, CNGLHiroshima City UniversityKyoto UniversityNiCTPohang Univ of Sci and Techtottori universityToyohashi University of

TechnologyYamanashi Eiwa College

[Community QA]Microsoft Research AsiaNational Institute of InformaticsShirayuri College

NTCIR-8 Active Participantsntcir at clef 2010-09-21 7Noriko Kando

Page 8: Ntcir at clef 2010-09-21 Noriko Kando 1 What is happening at NTCIR Noriko Kando National Institute of Informatics, Japan

Complex Cross-lingual Question Answering(CCLQA) Task

IR and QA communities can collaborate

Different teamscan exchange and create a “dream-team” QA system

Small teams that do not possess an entire QA system can contribute

ntcir at clef 2010-09-21 8Noriko Kandohttp://aclia.lti.cs.cmu.edu/ntcir8/

Page 9: Ntcir at clef 2010-09-21 Noriko Kando 1 What is happening at NTCIR Noriko Kando National Institute of Informatics, Japan

9 Noriko Kando

Evaluation Topics – any types of questions -

Type # Example Question Related Past NTCIR Task

DEFINITION 10 What is the Human Genome Project?

ACLIA

BIOGRAPHY 10 Who is Howard Dean? ACLIA

RELATIONSHIP 20 What is the relationship between Saddam Hussein and Jacques Chirac?

ACLIA

EVENT 20 What are the major conflicts between India and China on border issues?

ACLIA

WHY 20 Why doesn't U.S. ratify the Kyoto Protocol?

QAC-4

PERSON 5 Who is the Finland's first woman president?

QAC 1-3, CLQA 1,2

ORGANIZATION 5 What is the name of the company that produced the first Fairtrade coffee?

QAC 1-3, CLQA 1,2

LOCATION 5 What is the name of the river that separates North Korea from China?

QAC 1-3, CLQA 1,2

DATE 5 When did Queen Victoria die? QAC 1-3, CLQA 1,2

ntcir at clef 2010-09-21

Page 10: Ntcir at clef 2010-09-21 Noriko Kando 1 What is happening at NTCIR Noriko Kando National Institute of Informatics, Japan

CT/JA-T IR4QA run rankings

Mean Q

Mean Q

Mean nDCG

Mean nDCG

ntcir at clef 2010-09-21 10Noriko Kando

Page 11: Ntcir at clef 2010-09-21 Noriko Kando 1 What is happening at NTCIR Noriko Kando National Institute of Informatics, Japan

11 Noriko Kando

CCLQA Human Evaluation Preliminary Results: JA-JAJA-JA Runs ALL

LTI-JA-JA-01-T 0.1069

LTI-JA-JA-02-T 0.1443

LTI-JA-JA-03-T 0.1438

JA-JA automatic evaluation

JA-JA Runs ALL

LTI-JA-JA-01-T 0.2024

LTI-JA-JA-02-T 0.2259

LTI-JA-JA-03-T 0.2252

ntcir at clef 2010-09-21

Page 12: Ntcir at clef 2010-09-21 Noriko Kando 1 What is happening at NTCIR Noriko Kando National Institute of Informatics, Japan

12 Noriko Kando

Effect of Combination;IR4QA+CCLQA JA-JA Collaboration Track: F3 score

based on automatic evaluation

CCLQA

LTI

IR4QA

BRKLY-JA-JA-01-DN 0.2934

BRKLY-JA-JA-02-T 0.2686

BRKLY-JA-JA-03-DN 0.2074

BRKLY-JA-JA-04-DN 0.3000

BRKLY-JA-JA-05-T 0.2746

ntcir at clef 2010-09-21

Page 13: Ntcir at clef 2010-09-21 Noriko Kando 1 What is happening at NTCIR Noriko Kando National Institute of Informatics, Japan

NTCIR-GEOTIMEGEOTEMPORAL INFORMATION RETRIEVAL

(New Track in NTCIR Workshop 8)

Fredric Gey and Ray Larson and Noriko KandoRelevance judgments system by Jorge Machado and Hideki

ShimaEvaluation : Tetsuya Sakai

Judgments: U Iowa, U Lisbon, U California Barkelay, NII Search with a specific focus on Geography + To distinguish from past GIR evaluations, we introduced a temporal component

Asian language geographic search has not previously been evaluated, even though about 50 percent of the NTCIR-6 Cross-Language topics had a geographic component (usually a restriction to a particular country).

ntcir at clef 2010-09-21 13Noriko Kando

Page 14: Ntcir at clef 2010-09-21 Noriko Kando 1 What is happening at NTCIR Noriko Kando National Institute of Informatics, Japan

• Japanese Runs submitted by eight groups (two anonymous pooling supporters)

NTCIR-GeoTimePARTICIPANTS

Team Name Organization

Anonymous Anonymous

BRKLY University of California, Berkeley, USA

FORST Yokohama National University, JAPAN

HU-KB Hokkaido University, JAPAN

KOLIS Keio University, JAPAN

ANON2 Anonymous submission, group 2

M National Institute of Materials Science, JAPAN

OKSAT Osaka Kyoiku University, JAPAN

ntcir at clef 2010-09-21 14Noriko Kando

Page 15: Ntcir at clef 2010-09-21 Noriko Kando 1 What is happening at NTCIR Noriko Kando National Institute of Informatics, Japan

• English Runs submitted by six groups

NTCIR-GeoTimePARTICIPANTS

Team Name Organization

BRKLY University of California, Berkeley, USA

DCU Dublin City University, IRELAND

IITH International Institute of Technology, Hyderbad, INDIA

INESC National Institute of Electroniques and Computer Systems, Lisbon, PORTUGAL

UIOWA University of Iowa, USA

XLDB University of Lisbon, PORTUGAL

ntcir at clef 2010-09-21 15Noriko Kando

Page 16: Ntcir at clef 2010-09-21 Noriko Kando 1 What is happening at NTCIR Noriko Kando National Institute of Informatics, Japan

• BRKLY: baseline approach, probablistic + psued relevance feedback•DCU, IITH, XLDB (U Lisbon) : geographic enhancements•KOLIS (Keio U) : counting the number of geographic and temporal expression in top-ranked docs in initial search, then re-rank•FORST (Yokohama Nat U): utilize factoid QA technique to question decomposition•HU-KB (Hokkaido U), U Iowa: Hybrid approach combinging probablistic model and weighted boolean query formulation

NTCIR-GeoTime Approached

ntcir at clef 2010-09-21 16Noriko Kando

Page 17: Ntcir at clef 2010-09-21 Noriko Kando 1 What is happening at NTCIR Noriko Kando National Institute of Informatics, Japan

Most Difficult English topic (21): When and where were the 2010 Winter Olympics host city location

announced?

NTCIR-GeoTime: ENGLISH TOPIC DIFFICULTY by Average Precision

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

21 18 22 15 10 12 20 5 11 25 17 4 24 2 14 9 16 23 13 6 8 19 3 7 1

AP

Q

nDCG

Per-topic AP, Q and nDCG averaged over 25 English runs for 25 topics sorted by topic difficulty

(AP ascending)

ntcir at clef 2010-09-21 17Noriko Kando

Page 18: Ntcir at clef 2010-09-21 Noriko Kando 1 What is happening at NTCIR Noriko Kando National Institute of Informatics, Japan

Most Difficult Japanese topic 18: What date was a country was invaded by the United States in 2002?

NTCIR-GeoTime: JAPANESE TOPIC DIFFICULTY by Average Precision

Per-topic AP, Q and nDCG averaged over 34 Japanes runs for 24 topics sorted by topic difficulty

(AP ascending)

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1814252220121523131124 4 2 8 191016 5 21 6 1 9 3 7

AP

Q

nDCG

ntcir at clef 2010-09-21 18Noriko Kando

Page 19: Ntcir at clef 2010-09-21 Noriko Kando 1 What is happening at NTCIR Noriko Kando National Institute of Informatics, Japan

NTCIR-GeoTime: TOPIC per TOPIC Analysis on Japanese

ntcir at clef 2010-09-21 19Noriko Kando

0.00

0.20

0.40

0.60

0.80

1.00

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 18 19 20 21 22 23 24 25

HU-KB-JA-JA-03-D

KOLIS-JA-JA-04-D

FORST-JA-JA-04-D

BRKLY-JA-JA-02-D

nD

CG

Topic ID (#17 is missing)

Page 20: Ntcir at clef 2010-09-21 Noriko Kando 1 What is happening at NTCIR Noriko Kando National Institute of Informatics, Japan

• Geographic reference resolution is difficult enough, but

• More difficult to process temporal expression (“last Wednesday”) references

•Can indefinite answers be accepted (“a few hours”)?

•Need Japanese Gazetteers•Need NE annotated corpus for further refinement

NTCIR-GeoTime CHALLENGES

ntcir at clef 2010-09-21 20Noriko Kando

Page 21: Ntcir at clef 2010-09-21 Noriko Kando 1 What is happening at NTCIR Noriko Kando National Institute of Informatics, Japan

Multilingual Opinion Analysis (MAOT)

CO2 Reduction?CO2 Reduction?Lehman shock?Lehman shock?

ntcir at clef 2010-09-21 21Noriko Kando

Page 22: Ntcir at clef 2010-09-21 Noriko Kando 1 What is happening at NTCIR Noriko Kando National Institute of Informatics, Japan

Opinion Question ListID Opinion Question

N01 What negative prospects were discussed about the Euro when it was introduced in J anuary of 2002?(N03) (What reasons were discussed about Bomb Terror in Bali Island in October, 2002?)N04 What reasons have been given for the Space Shuttle Columbia accident in February, 2002?N05 What negative comments were discussed about Bush's decision to start Iraq war in March, 2003?N06 What negative prospects and opinions were discussed about SARS which started spreading in March, 2003?N07 What reasons are given for the blackout around North America in August, 2003?N08 What reasons and background information was discussed about the terrorist train bombing that happened in Madrid in March, 2004?N11 Why did supporters want to elect George W. Bush in the November 2004 American Presidential Election?

N13What positive comments were discussed to help the victims from earthquake and tsunami in Sumatera, Indonesia in December,2004?

N14 What objections are given for the US opposition to the Kyoto Protocol that was enacted in February 2005?

N16What reasons have been given for the anti- J apanese demonstrations that took place in April, 2005 in Peking and Shanghai inChina?

N17In J uly 2005 there were terrorist bombings in London. What reasons and background were given, and what controversies werediscussed?

N18 What actions by President George Bush were criticized in response to Hurricane Katrina's August 2005 landing?

N20 What negative opinions and discussion happened about the Bird Flu that started spreading in October, 2005?

N24Identify opinions that indicate that Arnold Schwarzenegger is a bad choice to be elected the new governor of California in theOctober 2003 election.

N26Find positive opinions about the reaction of Nuclear and Industrial Safety Agency officials to the Mihama nuclear powerplantaccident in August 2004.

N27 What were the advantages and disadvantages of the direct flight between Taiwan and Mainland China commercially?N32 What are good and bad approaches to losing weight?N36 What are complaints about XIX Olympic Winter Games that were held in and around Salt Lake City, Utah, United States in 2002?N39 What are the comments about China's first manned space flight which happened successfully in October 2003?

N41What negative comments were discussed when in April 2004 CBS made public pictures showing cruel U.S. military abuse of Iraqiprisoners of war?

ntcir at clef 2010-09-21 22Noriko Kando

Page 23: Ntcir at clef 2010-09-21 Noriko Kando 1 What is happening at NTCIR Noriko Kando National Institute of Informatics, Japan

Effective approaches

• The following teams attained good results with accomplished feature filtering, hot machine learning, and rich lexicon resources.

TeamID Lang Feature Filtering Macihne Learing Lexicon ResourceUNINE EN Z score logistic regression SentiWordNetPKUTM SC Iterative classifier SVM (better than NB, ME, DT) In House, NTU, and J un LI's lexiconCityUHK TC Supervised Lexicon Ensemble NTUSD, LCPW, LCNW, CPWP, SKPI

ntcir at clef 2010-09-21 23Noriko Kando

Page 24: Ntcir at clef 2010-09-21 Noriko Kando 1 What is happening at NTCIR Noriko Kando National Institute of Informatics, Japan

Community QA Pilot Task

ntcir at clef 2010-09-21 Noriko Kando24

Yahoo Chiebukuro data version 1.0:

3 million questions

Test collectionfor CQA:

1500 questions

Good Answers:

4 assessors individually assessed

Best Answer:Questioner

selected

1500 questions selected at

random

4 university students

Rank all posted answers by answer quality (as estimated by system) for every question.Training

Test

http://research.nii.ac.jp/ntcir/ntcir-ws8/yahoo/index-en.html

Page 25: Ntcir at clef 2010-09-21 Noriko Kando 1 What is happening at NTCIR Noriko Kando National Institute of Informatics, Japan

GA-Hit@1not useful

GA-nDCG and Qsimilar

5th run from MSRAused the BA info for

test Qs directly so does

not represent Practical

performance

ntcir at clef 2010-09-21 25Noriko Kando

Page 26: Ntcir at clef 2010-09-21 Noriko Kando 1 What is happening at NTCIR Noriko Kando National Institute of Informatics, Japan

LOVE is HARD according to BA!Systems can’t

find the asker’s BAs!

ntcir at clef 2010-09-21 26Noriko Kando

Page 27: Ntcir at clef 2010-09-21 Noriko Kando 1 What is happening at NTCIR Noriko Kando National Institute of Informatics, Japan

#questions GA-nG@1LOVE is EASY according to GA!

Systems can find many good answers that are not BA!(BA-evaluation not good

enough for social questions)

ntcir at clef 2010-09-21 27Noriko Kando

Page 28: Ntcir at clef 2010-09-21 Noriko Kando 1 What is happening at NTCIR Noriko Kando National Institute of Informatics, Japan

28

Goal: Automatic Creation of technical trend maps from a set of research papers and patents.

Research papers and patents are classified in terms of elemental technologies and their effects.

ntcir at clef 2010-09-21 Noriko Kando

Page 29: Ntcir at clef 2010-09-21 Noriko Kando 1 What is happening at NTCIR Noriko Kando National Institute of Informatics, Japan

EvaluationSubtask 1 (Research Paper Classification)Metrics: Mean Average Precision (MAP)• k-NN based approach is superior to machine learning

approach.• Re-ranking of IPC codes is effective.

Subtask 2: Technical Trend Map CreationMetrics: Recall, Precision, and F-measure• Top systems employed CRF, and the following features are

effective.– Dependency structure– Document structure– Domain adaptation

29ntcir at clef 2010-09-21 Noriko Kando

Page 30: Ntcir at clef 2010-09-21 Noriko Kando 1 What is happening at NTCIR Noriko Kando National Institute of Informatics, Japan

30

History of Patent IR at NTCIR

• NTCIR-3 (2001-2002)– Technology survey

• Applied conventional IR problems to patent data

• NTCIR-4 (2003-2004)– Invalidity search

• Addressed patent-specific IR problems

• NTCIR-5 (2004-2005)– Enlarged invalidity search

• NTCIR-6 (2006-2007)– Added English patents

2 years of JPO patent applications

* JPO = Japan Patent Office

5 years of JPO patent applications

10 years of JPO patent applications

10 years of USPTO patents granted

* USPTO = US Patent & Trademark Office

Both document sets were published in 1993-2002

ntcir at clef 2010-09-21 Noriko Kando

Page 31: Ntcir at clef 2010-09-21 Noriko Kando 1 What is happening at NTCIR Noriko Kando National Institute of Informatics, Japan

31

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

12.00 14.00 16.00 18.00 20.00 22.00 24.00 26.00

BLEU

MAP,R

ecall@

Recall@1000 (R = 0.77)Recall@500 (R = 0.84)Recall@200 (R = 0.86)Recall@100 (R = 0.86)MAP (R = 0.77)

Extrinsic E-J: BLEU & IR measures

MAP & Recall@N

Recall@100,200 are highly correlated with BLEU (R = 0.86)

ntcir at clef 2010-09-21 Noriko Kando

Page 32: Ntcir at clef 2010-09-21 Noriko Kando 1 What is happening at NTCIR Noriko Kando National Institute of Informatics, Japan

Overall structure

Page 33: Ntcir at clef 2010-09-21 Noriko Kando 1 What is happening at NTCIR Noriko Kando National Institute of Informatics, Japan

New NTCIR StructureNTCIR general chairs:Noriko Kando (NII), Eiichiro Sumita (NICT), Tsuneaki Kato (U of Tokyo)

NTCIR evaluation chairs:Hideo Joho (U of Tsukuba)Tetsuya Sakai (MSRA)

EVIA chairs:Mark Sanderson (RMIT)William Webber (U of Melbourne) + 1Grand

challenges

Core

ch

alle

ng

e

task

Core

ch

alle

ng

e

task

Pilo

t ta

sk

Pilo

t ta

sk

Task Selection Committee EVIA Program

CommitteeRefereed papers onEvaluating Information Access

Page 34: Ntcir at clef 2010-09-21 Noriko Kando 1 What is happening at NTCIR Noriko Kando National Institute of Informatics, Japan

Tasks accepted for NTCIR-9CORE TASKS• Intent (with One-Click Access subtask)• Recognizing Natural Language Inference• Geotemporal information retrieval (GeoTime2)• IR for Spoken Documents

PILOT TASKS• Cross-lingual Link Discovery• Evaluation Framework of Interactive Information Access• Patent Machine Translation (PATMT)

Page 35: Ntcir at clef 2010-09-21 Noriko Kando 1 What is happening at NTCIR Noriko Kando National Institute of Informatics, Japan

bookbook

INTENT (w 1CLICK) task (CS, J)

Organisers: Min Zhang, Yiqun Liu (Tsinghua U), Ruihua Song, Tetsuya Sakai, Youngin Song (MSRA), Makoto Kato (Kyoto U)

• Subtopic miningFind different intents given an

ambiguous/underspecified query• RankingSelectively diversity Web search results• One click access (1CLICK)Satisfy information need with the system’s first

result page (X-byte text, not document list)No need to click after clicking on SEARCH

Harry Potter

bookmovie

Page 36: Ntcir at clef 2010-09-21 Noriko Kando 1 What is happening at NTCIR Noriko Kando National Institute of Informatics, Japan

Recognizing Natural Language Inference (CS, CT, J)

• YES/NO taskDoes text 1 entail text 2?text 1 ⇒ text 2• Main taskGiven text 1 and text 2, choose from(1)Forward entailment (2) backward entailment(3) equivalent (4) contradict (5) independent• Extrinsic evaluation via CLIA will be

conducted

Organisers: Hideki Shima,Teruko Mitamura (CMU),Chuan-Jie Lin (NTOU),Cheng-Wei Lee (Academia Sinica)

Page 37: Ntcir at clef 2010-09-21 Noriko Kando 1 What is happening at NTCIR Noriko Kando National Institute of Informatics, Japan

GeoTime2 (E, J, K?)Organisers: Fred Gey, Ray Larson (UCB), Noriko Kando (NII)

• Second round of ad hoc IR for WHEN and WHERE• GeoTime1 topic: How old was Max Schmeling when he

died, and where did he die?

At GeoTime1, docs that contain both WHEN and WHERE were treated as relevant;

Those that contain only WHEN or only WHERE info were treated as partially relevant.

Can we do better?

Can we create more realistic

GeoTime topics?

Page 38: Ntcir at clef 2010-09-21 Noriko Kando 1 What is happening at NTCIR Noriko Kando National Institute of Informatics, Japan

IR for Spoken Documents (J)Organisers: Tomoyosi Akiba, Seiichi Nakagawa, Kiyoaki Aikawa (Toyohashi U of Technology),

Yoshiaki Itoh (Iwate Prefectural U), Tatsuya Kawahara (Kyoto U), Xinhui Hu (NICT), Hiroaki Nanjo (Ryukoku University), Hiromitsu Nishizaki (U of Yamanashi), Tomoko Matsui (Institute of Statistical Mathematics), Yoichi Yamashita (Ritsumeikan U)

• Handling spontaneous speech data (spoken lectures)• Spoken term detection

Find the position of a given query term within SD• Spoken document retrieval

Find passages in SD data that match a given query

Reference speech recognition results

provided (non-speech people can

easily participate)

Page 39: Ntcir at clef 2010-09-21 Noriko Kando 1 What is happening at NTCIR Noriko Kando National Institute of Informatics, Japan

Cross-lingual Link Discovery (E->C,K,J)

• Given an English Wikipedia page,

(1)Identify anchors; and

(2)For each anchor, provide a list of C/K/J documents to be linked

Solar eclipse 日食

太陽

月食

太陽

月食CorrespondingW entry

sun

Lunar eclipse

Organisers: Ling-Xiang Tang, Shlomo Geva, Andrew Trotman, Yue Xu, Darren Huang (Queesland U of Technology), Andrew Trotman (U of Otago)

(1)

(2)

Linking INEXand NTCIR!

Page 40: Ntcir at clef 2010-09-21 Noriko Kando 1 What is happening at NTCIR Noriko Kando National Institute of Informatics, Japan

Evaluation Framework of Interactive Information Access (J)

Organisers: Tsuneaki Kato (U Tokyo) and Mitsunori Matsushita (Kansai U)

• Explore evaluation of interactive and exploratory information access

• Shared modules, shared interfaces between modules

• Interactive complex question answering subtask (visual/textual output)

• Trend information summarization subtask (e.g. cabinet support rate change)

Page 41: Ntcir at clef 2010-09-21 Noriko Kando 1 What is happening at NTCIR Noriko Kando National Institute of Informatics, Japan

PATMT (C-E, J-E, E-J)Organisers: Benjamin Tsou, Bin Lu (CUHK), Isao Goto

(NICT)

• C-E MT (new at NTCIR-9), J-E MT, E-J MT

Manual and automatic evaluation (BLEU, NIST)

• PATMT@NTCIR-10 will include extrinsic evaluation by CLIR

• Lessons from PATMT@NTCIR-7,8:

BLEUMAPHuma

n rating

High Low

with multiple reference translations (RTs)

High

Page 42: Ntcir at clef 2010-09-21 Noriko Kando 1 What is happening at NTCIR Noriko Kando National Institute of Informatics, Japan

TALK OUTLINE• About myself• Looking Back- NTCIR- NTCIR-5,6 CLIR (as a participant)- NTCIR-7,8 ACLIA- NTCIR-8 GeoTime• Looking Forward- NTCIR Grand Challenges- NTCIR-9 Tasks• Summary

Page 43: Ntcir at clef 2010-09-21 Noriko Kando 1 What is happening at NTCIR Noriko Kando National Institute of Informatics, Japan

TALK SUMMARY• Looking back (1999-2010)- More teams are now using straight MT for CLIR.

Often over 90% of monolingual performance.What are CLIR researchers doing now? CLIA?• Looking forward (2010-)- Grand challenges [tentative] = NN2S + E2E =

No Need To Search + Easy To Explore.- Multilinguality will still be important especially in

the system2user direction.Please participate in NTCIR-9 tasks!

Page 44: Ntcir at clef 2010-09-21 Noriko Kando 1 What is happening at NTCIR Noriko Kando National Institute of Informatics, Japan

NTCIR-9 Important Dates (tentative)

1st Oct 2010 first call for participation

Winter-Autumn 2011

tasks run

November 2011

camera-ready papers due

December 2011

NTCIR-9@NII, Tokyo

Watch http://research.nii.ac.jp/ntcir/

Page 45: Ntcir at clef 2010-09-21 Noriko Kando 1 What is happening at NTCIR Noriko Kando National Institute of Informatics, Japan

ntcir at clef 2010-09-21 Noriko Kando 45

Thanks MerciDanke schön Gracie

Gracias Ta! Tack Köszönöm KiitosTerima Kasih Khap Khun

Ahsante Tak 謝謝 ありがとう

Thanks MerciDanke schön Gracie

Gracias Ta! Tack Köszönöm KiitosTerima Kasih Khap Khun

Ahsante Tak 謝謝 ありがとう

http://research.nii.ac.jp/ntcir/http://research.nii.ac.jp/ntcir/

Will be moved to: http://ntcir.nii.ac.jp/Will be moved to: http://ntcir.nii.ac.jp/