30
Berners-Lee is not an ORGANIZATION, and African- Americans are not LOCATIONS: An Analysis of the Performance of Named-Entity Recognition Robert Krovetz (Lexicalresearch.com), Paul Deane, Nitin Madnani (ETS) A Review by Richard Littauer (UdS)

Named Entity Recognition - ACL 2011 Presentation

Embed Size (px)

DESCRIPTION

Given for the Multiw

Citation preview

Page 1: Named Entity Recognition - ACL 2011 Presentation

The Web is not a PERSON, Berners-Lee is not an ORGANIZATION, and African-Americans are not LOCATIONS: An Analysis of the Performance of Named-Entity RecognitionRobert Krovetz (Lexicalresearch.com), Paul Deane, Nitin Madnani (ETS)

A Review by Richard Littauer (UdS)

Page 2: Named Entity Recognition - ACL 2011 Presentation

The BackgroundNamed-Entity Recognition (NER)

is normally judged in the context of Information Extraction (IE)

Page 3: Named Entity Recognition - ACL 2011 Presentation

The BackgroundNamed-Entity Recognition (NER)

is normally judged in the context of Information Extraction (IE)

Various competitions

Page 4: Named Entity Recognition - ACL 2011 Presentation

The BackgroundNamed-Entity Recognition (NER)

is normally judged in the context of Information Extraction (IE)

Various competitionsRecently:

◦non-English languages◦improving unsupervised learning

methods

Page 5: Named Entity Recognition - ACL 2011 Presentation

The Background“There are no well-established

standards for evaluation of NER.”

Page 6: Named Entity Recognition - ACL 2011 Presentation

The Background“There are no well-established

standards for evaluation of NER.”◦Criteria for NER system changes for

competitions◦Proprietary software

Page 7: Named Entity Recognition - ACL 2011 Presentation

The BackgroundKDM wanted to identify MWEs…

Page 8: Named Entity Recognition - ACL 2011 Presentation

The BackgroundKDM wanted to identify MWEs…

… but false positives, tagging inconsistencies stopped this.

Page 9: Named Entity Recognition - ACL 2011 Presentation

The BackgroundKDM wanted to identify MWEs…

… but false positives, tagging inconsistencies stopped this.

IE derives Recall and Precision from Information Retrieval

NER is just a small part of this, so is rarely evaluated independently

Page 10: Named Entity Recognition - ACL 2011 Presentation

The BackgroundSo, they want to test NER

systems, and provide a unit test based on the problems encountered

Page 11: Named Entity Recognition - ACL 2011 Presentation

Evaluation

Compared three NER taggers: Stanford:

◦CRF, 100m training corpus;University of Illinois (LBJ):

◦Regularized average perceptron, Reuters 1996 News Corpus;

BBN IdentiFinder (IdentiFinder):◦HMMs, commercial

Page 12: Named Entity Recognition - ACL 2011 Presentation

EvaluationAgreement on Classification

Page 13: Named Entity Recognition - ACL 2011 Presentation

EvaluationAgreement on ClassificationAmbiguity in Discourse

Page 14: Named Entity Recognition - ACL 2011 Presentation

EvaluationAgreement on ClassificationAmbiguity in Discourse

Stanford vs. LBJ on internal ETS 425m corpus

All three on American National Corpus

Page 15: Named Entity Recognition - ACL 2011 Presentation

Stanford vs. LBJNER reported as 85-95%

accurate.

Page 16: Named Entity Recognition - ACL 2011 Presentation

Stanford vs. LBJNER reported as 85-95%

accurate.Same number for both: 1.95m for

Stanford, 1.8m for LBJ (7.6% difference)

However, errors:

Page 17: Named Entity Recognition - ACL 2011 Presentation

Stanford vs. LBJAgreement:

Page 18: Named Entity Recognition - ACL 2011 Presentation

Stanford vs. LBJAmbiguity:

Page 19: Named Entity Recognition - ACL 2011 Presentation

Stanford vs. LBJ vs. IdentiFinderAgreement:

Page 20: Named Entity Recognition - ACL 2011 Presentation

Stanford vs. LBJ vs. IdentiFinderAgreement:

Page 21: Named Entity Recognition - ACL 2011 Presentation

Stanford vs. LBJ vs. IdentiFinderDifferences:

◦How they are tokenized◦Number of entities recognized

overall

Page 22: Named Entity Recognition - ACL 2011 Presentation

Stanford vs. LBJ vs. IdentiFinderAmbiguity:

Page 23: Named Entity Recognition - ACL 2011 Presentation

Unit TestCreated two documents that can

be used as texts◦Different cases for true positives of

PERSON, LOCATION, ORGANIZATION◦Entirely upper case not NE (Ex.

AAARGH)◦Punctuated terms not NE◦Terms with Initials◦Acronyms (some expanded, some

not)◦Last names in close proximity to first

names

Page 24: Named Entity Recognition - ACL 2011 Presentation

Unit TestCreated two documents that can

be used as texts◦Terms with prepositions (Mass. Inst.

Of Tech.)◦Terms with location and organization

(Amherst College)

Provided freely online.

Page 25: Named Entity Recognition - ACL 2011 Presentation

One NE Tag per DiscourseUnusual for multiple occurrences

of a token in a document to be different entities

True for homonymsAn exception: Location + sports

team

Page 26: Named Entity Recognition - ACL 2011 Presentation

One NE Tag per DiscourseStanford, LBJ have features for

non-local dependencies to help with this.

KDM: Two other uses for NLD:◦Source of error in evaluation◦A way to identify semantically

related entities

These should be treated as exceptions

Page 27: Named Entity Recognition - ACL 2011 Presentation

DiscussionThere are guidelines for NER –

but we need standards.The community should focus on

PERSON, ORGANISATION, LOCATION, and MISC.◦Harder to deal with than Dates,

Times.◦Disagreement between taggers.◦MISC is necessary.◦These have important value

elsewhere.

Page 28: Named Entity Recognition - ACL 2011 Presentation

DiscussionTo improve intrinsic evaluation

for NER:1. Create test sets for divers domains.2. Use standardized sets for different

phenomena.3. Report accuracy for POL separately.4. Establish uncertainty in the tagging

system.

Page 29: Named Entity Recognition - ACL 2011 Presentation

Conclusion90% accuracy not real. We need to use only entities that

are agreed on by multiple taggers.

Even in cases where they both disagree (Hint: Future work.)

Unit test downloadable.

Page 30: Named Entity Recognition - ACL 2011 Presentation

Cheers/PERSON

Richard/ORGANISATION thanks the Mword Class/LOCATION for listening to his talk about Berners-Lee/MISC