Intro to search engines with Lucene and ElasticSearch

Preview:

DESCRIPTION

A talk given at the JJTV Tools Night #3 on September 5, 2012: http://www.meetup.com/jjtv-il/events/77834332/

Citation preview

Introduction to Search Engineswith Lucene and ElasticSearch

Tomer Gabel, newBrandAnalytics

Overture

A search engine is…

a document store

Overture

Step 1

Tokenization

Overture

Step 2

Filtering

Overture

Step 3

Build reverse index

Overture

• Term query– Maps a term to its

document– Scoring is based

on:• Number of hits per

document (TF)• How “strong” a

match is (IDF)

Overture

• Boolean query–Multiple clauses–Each match can:

• Include document (MUST)

• Affect score (SHOULD)

• Exclude document (MUST_NOT)

Overture

• Phrase query–All terms must

appear near each other

–Slop is the maximum token “edit distance”

–Closer match = higher score

It’s demo time.

Recitativo

It’s demo time.

Crescendo

Content: Apache Lucene: http://lucene.apache.org/ Elastic Search: http://www.elasticsearch.org/ Code samples:https

://github.com/holograph/examples/tree/master/lucene-demo

PowerPoint template by SmileTemplates.com

Thank you for your time! tomer@tomergabel.com http://www.tomergabel.com

Afterword