19
ElasticSearch/Elastica Nicolas Badey

ElasticSearch & Elastica in Symfony2 - SfLive 2015

Embed Size (px)

Citation preview

ElasticSearch/ElasticaNicolas Badey

About me

Yesterday CTO of Yoopies

Tomorow CTO of Expertissim

SfLive is magic !

What is it ?● “Distributed, RESTful, Search Engine built on top of Apache

Lucene”

● Easy to install : aptitude install elasticsearch

● Easy to use, you will love JSON

● Denormalizing your data

Features- Scoring : Calculate relevance, boost, Score Scripting- Analyzers : a Tokenizer with TokenFilters and CharFilters- GeoLocation- Facets => Aggregations- Highlighting- Scripting- Percolator : Prospective search- 3 layers cache- Plugin (attachment type, River …)- Suggester : autocompletion and more

Why ElasticSearch● For SearchEngine: we reach SQL efficient and functional limits

● An easy solution for a first approach to Search Engine

● Denormalize our data for search

● Used in : Search Form, Cron , SEO page, Business Metrics...

Elastica / ElasticaBundle

● Persistence automatic provider, Doctrine/Propel/MongoDB● Pagination, PagerFanta/KNPpaginator● Persistence listener CallBack (only Doctrine)● Populate

Finally we don’t use it anymore, we just keep it for index config and services

Index Type FinderClient

Searchcurl -XGET http://localhost:9200/[INDEX]/[TYPE]/_search -d ‘{

"query": {

"query_string": {

"query": "foobar"

}

},

"filter": {

"numeric_range": {

"price": {

"lte": 42

}

}

},

"sort": {

"created_at": {

"order": "desc"

}

}

Query: - Relevance- Scoring

Filter :- Discriminate- Cached- Fast

Search

ETL● Extract all ads from SQL, Transform it then Load it in ElasticSearch

● Don’t use “Populate” for large project

● Still in PHP and Symfony2 for using our Model layer (or not...)

● DoctrineListener as AMQP publisher for live indexing

● Need to be fast : PDO & Curl : 10 types, 500 000 ads , 5min

● Next : decoupling outside Symfony with Console Components

Usage SitterForm

SitterSearch

SitterQueryextend ElasticaQuery

QueryFactory

ResultSet

PagerFantaElasticaAdapter

SearchManager

A Good FullText Search

● MultiMatch Query : Search text in multiple fields

● Highlighting : Highlight words in documents

● Suggester : Do autocompletion

● Find compromise between relevance and quantity

Multi Match Query

subfields, for fullText search : my_field.fr and my_field.en

“regular” field “my_field”

Multi Match Query

a boost by 3 on content’s subfieldsall title’s subfields but not title itself

Highlight with MultiMatch

Suggester

Percollator● Index user’s search query in a “percolator index”

● When an ad is registered, send it to regular index and percolator

● Matched percolator names will be return

● You can alert user that an ad corresponding to his alert has just been registered

Aggregator

Score Scripting

in /etc/elasticsearch/scripts/grade.groovy :doc['average_grade'].value > 3.5 ? _score * doc['average_grade'].value : _score

in /etc/elasticsearch/scripts/login.groovy :doc['lastLogin'].value < minLastLogin ? _score * 0.5 : _score

Error : Easy To Understand :)● Most of the time due to strong typing (string instead of int)

● Be carreful to space left in HDD when indexing