12
www.anant.us | [email protected] | 202.905.2818 1010 Wisconsin Ave, NW | Suite 250 | Washington, DC 20007 Research & Development – Comparing Lucene / SolR / Elastic & Cloud Search Providers Building Search Engines

Building Search Engines - Lucene, SolR and Elasticsearch

Embed Size (px)

Citation preview

Page 1: Building Search Engines - Lucene, SolR and Elasticsearch

www.anant.us | [email protected] | 202.905.28181010 Wisconsin Ave, NW | Suite 250 | Washington, DC 20007

Research & Development – Comparing Lucene / SolR / Elastic &

Cloud Search Providers

Building Search Engines

Page 2: Building Search Engines - Lucene, SolR and Elasticsearch

What do we do?

Streamline, Organize & Unify

Business Information

Page 3: Building Search Engines - Lucene, SolR and Elasticsearch

Agenda

• Challenge - Why does this matter?• Info Retrieval - Retrieval / Routing• Lucene - More than meets the eye ...• Search Engine - 30k Foot View• On Premise - Lucene / SolR / Elastic • Cloud Providers - Amazon / Azure

Page 4: Building Search Engines - Lucene, SolR and Elasticsearch

Challenge – Why does this matter?

Knowledge

Project Information

Client Service Information

CorporateGuides

Collaborative Documents

Assets& Files

Corporate Resources

Appleseed Framework (Portal, Base, Search)

G Drive Delta

DropBox

G Drive Delta

NutshellDropbox

Freshbooks

G DriveG Sites (KB)

G DriveWorkflowy

Evernote

G DriveDropBox

OwnCloud

PocketLeaves

AIC (WP)Anant (WP)

Page 5: Building Search Engines - Lucene, SolR and Elasticsearch

Document Retrieval• Google Search

• Amazon Search

• LinkedIn Search

• CMS Search *

• Portal Search *

• CRM Search *

• Search *

Document Routing• Google Alerts

• Amazon Recommendations

• Netflix Recommendations

• LinkedIn Recommendations

Information Retrieval

Page 6: Building Search Engines - Lucene, SolR and Elasticsearch

Lucene – Inverted Index

Page 7: Building Search Engines - Lucene, SolR and Elasticsearch

Lucene – More than meets the eye

WhoNext?

Think of it like a “NoSQL” Database that has great indexing.. everywhere.

Page 8: Building Search Engines - Lucene, SolR and Elasticsearch

Search Engine – 30 Thousand Foot View

The search index is only as good as your processed data. If you put everything you find in your index, you are going to spend a lot of time telling people how to search.

Page 9: Building Search Engines - Lucene, SolR and Elasticsearch

On Premise – Lucene / ES / SolR

Lucene• Library

• File System

• Format

• Fast

• Embeddable*

• Indexing Anywhere

• Need to really know Lucene

• No Interface

• No server

• Lots of house keeping

SolR• Server

• Admin / REST Interface

• Configurable

• Scalable

• Great at Text*

• Truly Open

• 10+ Years

• Good ecosystem

• Too customizable

• Schemas*

• Zookeeper Needed

ElasticSearch• Server

• Configurable

• Scalable

• Good ecosystem

• Built in Clustering

• Grouping / Filtering

• Great for Logs

• Started as a Cloud Tool

• No great OTS Interface

• Only REST Interface

Page 10: Building Search Engines - Lucene, SolR and Elasticsearch

Cloud Search – Amazon / Azure

Amazon• SolRCloud*

• AWS* Ecosystem

• 5 QParsers

• Dynamic Fields

• 100% Completely Managed

• Been Around for a While

• Data / Read Writes

• No nested Objects

Azure• ElasticSearch*

• Azure* Ecosystem

• 2 QParsers

• 100% Completely Managed

• Good SDK

• Few Years Old

• Data / Read Writes

• No nested Objects

• Not so Dynamic Fields

Page 11: Building Search Engines - Lucene, SolR and Elasticsearch

Questions & Contact

www.anant.us | [email protected] | 202.905.28181010 Wisconsin Ave, NW | Suite 250 | Washington, DC 20007

@anantcorp

facebook.com/anantCorp

linkedin.com/company/anant

[email protected]/in/xingh

Rahul SinghCEO & Founder

Questions & Contact

• Modern Enterprise• Mastering Services in the Service of Others• Hybrid Agile Project Management• Building Search Engines• CICD / DevOps• Connecting Internet Software

Page 12: Building Search Engines - Lucene, SolR and Elasticsearch

www.anant.us | [email protected] | 202.905.28181010 Wisconsin Ave, NW | Suite 250 | Washington, DC 20007

Streamlined DataIntegration / Data Pipelines

Organized KnowledgeSearch / Data Warehouses

Unified InterfacesPortals / Dashboards / Mobile