Upload
medcl
View
1.339
Download
5
Embed Size (px)
DESCRIPTION
lamper 2012/02/18 anyshare
Citation preview
ElasticSearchA search engine “ready to fly”
Medcl/2012/2/18
About me
• Medcl
• medcl@sina
• medcl@github
• log.medcl.net
Why I am here?
• 好东西需要与大家一起分享!
What’s elasticsearch
• “Distributed, (Near) Real Time, Search Engine”
• Open Source(Apache 2.0)• RESTful• Free Schema(Dynamic)• MultiTenant• Scalable• High Availability• Rich Search Features• Good Extensibility• … …
first impression
Let’s start
the trip
Debug Tools
Index a document
curl –XPOST http://localhost:9200/myindex/share/1
-d’
{
"url" : "http://www.lamper.cn/",
"date" : "2012-02-18 13:00:00",
"location" : "beijing,北京"
}’
RESTfulURL地址
索引文档内容,Json格式
Field字段名称
字段内容
Index Response
{
"ok": true,
"_index": "myindex",
"_type": "share",
"_id": "1",
"_version": 1
}
Explain the url
http://localhost:9200/myindex/share/1
服务器IP地址
HTTP端口
索引名称
索引类型名称
索引文档唯一标识
Query the document
curl –XGET http://localhost:9200/myindex/share/_search?q=location:beijing
ES服务器地址
索引名称
类型名称
搜索RESTful接口
指定查询条件
查询条件,
字段名:值
Search Response
{ "took": 12, "timed_out": false,
"_shards": { "total": 5, "successful": 5, "failed": 0 },
"hits": {
"total": 1, "max_score": 0.5,
"hits": [ {
"_index": "myindex",
"_type": "share",
"_id": "1",
"_score": 0.5,
"_source": {
"url": "http://www.lamper.cn/",
"date": "2012-02-18 13:00:00",
"location": "beijing,北京"
} } ] }}
Queries
http://localhost:9200/myindex/share/_search?q=beijing
http://localhost:9200/myindex/share,conf/_search?q=beijing
http://localhost:9200/myindex/_search?q=beijing
http://localhost:9200/myindex,myindex2/_search?q=beijing
http://localhost:9200/_search?q=beijing
QueryDSL
curl -XPOST http://localhost:9200/myindex/_search –d’
{
"query": {
"term": {
"location": "beijing"
}
}
}’
Why QueryDSL?
Filters、Caching、Highlighting、Facet、
ComplexQuery… …
Scalability&HA
Distributed Lucene Directory
• Each index is fully sharded with a configurable number of shards.
• Each shard can have zero or more replicas.
• Read / Search operations performed on either replica shard.
Automatic shard allocation
From:http://www.slideshare.net/elasticsearch/elasticsearch-at-berlinbuzzwords-2010#
Scalability
• nodes that can hold data, and nodes that do not.
• There is no need for a load balancer in elasticsearch, each node can receive a request, and if it can’t handle it, it will automatically delegate it to the appropriate node(s).
• If you want to scale out search, you can simply have more shard replicas per shard.
Transaction log
• Indexed / deleted doc is fully persistent
• No need for a Lucene IndexWriter#commit
• Managed using a transaction log / WAL
• Full single node durability (kill dash 9)
• Utilized when doing hot relocation of shards
• Periodically “flushed” (calling IW#commit)
BASE
• Each document you index is there once the index operation is done.
• No need to commit or something similar to get everything persisted.
• A shard can have 1 or more replicas for HA.
• Gateway persistency is done in the background in an async manner.
Not Mentioned Here…
• Versioning• Template• River• Percolator• PartialUpdate• Routing• Parent-Child Type• Scripting• … …
That’s Too Much,Discovery it yourself
Community&Support
• http://github.com/elasticsearch
• http://groups.google.com/group/elasticsearch
• Irc:#elasticsearch@freenode
• qq群:190605846
• http://doc.elasticsearch.cn
• http://s.medcl.net/
BTW
• 招人in’– 分布式
– 高性能
– 海量数据处理
– 个性化推荐
– 搜索引擎
• 对以上任一感兴趣者:
– 欢迎加入我们的团伙!
My Company!
Thank you!