Click here to load reader

Peggy elasticsearch應用

  • View
    257

  • Download
    0

Embed Size (px)

Text of Peggy elasticsearch應用

PowerPoint Presentation

Elasticsearch Peggy

Field datatypesa simple type like string, date, long, double, boolean or ip.a type which supports the hierarchical nature of JSON such as object or nested.or a specialised type like geo_point, geo_shape, or completion.

Search

get - http://localhost:9200/_index/_type/_id

http://localhost:9200/_index/_type/_id?pretty

get searchhttp://localhost:9200/_index/_type/_searchhttp://localhost:9200/_index/_type/_search?q=xxx&pretty

post search & query_stringhttp://localhost:9200/_index/_type/_search

http://localhost:9200/_index/_type/_search{ "query": { "query_string": { "query": "*" } } } =

query_string{ "query_string" : { "fields" : ["content", "name"], "query" : "this AND that" } }{ "query_string": { "query": "(content:this OR name:this) AND (content:that OR name:that)} }=

query_string - querystring = OR apple phone = apple OR phonetitle: OR title: = title: ( ) ! = (title: )booleanisPCT: truedate & rangedateName: [2012-01-01 TO 2012-12-31]dateName: [2012-01-01 TO *]dateName: {2011-12-31 TO *]range: [ 1 TO 5 ]

query_string - queryobjectinventorsRaw.name: Nicky

_missing_ & _exists__missing_: title_exists_: title

query_string - nested{ "query": { "nested": { "path": "relatedDocumentsRaw", "query": { "query_string": { "query": "relatedDocumentsRaw.type:*" } } } } }

query size & fromsize(default: 10) The size parameter allows you to configure the maximum amount of hits to be returned.from(default: 0) The from parameter defines the offset from the first result you want to fetch.

[query_phase_execution_exception] Result window is too large, from + size must be less than or equal to: [10000]See the scroll api for a more efficient way to request large data sets.

query sort & _sourcesortAllows to add one or more sort on specific fields. _sourceAllows to control how the _source field is returned with every hit.{ "query": "", "size": 5, "from": 10, "sort": [{ "pubDate": "desc" }], "_source": ["pubDate"], }

query - filter{ "query": { "query_string": { "query": "*" } }, "filter": { "script": { "script": { "lang": "groovy", "file": "fileNamw", "params": { "params1": "date1", "params2": "date2", } } } } }

query - aggregations (aggs)The aggregations framework helps provide aggregated data based on a search query.size: default :10size: 0 min_doc_count: order: date_histogram: terms: doc_dount { "query": "", "aggs": { "date_agg": { "date_histogram": { "field": pubDate", "interval": "day", "format": "yyyy-MM-dd", "order": { "_count": "desc" }, "min_doc_count": 1} }, "kindCode_agg": { "terms": { "field": "kindCode", "size": 20, "shard_size": 20} }} }

query - aggregations (aggs){"aggregations": { "kindCode_agg": { "doc_count_error_upper_bound": 0, "sum_other_doc_count": 0, "buckets": [ { "key": "U", "doc_count": 75879 }, { "key": "A", "doc_count": 73732 }, { "key": "B", "doc_count": 44115 }, { "key": "S", "doc_count": 38981 } ] }, "appDocs": { "buckets": [ { "key_as_string": "2016-01-06", "key": 1452038400000, "doc_count": 56079 }, { "key_as_string": "2016-01-13", "key": 1452643200000, "doc_count": 54256 }, { "key_as_string": "2016-01-20", "key": 1453248000000, "doc_count": 80021 }, { "key_as_string": "2016-01-27", "key": 1453852800000, "doc_count": 42351 } ] } } }

_timestamp fieldMappingquery result{ "mappings": { "my_type": { "_timestamp": { "enabled": true } } } } { "_index": "test2", "_type": "type", "_id": "2", "_score": 1, "_timestamp": 1454051014319, "_source": { "name": "Tony", "day": "1990-03-21" } }

https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-timestamp-field.html16

search_typecountThe count search type has only a query phase. It can be used when you dont need search results, just a document count or aggregations on documents matching the query.query_and_fetchThe query_and_fetch search type combines the query and fetch phases into a single step. This is an internal optimization that is used when a search request targets a single shard only.dfs_query_then_fetch and dfs_query_and_fetchThe dfs search types have a prequery phase that fetches the term frequencies from all involved shards in order to calculate global term frequencies. We discuss this further in Relevance Is Broken!.scanThe scan search type is used in conjunction with the scroll API to retrieve large numbers of results efficiently. It does this by disabling sorting.

https://www.elastic.co/guide/en/elasticsearch/guide/current/_search_options.html#search-type?q=sear17

Scan & Scroll

scan&scrollPOSThttp://localhost:9200/{{_index}}/({{type}}/)_search?search_type=scan&scroll=1m{ "query": { "query_string": { "query": *" } }}

Keeping the search context aliveThe scroll parameter (passed to the search request and to every scroll request) tells Elasticsearch how long it should keep the search context alive.Its value (e.g. 1m, see the section called Time unitsedit) does not need to be long enough to process all datait just needs to be long enough to process the previous batch of results. Each scroll request (with the scroll parameter) sets a new expiry time.

post{ "_scroll_id": "c2Nhbjs1OzE5NjMzOkxXdWt2d2V2UVFHTVvdGFsX2hpdHM6MT..", "took": 487,"timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 1041712, "max_score": 0, "hits": []} }

scrollGet http://localhost:9200/_search/scroll/{{_scroll_id}}?scroll=1m

http://stackoverflow.com/questions/25453872/why-does-this-elasticsearch-scan-and-scroll-keep-returning-the-same-scroll-id22

{ "_scroll_id": "c2Nhbjs1OzE5NjMzOkxXdWt2d2V2UVFHTVvdGFsX2hpdHM6MT..", "took": 487,"timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 1041712, "max_score": 0, "hits": [ {.}, {.}, {.}, {.}, {.}, {.}, {.}, {.}]} } get

Ref.Bulkhttps://www.elastic.co/guide/en/elasticsearch/guide/current/bulk.htmlScan & Scrollhttps://www.elastic.co/guide/en/elasticsearch/guide/current/scan-scroll.htmlhttp://stackoverflow.com/questions/25453872/why-does-this-elasticsearch-scan-and-scroll-keep-returning-the-same-scroll-id

Bulk

Cheaper in Bulk{ action: { metadata }}\n { request body }\n { action: { metadata }}\n { request body }\n..

actiondelete{ "delete": { "_index": "website", "_type": "blog", "_id": "123" }}\ncreate{ "create": { "_index": "website", "_type": "blog", "_id": "123" }} \n{ "title": "My first blog post" } \nIndex{ "index": { "_index": "website", "_type": "blog" }} \n{ "title": "My second blog post" } \nupdate{ "update": { "_index": "website", "_type": "blog", "_id": "123", "_retry_on_conflict" : 3} } \n{ "doc" : {"title" : "My updated blog post"} } \n

status '200': 'OK','201': 'Created',{ "took": 4, "errors": false, "items": [ { "delete": { "_index": "website", "_type": "blog", "_id": "123", "_version": 2, "status": 200, "found": true }}, { "create": { "_index": "website", "_type": "blog", "_id": "123", "_version": 3, "status": 201 }}, { "create": { "_index": "website", "_type": "blog", "_id": "EiwfApScQiiy7TIKFxRCTw", "_version": 1, "status": 201 }}, { "update": { "_index": "website", "_type": "blog", "_id": "123", "_version": 4, "status": 200 }} ] }

Error Example{ "create": { "_index": "website", "_type": "blog", "_id": "123" }} { "title": "Cannot create - it already exists" } { "index": { "_index": "website", "_type": "blog", "_id": "123" }} { "title": "But we can update it" }

Error Example{ "took": 3, "errors": true, "items": [ { "create": { "_index": "website", "_type": "blog", "_id": "123", "status": 409, "error": "DocumentAlreadyExistsException [[website][4] [blog][123]: document already exists]" }}, { "index": { "_index": "website", "_type": "blog", "_id": "123", "_version": 5, "status": 200 }} ] }