? 若在傳統(tǒng)DBMS 關系型數(shù)據(jù)庫中查詢海量數(shù)據(jù),特別是模糊查詢,一般我們都是使用like %查詢的值%,但這樣會導致無法應用索引,從而形成全表掃描效率低下,即使是在有索引的字段精確值查找,面對海量數(shù)據(jù),效率也是相對較低的,所以目前一般的互聯(lián)網(wǎng)公司或大型公司,若要查詢海量數(shù)據(jù),最好的辦法就是使用搜索引擎,目前比較主流的搜索引擎框架就是:Elasticsearch,故今天我這里總結了Elasticsearch必知必會的干貨知識一:ES索引文檔的CRUD,后面陸續(xù)還會有其它干貨知識分享,敬請期待。
-
ES索引文檔的CRUD(6.X與7.X有區(qū)別,6.X中支持一個index創(chuàng)建多個type,而7.X中及以上只支持1個固定的type,即:_doc,API用法上也稍有不同):
-
Create創(chuàng)建索引文檔【POST index/type/id可選,如果index、type、id已存在則重建索引文檔(先刪除后創(chuàng)建索引文檔,與Put index/type/id 原理相同),如果在指定id情況下需要限制自動更新,則可以使用:index/type/id?op_type=create 或 index/type/id/_create,指明操作類型為創(chuàng)建,這樣當存在的記錄的情況下會報錯】
POST demo_users/_doc 或 demo_users/_doc/2vJKsm8BriJODA6s9GbQ/_create
Request Body:
{
"userId":1,
"username":"張三",
"role":"administrator",
"enabled":true,
"createdDate":"2020-01-01T12:00:00"
}
Response Body:
{
"_index": "demo_users",
"_type": "_doc",
"_id": "2vJKsm8BriJODA6s9GbQ",
"_version": 1,
"result": "created",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 0,
"_primary_term": 1
}
-
Get獲取索引文檔【Get index/type/id】
Get demo_users/_doc/123
Response Body:
{
"_index": "demo_users",
"_type": "_doc",
"_id": "123",
"_version": 1,
"found": true,
"_source": {
"userId": 1,
"username": "張三",
"role": "administrator",
"enabled": true,
"createdDate": "2020-01-01T12:00:00"
}
}
-
Index Put重建索引文檔【PUT index/type/id 或 index/type/id?op_type=index,id必傳,如果id不存在文檔則創(chuàng)建文檔,否則先刪除原有id文檔后再重新創(chuàng)建文檔,version加1】
Put/POST demo_users/_doc/123 或 demo_users/_doc/123?op_type=index
Request Body:
{
"userId":1,
"username":"張三",
"role":"administrator",
"enabled":true,
"createdDate":"2020-01-01T12:00:00",
"remark":"僅演示"
}
Response Body:
{
"_index": "demo_users",
"_type": "_doc",
"_id": "123",
"_version": 4,
"result": "updated",
"_shards": {
"total": 2,
"successful": 2,
"failed": 0
},
"_seq_no": 10,
"_primary_term": 1
}
-
Update更新索引文檔【POST index/type/id/_update 請求體必需是{"doc":{具體的文檔JSON}},如果指定的鍵字段已存在則更新,如果指定的鍵字段不存在則附加新的鍵值對,支持多層級嵌套,多次請求,如果有字段值有更新則version加1,否則提示更新0條 】
POST demo_users/_doc/123/_update
Request Body:
{
"doc": {
"userId": 1,
"username": "張三",
"role": "administrator",
"enabled": true,
"createdDate": "2020-01-01T12:00:00",
"remark": "僅演示POST更新5",
"updatedDate": "2020-01-17T15:30:00"
}
}
Response Body:
{
"_index": "demo_users",
"_type": "_doc",
"_id": "123",
"_version": 26,
"result": "updated",
"_shards": {
"total": 2,
"successful": 2,
"failed": 0
},
"_seq_no": 35,
"_primary_term": 1
}
-
Delete刪除索引文檔【DELETE index/type/id】
DELETE demo_users/_doc/123
Response Body:
{
"_index": "demo_users",
"_type": "_doc",
"_id": "123",
"_version": 2,
"result": "deleted",
"_shards": {
"total": 2,
"successful": 2,
"failed": 0
},
"_seq_no": 39,
"_primary_term": 1
}
-
Bulk批量操作文檔【POST _bulk 或 index/_bulk 或 index/type/_bulk 一次請求支持進行多個索引、多個type的多種不同的CRUD操作,如果操作中有某個出現(xiàn)錯誤不會影響其它操作;】
POST _bulk
Request Body:(注意最后還得多一個換行,因為ES是根據(jù)換行符來識別多條命令的,如果缺少最后一條換行則會報錯,注意請求體非標準的JSON,每行才是一個JSON,整體頂多可看成是\n區(qū)分的JSON對象數(shù)組)
{ "index" : { "_index" : "demo_users_test", "_type" : "_doc", "_id" : "1" } }
{ "bulk_field1" : "測試創(chuàng)建index" }
{ "delete" : { "_index" : "demo_users", "_type" : "_doc", "_id" : "123" } }
{ "create" : { "_index" : "demo_users", "_type" : "_doc", "_id" : "2" } }
{ "bulk_field2" : "測試創(chuàng)建index2" }
{ "update" : { "_index" : "demo_users_test","_type" : "_doc","_id" : "1" } }
{ "doc": {"bulk_field1" : "測試創(chuàng)建index1","bulk_field2" : "測試創(chuàng)建index2"} }
Response Body:
{
"took": 162,
"errors": true,
"items": [
{
"index": {
"_index": "demo_users_test",
"_type": "_doc",
"_id": "1",
"_version": 8,
"result": "updated",
"_shards": {
"total": 2,
"successful": 2,
"failed": 0
},
"_seq_no": 7,
"_primary_term": 1,
"status": 200
}
},
{
"delete": {
"_index": "demo_users",
"_type": "_doc",
"_id": "123",
"_version": 2,
"result": "not_found",
"_shards": {
"total": 2,
"successful": 2,
"failed": 0
},
"_seq_no": 44,
"_primary_term": 1,
"status": 404
}
},
{
"create": {
"_index": "demo_users",
"_type": "_doc",
"_id": "2",
"status": 409,
"error": {
"type": "version_conflict_engine_exception",
"reason": "[_doc][2]: version conflict, document already exists (current version [1])",
"index_uuid": "u7WE286CQnGqhHeuwW7oyw",
"shard": "2",
"index": "demo_users"
}
}
},
{
"update": {
"_index": "demo_users_test",
"_type": "_doc",
"_id": "1",
"_version": 9,
"result": "updated",
"_shards": {
"total": 2,
"successful": 2,
"failed": 0
},
"_seq_no": 8,
"_primary_term": 1,
"status": 200
}
}
]
}
-
mGet【POST _mget 或 index/_mget 或 index/type/_mget ,如果指定了index或type,則請求報文中則無需再指明index或type,可以通過_source指明要查詢的include以及要排除exclude的字段】
POST _mget
Request Body:
{
"docs": [
{
"_index": "demo_users",
"_type": "_doc",
"_id": "12345"
},
{
"_index": "demo_users",
"_type": "_doc",
"_id": "1234567",
"_source": [
"userId",
"username",
"role"
]
},
{
"_index": "demo_users",
"_type": "_doc",
"_id": "1234",
"_source": {
"include": [
"userId",
"username"
],
"exclude": [
"role"
]
}
}
]
}
Response Body:
{
"docs":[
{
"_index":"demo_users",
"_type":"_doc",
"_id":"12345",
"_version":1,
"found":true,
"_source":{
"userId":1,
"username":"張三",
"role":"administrator",
"enabled":true,
"createdDate":"2020-01-01T12:00:00"
}
},
{
"_index":"demo_users",
"_type":"_doc",
"_id":"1234567",
"_version":7,
"found":true,
"_source":{
"role":"administrator",
"userId":1,
"username":"張三"
}
},
{
"_index":"demo_users",
"_type":"_doc",
"_id":"1234",
"_version":1,
"found":true,
"_source":{
"userId":1,
"username":"張三"
}
}
]
}
POST demo_users/_doc/_mget
Request Body:
{
"ids": [
"1234",
"12345",
"123457"
]
}
Response Body:
{
"docs":[
{
"_index":"demo_users",
"_type":"_doc",
"_id":"1234",
"_version":1,
"found":true,
"_source":{
"userId":1,
"username":"張三",
"role":"administrator",
"enabled":true,
"createdDate":"2020-01-01T12:00:00",
"remark":"僅演示"
}
},
{
"_index":"demo_users",
"_type":"_doc",
"_id":"12345",
"_version":1,
"found":true,
"_source":{
"userId":1,
"username":"張三",
"role":"administrator",
"enabled":true,
"createdDate":"2020-01-01T12:00:00"
}
},
{
"_index":"demo_users",
"_type":"_doc",
"_id":"123457",
"found":false
}
]
}
-
_update_by_query根據(jù)查詢條件更新匹配到的索引文檔的指定字段【POST index/_update_by_query 請求體寫查詢條件以及更新的字段,更新字段這里采用了painless腳本進行靈活更新】
POST demo_users/_update_by_query
Request Body:(意思是查詢role=administrator【可能大家看到keyword,這是因為role字段為text類型,無法直接匹配,需要借助于子字段role.keyword,如果有不理解后面會有簡要說明】,更新role為poweruser、remark為remark+采用_update_by_query更新)
{
"script":{ "source":"ctx._source.role=params.role;ctx._source.remark=ctx._source.remark+params.remark",
"lang":"painless",
"params":{
"role":"poweruser",
"remark":"采用_update_by_query更新"
}
},
"query":{
"term":{
"role.keyword":"administrator"
}
}
}
painless寫法請具體參考:painless語法教程
Response Body:
{
"took": 114,
"timed_out": false,
"total": 6,
"updated": 6,
"deleted": 0,
"batches": 1,
"version_conflicts": 0,
"noops": 0,
"retries": {
"bulk": 0,
"search": 0
},
"throttled_millis": 0,
"requests_per_second": -1,
"throttled_until_millis": 0,
"failures": [ ]
}
-
_delete_by_query根據(jù)查詢條件刪除匹配到的索引文檔【 POST index/_delete_by_query 請求體寫查詢匹配條件】
POST demo_users/_delete_by_query
Request Body:(意思是查詢enabled=false)
{
"query": {
"match": {
"enabled": false
}
}
}
Response Body:
{
"took":29,
"timed_out":false,
"total":3,
"deleted":3,
"batches":1,
"version_conflicts":0,
"noops":0,
"retries":{
"bulk":0,
"search":0
},
"throttled_millis":0,
"requests_per_second":-1,
"throttled_until_millis":0,
"failures":[
]
}
-
search查詢
-
URL GET查詢(GET index/_search?q=query_string語法,注意中文內(nèi)容默認分詞器是一個漢字拆分成一個term)
A.Term Query:【即分詞片段(詞條)查詢,注意這里講的包含是指與分詞片段匹配】
GET /demo_users/_search?q=role:poweruser //指定字段查詢,即:字段包含查詢的值
GET /demo_users/_search?q=poweruser //泛查詢(沒有指定查詢的字段),即查詢文檔中所有字段包含poweruser的值,只要有一個字段符合,那么該文檔將會被返回
B.Phrase Query【即分組查詢】
操作符有:AND / OR / NOT 或者表示為: && / || / !
+表示must -表示must_not 例如:field:(+a -b)意為field中必需包含a但不能包含b
GET /demo_users/_search?q=remark:(POST test)
GET /demo_users/_search?q=remark:(POST OR test)
GET /demo_users/_search?q=remark:"POST test"
//分組查詢,即:查詢remark中包含POST 或 test的文檔記錄
GET /demo_users/_search?q=remark:(test AND POST) //remark同時包含test與POST
GET /demo_users/_search?q=remark:(test NOT POST) //remark包含test但不包含POST
C.范圍查詢
區(qū)間表示:[]閉區(qū)間,{}開區(qū)間
如:year:[2019 TO 2020] 或 {2019 TO 2020} 或 {2019 TO 2020] 或 [* TO 2020]
算數(shù)符號
year:>2019 或 (>2012 && <=2020) 或 (+>=2012 +<=2020)
GET /demo_users/_search?q=userId:>123 //查詢userId字段大于123的文檔記錄
D.通配符查詢
?表示匹配任意1個字符,*表示匹配0或多個字符 例如:role:power* , role:use?
GET /demo_users/_search?q=role:power* //查詢role字段前面是power,后面可以是0或多個其它任意字符。
可使用正則表達式,如:username:張三\d+
可使用近似查詢偏移量(slop)提高查詢匹配結果【使用~N,N表示偏移量】
GET /demo_users/_search?q=remark:tett~1 //查詢remark中包含test的文檔,但實際寫成了tett,故使用~1偏移近似查詢,可以獲得test的查詢結果
GET /demo_users/_search?q=remark:"i like shenzhen"~2 //查詢i like shenzhen但實際remark字段中值為:i like hubei and shenzhen,比查詢值多了 hubei and,這里使用~2指定可偏移相隔2個term(這里即兩個單詞),最終也是可以查詢出結果
-
DSL POST查詢(POST index/_search)
POST demo_users/_search
Request Body:
{
"query":{
"bool":{
"must":[
{
"term":{
"enabled":"true" #查詢enabled=true
}
},
{
"term":{
"role.keyword":"poweruser" #且role=poweruser
}
},
{
"query_string":{
"default_field":"username.keyword",
"query":"張三" #且 username 包含張三
}
}
],
"must_not":[
],
"should":[
]
}
},
"from":0,
"size":1000,
"sort":[
{
"createdDate":"desc" #根據(jù)createdDate倒序
}
],
"_source":{ #指明返回的字段,includes需返回字段,excludes不需要返回字段
"includes":[
"role",
"username",
"userId",
"remark"
],
"excludes":[
]
}
}
具體用法可參見:
【Elasticsearch】query_string的各種用法
Elasticsearch中 match、match_phrase、query_string和term的區(qū)別
Elasticsearch Query DSL 整理總結
[布爾查詢Bool Query]
最后附上ES官方的API操作鏈接指引:
Indices APIs:負責索引Index的創(chuàng)建(create)、刪除(delete)、獲?。╣et)、索引存在(exist)等操作。
Document APIs:負責索引文檔的創(chuàng)建(index)、刪除(delete)、獲?。╣et)等操作。
Search APIs:負責索引文檔的search(查詢),Document APIS根據(jù)doc_id進行查詢,Search APIs]根據(jù)條件查詢。
Aggregations:負責針對索引的文檔各維度的聚合(Aggregation)。
cat APIs:負責查詢索引相關的各類信息查詢。
Cluster APIs:負責集群相關的各類信息查詢。
|