ES使用(基本查询、聚合查询)
操作索引
1.新建索引
curl -XPUT localhost:9200/index01
2.查看索引
curl -XGET http://192.168.168.101:9200/index01/_settings
curl -XGET http://192.168.168.101:9200/index01,blog/_settings
3.删除索引
curl -XDELETE http://192.168.168.101:9200/index02
4.打开关闭索引
curl -XPOST http://192.168.168.101:9200/index01/_close
curl -XPOST http://192.168.168.101:9200/index01/_open
文档管理
1.新建文档
curl -XPUT -d ‘{‘id’:1,‘title’:‘es简介’}’ http://localhost:9200/index01/article/1
2.获取文档
curl -XGET http://192.168.168.101:9200/index01/article/1
3.删除文档
curl -XDELETE http://192.168.168.101:9200/index01/article/1
查询操作
类Lucene查询
_exists_:execution_completed_time __type:company_extended_business weibo_type:18 OR weibo_type:24 OR weibo_type:25 NOT company_id:442966 first_consume_time:{'2019-01-03 00:00:00' TO '2019-01-03 00:00:00'}12345678
基本查询
指定请求头
–header “content-Type:application/json”
准备数据
curl -XPUT -d '{"id":1,"title":"es简介","content":"es好用好用真好用"}' http://192.168.168.101:9200/index01/article/1 curl -XPUT -d '{"id":1,"title":"java编程思想","content":"这就是个工具书"}' http://192.168.168.101:9200/index01/article/2 curl -XPUT -d '{"id":1,"title":"大数据简介","content":"你知道什么是大数据吗,就是大数据"}' http://192.168.168.101:9200/index01/article/3123
term query
curl -XGET http://192.168.168.101:9200/index01/_search -d {'query':{'term':{'title':'你好'}}}1
查询的字段只有一个值得时候,应该使用term而不是terms,在查询字段包含多个的时候才使用terms,使用terms语法,json中必须包含数组
match在匹配时会对所查找的关键词进行分词,然后按分词匹配查找,而term会直接对关键词进行查找。一般**模糊查找的时候,多用match,而精确查找时可以使用term
terms query
{ 'query':{ 'terms':{ 'tag':["search",'nosql','hello'] } } }1234567
match query
{'query':{'match':{'title':'你好'}}} { "query": { "match": { "__type": "info" } }, "sort": [ { "campaign_end_time": { "order": "desc" } } ] }12345678910111213141516
match_all
{'query':{'match_all':{'title':'标题一样'}}}1
multi match
多值匹配查询
{ "query": { "multi_match": { "query": "运动 上衣", "fields": [ "brandName^100", "brandName.brandName_pinyin^100", "brandName.brandName_keyword^100", "sortName^80", "sortName.sortName_pinyin^80", "productName^60", "productKeyword^20" ], "type": , "operator": "AND" } } }12345678910111213141516171819
Bool query
bool查询包含四个子句,must,filter,should,must_not
{ 'query':{ 'bool':{ 'must':[{ 'term':{ '_type':{ 'value':'age' } } },{ 'term':{ 'account_grade':{ 'value':'23' } } } ] } } } { "bool":{ "must":{ "term":{"user":"lucy"} }, "filter":{ "term":{"tag":"teach"} }, "should":[ {"term":{"tag":"wow"}}, {"term":{"tag":"elasticsearch"}} ], "mininum_should_match":1, "boost":1.0 } }123456789101112131415161718192021222324252627282930313233343536373839
Filter query
query和filter的区别:query查询的时候,会先比较查询条件,然后计算分值,最后返回文档结果;而filter是先判断是否满足查询条件,如果不满足会缓存查询结果(记录该文档不满足结果),满足的话,就直接缓存结果
filter快在:对结果进行缓存,避免计算分值
{ "query": { "bool": { "must": [ {"match_all": {}} ], "filter": { "range": { "create_admin_id": { "gte": 10, "lte": 20 } } } } } }1234567891011121314151617
range query
{ 'query':{ 'range':{ 'age':{ 'gte':'30', 'lte':'20' } } } }12345678910
通配符查询
{ 'query':{ 'wildcard':{ 'title':'cr?me' } } }123456789
正则表达式查询
{ 'query':{ 'regex':{ 'title':{ 'value':'cr.m[ae]', 'boost':10.0 } } } }12345678910
前缀查询
{ 'query':{ 'match_phrase_prefix':{ 'title':{ 'query':'crime punish', 'slop':1 } } } }12345678910
query_string
{ 'query':{ 'query_string':{ 'query':'title:crime^10 +title:punishment -otitle:cat +author:(+Fyodor +dostoevsky)' } } }1234567
聚合查询
聚合提供了用户进行分组和数理统计的能力,可以把聚合理解成SQL中的GROUP BY和分组函数
指标聚合/桶聚合
Metrics(度量/指标):简单的对过滤出来的数据集进行avg,max操作,是一个单一的数值
Bucket(桶):将过滤出来的数据集按条件分成多个小数据集,然后Metrics会分别作用在这些小数据集上
max/min/avg/sum/stats
{ 'aggs':{c 'group_sum':{ 'sum':{ 'field':'money' } } } } { "aggs":{ "avg_fees":{ "avg":{ "field":"fees" } } } }12345678910111213141516171819
terms聚合
terms根据字段值项分组聚合.field按什么字段分组,size指定返回多少个分组,shard_size指定每个分片上返回多少个分组,order排序方式.可以指定include和exclude正则筛选表达式的值,指定missing设置缺省值
{ 'aggs':{ 'group_by_type':{ 'terms':{ 'field':'_type' } } } } { "size": 0, "aggs": { "terms":{ "terms": { "field": "__type", "size": 10 } } } } { "size": 0, "aggs": { "terms":{ "terms": { "field": "__type", "size": 10, "order": { "_count": "asc" } } } } } { "size": 0, "aggs": { "agg_terms": { "terms": { "field": "cost", "order": { "_count": "asc" } }, "aggs": { "max_balance": { "max": { "field": "cost" } } } } } } { "size": 0, "aggs": { "agg_terms": { "terms": { "field": "cost", "include": ".*", "exclude": ".*" } } } }12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667
cardinality去重
{ "size": 0, "aggs": { "count_type": { "cardinality": { "field": "__type" } } } } cardinality123456789101112
percentiles百分比
percentiles对指定字段(脚本)的值按从小到大累计每个值对应的文档数的占比(占所有命中文档数的百分比),返回指定占比比例对应的值。默认返回[ 1, 5, 25, 50, 75, 95, 99 ]分位上的值 { "size": 0, "aggs": { "age_percents":{ "percentiles": { "field": "age", "percents": [ 1, 5, 25, 50, 75, 95, 99 ] } } } } { "size": 0, "aggs": { "states": { "terms": { "field": "gender" }, "aggs": { "banlances": { "percentile_ranks": { "field": "balance", "values": [ 20000, 40000 ] } } } } }12345678910111213141516171819202122232425262728293031323334353637383940414243
percentiles rank
统计小于等于指定值得文档比
{ "size": 0, "aggs": { "tests": { "percentile_ranks": { "field": "age", "values": [ 10, 15 ] } } } }1234567891011121314
filter聚合
filter对满足过滤查询的文档进行聚合计算,在查询命中的文档中选取过滤条件的文档进行聚合,先过滤在聚合
{ "size": 0, "aggs": { "agg_filter":{ "filter": { "match":{"gender":"F"} }, "aggs": { "avgs": { "avg": { "field": "age" } } } } } }1234567891011121314151617
filtters聚合
多个过滤组聚合计算
{ "size": 0, "aggs": { "message": { "filters": { "filters": { "errors": { "exists": { "field": "__type" } }, "warring":{ "term": { "__type": "info" } } } } } } }12345678910111213141516171819202122
range聚合
{ "aggs": { "agg_range": { "range": { "field": "cost", "ranges": [ { "from": 50, "to": 70 }, { "from": 100 } ] }, "aggs": { "bmax": { "max": { "field": "cost" } } } } } }1234567891011121314151617181920212223242526
date_range聚合
{ "aggs": { "date_aggrs": { "date_range": { "field": "accepted_time", "format": "MM-yyy", "ranges": [ { "from": "now-10d/d", "to": "now" } ] } } } }12345678910111213141516
date_histogram
时间直方图聚合,就是按天、月、年等进行聚合统计。可按 year (1y), quarter (1q), month (1M), week (1w), day (1d), hour (1h), minute (1m), second (1s) 间隔聚合或指定的时间间隔聚合
{ "aggs": { "sales_over_time": { "date_histogram": { "field": "accepted_time", "interval": "quarter", "min_doc_count" : 0, //可以返回没有数据的月份 "extended_bounds" : { //强制返回数据的范围 "min" : "2014-01-01", "max" : "2014-12-31" } } } } }123456789101112131415
missing聚合
{ "aggs": { "account_missing": { "missing": { "field": "__type" } } } }12345678910
LogStash操作
启动logStash
logstash -e ‘input{stdin{}}output{stdout{codec=>rubydebug}}’
IK分词器
curl -XPOST http://192.168.168.101:9200/_analyze -d ‘{“analyzer”:“ik”,“text”:“JAVA编程思想”}’
http://192.168.168.101:9200/index01/_analyze?analyzer=ik&text=%E4%B8%AD%E5%8D%8E%E4%BA%BA%E6%B0%91%E5%85%B1%E5%92%8C%E5%9B%BD
IK分词器
curl -XPUT -d ‘{“id”:1,“kw”:“我们都爱中华人民共和国”}’ http://192.168.168.101:9200/haha1/haha/1
Mapping
查看mapping
curl -XGET http://192.168.168.101:9200/jtdb_item/tb_item/_mapping
评论 (3)
涉及广告推广,审核未通过
Dec 18 2020 08:26 am