2000字范文,分享全网优秀范文,学习好帮手!
2000字范文 > Elasticsearch7.5配置IK中文分词器+拼音分词

Elasticsearch7.5配置IK中文分词器+拼音分词

时间:2022-01-15 03:16:14

相关推荐

Elasticsearch7.5配置IK中文分词器+拼音分词

1. 安装插件

1.1 安装插件

拼音分词器:/medcl/elasticsearch-analysis-pinyin

中文分词器:/medcl/elasticsearch-analysis-ik

找到自己对应的自己的Elasticsearch版本的插件进行安装

Elasticsearch 7.5.1elasticsearch-analysis-ik 7.5.1elasticsearch-analysis-pinyin 7.5.1

直接进入Elasticsearch安装目录下,依次进行在线安装

./bin/elasticsearch-plugin install /medcl/elasticsearch-analysis-ik/releases/download/v7.5.1/elasticsearch-analysis-ik-7.5.1.zip./bin/elasticsearch-plugin install /medcl/elasticsearch-analysis-pinyin/releases/download/v7.5.1/elasticsearch-analysis-pinyin-7.5.1.zip

安装完成后需要重启 elasticsearch,然后测试分词器是否OK,正常情况下会出现一堆分词结果

1.2 测试中文分词器

POST http://data:9200/_analyze{"analyzer":"ik_smart","text":"新型冠状病毒"}

分词结果

{"tokens": [{"token": "新型","start_offset": 0,"end_offset": 2,"type": "CN_WORD","position": 0},{"token": "冠状病毒","start_offset": 2,"end_offset": 6,"type": "CN_WORD","position": 1}]}

1.3 测试拼音分词器

POST http://data:9200/_analyze{"analyzer":"pinyin","text":"新型冠状病毒"}

分词结果

{"tokens": [{"token": "xin","start_offset": 0,"end_offset": 0,"type": "word","position": 0},{"token": "xxgzbd","start_offset": 0,"end_offset": 0,"type": "word","position": 0},{"token": "xing","start_offset": 0,"end_offset": 0,"type": "word","position": 1},{"token": "guan","start_offset": 0,"end_offset": 0,"type": "word","position": 2},{"token": "zhuang","start_offset": 0,"end_offset": 0,"type": "word","position": 3},{"token": "bing","start_offset": 0,"end_offset": 0,"type": "word","position": 4},{"token": "du","start_offset": 0,"end_offset": 0,"type": "word","position": 5}]}

2. 修改解析器

修改分词器,以下所有操作均是对song 索引库进行的操作

2.1 关闭索引

首先关闭索引,否则会报错的

POST http://data:9200/song/_close{}

2.2 配置IK+拼音分词

然后自定义分词器,我这里使用的IK_SMART+拼音

PUT http://data:9200/song/_settings{"index": {"analysis": {"analyzer": {"ik_pinyin_analyzer": {"type": "custom","tokenizer": "ik_smart","filter": "pinyin_filter"}},"filter": {"pinyin_filter": {"type": "pinyin","keep_first_letter": false}}}}}

你也可以使用IK_MAX_WORD + 拼音分词

PUT http://data:9200/song/_settings{"index": {"analysis": {"analyzer": {"ik_pinyin_analyzer": {"type": "custom","tokenizer": "ik_max_word","filter": "pinyin_filter"}},"filter": {"pinyin_filter": {"type": "pinyin","keep_first_letter": false}}}}}

2.3 开启索引

POST http://data:9200/song/_open{}

本内容不代表本网观点和政治立场,如有侵犯你的权益请联系我们处理。
网友评论
网友评论仅供其表达个人看法,并不表明网站立场。