2000字范文,分享全网优秀范文,学习好帮手!
2000字范文 > es修改IK分词器源码 mysql热词动态更新(报错解决x3)

es修改IK分词器源码 mysql热词动态更新(报错解决x3)

时间:2020-07-12 12:46:44

相关推荐

es修改IK分词器源码 mysql热词动态更新(报错解决x3)

最近在公司遇到的一个问题,给elasticsearch配置ik热部署mysql词库。

我是参照下面这个博客来做的

/xiaoxiaoliu/p/11218109.html

但是按照这个做就会报下面这个错误

[-08-11T11:27:53,515][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [DESKTOP-0PKSCKK] fatal error in thread [elasticsearch[DESKTOP-0PKSCKK][clusterApplierService#updateTask][T#1]], exitingjava.lang.ExceptionInInitializerError: nullat java.lang.Class.forName0(Native Method) ~[?:1.8.0_202]at java.lang.Class.forName(Class.java:264) ~[?:1.8.0_202]at com.mysql.cj.jdbc.NonRegisteringDriver.<clinit>(NonRegisteringDriver.java:106) ~[?:?]at java.lang.Class.forName0(Native Method) ~[?:1.8.0_202]at java.lang.Class.forName(Class.java:264) ~[?:1.8.0_202]at org.wltea.analyzer.dic.Dictionary.<clinit>(Dictionary.java:108) ~[?:?]at org.wltea.analyzer.cfg.Configuration.<init>(Configuration.java:40) ~[?:?]at org.elasticsearch.index.analysis.IkTokenizerFactory.<init>(IkTokenizerFactory.java:15) ~[?:?]at org.elasticsearch.index.analysis.IkTokenizerFactory.getIkSmartTokenizerFactory(IkTokenizerFactory.java:23) ~[?:?]at org.elasticsearch.index.analysis.AnalysisRegistry.buildMapping(AnalysisRegistry.java:445) ~[elasticsearch-7.6.2.jar:7.6.2]at org.elasticsearch.index.analysis.AnalysisRegistry.buildTokenizerFactories(AnalysisRegistry.java:286) ~[elasticsearch-7.6.2.jar:7.6.2]at org.elasticsearch.index.analysis.AnalysisRegistry.build(AnalysisRegistry.java:214) ~[elasticsearch-7.6.2.jar:7.6.2]at org.elasticsearch.index.IndexModule.newIndexService(IndexModule.java:421) ~[elasticsearch-7.6.2.jar:7.6.2]at org.elasticsearch.indices.IndicesService.createIndexService(IndicesService.java:603) ~[elasticsearch-7.6.2.jar:7.6.2]at org.elasticsearch.indices.IndicesService.createIndex(IndicesService.java:542) ~[elasticsearch-7.6.2.jar:7.6.2]at org.elasticsearch.indices.IndicesService.createIndex(IndicesService.java:173) ~[elasticsearch-7.6.2.jar:7.6.2]at org.elasticsearch.indices.cluster.IndicesClusterStateService.createIndices(IndicesClusterStateService.java:484) ~[elasticsearch-7.6.2.jar:7.6.2]at org.elasticsearch.indices.cluster.IndicesClusterStateService.applyClusterState(IndicesClusterStateService.java:246) ~[elasticsearch-7.6.2.jar:7.6.2]at org.elasticsearch.cluster.service.ClusterApplierService.lambda$callClusterStateAppliers$5(ClusterApplierService.java:517) ~[elasticsearch-7.6.2.jar:7.6.2]at java.lang.Iterable.forEach(Iterable.java:75) ~[?:1.8.0_202]at org.elasticsearch.cluster.service.ClusterApplierService.callClusterStateAppliers(ClusterApplierService.java:514) ~[elasticsearch-7.6.2.jar:7.6.2]at org.elasticsearch.cluster.service.ClusterApplierService.applyChanges(ClusterApplierService.java:485) ~[elasticsearch-7.6.2.jar:7.6.2]at org.elasticsearch.cluster.service.ClusterApplierService.runTask(ClusterApplierService.java:432) ~[elasticsearch-7.6.2.jar:7.6.2]at org.elasticsearch.cluster.service.ClusterApplierService.access$100(ClusterApplierService.java:73) ~[elasticsearch-7.6.2.jar:7.6.2]at org.elasticsearch.cluster.service.ClusterApplierService$UpdateTask.run(ClusterApplierService.java:176) ~[elasticsearch-7.6.2.jar:7.6.2]at mon.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:633) ~[elasticsearch-7.6.2.jar:7.6.2]at mon.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:252) ~[elasticsearch-7.6.2.jar:7.6.2]at mon.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:215) ~[elasticsearch-7.6.2.jar:7.6.2]at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_202]at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_202]at java.lang.Thread.run(Thread.java:748) [?:1.8.0_202]Caused by: java.security.AccessControlException: access denied ("java.lang.RuntimePermission" "setContextClassLoader")at java.security.AccessControlContext.checkPermission(AccessControlContext.java:472) ~[?:1.8.0_202]at java.security.AccessController.checkPermission(AccessController.java:884) ~[?:1.8.0_202]at java.lang.SecurityManager.checkPermission(SecurityManager.java:549) ~[?:1.8.0_202]at java.lang.Thread.setContextClassLoader(Thread.java:1474) ~[?:1.8.0_202]at com.mysql.cj.jdbc.AbandonedConnectionCleanupThread$1.newThread(AbandonedConnectionCleanupThread.java:56) ~[?:?]at java.util.concurrent.ThreadPoolExecutor$Worker.<init>(ThreadPoolExecutor.java:619) ~[?:1.8.0_202]at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:932) ~[?:1.8.0_202]at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1367) ~[?:1.8.0_202]at java.util.concurrent.Executors$DelegatedExecutorService.execute(Executors.java:668) ~[?:1.8.0_202]at com.mysql.cj.jdbc.AbandonedConnectionCleanupThread.<clinit>(AbandonedConnectionCleanupThread.java:60) ~[?:?]... 31 more

方案1

第一个解决方案是在jdk文件夹下添加权限,如下这是网络上大多数的解决方案非常容易找到。在jdk文件夹下,jdk1.8.0_161\jre\lib\security ,找到 java.policy ,在 grant最后一行加入

permission java.security.AllPermission;

然后重启ES ,即可解决.

但是吧,公司里面吧jdk是别人运维管的你说要加就给你加啊>.<

方案2

那行咱们就换一个,我们改elasticsearch总可以了吧0.0

在elasticsearch-7.6.2\config\jmv.options里添加权限

-Djava.security.policy={你的.policy文件路径}// .policy是ik源码里resource里的一个文件在里面配置相应权限//比如permission java.security.AllPermission

像这个样子

重启es,也可以解决问题

。。。

但是es集群运维人又说了,“这是你ik的问题为啥要我管,我不管我不加”>.<

方案3

好吧,那我们就再看看吧,我们只改ik源码也能解决

这里我重新说一下整体过程

1,mysql脚本

#建库和建表,库我就不说了CREATE TABLE hot_words (id bigint(20) NOT NULL AUTO_INCREMENT,word varchar(50) COLLATE utf8_unicode_ci DEFAULT NULL COMMENT '词语',PRIMARY KEY (id)) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;CREATE TABLE hot_stopwords (id bigint(20) NOT NULL AUTO_INCREMENT,stopword varchar(50) COLLATE utf8_unicode_ci DEFAULT NULL COMMENT '停用词',PRIMARY KEY (id)) ENGINE=InnoDB AUTO_INCREMENT=2 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

2,下载ik源码,导入idea中

修改pom,你用的elasticsearch版本

添加sql版本依赖

<dependency><groupId>mysql</groupId><artifactId>mysql-connector-java</artifactId><version>8.0.11</version></dependency>

添加src/main/assemblies/plugin.xml文件

<dependencySet><outputDirectory>/</outputDirectory><useProjectArtifact>true</useProjectArtifact><useTransitiveFiltering>true</useTransitiveFiltering><includes><include>mysql:mysql-connector-java</include></includes></dependencySet>

3,在config文件下添加mysql链接和 SQL语句配置文件jdbc-reload.properties

jdbc.url=jdbc:mysql://127.0.0.1:3306/extra_dic?characterEncoding=UTF-8&serverTimezone=GMT&useSSL=false&nullCatalogMeansCurrent=truejdbc.user=rootjdbc.password=123456# 更新词库jdbc.reload.sql=select word from hot_words# 更新停用词库jdbc.reload.stopword.sql=select stopword as word from hot_stopwords# 更新的时间间隔jdbc.reload.interval=10000

4,在源码org.wltea.analyzer.dic包下添加线程类HotDicReloadThread.java

package org.wltea.analyzer.dic;import org.apache.logging.log4j.Logger;import org.elasticsearch.SpecialPermission;import org.wltea.analyzer.help.ESPluginLoggerFactory;import java.security.AccessController;import java.security.PrivilegedAction;/*** 死循环,调用Dictionary.getSingleton().reLoadMainDict(),重新加载词典* 这个地方我改了,把循环放在了后边,所以没有循环*/public class HotDicReloadThread implements Runnable {private static final Logger logger = ESPluginLoggerFactory.getLogger(HotDicReloadThread.class.getName());public void run() {logger.info("[==========]reload hot dict from mysql......");Dictionary.getSingleton().reLoadMainDict();}}

5,在org.wltea.analyzer.dic里的Dictionary添加以下4个方法

private void loadMySQLExtDict() {logger.info("--------mysql hotword add---------------");SpecialPermission.check();AccessController.doPrivileged((PrivilegedAction<Void>) () -> {this.loadMySQLExtDictrun();return null;});}/*** TODO 01* 从mysql中加载热更新词典*/private void loadMySQLExtDictrun() {try {//Class.forName("com.mysql.jdbc.Driver");Class.forName("com.mysql.cj.jdbc.Driver");} catch (ClassNotFoundException e) {logger.error("error", e);}Connection conn = null;Statement stmt = null;ResultSet rs = null;try {Path file = PathUtils.get(getDictRoot(), "jdbc-reload.properties");prop.load(new FileInputStream(file.toFile()));conn = DriverManager.getConnection(prop.getProperty("jdbc.url"),prop.getProperty("jdbc.user"),prop.getProperty("jdbc.password"));stmt = conn.createStatement();rs = stmt.executeQuery(prop.getProperty("jdbc.reload.sql"));while (rs.next()) {String theWord = rs.getString("word");logger.info("[==========]hot word from mysql: " + theWord);_MainDict.fillSegment(theWord.trim().toCharArray());}Thread.sleep(Integer.valueOf(String.valueOf(prop.get("jdbc.reload.interval"))));} catch (Exception e) {logger.error("erorr", e);} finally {if (rs != null) {try {rs.close();} catch (SQLException e) {logger.error("error", e);}}if (stmt != null) {try {stmt.close();} catch (SQLException e) {logger.error("error", e);}}if (conn != null) {try {conn.close();} catch (SQLException e) {logger.error("error", e);}}}}private void loadMySQLStopwordDict() {logger.info("--------mysql stop_word add---------------");SpecialPermission.check();AccessController.doPrivileged((PrivilegedAction<Void>) () -> {this.loadMySQLStopwordDictrun();return null;});}/*** TODO* 从mysql中加载停用词* by blad*/private void loadMySQLStopwordDictrun() {try {//Class.forName("com.mysql.jdbc.Driver");Class.forName("com.mysql.cj.jdbc.Driver");} catch (ClassNotFoundException e) {logger.error("error", e);}Connection conn = null;Statement stmt = null;ResultSet rs = null;try {Path file = PathUtils.get(getDictRoot(), "jdbc-reload.properties");prop.load(new FileInputStream(file.toFile()));for (Object key : prop.keySet()) {logger.info("[==========]" + key + "=" + prop.getProperty(String.valueOf(key)));}logger.info("[==========]query hot stopword dict from mysql, " + prop.getProperty("jdbc.reload.stopword.sql") + "......");conn = DriverManager.getConnection(prop.getProperty("jdbc.url"),prop.getProperty("jdbc.user"),prop.getProperty("jdbc.password"));stmt = conn.createStatement();rs = stmt.executeQuery(prop.getProperty("jdbc.reload.stopword.sql"));while (rs.next()) {String theWord = rs.getString("word");logger.info("[==========]hot stopword from mysql: " + theWord);_StopWords.fillSegment(theWord.trim().toCharArray());}Thread.sleep(Integer.valueOf(String.valueOf(prop.get("jdbc.reload.interval"))));} catch (Exception e) {logger.error("erorr", e);} finally {if (rs != null) {try {rs.close();} catch (SQLException e) {logger.error("error", e);}}if (stmt != null) {try {stmt.close();} catch (SQLException e) {logger.error("error", e);}}if (conn != null) {try {conn.close();} catch (SQLException e) {logger.error("error", e);}}}}

6,在Dictionary类,修改三个方法添加代码

分别是

initial我添加了 pool.scheduleAtFixedRate(new HotDicReloadThread(), 10, 60, TimeUnit.SECONDS);

改用了源码里自带的pool线程。

public static synchronized void initial(Configuration cfg) {if (singleton == null) {synchronized (Dictionary.class) {if (singleton == null) {singleton = new Dictionary(cfg);singleton.loadMainDict();singleton.loadSurnameDict();singleton.loadQuantifierDict();singleton.loadSuffixDict();singleton.loadPrepDict();singleton.loadStopWordDict();// Step1.开启新的线程重新加载词典//new Thread(new HotDicReloadThread()).start();pool.scheduleAtFixedRate(new HotDicReloadThread(), 10, 60, TimeUnit.SECONDS);if (cfg.isEnableRemoteDict()) {// 建立监控线程for (String location : singleton.getRemoteExtDictionarys()) {// 10 秒是初始延迟可以修改的 60是间隔时间 单位秒pool.scheduleAtFixedRate(new Monitor(location), 10, 60, TimeUnit.SECONDS);}for (String location : singleton.getRemoteExtStopWordDictionarys()) {pool.scheduleAtFixedRate(new Monitor(location), 10, 60, TimeUnit.SECONDS);}}}}}}

添加了this.loadMySQLExtDict();

/*** 加载主词典及扩展词典*/private void loadMainDict() {// 建立一个主词典实例_MainDict = new DictSegment((char) 0);// 读取主词典文件Path file = PathUtils.get(getDictRoot(), Dictionary.PATH_DIC_MAIN);loadDictFile(_MainDict, file, false, "Main Dict");// Step2 从mysql加载词典this.loadMySQLExtDict();// 加载扩展词典this.loadExtDict();// 加载远程自定义词库this.loadRemoteExtDict();}

最后添加了一句 this.loadMySQLStopwordDict();

/*** 加载用户扩展的停止词词典*/private void loadStopWordDict() {// 建立主词典实例_StopWords = new DictSegment((char) 0);// 读取主词典文件Path file = PathUtils.get(getDictRoot(), Dictionary.PATH_DIC_STOP);loadDictFile(_StopWords, file, false, "Main Stopwords");// 加载扩展停止词典List<String> extStopWordDictFiles = getExtStopWordDictionarys();if (extStopWordDictFiles != null) {for (String extStopWordDictName : extStopWordDictFiles) {logger.info("[Dict Loading] " + extStopWordDictName);// 读取扩展词典文件file = PathUtils.get(extStopWordDictName);loadDictFile(_StopWords, file, false, "Extra Stopwords");}}// 加载远程停用词典List<String> remoteExtStopWordDictFiles = getRemoteExtStopWordDictionarys();for (String location : remoteExtStopWordDictFiles) {logger.info("[Dict Loading] " + location);List<String> lists = getRemoteWords(location);// 如果找不到扩展的字典,则忽略if (lists == null) {logger.error("[Dict Loading] " + location + " load failed");continue;}for (String theWord : lists) {if (theWord != null && !"".equals(theWord.trim())) {// 加载远程词典数据到主内存中logger.info(theWord);_StopWords.fillSegment(theWord.trim().toLowerCase().toCharArray());}}}// Step3 从mysql加载停用词this.loadMySQLStopwordDict();}

最后要在resource的policy文件中添加权限

permission java.lang.RuntimePermission "setContextClassLoader";

打包后把release文件里的zip文件解压上传,重启es,解决所有问题

搞了好长时间终于搞定了,遇到问题一定要多看源码>0<

本内容不代表本网观点和政治立场,如有侵犯你的权益请联系我们处理。
网友评论
网友评论仅供其表达个人看法,并不表明网站立场。