Elasticsearchでインデックスを上書きしないように設定する方法を記す。
公式ドキュメント URL-based access control を参考にした。 www.elastic.co
検証環境: Elasticsearch 6.0.0-rc1
Elasticsearchは基本URLベースでインデックスにアクセスする。
デフォルトではインデックスは上書きできるようになっている。しかし、elasticsearch.ymlファイルにrest.action.multi.allow_explicit_indexの値をfalseとして追加することで上書きしないように設定することが出来る。
以下はデフォルトのelasticsearch.ymlファイル(rest.action.multi.allow_explicit_indexの設定を加える前)。
bash-3.2$ pwd /Users/sakura818uuu/elasticsearch-6.0.0-rc1 bash-3.2$ ls LICENSE.txt README.textile bin data logs plugins NOTICE.txt accounts.json config lib modules bash-3.2$ cd config/ bash-3.2$ ls elasticsearch.yml jvm.options log4j2.properties bash-3.2$ cat elasticsearch.yml # ======================== Elasticsearch Configuration ========================= # # NOTE: Elasticsearch comes with reasonable defaults for most settings. # Before you set out to tweak and tune the configuration, make sure you # understand what are you trying to accomplish and the consequences. # # The primary way of configuring a node is via this file. This template lists # the most important settings you may want to configure for a production cluster. # # Please consult the documentation for further information on configuration options: # https://www.elastic.co/guide/en/elasticsearch/reference/index.html # # ---------------------------------- Cluster ----------------------------------- # # Use a descriptive name for your cluster: # #cluster.name: my-application # # ------------------------------------ Node ------------------------------------ # # Use a descriptive name for the node: # #node.name: node-1 # # Add custom attributes to the node: # #node.attr.rack: r1 # # ----------------------------------- Paths ------------------------------------ # # Path to directory where to store the data (separate multiple locations by comma): # #path.data: /path/to/data # # Path to log files: # #path.logs: /path/to/logs # # ----------------------------------- Memory ----------------------------------- # # Lock the memory on startup: # #bootstrap.memory_lock: true # # Make sure that the heap size is set to about half the memory available # on the system and that the owner of the process is allowed to use this # limit. # # Elasticsearch performs poorly when the system is swapping the memory. # # ---------------------------------- Network ----------------------------------- # # Set the bind address to a specific IP (IPv4 or IPv6): # #network.host: 192.168.0.1 # # Set a custom port for HTTP: # #http.port: 9200 # # For more information, consult the network module documentation. # # --------------------------------- Discovery ---------------------------------- # # Pass an initial list of hosts to perform discovery when new node is started: # The default list of hosts is ["127.0.0.1", "[::1]"] # #discovery.zen.ping.unicast.hosts: ["host1", "host2"] # # Prevent the "split brain" by configuring the majority of nodes (total number of master-eligible nodes / 2 + 1): # #discovery.zen.minimum_master_nodes: 3 # # For more information, consult the zen discovery module documentation. # # ---------------------------------- Gateway ----------------------------------- # # Block initial recovery after a full cluster restart until N nodes are started: # #gateway.recover_after_nodes: 3 # # For more information, consult the gateway module documentation. # # ---------------------------------- Various ----------------------------------- # # Require explicit names when deleting indices: # #action.destructive_requires_name: true
実際にやってみる。
bash-3.2$ vim elasticsearch.yml bash-3.2$ cat elasticsearch.yml # ======================== Elasticsearch Configuration ========================= # # NOTE: Elasticsearch comes with reasonable defaults for most settings. # Before you set out to tweak and tune the configuration, make sure you # understand what are you trying to accomplish and the consequences. # # The primary way of configuring a node is via this file. This template lists # the most important settings you may want to configure for a production cluster. # # Please consult the documentation for further information on configuration options: # https://www.elastic.co/guide/en/elasticsearch/reference/index.html # # ---------------------------------- Cluster ----------------------------------- # # Use a descriptive name for your cluster: # #cluster.name: my-application # # ------------------------------------ Node ------------------------------------ # # Use a descriptive name for the node: # #node.name: node-1 # # Add custom attributes to the node: # #node.attr.rack: r1 # # ----------------------------------- Paths ------------------------------------ # # Path to directory where to store the data (separate multiple locations by comma): # #path.data: /path/to/data # # Path to log files: # #path.logs: /path/to/logs # # ----------------------------------- Memory ----------------------------------- # # Lock the memory on startup: # #bootstrap.memory_lock: true # # Make sure that the heap size is set to about half the memory available # on the system and that the owner of the process is allowed to use this # limit. # # Elasticsearch performs poorly when the system is swapping the memory. # # ---------------------------------- Network ----------------------------------- # # Set the bind address to a specific IP (IPv4 or IPv6): # #network.host: 192.168.0.1 # # Set a custom port for HTTP: # #http.port: 9200 # # For more information, consult the network module documentation. # # --------------------------------- Discovery ---------------------------------- # # Pass an initial list of hosts to perform discovery when new node is started: # The default list of hosts is ["127.0.0.1", "[::1]"] # #discovery.zen.ping.unicast.hosts: ["host1", "host2"] # # Prevent the "split brain" by configuring the majority of nodes (total number of master-eligible nodes / 2 + 1): # #discovery.zen.minimum_master_nodes: 3 # # For more information, consult the zen discovery module documentation. # # ---------------------------------- Gateway ----------------------------------- # # Block initial recovery after a full cluster restart until N nodes are started: # #gateway.recover_after_nodes: 3 # # For more information, consult the gateway module documentation. # # ---------------------------------- Various ----------------------------------- # # Require explicit names when deleting indices: # #action.destructive_requires_name: true # Add rest.action.multi.allow_explicit_index:false bash-3.2$ curl -XPUT 'localhost:9200/indexusertest?pretty&pretty' { "acknowledged" : true, "shards_acknowledged" : true, "index" : "indexusertest" } bash-3.2$ curl -XGET 'localhost:9200/_cat/indices?v&pretty' health status index uuid pri rep docs.count docs.deleted store.size pri.store.size yellow open customer tl6qvdROTfuL380eLOxH0Q 5 1 2 0 8.3kb 8.3kb yellow open indexusertest EO9eYTNoT-i7_L3PBIb7dQ 5 1 0 0 1.1kb 1.1kb yellow open bank yaVaZLiLT2G0RyA-vBn5nw 5 1 1000 0 488.3kb 488.3kb bash-3.2$ curl -XPUT 'localhost:9200/customer/doc/1?pretty&pretty' -H 'Content-Type: application/json' -d' > > { > > "name": "John Doe" > > } > bash-3.2$ curl -XPUT 'localhost:9200/indexusertest/doc/1?pretty&pretty' -H 'Content-Type: application/json' -d' > { > "name": "John Doe" > } > ' { "_index" : "indexusertest", "_type" : "doc", "_id" : "1", "_version" : 1, "result" : "created", "_shards" : { "total" : 2, "successful" : 1, "failed" : 0 }, "_seq_no" : 0, "_primary_term" : 1 } bash-3.2$ curl -XGET 'localhost:9200/indexusertest/doc/1?pretty&pretty' { "_index" : "indexusertest", "_type" : "doc", "_id" : "1", "_version" : 1, "found" : true, "_source" : { "name" : "John Doe" } } bash-3.2$ curl -XPUT 'localhost:9200/indexusertest/doc/1?pretty&pretty' -H 'Content-Type: application/json' -d' > { > "name": "Jane Doe" > } > ' { "_index" : "indexusertest", "_type" : "doc", "_id" : "1", "_version" : 2, "result" : "updated", "_shards" : { "total" : 2, "successful" : 1, "failed" : 0 }, "_seq_no" : 1, "_primary_term" : 1 } bash-3.2$ curl -XGET 'localhost:9200/indexusertest/doc/1?pretty&pretty' { "_index" : "indexusertest", "_type" : "doc", "_id" : "1", "_version" : 2, "found" : true, "_source" : { "name" : "Jane Doe" } }
予想に反してelasticsearch.ymlに設定を追加しても普通にインデックスの上書きができてしまった。 上記はelasticsearchをずっと起動させっぱなしだったので、再起動したらelasticsearch.ymlがきちんと反映されるかもしれないと思って同じことを再度試した。
^C[2017-10-29T18:35:29,817][INFO ][o.e.n.Node ] [Xj840__] stopping ... [2017-10-29T18:35:29,881][INFO ][o.e.n.Node ] [Xj840__] stopped [2017-10-29T18:35:29,882][INFO ][o.e.n.Node ] [Xj840__] closing ... [2017-10-29T18:35:29,900][INFO ][o.e.n.Node ] [Xj840__] closed bash-3.2$ ./bin/elasticsearch Exception in thread "main" 2017-10-29 18:35:36,902 main ERROR No log4j2 configuration file found. Using default configuration: logging only errors to the console. Set system property 'log4j2.debug' to show Log4j2 internal initialization logging. ElasticsearchParseException[malformed, expected settings to start with 'object', instead was [VALUE_STRING]] at org.elasticsearch.common.settings.loader.XContentSettingsLoader.load(XContentSettingsLoader.java:73) at org.elasticsearch.common.settings.loader.XContentSettingsLoader.load(XContentSettingsLoader.java:52) at org.elasticsearch.common.settings.loader.YamlSettingsLoader.load(YamlSettingsLoader.java:50) at org.elasticsearch.common.settings.Settings$Builder.loadFromStream(Settings.java:1069) at org.elasticsearch.common.settings.Settings$Builder.loadFromPath(Settings.java:1058) at org.elasticsearch.node.InternalSettingsPreparer.prepareEnvironment(InternalSettingsPreparer.java:99) at org.elasticsearch.cli.EnvironmentAwareCommand.createEnv(EnvironmentAwareCommand.java:78) at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:69) at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:134) at org.elasticsearch.cli.Command.main(Command.java:90) at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:92) at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:85)
log4j2に関するエラーがでて起動できなくなった。変更したところはelasticsearch.ymlのrest.action.multi.allow_explicit_index:falseだけなので本末転倒だがそこを削除して起動してみるときちんと起動するようになった。
bash-3.2$ ./bin/elasticsearch [2017-10-29T18:38:29,327][INFO ][o.e.n.Node ] [] initializing ... (省略) [2017-10-29T18:38:36,999][INFO ][o.e.n.Node ] [Xj840__] started
もしかするとelasticsearch.ymlファイルの記述方法が間違っていたのかもしれない。elasticsearch.ymlファイルのデフォルトにより似せてrest.action.multi.allow_explicit_index:falseを追加してelasticsearchを再起動してみた。
bash-3.2$ vim elasticsearch.yml bash-3.2$ cat elasticsearch.yml # ======================== Elasticsearch Configuration ========================= # # NOTE: Elasticsearch comes with reasonable defaults for most settings. # Before you set out to tweak and tune the configuration, make sure you # understand what are you trying to accomplish and the consequences. # # The primary way of configuring a node is via this file. This template lists # the most important settings you may want to configure for a production cluster. # # Please consult the documentation for further information on configuration options: # https://www.elastic.co/guide/en/elasticsearch/reference/index.html # # ---------------------------------- Cluster ----------------------------------- # # Use a descriptive name for your cluster: # #cluster.name: my-application # # ------------------------------------ Node ------------------------------------ # # Use a descriptive name for the node: # #node.name: node-1 # # Add custom attributes to the node: # #node.attr.rack: r1 # # ----------------------------------- Paths ------------------------------------ # # Path to directory where to store the data (separate multiple locations by comma): # #path.data: /path/to/data # # Path to log files: # #path.logs: /path/to/logs # # ----------------------------------- Memory ----------------------------------- # # Lock the memory on startup: # #bootstrap.memory_lock: true # # Make sure that the heap size is set to about half the memory available # on the system and that the owner of the process is allowed to use this # limit. # # Elasticsearch performs poorly when the system is swapping the memory. # # ---------------------------------- Network ----------------------------------- # # Set the bind address to a specific IP (IPv4 or IPv6): # #network.host: 192.168.0.1 # # Set a custom port for HTTP: # #http.port: 9200 # # For more information, consult the network module documentation. # # --------------------------------- Discovery ---------------------------------- # # Pass an initial list of hosts to perform discovery when new node is started: # The default list of hosts is ["127.0.0.1", "[::1]"] # #discovery.zen.ping.unicast.hosts: ["host1", "host2"] # # Prevent the "split brain" by configuring the majority of nodes (total number of master-eligible nodes / 2 + 1): # #discovery.zen.minimum_master_nodes: 3 # # For more information, consult the zen discovery module documentation. # # ---------------------------------- Gateway ----------------------------------- # # Block initial recovery after a full cluster restart until N nodes are started: # #gateway.recover_after_nodes: 3 # # For more information, consult the gateway module documentation. # # ---------------------------------- Various ----------------------------------- # # Require explicit names when deleting indices: # #action.destructive_requires_name: true # # --------------------------------- Add ---------------------------------------- # # rest.action.multi.allow_explicit_index:false # bash-3.2$ ./bin/elasticsearch [2017-10-29T18:44:56,179][INFO ][o.e.n.Node ] [] initializing ... (省略) [2017-10-29T18:45:04,328][INFO ][o.e.n.Node ] [Xj840__] started
きちんと起動するようになったので、最初の目的であるインデックスを上書きしないようになってるかまた確かめる。
bash-3.2$ curl -XPUT 'localhost:9200/noindextest/doc/1?pretty&pretty' -H 'Content-Type: application/json' -d' > { > "name": "John Doe" > } > ' { "_index" : "noindextest", "_type" : "doc", "_id" : "1", "_version" : 1, "result" : "created", "_shards" : { "total" : 2, "successful" : 1, "failed" : 0 }, "_seq_no" : 0, "_primary_term" : 1 } bash-3.2$ curl -XGET 'localhost:9200/noindextest/doc/1?pretty&pretty' { "_index" : "noindextest", "_type" : "doc", "_id" : "1", "_version" : 1, "found" : true, "_source" : { "name" : "John Doe" } } bash-3.2$ curl -XGET 'localhost:9200/_cat/indices?v&pretty' health status index uuid pri rep docs.count docs.deleted store.size pri.store.size yellow open customer tl6qvdROTfuL380eLOxH0Q 5 1 2 0 8.3kb 8.3kb yellow open bank yaVaZLiLT2G0RyA-vBn5nw 5 1 1000 0 488.3kb 488.3kb yellow open indexusertest EO9eYTNoT-i7_L3PBIb7dQ 5 1 1 0 4.7kb 4.7kb yellow open noindextest LseoEEkrSiGuMeWXJNEMrA 5 1 1 0 4.5kb 4.5kb bash-3.2$ curl -XPUT 'localhost:9200/noindextest/doc/1?pretty&pretty' -H 'Content-Type: application/json' -d' > { > "name": "Jane Doe" > } > ' { "_index" : "noindextest", "_type" : "doc", "_id" : "1", "_version" : 2, "result" : "updated", "_shards" : { "total" : 2, "successful" : 1, "failed" : 0 }, "_seq_no" : 1, "_primary_term" : 1 } bash-3.2$ curl -XGET 'localhost:9200/noindextest/doc/1?pretty&pretty' { "_index" : "noindextest", "_type" : "doc", "_id" : "1", "_version" : 2, "found" : true, "_source" : { "name" : "Jane Doe" } }
やはりインデックスできている。
結果として公式ドキュメントにならったがElasticsearch 6.0.0-rc1でインデックスを上書きしないようにはできなかった。
できる方法がみつかった or できるようになったらまたブログに記載する。
2017/10/30追記
Elasticsearch公式日本語質問フォーラムでこのことを質問してみた。
するとJun Ohtaniさんから以下のような返答を頂いた。
ブログに書かれている設定のドキュメントにありますが、 リクエストボディにインデックス名を入力した場合に、リジェクトする機能になります。 Bulkなどで、インデックス名をリクエストボディで指定したものがあるとエラーになる機能です。 「上書き」が何を想定されているかはちょっとブログからは読み取れなかったですが、 ここで言っている「overriding」はindexの名前の「overriding」ですね。