elasticsearchnodescluster-computingdata-loss

Rebuild a compromised cluster on Elasticsearch


I'm using an elasticsearch cluster with 3 nodes (one of them is a master and the other two are master-eligible). Unfortunately, all of them have been stopped at the same time and after restarting them I'm encountering 2 different problems:

  1. The cluster is not able to elect a master node anymore (logs from my linux machine below):
[2024-01-08T12:07:34,186][WARN ][o.e.c.c.ClusterFormationFailureHelper] [masterNodeES1] master not discovered or elected yet, an election requires at least 2 nodes with ids from [Wp2ThiNCT-xpIskc0FTITg, 7PZ-SP5usdoKrL4tjfSMgA, nLFU0ydhTgsVItQNFL3T2n], have discovered [{masterNodeES1}{7PZ-SP5usdoKrL4tjfSMgA}{Z2B4Rja5TneDDm7N6fGYjQ}{<ip_master_node>}{<ip_master_node>:9300}{dilm}{ml.machine_memory=4046721024, xpack.installed=true, ml.max_open_jobs=20}] which is not a quorum; discovery will continue using [<ip_node_2>:9300, <ip_node_3>:9300] from hosts providers and [{masterNodeES1}{7PZ-SP5usdoKrL4tjfSMgA}{Z4B4Rji5TzeDDe7N5fBYjQ}{<ip_master_node>}{<ip_master_node>:9300}{dilm}{ml.machine_memory=4046721024, xpack.installed=true, ml.max_open_jobs=20}] from last-known cluster state; node term 102018, last-accepted version 13415 in term 11968
  1. I'm not able to query my cluster anymore because of failed authentication error. The logs below from my linux machine are generated by curl -u user:password -XGET '<master_node_ip>:9200/_cat/indices?pretty':
{ "error" : { "root_cause" : [ { "type" : "security_exception", "reason" : "failed to authenticate user [elastic]", "header" : { "WWW-Authenticate" : "Basic realm="security" charset="UTF-8"" } } ], "type" : "security_exception", "reason" : "failed to authenticate user [elastic]", "header" : { "WWW-Authenticate" : "Basic realm="security" charset="UTF-8"" } }, "status" : 401 }

Since I can't query the database, I can't specify the elasticsearch version (it could be 5.x.x), but anyway I noticed that elasticsearch-reset-password Tool is not present in my /bin folder. I just want to know if is there a way to restore my cluster without losing data that are inside their nodes. Thank you in advance


Solution

  • You can use bin/x-pack/users command in ESv5 or bin/elasticsearch-users command for ESv6 and onwards.

    Elasticsearch version 5

    bin/x-pack/users useradd test -p test -r superuser
    

    Elasticsearch version 6 and onwards

    bin/elasticsearch-users useradd test -p test -r superuser
    

    test it

    curl -k 'http://localhost:9200/_cluster/health?pretty' -u test:test
    curl -k 'https://localhost:9200/_cluster/health?pretty' -u test:test
    

    After create the user you can send curl request to only localhost. BUT if the .security index shards are not available you will get the same security_exception error. If it not works, check your elasticsearch logs, make sure all nodes are up and running and find a specific logs that can cause the issue.

    To diagnose the issue:

    1. Check network - let's try telnet the master2 from master1. telnet ip_node_2 9300

    Question: Did you lost any master node disks?