dockerelasticsearchdocker-composefscrawler

Dockerized elasticsearch and fscrawler: failed to create elasticsearch client, disabling crawler… Connection refused


I received the following error when attempting to connect Dockerized fscrawler to Dockerized elasticsearch:

[f.p.e.c.f.c.ElasticsearchClientManager] failed to create elasticsearch client, disabling crawler…
[f.p.e.c.f.FsCrawler] Fatal error received while running the crawler: [Connection refused]


Solution

  • When fscrawler is run for the fist time (i.e., docker-compose run fscrawler) it creates /config/{fscrawer_job}/_settings.yml with the following default setting:

    elasticsearch:
      nodes:
      - url: "http://127.0.0.1:9200"
    

    This will cause fscrawler to attempt to connect to localhost (i.e., 127.0.0.1). However, this will fail when fscrawler is located within a docker container because it is attempting to connect with the localhost of the CONTAINER. This was particularly confusing in my case because elasticsearch WAS accessible as localhost, but on the localhost of my physical computer (and NOT localhost of the container). Changing the url allowed fscrawler to connect to network address where elasticsearch actually resides.

    elasticsearch:
      nodes:
      - url: "http://elasticsearch:9200"
    

    I used the following docker image: https://hub.docker.com/r/toto1310/fscrawler

    # FILE: docker-compose.yml
    
    version: '2.2'
    services:
      # FSCrawler 
      fscrawler:
        image: toto1310/fscrawler
        container_name: fscrawler
        volumes:
          - ${PWD}/config:/root/.fscrawler
          - ${PWD}/data:/tmp/es
        networks: 
          - esnet
        command: fscrawler job_name
    
      # Elasticsearch Cluster
      elasticsearch:
        image: docker.elastic.co/elasticsearch/elasticsearch:7.3.2
        container_name: elasticsearch
        environment:
          - node.name=elasticsearch
          - discovery.seed_hosts=elasticsearch2
          - cluster.initial_master_nodes=elasticsearch,elasticsearch2
          - cluster.name=docker-cluster
          - bootstrap.memory_lock=true
          - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
        ulimits:
          memlock:
            soft: -1
            hard: -1
        volumes:
          - esdata01:/usr/share/elasticsearch/data
        ports:
          - 9200:9200
        networks:
          - esnet
      elasticsearch2:
        image: docker.elastic.co/elasticsearch/elasticsearch:7.3.2
        container_name: elasticsearch2
        environment:
          - node.name=elasticsearch2
          - discovery.seed_hosts=elasticsearch
          - cluster.initial_master_nodes=elasticsearch,elasticsearch2
          - cluster.name=docker-cluster
          - bootstrap.memory_lock=true
          - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
        ulimits:
          memlock:
            soft: -1
            hard: -1
        volumes:
          - esdata02:/usr/share/elasticsearch/data
        networks:
          - esnet
    
    volumes:
      esdata01:
        driver: local
      esdata02:
        driver: local
    
    networks:
      esnet:
    

    Ran docker-compose up elasticsearch elasticsearch2 to bring up elasticsearch nodes.
    Ran docker-compose run fscrawler to create _settings.yml
    Edited _settings.yml to

    elasticsearch:
      nodes:
      - url: "http://elasticsearch:9200"
    

    Started fscrawler docker-compose up fscrawler