pythondockerdocker-composechromadb

chroma in the docker cannot be connected from another docker service


I am trying to talk to chroma service (in docker container) from service-a (also in docker container). However, when I try to ChromaConnector.get_instance(), I got error below:

Could not connect to a Chroma server. Are you sure it is running?

I tried to do following from the service-a container

curl http://chroma:8000
curl http://chroma:8000/api/v1/collections
curl http://chroma:8000/api/v1

The curl http://chroma:8000 gives error

{"detail":"Not Found"}

But curl http://chroma:8000/api/v1/collections gives

[]

and curl http://chroma:8000/api/v1 returns

{"nanosecond heartbeat":1731106743803134838}

This gives me the feeling that the server was up.

Below is the code

class ChromaConnector:
    _instance = None

    @classmethod
    def get_instance(cls):
        
        if cls._instance is None:
            cls._instance = chromadb.HttpClient(host='http://chroma', port=8000)
        return cls._instance

below is my docker-compose.yaml file

version: '3.8'

services:
  service-a:
    build:
      context: .
      dockerfile: Dockerfile
    ports:
      - "18001:18001"
    environment:
      - ENVIRONMENT=development
    depends_on:
      - kafka
      - chroma

  kafka:
    image: bitnami/kafka:latest
    ports:
      - "9094:9094"
    environment:
      - KAFKA_ENABLE_KRAFT=yes
      - KAFKA_CFG_BROKER_ID=1
      - KAFKA_CFG_NODE_ID=1
      - KAFKA_CFG_PROCESS_ROLES=broker,controller
      - KAFKA_CFG_CONTROLLER_LISTENER_NAMES=CONTROLLER
      - KAFKA_CFG_LISTENERS=PLAINTEXT://:9092,CONTROLLER://:9093,EXTERNAL://:9094
      - KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP=CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT,EXTERNAL:PLAINTEXT
      - KAFKA_CFG_ADVERTISED_LISTENERS=PLAINTEXT://kafka:9092,EXTERNAL://localhost:9094
      - KAFKA_CFG_CONTROLLER_QUORUM_VOTERS=1@:9093
      - ALLOW_PLAINTEXT_LISTENER=yes    

  chroma:
    image: chromadb/chroma:latest  
    ports:
      - "18000:8000"
    volumes:
      - chroma-data:/chroma/chroma 
    restart: unless-stopped
    command: "--workers 1 --host 0.0.0.0 --port 8000 --proxy-headers --log-config chromadb/log_config.yml --timeout-keep-alive 30"
    environment:
      - IS_PERSISTENT=TRUE   

  redis:
    image: redis:latest
    ports:
      - "6379:6379"
    volumes:
      - redis-data:/data

volumes:
  redis-data:
  chroma-data:

Please advise. Thanks.


Solution

  • try using way of initializing your client in service-a

    class ChromaConnector:
        _instance = None
    
        @classmethod
        def get_instance(cls):
            
            if cls._instance is None:
                cls._instance = chromadb.HttpClient(host='http://chroma:8000')
            return cls._instance
    

    I've tested this in a similar to your setup:

    version: '3.8'
    
    services:
      service-a:
        image: python:3.12-bookworm
        command: sleep 3600
        ports:
          - "18001:18001"
        environment:
          - ENVIRONMENT=development
        depends_on:
          - chroma
    
      chroma:
        image: chromadb/chroma:latest  
        ports:
          - "18000:8000"
        volumes:
          - chroma-data:/chroma/chroma 
        restart: unless-stopped
        command: "--workers 1 --host 0.0.0.0 --port 8000 --proxy-headers --log-config chromadb/log_config.yml --timeout-keep-alive 30"
        environment:
          - IS_PERSISTENT=TRUE   
    
    volumes:
      chroma-data:
    

    Shell into the python312 container docker exec -it <containerid> /bin/bash, install chroma pip install chromadb then using python REPL:

    >>> import chromadb
    >>> instance = chromadb.HttpClient(host='http://chroma:8000')
    >>> instance.list_collections()
    []
    >>> 
    

    I think the core of the issue is how the url is handled. About a year ago support was added for handling complete urls in the host param (e.g. http://my-chroma-ip:port/my_lb_service/). This allowed for using Chroma behind custom paths. The downside of this is that now it introduces the confusion you've run into - supply an url + port, which leads to a weird defect where the port param is ignored.