I am trying to talk to chroma service (in docker container) from service-a (also in docker container). However, when I try to ChromaConnector.get_instance()
, I got error below:
Could not connect to a Chroma server. Are you sure it is running?
I tried to do following from the service-a container
curl http://chroma:8000
curl http://chroma:8000/api/v1/collections
curl http://chroma:8000/api/v1
The curl http://chroma:8000
gives error
{"detail":"Not Found"}
But curl http://chroma:8000/api/v1/collections
gives
[]
and curl http://chroma:8000/api/v1
returns
{"nanosecond heartbeat":1731106743803134838}
This gives me the feeling that the server was up.
Below is the code
class ChromaConnector:
_instance = None
@classmethod
def get_instance(cls):
if cls._instance is None:
cls._instance = chromadb.HttpClient(host='http://chroma', port=8000)
return cls._instance
below is my docker-compose.yaml file
version: '3.8'
services:
service-a:
build:
context: .
dockerfile: Dockerfile
ports:
- "18001:18001"
environment:
- ENVIRONMENT=development
depends_on:
- kafka
- chroma
kafka:
image: bitnami/kafka:latest
ports:
- "9094:9094"
environment:
- KAFKA_ENABLE_KRAFT=yes
- KAFKA_CFG_BROKER_ID=1
- KAFKA_CFG_NODE_ID=1
- KAFKA_CFG_PROCESS_ROLES=broker,controller
- KAFKA_CFG_CONTROLLER_LISTENER_NAMES=CONTROLLER
- KAFKA_CFG_LISTENERS=PLAINTEXT://:9092,CONTROLLER://:9093,EXTERNAL://:9094
- KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP=CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT,EXTERNAL:PLAINTEXT
- KAFKA_CFG_ADVERTISED_LISTENERS=PLAINTEXT://kafka:9092,EXTERNAL://localhost:9094
- KAFKA_CFG_CONTROLLER_QUORUM_VOTERS=1@:9093
- ALLOW_PLAINTEXT_LISTENER=yes
chroma:
image: chromadb/chroma:latest
ports:
- "18000:8000"
volumes:
- chroma-data:/chroma/chroma
restart: unless-stopped
command: "--workers 1 --host 0.0.0.0 --port 8000 --proxy-headers --log-config chromadb/log_config.yml --timeout-keep-alive 30"
environment:
- IS_PERSISTENT=TRUE
redis:
image: redis:latest
ports:
- "6379:6379"
volumes:
- redis-data:/data
volumes:
redis-data:
chroma-data:
Please advise. Thanks.
try using way of initializing your client in service-a
class ChromaConnector:
_instance = None
@classmethod
def get_instance(cls):
if cls._instance is None:
cls._instance = chromadb.HttpClient(host='http://chroma:8000')
return cls._instance
I've tested this in a similar to your setup:
version: '3.8'
services:
service-a:
image: python:3.12-bookworm
command: sleep 3600
ports:
- "18001:18001"
environment:
- ENVIRONMENT=development
depends_on:
- chroma
chroma:
image: chromadb/chroma:latest
ports:
- "18000:8000"
volumes:
- chroma-data:/chroma/chroma
restart: unless-stopped
command: "--workers 1 --host 0.0.0.0 --port 8000 --proxy-headers --log-config chromadb/log_config.yml --timeout-keep-alive 30"
environment:
- IS_PERSISTENT=TRUE
volumes:
chroma-data:
Shell into the python312 container docker exec -it <containerid> /bin/bash
, install chroma pip install chromadb
then using python REPL:
>>> import chromadb
>>> instance = chromadb.HttpClient(host='http://chroma:8000')
>>> instance.list_collections()
[]
>>>
I think the core of the issue is how the url is handled. About a year ago support was added for handling complete urls in the host
param (e.g. http://my-chroma-ip:port/my_lb_service/
). This allowed for using Chroma behind custom paths. The downside of this is that now it introduces the confusion you've run into - supply an url + port, which leads to a weird defect where the port param is ignored.