I'm trying to set up a docker stack for a datascience project and I want to use redis to have services exchange data.
I followed the documentation provided by label studio but there are a lot of details missing and my implementation doesn't work.
Specifically : label studio is able to register redis as a data source but not as a data target, and as a source it doesn't retrieve my tasks data.
I removed any service unrelated to label-studio, and there is a .env file for the variables.
The postgres part works fine but I kept it in the example because its part of redis config.
services:
postgres:
image: postgres:16-alpine
container_name: postgres
ports:
- ${POSTGRES_PORT}:5432
environment:
- POSTGRES_PASSWORD=${POSTGRES_PASSWORD}
- PGDATA=/var/lib/postgresql/data/pgdata
- POSTGRES_PORT=${POSTGRES_PORT}
healthcheck:
test: ["CMD-SHELL", "pg_isready", "-d", "postgres"]
interval: 10s
timeout: 10s
retries: 120
volumes:
- pgdata:/var/lib/postgresql/data:Z
redis:
image: redis:5-alpine
container_name: redis
ports:
- 6379:6379
volumes:
- redisdata:/data
healthcheck:
test: [ "CMD", "redis-cli", "--raw", "incr", "ping" ]
interval: 10s
timeout: 10s
retries: 120
command: [ "redis-server",
"--save", "60", "1",
"--loglevel", "debug",
"--requirepass", "${REDIS_PASSWORD}"]
label-studio:
image: heartexlabs/label-studio:latest
container_name: label-studio
depends_on:
postgres:
condition: service_healthy
redis:
condition: service_healthy
ports:
- 8081:8080
environment:
- DJANGO_DB=default
- POSTGRE_HOST=postgres
- POSTGRE_PORT=${POSTGRE_PORT}
- POSTGRE_NAME=${POSTGRE_NAME}
- POSTGRE_USER=${POSTGRE_USER}
- POSTGRE_PASSWORD=${POSTGRE_PASSWORD}
- REDIS_HOST=redis
- REDIS_PORT=6379
- REDIS_LOCATION=redis:6379
- REDIS_DB=0
- REDIS_PASSWORD=${REDIS_PASSWORD}
volumes:
- lsdata:/label-studio/data
command: ["label-studio",
"--log-level", "DEBUG"]
volumes:
pgdata:
driver: local
redisdata:
driver: local
lsdata:
driver: local
Redis runs, and has tasks data, I tested the following formats
labelstudio:ls-task-1 '{"text":"some text"}'
labelstudio:ls-task-2 '{"id":0, "data": {"texte": "some text"}}
ls-task-1 '{"text":"some text"}'
ls-task-2 '{"id":0, "data": {"texte": "some text"}}
<View>
<Text name="text" value="$text"/>
<View style="box-shadow: 2px 2px 5px #999; padding: 20px; margin-top: 2em; border-radius: 5px;">
<Header value="Some themes"/>
<Choices name="theme" toName="text" choice="multiple" showInLine="true">
<Choice value="somevalue">Some Choice</Choice>
<Choice value="othervalue">Other Choice/Choice>
</Choices>
</View>
</View>
<!-- {
"data": {"text": "Some Text"}
} -->
Storage Type : Redis
Path : labelstudio
Password :
Host : redis
port : 6379
I see in the logs label studio connecting to redis but it always shows 0 tasks
[2024-07-11 08:46:24,606] [urllib3.connectionpool::_make_request::474] [DEBUG] https://o227124.ingest.sentry.io:443 "POST /api/5820521/envelope/ HTTP/1.1" 200 2
[2024-07-11 08:46:43,954] [io_storages.base_models::sync::454] [INFO] Start syncing storage RedisImportStorage object (1)
[2024-07-11 08:46:43,964] [projects.models::_update_tasks_states::422] [INFO] Starting _update_tasks_states with params: Project risque-juridique (id=1) maximum_annotations 1 and percentage 100
[2024-07-11 08:46:43,971] [urllib3.connectionpool::_new_conn::1019] [DEBUG] Starting new HTTPS connection (1): tele.labelstud.io:443
[2024-07-11 08:46:43,971] [django.server::log_message::161] [INFO] "POST /api/storages/redis/1/sync HTTP/1.1" 200 618
[2024-07-11 08:46:43,971] [django.server::log_message::161] [INFO] "POST /api/storages/redis/1/sync HTTP/1.1" 200 618
[2024-07-11 08:46:44,454] [urllib3.connectionpool::_make_request::474] [DEBUG] https://tele.labelstud.io:443 "POST / HTTP/1.1" 200 0
[2024-07-11 08:47:24,609] [urllib3.connectionpool::_make_request::474] [DEBUG] https://o227124.ingest.sentry.io:443 "POST /api/5820521/envelope/ HTTP/1.1" 200 2
11 Jul 2024 08:46:43.953 - Accepted 192.168.48.6:42558
11 Jul 2024 08:46:43.954 - Client closed connection
11 Jul 2024 08:46:43.963 - Accepted 192.168.48.6:42564
11 Jul 2024 08:46:43.964 - Client closed connection
11 Jul 2024 08:46:45.526 - Accepted 127.0.0.1:57774
11 Jul 2024 08:46:45.526 - Client closed connection
Label studio shows
Runtime error
Validation error
validate_connection is not implemented
Version: 1.12.1
[2024-07-11 08:51:29,742] [core.utils.common::custom_exception_handler::89] [ERROR] c9a7909a-c865-4f8a-813b-3a4e7918d9a5 [ErrorDetail(string='validate_connection is not implemented', code='invalid')]
Traceback (most recent call last):
File "/label-studio/label_studio/io_storages/api.py", line 82, in perform_create
instance.validate_connection()
File "/label-studio/label_studio/io_storages/base_models.py", line 218, in validate_connection
raise NotImplementedError('validate_connection is not implemented')
NotImplementedError: validate_connection is not implemented
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/rest_framework/views.py", line 506, in dispatch
response = handler(request, *args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/django/utils/decorators.py", line 43, in _wrapper
return bound_method(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/rest_framework/generics.py", line 242, in post
return self.create(request, *args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/rest_framework/mixins.py", line 19, in create
self.perform_create(serializer)
File "/label-studio/label_studio/io_storages/api.py", line 84, in perform_create
raise ValidationError(exc)
rest_framework.exceptions.ValidationError: [ErrorDetail(string='validate_connection is not implemented', code='invalid')]
[2024-07-11 08:51:29,748] [django.request::log_response::224] [WARNING] Bad Request: /api/storages/export/redis
[2024-07-11 08:51:29,748] [django.request::log_response::224] [WARNING] Bad Request: /api/storages/export/redis
[2024-07-11 08:51:29,748] [urllib3.connectionpool::_new_conn::1019] [DEBUG] Starting new HTTPS connection (1): tele.labelstud.io:443
[2024-07-11 08:51:29,749] [django.server::log_message::161] [WARNING] "POST /api/storages/export/redis?project=1 HTTP/1.1" 400 210
[2024-07-11 08:51:29,749] [django.server::log_message::161] [WARNING] "POST /api/storages/export/redis?project=1 HTTP/1.1" 400 210
I'm gonna answer my own question, long story short, it was indeed a bug, not only was the validation function for redis output not written but the redis input link into label studio wasn't properly implemented either. The Label Studio team recently added to the redis connection config the missing parameters (redis database id for one) now allowing retrieval of tasks from redis.