dockerpipapache-supersetapache-drill

How to connect apache-superset with apache-drill?


I'm trying to connect superset (v2021.10.0, using docker) with drill (in embedded mode).

This tutorial mentions that, when drill is in embedded mode, the query string is drill+sadrill://localhost:8047/dfs?use_ssl=False. However, when I test the connection I get this error

superseterror

The logs show this :

superset_app             | DEBUG:superset.stats_logger:[stats_logger] (incr) test_connection_error.NoSuchModuleError
superset_app             | DEBUG:superset.stats_logger:[stats_logger] (incr) DatabaseRestApi.test_connection.error
superset_app             | DEBUG:superset.stats_logger:[stats_logger] (timing) DatabaseRestApi.test_connection.time | 46.48779600029229

Based on this question and the error I'm getting I assumed that the error is because it's missing the sqlalchemy-drill dependency, so I tried to install sqlalchemy-drill adding sqlalchemy-drill==0.1.dev to the base.txt file used by docker to install pip dependencies. But I'm still getting the same error.

Is my assumption right and it's missing the sqlalchemy-drill dependency? How can it be added? If no, what's the right way of running superset (on docker) with drill?

Update

After following the instructions from @ʈᵃᵢ's link I see this in the docker-compose's output:

superset_worker         | Successfully built sqlalchemy-drill
superset_worker_beat    | Successfully built sqlalchemy-drill
superset_app            | logging was configured successfully
superset_app            | INFO:superset.utils.logging_configurator:logging was configured successfully
superset_init           | Installing collected packages: sqlalchemy-drill
superset_init           | Successfully installed sqlalchemy-drill-0.1.dev0
superset_worker         | Installing collected packages: sqlalchemy-drill
superset_worker_beat    | Installing collected packages: sqlalchemy-drill
superset_worker         | Successfully installed sqlalchemy-drill-0.1.dev0
superset_worker_beat    | Successfully installed sqlalchemy-drill-0.1.dev0

But the Could not load database driver: DrillEngineSpec is still happening. (Tested with 0.3.dev0 too).

Update 2:

After pulling the latest sources from master (including the @ʈᵃᵢ's fix) I was able to load the driver. I also needed to change localhost to host.docker.internal.


Solution

  • If you’re using docker-compose take a look at this doc for how to add local packages https://github.com/apache/superset/tree/master/docker#local-packages