pythonpip

Managing dependency conflicts in ubuntu docker python script


I am building a simple chatbot, within an Ubuntu docker image, which runs a websocket server in a main program asyncio loop and from within that loop needs to fluidly call and execute text-to-speech models like Coqui XTTS and speech-to-text models like NeMo fastconformer or parakeet asynchronously.

The trouble I'm running into is that my environment is now becoming complex enough that different packages have conflicting dependencies: for example Coqui requires transformers version 4.35.2 while certain NeMo tools require transformers >= 4.41.0 or they will break.

I have read about using venv for isolating python dependencies between different projects, but what do you do when you have conflicting dependencies within the same project and need to use packages with conflicting dependencies within the same script?

Is there a simple solution to this that I am overlooking (short of ChatGPT's solution of redesigning my entire workflow as a collection of microservices communicating in a docker swarm)?


Solution

  • In general, no, there isn't a way to install multiple versions of a package in a single environment (as far as I know, someone please correct me if there is a way I'm not aware of!) That means that if you really do absolutely need access to multiple versions, you'll need to have multiple processes running separate interpreters in separate environments. Docker and virtual environments are your best tools if you really need to go that route. So in the general case, I think ChatGPT is right on this one.

    However, in your case, there may be less drastic solutions. The original Coqui repository is no longer maintained (hence the reliance on an old version of transformers), but there is a fork called coqui-ai-TTS that is actively maintained. You shouldn't have this kind of dependency conflict with that.

    Another option if you really need to stick with the old Coqui: you could try to monkey-patch the bug that is tying you down to the old version of transformers. This isn't an elegant solution, and you shouldn't do it lightly because it can cause other issues down the road. But if I understand correctly, the bug in question is a relatively simple case of TTS passing the wrong argument type to a transformers function. It looks like this happens in TTS/TTS/tts/layers/xtts/stream_generator.py:

            if model_kwargs.get("attention_mask", None) is None and requires_attention_mask and accepts_attention_mask:
                model_kwargs["attention_mask"] = self._prepare_attention_mask_for_generation(
                    inputs_tensor,
                    generation_config.pad_token_id,
                    generation_config.eos_token_id,
                )
    

    So you might be able to replace the NewGenerationMixin.generate() function with a version that is identical, except that it passes the correct arguments to _prepare_attention_mask_for_generation. You'll need to do some research and testing to make sure this works and figure out if you need to make other changes. Look up an example of monkey-patching if you aren't familiar.

    And of course, if you need to change more than a couple things, a better solution might be to simply fork Coqui TTS yourself with whatever fixes you need. Depending on your needs, this might be easier than re-architecting your app to use microservices.