machine-learning audio speaker-diarization

Pyannote: Load and Apply Speaker Diarization Offline

I try to use Pyannotes models offline.

I was loading and applying models like this:

from pyannote.audio import Pipeline

access_token = 'xxxxxxxxxxx'

model = Pipeline.from_pretrained(
         "pyannote/speaker-diarization-3.1",
         use_auth_token=access_token)

path_in = 'blabla/1-137-A-32.wav'

num_speakers = 1

model(path_in,
   num_speakers=num_speakers).labels()

That works fine.

But now I followed the instructions for offline use: https://github.com/pyannote/pyannote-audio/blob/develop/tutorials/applying_a_pipeline.ipynb

My directory structure is as follows:

src-
|-pyannote_offline_config.yaml
|-pyannote_pytorch_model.bin

---- YAML ----

version: 3.1.0

pipeline:
  name: pyannote.audio.pipelines.SpeakerDiarization
  params:
    clustering: AgglomerativeClustering
    embedding: pyannote/wespeaker-voxceleb-resnet34-LM
    embedding_batch_size: 32
    embedding_exclude_overlap: true
    segmentation: src/pyannote_pytorch_model.bin
    segmentation_batch_size: 32

params:
  clustering:
    method: centroid
    min_cluster_size: 12
    threshold: 0.7045654963945799
  segmentation:
    min_duration_off: 0.0

---- Loading Model ----

path_yaml = 'src/pyannote_offline_config.yaml'

model = Pipeline.from_pretrained(path_yaml)

path_in = 'blabla/1-137-A-32.wav'

num_speakers = 1

model(path_in,
         num_speakers=num_speakers).labels()

But that results in: "A pipeline must be instantiated with pipeline.instantiate(parameters) before it can be applied."

OK, next try:

---- Loading Model ----

path_yaml = 'src/pyannote_offline_config.yaml'

model = Pipeline.from_pretrained(path_yaml)

params = {'clustering':
    {'method': 'centroid',
    'min_cluster_size': 12,
    'threshold': 0.7045654963945799},
  'segmentation':
    {'min_duration_off': 0.0}}


pipeline = model.instantiate(params)

path_in = 'blabla/1-137-A-32.wav'

num_speakers = 1

pipeline(path_in,
         num_speakers=num_speakers).labels()

But that results in: "A pipeline must be instantiated with pipeline.instantiate(parameters) before it can be applied."

I don't understand the problem.

It works if I do it like that: ---- Loading Model ----

path_yaml = 'src/pyannote_offline_config.yaml'

model = Pipeline.from_pretrained("pyannote/speaker-diarization-3.1", path_yaml)

path_in = 'blabla/1-137-A-32.wav'

num_speakers = 1

model(path_in,
         num_speakers=num_speakers).labels()

But after an upload to gitlab the test pipline gives me:"Could not download 'pyannote/speaker-diarization-3.1' pipeline. It might be because the pipeline is private or gated so make sure to authenticate. Visit https://hf.co/settings/tokens to create your access token and retry with: Pipeline.from_pretrained('pyannote/speaker-diarization-3.1', ... use_auth_token=YOUR_AUTH_TOKEN)"

So it seems that something is on my local computer that is not downloaded with the pip install. E.g. if I load it without the yaml and only with model = Pipeline.from_pretrained("pyannote/speaker-diarization-3.1"), it also works.

Solution

It works now. I went through the instructions again from the beginning and downloaded everything again. The model file on my computer was somehow corrupt.

Code for offline/local use of pyannote

There are instructions somewhere for downloading the files for the model. Here: pyannote models. Here you will find all the relevant files. The most important one is usually pytorch_model.bin. Download this file and the configuration files (e.g., config.json, hyperparameters.yaml) using the small download icon.
You can also download the models automatically (instructions at huggingface) and then retrieve them from the cache: ~/.cache/huggingface/hub (Ubuntu)

Then:

from pyannote.audio import Pipeline

PATH_PYANNOTE_MODEL = 'src/pyannote_clf/pyannote_offline_config.yaml'
PATH_PYANNOTE_EMBEDDING = 'src/pyannote_clf/embedding_model/pytorch_model.bin'

def pyann_load_model(path_model: str = PATH_PYANNOTE_MODEL):
    """
    Loads a Pyannote model for diarization from
    src/pyannote_clf/pyannote_offline_config.yaml.
    The yaml file contains the path to the model 
    pyannote_pytorch_model.bin.
    
    Args
        path_model (str): Path to the yaml file which 
            contains additional informations for the
            model and the path to the file of the model.
            
    Returns
        A Pyannote model.
    """

    return Pipeline.from_pretrained(path_model)

def diarize(self,
                cuda_switch: bool = False,
                dump_switch: bool = False,
                dump_output_file_name=None):

        # pipeline = Pipeline.from_pretrained(
        #     "pyannote/speaker-diarization-3.1",
        #     use_auth_token=self.access_token
        #                                     )

        pipeline = pyann_load_model()

        self.diarization = diariza(
            input_dir=self.input_dir_to_diar,
            input_file_name=self.input_file_name_to_diar,
            pipeline=pipeline,
            cuda_switch=cuda_switch,
            dump_switch=dump_switch,
            dump_output_dir=self.dump_rttm_dir,
            dump_output_file_name=dump_output_file_name)