pythonazure-blob-storageazure-synapsefasttext

Loading .bin pretrain fasttext models from Azure Blob Storage into Azure Synapse Notebook


Does anyone know how to load pretrained .bin fasttext model into azure synapse notebook using fasttext.load_model function? My .bin file is on azure blob storage account

I try to load using the function

fasttext.load_model(aure_blob_storage_path_file) but I get error like .dfs.core.windows.net/cc.sr.300.bin cannot be opened for loading!


Solution

  • You can get model in storage by mounting the storage account in azure synapse. Follow below steps.

    mssparkutils.fs.mount(
        "wasbs://<container>@<storage_account_name>.blob.core.windows.net",
        "/mnt/",
        {"accountKey":"<your_storage_account_key>"})
    

    Then after mounting you list down the file by giving path as below format. /synfs/{jobId}/mnt/{filename}

    enter image description here

    Here, in my case job id is 0 and /mnt/ is mount point, find out what is yours job id and list down files giving mount point.

    Next, by giving this path load the model.

    import fasttext
    
    azure_blob_storage_path_file = "/synfs/0/mnt/model_1.bin"
    model = fasttext.load_model(azure_blob_storage_path_file)
    model.predict("Give the github links.")
    

    enter image description here