azure-databrickslangchainmlflow

Wrapping LLM deployed in azure Databricks


I have deployed a LLM model on azure databricks. I can access it via the databricks API:

Https://adb-17272728282828282.1.azuredatabricks.net/serving-endpoints/my-code-llama/invocations

I am aware that langchain supports Databricks directly but is it possible to say wrap the databricks cluster in mlflow/openai wrapper or something so that I can use it in langchain like this:

Llm = ChatOpenAI(
   openai_api_base="http://my-url/API"
   openai_api_key="7282"
   model_name= "my-code-llama"
   max_tokens =1000,
   streaming=True,
   callbacks= [StreamingStdOutCallbackHandler()]
)

Trying to do so as there are alot of limitations if I just use the langchain Databricks wrapper directly. I am quite new so some support would be really great!


Solution

  • Below are the requirements for wrapping a model serving endpoint in Databricks:

    Install the latest langchain on the cluster library and restart the cluster.

    enter image description here

    Use the code below.

    from langchain.llms import Databricks
    
    llm = Databricks(endpoint_name="databricks-mpt-7b-instruct")
    llm("How are you?")
    

    Output:

    enter image description here

    If you want to use this endpoint directly in langchain, try the code below.

    Install langchain-openai as shown in the above steps.

    Note: Make sure to use a model of the correct task type. Below, I used the model (databricks-llama-2-70b-chat) for the task chat.

    from langchain_openai import ChatOpenAI, OpenAI
    from langchain.callbacks import StreamingStdOutCallbackHandler
    
    chat_llm = ChatOpenAI(
       openai_api_base="https://<xxxxxx>.azuredatabricks.net/serving-endpoints",
       openai_api_key="dapi-xxxxx",
       model_name= "databricks-llama-2-70b-chat",
       max_tokens =1000,
       streaming=True,
       callbacks= [StreamingStdOutCallbackHandler()]
    )
    
    chat_llm.invoke("What is mlflow?")
    

    Output:

    enter image description here

    If you want to serve this in a custom endpoint, you need to use langserve with deploying it in the cloud.

    Refer to this documentation for more information.