I have deployed a LLM model on azure databricks. I can access it via the databricks API:
Https://adb-17272728282828282.1.azuredatabricks.net/serving-endpoints/my-code-llama/invocations
I am aware that langchain supports Databricks directly but is it possible to say wrap the databricks cluster in mlflow/openai wrapper or something so that I can use it in langchain like this:
Llm = ChatOpenAI(
openai_api_base="http://my-url/API"
openai_api_key="7282"
model_name= "my-code-llama"
max_tokens =1000,
streaming=True,
callbacks= [StreamingStdOutCallbackHandler()]
)
Trying to do so as there are alot of limitations if I just use the langchain Databricks wrapper directly. I am quite new so some support would be really great!
Below are the requirements for wrapping a model serving endpoint in Databricks:
Install the latest langchain
on the cluster library and restart the cluster.
Use the code below.
from langchain.llms import Databricks
llm = Databricks(endpoint_name="databricks-mpt-7b-instruct")
llm("How are you?")
Output:
If you want to use this endpoint directly in langchain
, try the code below.
Install langchain-openai
as shown in the above steps.
Note: Make sure to use a model of the correct task type. Below, I used the model (databricks-llama-2-70b-chat
) for the task chat
.
from langchain_openai import ChatOpenAI, OpenAI
from langchain.callbacks import StreamingStdOutCallbackHandler
chat_llm = ChatOpenAI(
openai_api_base="https://<xxxxxx>.azuredatabricks.net/serving-endpoints",
openai_api_key="dapi-xxxxx",
model_name= "databricks-llama-2-70b-chat",
max_tokens =1000,
streaming=True,
callbacks= [StreamingStdOutCallbackHandler()]
)
chat_llm.invoke("What is mlflow?")
Output:
If you want to serve this in a custom endpoint, you need to use langserve
with deploying it in the cloud.
Refer to this documentation for more information.