Databricks

DatabricksLakehouse Platform 将数据、分析和 AI 统一在一个平台上。

此笔记本提供了 Databricks LLM 模型入门的快速概述。有关所有功能和配置的详细文档，请参阅 API 参考。

概述

DatabricksLLM 类包装了托管为以下两种终端节点类型之一的完成终端节点：

Databricks Model Serving，建议用于生产和开发，
集群驱动代理应用，推荐用于交互式开发。

此示例笔记本展示了如何包装 LLM 终端节点并将其用作 LangChain 应用程序中的 LLM。

局限性

这DatabricksLLM 类是传统实现，在功能兼容性方面存在一些限制。

仅支持同步调用。不支持流式处理或异步 API。
batch不支持 API。

要使用这些功能，请改用新的 ChatDatabricks 类。ChatDatabricks支持所有 APIChatModel包括 streaming、async、batch 等。

设置

要访问 Databricks 模型，您需要创建一个 Databricks 帐户，设置凭据（仅当您位于 Databricks 工作区之外时）并安装所需的包。

凭据（仅当你在 Databricks 外部时）

如果您在 Databricks 中运行 LangChain 应用程序，则可以跳过此步骤。

否则，需要手动将 Databricks 工作区主机名和个人访问令牌设置为DATABRICKS_HOST和DATABRICKS_TOKEN环境变量。有关如何获取访问令牌的信息，请参阅身份验证文档。

import getpass
import os

os.environ["DATABRICKS_HOST"] = "https://your-workspace.cloud.databricks.com"
if "DATABRICKS_TOKEN" not in os.environ:
    os.environ["DATABRICKS_TOKEN"] = getpass.getpass(
        "Enter your Databricks access token: "
    )

或者，您可以在初始化Databricks类。

from langchain_community.llms import Databricks

databricks = Databricks(
    host="https://your-workspace.cloud.databricks.com",
    # We strongly recommend NOT to hardcode your access token in your code, instead use secret management tools
    # or environment variables to store your access token securely. The following example uses Databricks Secrets
    # to retrieve the access token that is available within the Databricks notebook.
    token=dbutils.secrets.get(scope="YOUR_SECRET_SCOPE", key="databricks-token"),  # noqa: F821
)

API 参考：Databricks

安装

LangChain Databricks 集成位于langchain-community包。也mlflow >= 2.9 才能运行此笔记本中的代码。

%pip install -qU langchain-community mlflow>=2.9.0

包装模型服务端点

先决条件：

已注册 LLM 并将其部署到 Databricks 服务终端节点。
您对终端节点具有 “Can Query” 权限。

预期的 MLflow 模型签名为：

输入：[{"name": "prompt", "type": "string"}, {"name": "stop", "type": "list[string]"}]
输出：[{"type": "string"}]

调用

from langchain_community.llms import Databricks

llm = Databricks(endpoint_name="YOUR_ENDPOINT_NAME")
llm.invoke("How are you?")

API 参考：Databricks

'I am happy to hear that you are in good health and as always, you are appreciated.'

llm.invoke("How are you?", stop=["."])

'Good'

变换输入和输出

有时你可能想要包装一个模型签名不兼容的 serving 端点，或者你想插入额外的配置。您可以使用transform_input_fn和transform_output_fn参数来定义额外的 pre/post 进程。

# Use `transform_input_fn` and `transform_output_fn` if the serving endpoint
# expects a different input schema and does not return a JSON string,
# respectively, or you want to apply a prompt template on top.


def transform_input(**request):
    full_prompt = f"""{request["prompt"]}
    Be Concise.
    """
    request["prompt"] = full_prompt
    return request


def transform_output(response):
    return response.upper()


llm = Databricks(
    endpoint_name="YOUR_ENDPOINT_NAME",
    transform_input_fn=transform_input,
    transform_output_fn=transform_output,
)

llm.invoke("How are you?")

'I AM DOING GREAT THANK YOU.'

LLM 概念指南
LLM 操作指南

概述

局限性

设置

凭据（仅当你在 Databricks 外部时）

安装

包装模型服务端点

先决条件：

调用

变换输入和输出

相关