数据集转换

LangSmith 允许您将转换附加到数据集架构中应用添加到数据集之前，无论是从 UI、API 还是运行规则。

结合 LangSmith 的预构建 JSON 架构类型，这些使您能够轻松在将数据保存到数据集之前对其进行预处理。

转换类型

变换类型	目标类型	功能性
remove_system_messages	Array[Message]	Filters a list of messages to remove any system messages.
convert_to_openai_message	Message Array[Message]	Converts any incoming data from LangChain's internal serialization format to OpenAI's standard message format using langchain's convert_to_openai_messages. If the target field is marked as required, and no matching message is found upon entry, it will attempt to extract a message (or list of messages) from several well-known LangSmith tracing formats (e.g., any traced LangChain BaseChatModel run or traced run from the LangSmith OpenAI wrapper), and remove the original key containing the message.
convert_to_openai_tool	Array[Tool] Only available on top level fields in the inputs dictionary.	Converts any incoming data into OpenAI standard tool formats here using langchain's convert_to_openai_tool Will extract tool definitions from a run's invocation parameters if present / no tools are found at the specified key. This is useful because LangChain chat models trace tool definitions to the `extra.invocation_params` field of the run rather than inputs.
remove_extra_fields	Object	Removes any field not defined in the schema for this target object.

聊天模型预构建架构

转换的主要用例是简化将生产跟踪收集到数据集中的过程，其格式可以是跨模型提供商标准化，用于评估/少量射击提示/等下游。

为了简化最终用户的转换设置，LangSmith 提供了一个预定义的架构，该架构将执行以下作：

从您收集的运行中提取消息并将其转换为 openai 标准格式，这使它们兼容所有 LangChain ChatModel 和大多数模型提供商的 SDK，用于下游评估和实验
提取 LLM 使用的任何工具，并将它们添加到示例的输入中，以用于下游评估中的可重复性

提示

想要迭代其系统提示符的用户通常还会在其 input 消息，这将阻止您将系统提示保存到数据集中。

兼容性

LLM 运行集合架构旨在从 LangChain BaseChatModel 运行中收集数据，或从 LangSmith OpenAI 包装器中跟踪运行。

如果您跟踪的 LLM 运行不兼容，请联系 support@langchain.dev，我们可以延长支持。

如果你想将转换应用于其他类型的运行（例如，用消息历史记录表示 LangGraph 状态），请定义您的 schema 直接添加相关转换。

支持

将跟踪项目或注释队列中的运行添加到数据集时，如果它具有 LLM 运行类型，我们将应用 Chat Model 架构。

有关在新数据集上启用的信息，请参阅我们的数据集管理操作指南。

规格

有关预构建架构的完整 API 规范，请参阅以下部分：

输入架构

{
  "type": "object",
  "properties": {
    "messages": {
      "type": "array",
      "items": {
        "$ref": "https://api.smith.langchain.com/public/schemas/v1/message.json"
      }
    },
    "tools": {
      "type": "array",
      "items": {
        "$ref": "https://api.smith.langchain.com/public/schemas/v1/tooldef.json"
      }
    }
  },
  "required": ["messages"]
}

输出架构

{
  "type": "object",
  "properties": {
    "message": {
      "$ref": "https://api.smith.langchain.com/public/schemas/v1/message.json"
    }
  },
  "required": ["message"]
}

转换

转换如下所示：

[
  {
    "path": ["inputs"],
    "transformation_type": "remove_extra_fields"
  },
  {
    "path": ["inputs", "messages"],
    "transformation_type": "convert_to_openai_message"
  },
  {
    "path": ["inputs", "tools"],
    "transformation_type": "convert_to_openai_tool"
  },
  {
    "path": ["outputs"],
    "transformation_type": "remove_extra_fields"
  },
  {
    "path": ["outputs", "message"],
    "transformation_type": "convert_to_openai_message"
  }
]

数据集转换

转换类型

聊天模型预构建架构

兼容性

支持

规格

输入架构

输出架构

转换

这个页面有帮助吗？

您可以在 GitHub 上留下详细的反馈。