记录自定义 LLM 跟踪

注意

如果您没有以正确的格式记录 LLM 跟踪，则不会发生任何中断，并且仍会记录数据。但是，数据不会以特定于 LLM 的方式处理或呈现。

记录来自 OpenAI 模型的跟踪的最佳方法是使用langsmith适用于 Python 和 TypeScript 的 SDK。但是，您也可以按照以下准则记录自定义模型中的跟踪。

LangSmith 为 LLM 跟踪提供特殊的渲染和处理，包括令牌计数（假设模型提供商无法提供令牌计数）和基于令牌的成本计算。为了充分利用此功能，您必须以特定格式记录 LLM 跟踪。

注意

以下示例使用traceabledecorator/wrapper 来记录模型运行（这是 Python 和 JS/TS 的推荐方法）。但是，如果您直接使用 RunTree 或 API，同样的想法也适用。

聊天式模型

对于聊天样式模型，输入必须是 OpenAI 兼容格式的消息列表，表示为 Python 字典或 TypeScript 对象。每条消息都必须包含 keyrole和content.

接受以下任何格式的输出：

包含键的字典/对象choices其值是字典/对象列表。每个字典/对象都必须包含键message映射到带有键的 message 对象role和content.
包含键的字典/对象message的值是带有键的 Message 对象role和content.
一个包含两个元素的元组/数组，其中第一个元素是角色，第二个元素是内容。
包含键的字典/对象role和content.

函数的输入应命名为messages.

您还可以提供以下内容metadata字段来帮助 LangSmith 识别模型并计算成本。如果使用 LangChain 或 OpenAI 包装器，这些字段将自动正确填充。要了解有关如何使用metadata字段，请参阅本指南。

ls_provider：模型的提供者，例如 “openai”、“anthropic” 等。
ls_model_name：型号的名称，例如 “gpt-4o-mini”、“claude-3-opus-20240307” 等。

蟒
TypeScript （类型脚本）

from langsmith import traceable

inputs = [
  {"role": "system", "content": "You are a helpful assistant."},
  {"role": "user", "content": "I'd like to book a table for two."},
]

output = {
  "choices": [
      {
          "message": {
              "role": "assistant",
              "content": "Sure, what time would you like to book the table for?"
          }
      }
  ]
}

# Can also use one of:
# output = {
#     "message": {
#         "role": "assistant",
#         "content": "Sure, what time would you like to book the table for?"
#     }
# }
#
# output = {
#     "role": "assistant",
#     "content": "Sure, what time would you like to book the table for?"
# }
#
# output = ["assistant", "Sure, what time would you like to book the table for?"]

@traceable(
  run_type="llm",
  metadata={"ls_provider": "my_provider", "ls_model_name": "my_model"}
)
def chat_model(messages: list):
  return output

chat_model(inputs)

import { traceable } from "langsmith/traceable";

const messages = [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "I'd like to book a table for two." }
];

const output = {
choices: [
  {
    message: {
      role: "assistant",
      content: "Sure, what time would you like to book the table for?"
    }
  }
]
};

// Can also use one of:
// const output = {
//   message: {
//     role: "assistant",
//     content: "Sure, what time would you like to book the table for?"
//   }
// };
//
// const output = {
//   role: "assistant",
//   content: "Sure, what time would you like to book the table for?"
// };
//
// const output = ["assistant", "Sure, what time would you like to book the table for?"];

const chatModel = traceable(
async ({ messages }: { messages: { role: string; content: string }[] }) => {
  return output;
},
{ run_type: "llm", name: "chat_model", metadata: { ls_provider: "my_provider", ls_model_name: "my_model" } }
);

await chatModel({ messages });

上面的代码将记录以下跟踪：

流输出

对于流式处理，可以将输出“减少”为与非流式处理版本相同的格式。目前仅在 Python 中支持此功能。

def _reduce_chunks(chunks: list):
    all_text = "".join([chunk["choices"][0]["message"]["content"] for chunk in chunks])
    return {"choices": [{"message": {"content": all_text, "role": "assistant"}}]}

@traceable(
    run_type="llm",
    reduce_fn=_reduce_chunks,
    metadata={"ls_provider": "my_provider", "ls_model_name": "my_model"}
)
def my_streaming_chat_model(messages: list):
    for chunk in ["Hello, " + messages[1]["content"]]:
        yield {
            "choices": [
                {
                    "message": {
                        "content": chunk,
                        "role": "assistant",
                    }
                }
            ]
        }

list(
    my_streaming_chat_model(
        [
            {"role": "system", "content": "You are a helpful assistant. Please greet the user."},
            {"role": "user", "content": "polly the parrot"},
        ],
    )
)

手动提供令牌计数

基于 Token 的成本跟踪

要了解如何根据令牌使用情况信息设置基于令牌的成本跟踪，请参阅本指南。

默认情况下，LangSmith 使用 TikToken 对代币进行计数，利用基于模型的 tokenizer 的最佳猜测ls_model_name提供。许多模型已将令牌计数作为响应的一部分。您可以通过提供usage_metadata字段。如果令牌信息传递给 LangSmith，系统将使用此信息而不是使用 TikToken。

您可以添加usage_metadata键添加到函数的响应中，其中包含一个带有键的字典input_tokens,output_tokens和total_tokens. 如果使用 LangChain 或 OpenAI 包装器，这些字段将自动正确填充。

注意

如果ls_model_name不存在于extra.metadata，则其他字段可能会从extra.invocation_metadata用于估计令牌计数。以下字段按优先顺序使用：

metadata.ls_model_name
invocation_params.model
invocation_params.model_name

蟒
TypeScript （类型脚本）

from langsmith import traceable

inputs = [
  {"role": "system", "content": "You are a helpful assistant."},
  {"role": "user", "content": "I'd like to book a table for two."},
]

output = {
  "choices": [
      {
          "message": {
              "role": "assistant",
              "content": "Sure, what time would you like to book the table for?"
          }
      }
  ],
  "usage_metadata": {
      "input_tokens": 27,
      "output_tokens": 13,
      "total_tokens": 40,
  },
}

@traceable(
  run_type="llm",
  metadata={"ls_provider": "my_provider", "ls_model_name": "my_model"}
)
def chat_model(messages: list):
  return output

chat_model(inputs)

import { traceable } from "langsmith/traceable";

const messages = [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "I'd like to book a table for two." },
];

const output = {
choices: [
  {
    message: {
      role: "assistant",
      content: "Sure, what time would you like to book the table for?",
    },
  },
],
usage_metadata: {
  input_tokens: 27,
  output_tokens: 13,
  total_tokens: 40,
},
};

const chatModel = traceable(
async ({
  messages,
}: {
  messages: { role: string; content: string }[];
  model: string;
}) => {
  return output;
},
{ run_type: "llm", name: "chat_model", metadata: { ls_provider: "my_provider", ls_model_name: "my_model" } }
);

await chatModel({ messages });

Instruct 风格的模型

对于 instruct 样式的模型（string in、string out），您的输入必须包含一个键prompt替换为 String 值。还允许其他输入。输出必须返回一个对象，该对象在序列化时包含键choices替换为字典/对象列表。每个 Cookie 都必须包含密钥text替换为 String 值。相同的规则metadata和usage_metadata适用于 Chat 风格的模型。

蟒
TypeScript （类型脚本）

@traceable(
  run_type="llm",
  metadata={"ls_provider": "my_provider", "ls_model_name": "my_model"}
)
def hello_llm(prompt: str):
  return {
      "choices": [
          {"text": "Hello, " + prompt}
      ],
      "usage_metadata": {
          "input_tokens": 4,
          "output_tokens": 5,
          "total_tokens": 9,
      },
  }

hello_llm("polly the parrot\n")

import { traceable } from "langsmith/traceable";

const helloLLM = traceable(
({ prompt }: { prompt: string }) => {
  return {
    choices: [
      { text: "Hello, " + prompt }
    ],
      usage_metadata: {
          input_tokens: 4,
          output_tokens: 5,
          total_tokens: 9,
      },
  };
},
{ run_type: "llm", name: "hello_llm", metadata: { ls_provider: "my_provider", ls_model_name: "my_model" } }
);

await helloLLM({ prompt: "polly the parrot\n" });

上面的代码将记录以下跟踪：

记录自定义 LLM 跟踪

聊天式模型

流输出

手动提供令牌计数

Instruct 风格的模型

这个页面有帮助吗？

您可以在 GitHub 上留下详细的反馈。