如何添加聊天历史

笔记

本指南之前使用了 RunnableWithMessageHistory 抽象。您可以在 v0.2 文档中访问此版本的文档。

在 LangChain v0.3 版本发布后，我们建议 LangChain 用户利用 LangGraph 持久化将 memory 集成到新的 LangChain 应用中。

如果您的代码已经依赖于 RunnableWithMessageHistory 或 BaseChatMessageHistory，则无需进行任何更改。由于该功能适用于简单的聊天应用，且使用 RunnableWithMessageHistory 的任何代码将继续按预期工作，因此我们计划在近期不弃用此功能。

请参阅如何迁移到 LangGraph Memory 以获取更多详细信息。

在许多问答应用中，我们希望允许用户进行来回对话，这意味着应用程序需要某种“记忆”来保存过去的问题和答案，并具备将这些内容融入当前思考逻辑的能力。

在本指南中，我们重点介绍 添加用于整合历史消息的逻辑。

这主要是对话式RAG教程的精简版本。

我们将介绍两种方法：

Chains，其中我们始终执行一个检索步骤；
代理，其中我们赋予大型语言模型在是否以及如何执行检索步骤（或多个步骤）方面的自主权。

对于外部知识源，我们将使用Lilian Weng撰写的同一篇由大型语言模型驱动的自主代理博客文章，该文章来自RAG教程。

两种方法都利用 LangGraph 作为编排框架。LangGraph 实现了内置的持久化层，使其非常适合支持多轮对话的聊天应用。

设置

依赖项

在本教程中，我们将使用 OpenAI 嵌入和内存中的向量存储，但此处展示的所有内容都适用于任何嵌入，以及向量存储或检索器。

我们将使用以下软件包：

%%capture --no-stderr
%pip install --upgrade --quiet langgraph langchain-community beautifulsoup4

LangSmith

使用 LangChain 构建的许多应用程序都包含多个步骤，以及多次调用大型语言模型（LLM）。随着这些应用程序变得越来越复杂，能够检查链或代理内部的具体情况变得至关重要。实现这一点的最佳方式是使用 LangSmith。

请注意，LangSmith 并非必需，但使用它会有所帮助。如果您确实希望使用 LangSmith，请在上方链接注册后，确保设置您的环境变量以开始记录追踪信息：

os.environ["LANGSMITH_TRACING"] = "true"
if not os.environ.get("LANGSMITH_API_KEY"):
    os.environ["LANGSMITH_API_KEY"] = getpass.getpass()

组件

我们需要从 LangChain 的集成套件中选择三个组件。

一个聊天模型：

选择聊天模型:

pip install -qU "langchain[openai]"

import getpass
import os

if not os.environ.get("OPENAI_API_KEY"):
  os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter API key for OpenAI: ")

from langchain.chat_models import init_chat_model

llm = init_chat_model("gpt-4o-mini", model_provider="openai")

一个嵌入模型：

选择嵌入模型：

pip install -qU langchain-openai

import getpass
import os

if not os.environ.get("OPENAI_API_KEY"):
  os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter API key for OpenAI: ")

from langchain_openai import OpenAIEmbeddings

embeddings = OpenAIEmbeddings(model="text-embedding-3-large")

并且一个向量存储：

选择向量存储：

pip install -qU langchain-core

from langchain_core.vectorstores import InMemoryVectorStore

vector_store = InMemoryVectorStore(embeddings)

链式调用

RAG 教程索引了 Lilian Weng 撰写的关于由大型语言模型驱动的自主代理的博客文章。我们将在此重复相关内容。以下是加载页面内容、将其拆分为子文档，并将文档嵌入到我们的向量存储中的过程：

import bs4
from langchain import hub
from langchain_community.document_loaders import WebBaseLoader
from langchain_core.documents import Document
from langchain_text_splitters import RecursiveCharacterTextSplitter
from typing_extensions import List, TypedDict

# Load and chunk contents of the blog
loader = WebBaseLoader(
    web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),
    bs_kwargs=dict(
        parse_only=bs4.SoupStrainer(
            class_=("post-content", "post-title", "post-header")
        )
    ),
)
docs = loader.load()

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
all_splits = text_splitter.split_documents(docs)

API 参考：hub |WebBaseLoader | 文档 |RecursiveCharacterTextSplitter

# Index chunks
_ = vector_store.add_documents(documents=all_splits)

如在RAG教程的第2部分中详细所述，我们可以通过将RAG应用的流程表示为一系列消息，自然地支持对话式体验：

用户输入作为 HumanMessage;
将向量存储查询作为带有工具调用的 AIMessage；
检索到的文档作为 ToolMessage;
最终响应为 AIMessage。

我们将使用工具调用来实现这一点，这还允许由大语言模型生成查询。我们可以构建一个工具来执行检索步骤：

from langchain_core.tools import tool


@tool(response_format="content_and_artifact")
def retrieve(query: str):
    """Retrieve information related to a query."""
    retrieved_docs = vector_store.similarity_search(query, k=2)
    serialized = "\n\n".join(
        (f"Source: {doc.metadata}\n" f"Content: {doc.page_content}")
        for doc in retrieved_docs
    )
    return serialized, retrieved_docs

API 参考：工具

现在我们可以构建我们的LangGraph应用程序了。

请注意，我们使用一个检查点程序来支持来回对话。LangGraph 自带一个简单的内存检查点程序，我们将在下面使用。有关更多详细信息，请参阅其文档，包括如何使用不同的持久化后端（例如 SQLite 或 Postgres）。

from langchain_core.messages import SystemMessage
from langgraph.checkpoint.memory import MemorySaver
from langgraph.graph import END, MessagesState, StateGraph
from langgraph.prebuilt import ToolNode, tools_condition


# Step 1: Generate an AIMessage that may include a tool-call to be sent.
def query_or_respond(state: MessagesState):
    """Generate tool call for retrieval or respond."""
    llm_with_tools = llm.bind_tools([retrieve])
    response = llm_with_tools.invoke(state["messages"])
    # MessagesState appends messages to state instead of overwriting
    return {"messages": [response]}


# Step 2: Execute the retrieval.
tools = ToolNode([retrieve])


# Step 3: Generate a response using the retrieved content.
def generate(state: MessagesState):
    """Generate answer."""
    # Get generated ToolMessages
    recent_tool_messages = []
    for message in reversed(state["messages"]):
        if message.type == "tool":
            recent_tool_messages.append(message)
        else:
            break
    tool_messages = recent_tool_messages[::-1]

    # Format into prompt
    docs_content = "\n\n".join(doc.content for doc in tool_messages)
    system_message_content = (
        "You are an assistant for question-answering tasks. "
        "Use the following pieces of retrieved context to answer "
        "the question. If you don't know the answer, say that you "
        "don't know. Use three sentences maximum and keep the "
        "answer concise."
        "\n\n"
        f"{docs_content}"
    )
    conversation_messages = [
        message
        for message in state["messages"]
        if message.type in ("human", "system")
        or (message.type == "ai" and not message.tool_calls)
    ]
    prompt = [SystemMessage(system_message_content)] + conversation_messages

    # Run
    response = llm.invoke(prompt)
    return {"messages": [response]}


# Build graph
graph_builder = StateGraph(MessagesState)

graph_builder.add_node(query_or_respond)
graph_builder.add_node(tools)
graph_builder.add_node(generate)

graph_builder.set_entry_point("query_or_respond")
graph_builder.add_conditional_edges(
    "query_or_respond",
    tools_condition,
    {END: END, "tools": "tools"},
)
graph_builder.add_edge("tools", "generate")
graph_builder.add_edge("generate", END)

memory = MemorySaver()
graph = graph_builder.compile(checkpointer=memory)

API 参考：SystemMessage | MemorySaver | StateGraph | ToolNode | tools_condition

from IPython.display import Image, display

display(Image(graph.get_graph().draw_mermaid_png()))

让我们测试一下我们的应用程序。

请注意，它会适当回应那些不需要额外检索步骤的消息：

# Specify an ID for the thread
config = {"configurable": {"thread_id": "abc123"}}

input_message = "Hello"

for step in graph.stream(
    {"messages": [{"role": "user", "content": input_message}]},
    stream_mode="values",
    config=config,
):
    step["messages"][-1].pretty_print()

================================[1m Human Message [0m=================================

Hello
==================================[1m Ai Message [0m==================================

Hello! How can I assist you today?

在执行搜索时，我们可以流式输出各个步骤，以观察查询生成、检索和答案生成的过程：

input_message = "What is Task Decomposition?"

for step in graph.stream(
    {"messages": [{"role": "user", "content": input_message}]},
    stream_mode="values",
    config=config,
):
    step["messages"][-1].pretty_print()

================================[1m Human Message [0m=================================

What is Task Decomposition?
==================================[1m Ai Message [0m==================================
Tool Calls:
  retrieve (call_RntwX5GMt531biEE9MqSbgLV)
 Call ID: call_RntwX5GMt531biEE9MqSbgLV
  Args:
    query: Task Decomposition
=================================[1m Tool Message [0m=================================
Name: retrieve

Source: {'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}
Content: Fig. 1. Overview of a LLM-powered autonomous agent system.
Component One: Planning#
A complicated task usually involves many steps. An agent needs to know what they are and plan ahead.
Task Decomposition#
Chain of thought (CoT; Wei et al. 2022) has become a standard prompting technique for enhancing model performance on complex tasks. The model is instructed to “think step by step” to utilize more test-time computation to decompose hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.

Source: {'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}
Content: Tree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.
Task decomposition can be done (1) by LLM with simple prompting like "Steps for XYZ.\n1.", "What are the subgoals for achieving XYZ?", (2) by using task-specific instructions; e.g. "Write a story outline." for writing a novel, or (3) with human inputs.
==================================[1m Ai Message [0m==================================

Task Decomposition is the process of breaking down a complicated task into smaller, more manageable steps. It often involves techniques like Chain of Thought (CoT), where the model is prompted to "think step by step," allowing for better handling of complex tasks. This approach enhances model performance and provides insight into the model's reasoning process.

最后，由于我们使用检查点编译了应用程序，历史消息会保留在状态中。这使得模型能够根据上下文理解用户查询：

input_message = "Can you look up some common ways of doing it?"

for step in graph.stream(
    {"messages": [{"role": "user", "content": input_message}]},
    stream_mode="values",
    config=config,
):
    step["messages"][-1].pretty_print()

================================[1m Human Message [0m=================================

Can you look up some common ways of doing it?
==================================[1m Ai Message [0m==================================
Tool Calls:
  retrieve (call_kwO5rYPyJ0MftYKoKRFjKpZM)
 Call ID: call_kwO5rYPyJ0MftYKoKRFjKpZM
  Args:
    query: common methods for task decomposition
=================================[1m Tool Message [0m=================================
Name: retrieve

Source: {'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}
Content: Tree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.
Task decomposition can be done (1) by LLM with simple prompting like "Steps for XYZ.\n1.", "What are the subgoals for achieving XYZ?", (2) by using task-specific instructions; e.g. "Write a story outline." for writing a novel, or (3) with human inputs.

Source: {'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}
Content: Fig. 1. Overview of a LLM-powered autonomous agent system.
Component One: Planning#
A complicated task usually involves many steps. An agent needs to know what they are and plan ahead.
Task Decomposition#
Chain of thought (CoT; Wei et al. 2022) has become a standard prompting technique for enhancing model performance on complex tasks. The model is instructed to “think step by step” to utilize more test-time computation to decompose hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.
==================================[1m Ai Message [0m==================================

Common ways of Task Decomposition include: (1) using large language models (LLMs) with simple prompts like "Steps for XYZ" or "What are the subgoals for achieving XYZ?"; (2) utilizing task-specific instructions, such as "Write a story outline" for creative tasks; and (3) incorporating human inputs to guide the decomposition process.

请注意，我们可以在 LangSmith 跟踪中观察发送给聊天模型的完整消息序列——包括工具调用和检索到的上下文。

对话历史也可以通过应用程序的状态进行检查：

chat_history = graph.get_state(config).values["messages"]
for message in chat_history:
    message.pretty_print()

================================[1m Human Message [0m=================================

Hello
==================================[1m Ai Message [0m==================================

Hello! How can I assist you today?
================================[1m Human Message [0m=================================

What is Task Decomposition?
==================================[1m Ai Message [0m==================================
Tool Calls:
  retrieve (call_RntwX5GMt531biEE9MqSbgLV)
 Call ID: call_RntwX5GMt531biEE9MqSbgLV
  Args:
    query: Task Decomposition
=================================[1m Tool Message [0m=================================
Name: retrieve

Source: {'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}
Content: Fig. 1. Overview of a LLM-powered autonomous agent system.
Component One: Planning#
A complicated task usually involves many steps. An agent needs to know what they are and plan ahead.
Task Decomposition#
Chain of thought (CoT; Wei et al. 2022) has become a standard prompting technique for enhancing model performance on complex tasks. The model is instructed to “think step by step” to utilize more test-time computation to decompose hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.

Source: {'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}
Content: Tree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.
Task decomposition can be done (1) by LLM with simple prompting like "Steps for XYZ.\n1.", "What are the subgoals for achieving XYZ?", (2) by using task-specific instructions; e.g. "Write a story outline." for writing a novel, or (3) with human inputs.
==================================[1m Ai Message [0m==================================

Task Decomposition is the process of breaking down a complicated task into smaller, more manageable steps. It often involves techniques like Chain of Thought (CoT), where the model is prompted to "think step by step," allowing for better handling of complex tasks. This approach enhances model performance and provides insight into the model's reasoning process.
================================[1m Human Message [0m=================================

Can you look up some common ways of doing it?
==================================[1m Ai Message [0m==================================
Tool Calls:
  retrieve (call_kwO5rYPyJ0MftYKoKRFjKpZM)
 Call ID: call_kwO5rYPyJ0MftYKoKRFjKpZM
  Args:
    query: common methods for task decomposition
=================================[1m Tool Message [0m=================================
Name: retrieve

Source: {'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}
Content: Tree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.
Task decomposition can be done (1) by LLM with simple prompting like "Steps for XYZ.\n1.", "What are the subgoals for achieving XYZ?", (2) by using task-specific instructions; e.g. "Write a story outline." for writing a novel, or (3) with human inputs.

Source: {'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}
Content: Fig. 1. Overview of a LLM-powered autonomous agent system.
Component One: Planning#
A complicated task usually involves many steps. An agent needs to know what they are and plan ahead.
Task Decomposition#
Chain of thought (CoT; Wei et al. 2022) has become a standard prompting technique for enhancing model performance on complex tasks. The model is instructed to “think step by step” to utilize more test-time computation to decompose hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.
==================================[1m Ai Message [0m==================================

Common ways of Task Decomposition include: (1) using large language models (LLMs) with simple prompts like "Steps for XYZ" or "What are the subgoals for achieving XYZ?"; (2) utilizing task-specific instructions, such as "Write a story outline" for creative tasks; and (3) incorporating human inputs to guide the decomposition process.

代理

代理利用大型语言模型（LLM）的推理能力，在执行过程中做出决策。使用代理可以将检索过程中的额外决策权交由模型处理。尽管它们的行为不如上述“链”那样可预测，但能够为查询执行多个检索步骤，或对单个搜索进行迭代。

以下是构建一个最小化RAG代理的方法。使用LangGraph的预构建ReAct代理构造器，我们可以在一行代码中完成。

提示

查看 LangGraph 的智能体 RAG 教程，了解更高级的实现方法。

from langgraph.prebuilt import create_react_agent

agent_executor = create_react_agent(llm, [retrieve], checkpointer=memory)

API 参考：create_react_agent

让我们检查一下这个图表：

display(Image(agent_executor.get_graph().draw_mermaid_png()))

与我们早期实现的关键区别在于，这里不再以最终的生成步骤结束运行，而是工具调用会循环返回到最初的LLM调用。模型随后可以使用检索到的上下文回答问题，或生成另一个工具调用以获取更多信息。

让我们来测试一下。我们构造一个通常需要通过一系列迭代的检索步骤才能回答的问题：

config = {"configurable": {"thread_id": "def234"}}

input_message = (
    "What is the standard method for Task Decomposition?\n\n"
    "Once you get the answer, look up common extensions of that method."
)

for event in agent_executor.stream(
    {"messages": [{"role": "user", "content": input_message}]},
    stream_mode="values",
    config=config,
):
    event["messages"][-1].pretty_print()

================================[1m Human Message [0m=================================

What is the standard method for Task Decomposition?

Once you get the answer, look up common extensions of that method.
==================================[1m Ai Message [0m==================================
Tool Calls:
  retrieve (call_rxBqio7dxthnMuzjr4AIquSZ)
 Call ID: call_rxBqio7dxthnMuzjr4AIquSZ
  Args:
    query: standard method for Task Decomposition
=================================[1m Tool Message [0m=================================
Name: retrieve

Source: {'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}
Content: Tree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.
Task decomposition can be done (1) by LLM with simple prompting like "Steps for XYZ.\n1.", "What are the subgoals for achieving XYZ?", (2) by using task-specific instructions; e.g. "Write a story outline." for writing a novel, or (3) with human inputs.

Source: {'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}
Content: Fig. 1. Overview of a LLM-powered autonomous agent system.
Component One: Planning#
A complicated task usually involves many steps. An agent needs to know what they are and plan ahead.
Task Decomposition#
Chain of thought (CoT; Wei et al. 2022) has become a standard prompting technique for enhancing model performance on complex tasks. The model is instructed to “think step by step” to utilize more test-time computation to decompose hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.
==================================[1m Ai Message [0m==================================
Tool Calls:
  retrieve (call_kmQMRWCKeBdtXdlJi8yZD9CO)
 Call ID: call_kmQMRWCKeBdtXdlJi8yZD9CO
  Args:
    query: common extensions of Task Decomposition methods
=================================[1m Tool Message [0m=================================
Name: retrieve

Source: {'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}
Content: Tree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.
Task decomposition can be done (1) by LLM with simple prompting like "Steps for XYZ.\n1.", "What are the subgoals for achieving XYZ?", (2) by using task-specific instructions; e.g. "Write a story outline." for writing a novel, or (3) with human inputs.

Source: {'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}
Content: Fig. 1. Overview of a LLM-powered autonomous agent system.
Component One: Planning#
A complicated task usually involves many steps. An agent needs to know what they are and plan ahead.
Task Decomposition#
Chain of thought (CoT; Wei et al. 2022) has become a standard prompting technique for enhancing model performance on complex tasks. The model is instructed to “think step by step” to utilize more test-time computation to decompose hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.
==================================[1m Ai Message [0m==================================

The standard method for Task Decomposition involves breaking down complex tasks into smaller, manageable steps. Here are the main techniques:

1. **Chain of Thought (CoT)**: This prompting technique encourages a model to "think step by step," allowing it to utilize more computational resources during testing to decompose challenging tasks into simpler parts. CoT not only simplifies tasks but also provides insights into the model's reasoning process.

2. **Simple Prompting**: This can involve straightforward queries like "Steps for XYZ" or "What are the subgoals for achieving XYZ?" to guide the model in identifying the necessary steps.

3. **Task-specific Instructions**: Using specific prompts tailored to the task at hand, such as "Write a story outline" for creative writing, allows for more directed decomposition.

4. **Human Inputs**: Involving human expertise can also aid in breaking down tasks effectively.

### Common Extensions of Task Decomposition Methods

1. **Tree of Thoughts**: This method extends CoT by exploring multiple reasoning possibilities at each step. It decomposes the problem into various thought steps and generates multiple thoughts per step, forming a tree structure. This can utilize search processes like breadth-first search (BFS) or depth-first search (DFS) to evaluate states through classifiers or majority voting.

These extensions build on the basic principles of task decomposition, enhancing the depth and breadth of reasoning applied to complex tasks.

请注意，该代理：

生成用于搜索任务分解标准方法的查询；
接收答案后，生成第二个查询以搜索其常见的扩展内容；
在获取所有必要上下文后，回答问题。

我们可以在 LangSmith 跟踪中查看完整的步骤序列，以及延迟和其他元数据。

下一步

我们已经介绍了构建一个基本对话式问答应用的步骤：

我们使用链来构建一个可预测的应用程序，该应用程序会为每个用户输入生成搜索查询；
我们使用代理来构建一个应用程序，该程序“决定”何时以及如何生成搜索查询。

要探索不同类型的检索器和检索策略，请访问指南中的检索器部分。

有关 LangChain 会话记忆抽象的详细指南，请访问如何添加消息历史记录（记忆） LCEL 页面。

要了解有关代理的更多信息，请访问代理模块。

设置​

依赖项​

LangSmith​

组件​

链式调用​

代理​

下一步​

设置

依赖项

LangSmith

组件

链式调用

代理

下一步