SageMaker 追踪
Amazon SageMaker 是一种完全托管的服务,可用于快速、轻松地构建、训练和部署机器学习(ML)模型。
Amazon SageMaker Experiments 是
Amazon SageMaker的一项功能,可让您组织、跟踪、比较和评估机器学习实验及模型版本。
该笔记本展示了如何使用 LangChain 回调将提示和其他 LLM 超参数记录并跟踪到 SageMaker Experiments 中。在这里,我们使用不同的场景来展示其功能:
- 场景1: 单个大语言模型(LLM) - 使用单个LLM模型根据给定提示生成输出的情况。
- 场景 2:顺序链 - 使用两个大语言模型(LLM)的顺序链的情况。
- 场景3:带工具的智能体(思维链) - 一种除了使用大语言模型(LLM)外,还结合多个工具(如搜索和数学计算)的情况。
在本笔记本中,我们将创建一个实验来记录每种场景中的提示。
安装与设置¶
%pip install --upgrade --quiet sagemaker
%pip install --upgrade --quiet langchain-openai
%pip install --upgrade --quiet google-search-results
首先,设置所需的API密钥
- OpenAI: https://platform.openai.com/account/api-keys (适用于 OpenAI 大语言模型)
- 谷歌 SERP API:https://serpapi.com/manage-api-key(用于谷歌搜索工具)
import os
## Add your API keys below
os.environ["OPENAI_API_KEY"] = "<ADD-KEY-HERE>"
os.environ["SERPAPI_API_KEY"] = "<ADD-KEY-HERE>"
from langchain_community.callbacks.sagemaker_callback import SageMakerCallbackHandler
API 参考:SageMaker 回调处理器
from langchain.agents import initialize_agent, load_tools
from langchain.chains import LLMChain, SimpleSequentialChain
from langchain_core.prompts import PromptTemplate
from langchain_openai import OpenAI
from sagemaker.analytics import ExperimentAnalytics
from sagemaker.experiments.run import Run
from sagemaker.session import Session
LLM 提示跟踪
# LLM Hyperparameters
HPARAMS = {
"temperature": 0.1,
"model_name": "gpt-3.5-turbo-instruct",
}
# Bucket used to save prompt logs (Use `None` is used to save the default bucket or otherwise change it)
BUCKET_NAME = None
# Experiment name
EXPERIMENT_NAME = "langchain-sagemaker-tracker"
# Create SageMaker Session with the given bucket
session = Session(default_bucket=BUCKET_NAME)
场景1 - 大型语言模型
RUN_NAME = "run-scenario-1"
PROMPT_TEMPLATE = "tell me a joke about {topic}"
INPUT_VARIABLES = {"topic": "fish"}
with Run(
experiment_name=EXPERIMENT_NAME, run_name=RUN_NAME, sagemaker_session=session
) as run:
# Create SageMaker Callback
sagemaker_callback = SageMakerCallbackHandler(run)
# Define LLM model with callback
llm = OpenAI(callbacks=[sagemaker_callback], **HPARAMS)
# Create prompt template
prompt = PromptTemplate.from_template(template=PROMPT_TEMPLATE)
# Create LLM Chain
chain = LLMChain(llm=llm, prompt=prompt, callbacks=[sagemaker_callback])
# Run chain
chain.run(**INPUT_VARIABLES)
# Reset the callback
sagemaker_callback.flush_tracker()
场景2 - 顺序链
RUN_NAME = "run-scenario-2"
PROMPT_TEMPLATE_1 = """You are a playwright. Given the title of play, it is your job to write a synopsis for that title.
Title: {title}
Playwright: This is a synopsis for the above play:"""
PROMPT_TEMPLATE_2 = """You are a play critic from the New York Times. Given the synopsis of play, it is your job to write a review for that play.
Play Synopsis: {synopsis}
Review from a New York Times play critic of the above play:"""
INPUT_VARIABLES = {
"input": "documentary about good video games that push the boundary of game design"
}
with Run(
experiment_name=EXPERIMENT_NAME, run_name=RUN_NAME, sagemaker_session=session
) as run:
# Create SageMaker Callback
sagemaker_callback = SageMakerCallbackHandler(run)
# Create prompt templates for the chain
prompt_template1 = PromptTemplate.from_template(template=PROMPT_TEMPLATE_1)
prompt_template2 = PromptTemplate.from_template(template=PROMPT_TEMPLATE_2)
# Define LLM model with callback
llm = OpenAI(callbacks=[sagemaker_callback], **HPARAMS)
# Create chain1
chain1 = LLMChain(llm=llm, prompt=prompt_template1, callbacks=[sagemaker_callback])
# Create chain2
chain2 = LLMChain(llm=llm, prompt=prompt_template2, callbacks=[sagemaker_callback])
# Create Sequential chain
overall_chain = SimpleSequentialChain(
chains=[chain1, chain2], callbacks=[sagemaker_callback]
)
# Run overall sequential chain
overall_chain.run(**INPUT_VARIABLES)
# Reset the callback
sagemaker_callback.flush_tracker()
场景3 - 带工具的代理
RUN_NAME = "run-scenario-3"
PROMPT_TEMPLATE = "Who is the oldest person alive? And what is their current age raised to the power of 1.51?"
with Run(
experiment_name=EXPERIMENT_NAME, run_name=RUN_NAME, sagemaker_session=session
) as run:
# Create SageMaker Callback
sagemaker_callback = SageMakerCallbackHandler(run)
# Define LLM model with callback
llm = OpenAI(callbacks=[sagemaker_callback], **HPARAMS)
# Define tools
tools = load_tools(["serpapi", "llm-math"], llm=llm, callbacks=[sagemaker_callback])
# Initialize agent with all the tools
agent = initialize_agent(
tools, llm, agent="zero-shot-react-description", callbacks=[sagemaker_callback]
)
# Run agent
agent.run(input=PROMPT_TEMPLATE)
# Reset the callback
sagemaker_callback.flush_tracker()
加载日志数据
提示记录后,我们可以轻松地将其加载并转换为Pandas DataFrame,如下所示。
# Load
logs = ExperimentAnalytics(experiment_name=EXPERIMENT_NAME)
# Convert as pandas dataframe
df = logs.dataframe(force_refresh=True)
print(df.shape)
df.head()
如上所示,该实验包含对应每个场景的三个运行(行)。每次运行都会将提示词及相关的大语言模型设置/超参数以json格式记录,并保存到s3存储桶中。您可以随时从每个json路径加载并查看日志数据。