Skip to main content
Open In Colab在 GitHub 上打开

上下文 AI Reranker

上下文 AI 的 Instruction-Following Reranker 是世界上第一个重新排序器,旨在遵循有关如何根据特定标准(如新近度、来源和元数据)确定文档优先级的自定义说明。凭借在 BEIR 基准测试中的卓越性能(得分 61.2 分,显著优于竞争对手),它为企业 RAG 应用程序提供了前所未有的控制和准确性。

主要功能

  • 指令遵循:通过自然语言命令动态控制文档排名
  • 冲突解决:智能处理来自多个知识来源的矛盾信息
  • 卓越的准确性:在行业基准上实现最先进的性能
  • 无缝集成:直接替换 RAG 管道中的现有重新排名器

reranker 擅长解决企业知识库中的实际挑战,例如优先考虑最近的文档而不是过时的文档,或者优先考虑内部文档而不是外部来源。

要了解有关我们的指令遵循 reranker 的更多信息并查看其实际示例,请访问我们的产品概述

有关 Contextual AI 产品的全面文档,请访问我们的开发人员门户

此集成需要contextual-clientPython SDK 的 SDK 中。在此处了解更多信息。

概述

此集成调用 Contextual AI 的 Grounded Language Model。

集成详细信息

本地化序列 化JS 支持软件包下载最新包装
ContextualReranklangchain-contextualbetaPyPI - DownloadsPyPI - Version

设置

要访问 Context 的 reranker 模型,您需要创建一个/一个 Contextual AI 帐户,获取 API 密钥,并安装langchain-contextual集成包。

凭据

前往 app.contextual.ai 注册 Textual 并生成 API 密钥。完成此作后,设置 CONTEXTUAL_AI_API_KEY 环境变量:

import getpass
import os

if not os.getenv("CONTEXTUAL_AI_API_KEY"):
os.environ["CONTEXTUAL_AI_API_KEY"] = getpass.getpass(
"Enter your Contextual API key: "
)

安装

LangChain 上下文集成位于langchain-contextual包:

%pip install -qU langchain-contextual

实例

上下文 Reranker 参数是:

参数类型描述
documentslist[Document]A sequence of documents to rerank. Any metadata contained in the documents will also be used for reranking.
querystrThe query to use for reranking.
modelstrThe version of the reranker to use. Currently, we just have "ctxl-rerank-en-v1-instruct".
top_nOptional[int]The number of results to return. If None returns all results. Defaults to self.top_n.
instructionOptional[str]The instruction to be used for the reranker.
callbacksOptional[Callbacks]Callbacks to run during the compression process.
from langchain_contextual import ContextualRerank

api_key = ""
model = "ctxl-rerank-en-v1-instruct"

compressor = ContextualRerank(
model=model,
api_key=api_key,
)

用法

首先,我们将设置我们将使用的全局变量和示例,并实例化我们的 reranker 客户端。

from langchain_core.documents import Document

query = "What is the current enterprise pricing for the RTX 5090 GPU for bulk orders?"
instruction = "Prioritize internal sales documents over market analysis reports. More recent documents should be weighted higher. Enterprise portal content supersedes distributor communications."

document_contents = [
"Following detailed cost analysis and market research, we have implemented the following changes: AI training clusters will see a 15% uplift in raw compute performance, enterprise support packages are being restructured, and bulk procurement programs (100+ units) for the RTX 5090 Enterprise series will operate on a $2,899 baseline.",
"Enterprise pricing for the RTX 5090 GPU bulk orders (100+ units) is currently set at $3,100-$3,300 per unit. This pricing for RTX 5090 enterprise bulk orders has been confirmed across all major distribution channels.",
"RTX 5090 Enterprise GPU requires 450W TDP and 20% cooling overhead.",
]

metadata = [
{
"Date": "January 15, 2025",
"Source": "NVIDIA Enterprise Sales Portal",
"Classification": "Internal Use Only",
},
{"Date": "11/30/2023", "Source": "TechAnalytics Research Group"},
{
"Date": "January 25, 2025",
"Source": "NVIDIA Enterprise Sales Portal",
"Classification": "Internal Use Only",
},
]

documents = [
Document(page_content=content, metadata=metadata[i])
for i, content in enumerate(document_contents)
]
reranked_documents = compressor.compress_documents(
query=query,
instruction=instruction,
documents=documents,
)
API 参考:文档

在链内使用

示例即将推出。

API 参考

有关所有 ChatContextual 功能和配置的详细文档,请前往 Github 页面:https://github.com/ContextualAI//langchain-contextual