上下文 AI 重排序器
Contextual AI的指令跟随排序器是世界上第一个旨在遵循关于根据特定标准(如时效性、来源和元数据)优先处理文档的定制指令的排序器。在BEIR基准测试中表现出色(得分为61.2,并显著超越竞争对手),它为企业的RAG应用提供了前所未有的控制力和准确性。
Key Capabilities
- 指令遵循: 通过自然语言命令动态控制文档排名
- 冲突解决: 从多个知识来源智能处理矛盾信息
- 卓越的准确性: 在行业基准测试中实现最先进的性能
- 无缝集成:您的RAG管道中现有rerankers的即插即用替代方案
reranker 在企业知识库中解决实际问题方面表现出色,例如优先考虑较新的文档而非过时的文档,或者更倾向于内部文档而非外部来源。
要了解我们的指令跟随排序算法的更多信息并查看其运行示例,请访问我们的产品概述。
有关Contextual AI产品的完整文档,请访问我们的开发者门户。
此集成需要使用contextual-client Python SDK。了解有关它的更多信息,请访问此处。
概览
此集成调用了Contextual AI的接地语言模型。
集成细节
| Class | 包 | 本地 | 序列化 | JS支持 | Package downloads | Package 最新版本 |
|---|---|---|---|---|---|---|
| ContextualRerank | langchain-contextual | ❌ | beta | ❌ |
设置
要访问Contextual的重排序模型,您需要创建一个/一个Contextual AI账户、获取API密钥,并安装langchain-contextual集成包。
Credentials
前往 app.contextual.ai 注册 Contextual 并生成 API 密钥。完成这些步骤后,请设置 CONTEXTUAL_AI_API_KEY 环境变量:
import getpass
import os
if not os.getenv("CONTEXTUAL_AI_API_KEY"):
os.environ["CONTEXTUAL_AI_API_KEY"] = getpass.getpass(
"Enter your Contextual API key: "
)
安装
The LangChain 上下文集成存在于 langchain-contextual 包中:
%pip install -qU langchain-contextual
Instantiation
Contextual Reranker 参数是:
| 参数 | 类型 | 描述 |
|---|---|---|
| documents | list[Document] | A sequence of documents to rerank. Any metadata contained in the documents will also be used for reranking. |
| query | str | The query to use for reranking. |
| model | str | The version of the reranker to use. Currently, we just have "ctxl-rerank-en-v1-instruct". |
| top_n | Optional[int] | The number of results to return. If None returns all results. Defaults to self.top_n. |
| instruction | Optional[str] | The instruction to be used for the reranker. |
| callbacks | Optional[Callbacks] | Callbacks to run during the compression process. |
from langchain_contextual import ContextualRerank
api_key = ""
model = "ctxl-rerank-en-v1-instruct"
compressor = ContextualRerank(
model=model,
api_key=api_key,
)
用法
首先,我们将设置全局变量和我们将会使用的示例,并实例化我们的重排序客户端。
from langchain_core.documents import Document
query = "What is the current enterprise pricing for the RTX 5090 GPU for bulk orders?"
instruction = "Prioritize internal sales documents over market analysis reports. More recent documents should be weighted higher. Enterprise portal content supersedes distributor communications."
document_contents = [
"Following detailed cost analysis and market research, we have implemented the following changes: AI training clusters will see a 15% uplift in raw compute performance, enterprise support packages are being restructured, and bulk procurement programs (100+ units) for the RTX 5090 Enterprise series will operate on a $2,899 baseline.",
"Enterprise pricing for the RTX 5090 GPU bulk orders (100+ units) is currently set at $3,100-$3,300 per unit. This pricing for RTX 5090 enterprise bulk orders has been confirmed across all major distribution channels.",
"RTX 5090 Enterprise GPU requires 450W TDP and 20% cooling overhead.",
]
metadata = [
{
"Date": "January 15, 2025",
"Source": "NVIDIA Enterprise Sales Portal",
"Classification": "Internal Use Only",
},
{"Date": "11/30/2023", "Source": "TechAnalytics Research Group"},
{
"Date": "January 25, 2025",
"Source": "NVIDIA Enterprise Sales Portal",
"Classification": "Internal Use Only",
},
]
documents = [
Document(page_content=content, metadata=metadata[i])
for i, content in enumerate(document_contents)
]
reranked_documents = compressor.compress_documents(
query=query,
instruction=instruction,
documents=documents,
)
API 参考:文档
使用在链中
示例即将推出。
API 参考
详细介绍了所有ChatContextual功能和配置的文档,请访问Github页面: https://github.com/ContextualAI//langchain-contextual