Skip to main content
Open In ColabOpen on GitHub

ZoteroRetriever

这将帮助你开始使用Zotero 检索器。要查看所有ZoteroRetriever功能和配置的详细文档,请访问Github页面

集成细节

检索器来源
ZoteroRetrieverZotero APIlangchain-community

设置

如果您想要从单个查询中获取自动跟踪,您也可以通过取消注释下方代码来设置您的LangSmith API密钥:

# os.environ["LANGSMITH_API_KEY"] = getpass.getpass("Enter your LangSmith API key: ")
# os.environ["LANGSMITH_TRACING"] = "true"

安装

这个检索器位于langchain-zotero-retriever包中。我们还需要pyzotero依赖项:

%pip install -qU langchain-zotero-retriever pyzotero

Instantiation

ZoteroRetriever 参数包括:

  • k: 包含的结果数量(默认值:50)
  • type: 执行的搜索类型。"Top"检索顶级Zotero库项目,"items"返回任何Zotero库项目。(默认:top)
  • get_fulltext: 如果库中的项附带了全文文本,则检索这些全文文本。如果为 False 或没有附加文本,则返回一个空字符串作为 page_content。(默认: True)
  • library_id: ID of the Zotero library to search. Required to connect to a library.
  • library_type: 搜索的库类型。"user"表示个人库,"group"表示共享组库。(默认:user)
  • api_key: Zotero API密钥(如果未设置为环境变量)。可选,用于访问非公共组库或个人库。如果作为ZOTERO_API_KEY环境变量提供,则会自动获取。
from langchain_zotero_retriever.retrievers import ZoteroRetriever

retriever = ZoteroRetriever(
k=10,
library_id="2319375", # a public group library that does not require an API key for access
library_type="group", # set this to "user" if you are using a personal library. Personal libraries require an API key
)

用法

除了query,检索器还提供以下搜索参数:

  • itemType: 搜索项目的类型(例如:“书籍”或“期刊文章”)
  • tag: 对库项目附加的标签进行搜索(参见搜索语法以组合多个标签)
  • qmode: 搜索模式。更改查询搜索的内容范围。“everything”包括全文内容。“titleCreatorYear”则用于在标题、作者和年份中进行搜索。
  • since: 返回指定库版本之后修改的对象。默认情况下返回所有内容。

对于搜索语法,请参阅Zotero API文档:https://www.zotero.org/support/dev/web_api/v3/basics#search_syntax

对于完整的API架构(包括可用的itemTypes),请参见:https://github.com/zotero/zotero-schema

query = "Zuboff"

retriever.invoke(query)
tags = [
"Surveillance",
"Digital Capitalism",
] # note that providing tags as a list will result in a logical AND operation

retriever.invoke("", tag=tags)

使用在链中

由于 Zotero API 搜索的工作方式,直接将用户问题传递给 ZoteroRetriever 往往不会返回满意的结果。在链条或代理框架中使用时,建议将 ZoteroRetriever 转换为 工具。这样,LLM 可以将用户查询转换为更简洁的搜索查询用于 API。此外,这还允许 LLM 填充额外的搜索参数,例如标签或项目类型。

from typing import List, Optional, Union

from langchain_core.output_parsers import PydanticToolsParser
from langchain_core.tools import StructuredTool, tool
from langchain_openai import ChatOpenAI


def retrieve(
query: str,
itemType: Optional[str],
tag: Optional[Union[str, List[str]]],
qmode: str = "everything",
since: Optional[int] = None,
):
retrieved_docs = retriever.invoke(
query, itemType=itemType, tag=tag, qmode=qmode, since=since
)
serialized_docs = "\n\n".join(
(
f"Metadata: { {key: doc.metadata[key] for key in doc.metadata if key != 'abstractNote'} }\n"
f"Abstract: {doc.metadata['abstractNote']}\n"
)
for doc in retrieved_docs
)

return serialized_docs, retrieved_docs


description = """Search and return relevant documents from a Zotero library. The following search parameters can be used:

Args:
query: str: The search query to be used. Try to keep this specific and short, e.g. a specific topic or author name
itemType: Optional. Type of item to search for (e.g. "book" or "journalArticle"). Multiple types can be passed as a string separated by "||", e.g. "book || journalArticle". Defaults to all types.
tag: Optional. For searching over tags attached to library items. If documents tagged with multiple tags are to be retrieved, pass them as a list. If documents with any of the tags are to be retrieved, pass them as a string separated by "||", e.g. "tag1 || tag2"
qmode: Search mode to use. Changes what the query searches over. "everything" includes full-text content. "titleCreatorYear" to search over title, authors and year. Defaults to "everything".
since: Return only objects modified after the specified library version. Defaults to return everything.
"""

retriever_tool = StructuredTool.from_function(
func=retrieve,
name="retrieve",
description=description,
return_direct=True,
)


llm = ChatOpenAI(model="gpt-4o-mini-2024-07-18")

llm_with_tools = llm.bind_tools([retrieve])

q = "What journal articles do I have on Surveillance in the zotero library?"

chain = llm_with_tools | PydanticToolsParser(tools=[retrieve])

chain.invoke(q)

API 参考

详细介绍了所有ZoteroRetriever功能和配置的文档,请访问GitHub页面

有关Zotero API的详细文档,请访问Zotero API参考