AstraDB
DataStax Astra DB 是一个基于
Apache Cassandra®构建的无服务器 AI 就绪数据库,并通过易于使用的 JSON API 方便地提供。
概览
The Astra DB Document Loader 返回一个从 Astra DB 集合中读取的 Langchain Document 对象列表。
该加载器接受以下参数:<br/>
api_endpoint: Astra DB API端点。看起来像https://01234567-89ab-cdef-0123-456789abcdef-us-east1.apps.astra.datastax.comtoken: Astra DB令牌。看起来像AstraCS:aBcD0123...collection_name: AstraDB集合名称namespace: (可选)AstraDB 命名空间(在 Astra DB 中称为 keyspace)filter_criteria: (可选)查询中使用的筛选器projection: (可选) 在查询中使用的投影limit: (可选) 需检索的最大文档数extraction_function: (可选)一个函数,用于将AstraDB文档转换为LangChainpage_content字符串。默认情况下是json.dumps
The loader sets the following metadata for the documents it reads:
metadata={
"namespace": "...",
"api_endpoint": "...",
"collection": "..."
}
设置
!pip install "langchain-astradb>=0.6,<0.7"
使用文档加载器加载文档
from langchain_astradb import AstraDBLoader
API 参考:AstraDB 加载器
from getpass import getpass
ASTRA_DB_API_ENDPOINT = input("ASTRA_DB_API_ENDPOINT = ")
ASTRA_DB_APPLICATION_TOKEN = getpass("ASTRA_DB_APPLICATION_TOKEN = ")
ASTRA_DB_API_ENDPOINT = https://01234567-89ab-cdef-0123-456789abcdef-us-east1.apps.astra.datastax.com
ASTRA_DB_APPLICATION_TOKEN = ········
loader = AstraDBLoader(
api_endpoint=ASTRA_DB_API_ENDPOINT,
token=ASTRA_DB_APPLICATION_TOKEN,
collection_name="movie_reviews",
projection={"title": 1, "reviewtext": 1},
limit=10,
)
docs = loader.load()
docs[0]
Document(metadata={'namespace': 'default_keyspace', 'api_endpoint': 'https://01234567-89ab-cdef-0123-456789abcdef-us-east1.apps.astra.datastax.com', 'collection': 'movie_reviews'}, page_content='{"_id": "659bdffa16cbc4586b11a423", "title": "Dangerous Men", "reviewtext": "\\"Dangerous Men,\\" the picture\'s production notes inform, took 26 years to reach the big screen. After having seen it, I wonder: What was the rush?"}')