DashVector
DashVector 是一项完全托管的 vectorDB 服务,支持高维密集和稀疏向量、实时插入和筛选搜索。它专为自动扩展而构建,可以适应不同的应用程序要求。
此笔记本展示了如何使用与DashVectorvector 数据库。
要使用 DashVector,您必须拥有 API 密钥。 以下是安装说明。
安装
%pip install --upgrade --quiet langchain-community dashvector dashscope
我们想使用DashScopeEmbeddings因此,我们还必须获取 Dashscope API 密钥。
import getpass
import os
if "DASHVECTOR_API_KEY" not in os.environ:
os.environ["DASHVECTOR_API_KEY"] = getpass.getpass("DashVector API Key:")
if "DASHSCOPE_API_KEY" not in os.environ:
os.environ["DASHSCOPE_API_KEY"] = getpass.getpass("DashScope API Key:")
例
from langchain_community.embeddings.dashscope import DashScopeEmbeddings
from langchain_community.vectorstores import DashVector
from langchain_text_splitters import CharacterTextSplitter
from langchain_community.document_loaders import TextLoader
loader = TextLoader("../../how_to/state_of_the_union.txt")
documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
docs = text_splitter.split_documents(documents)
embeddings = DashScopeEmbeddings()
API 参考:TextLoader
我们可以从文档创建 DashVector。
dashvector = DashVector.from_documents(docs, embeddings)
query = "What did the president say about Ketanji Brown Jackson"
docs = dashvector.similarity_search(query)
print(docs)
我们可以添加带有 meta data 和 id 的文本,并使用 meta filter 进行搜索。
texts = ["foo", "bar", "baz"]
metadatas = [{"key": i} for i in range(len(texts))]
ids = ["0", "1", "2"]
dashvector.add_texts(texts, metadatas=metadatas, ids=ids)
docs = dashvector.similarity_search("foo", filter="key = 2")
print(docs)
[Document(page_content='baz', metadata={'key': 2})]
工作频段partition参数
这partition参数默认为 default,如果不存在partitionparameter 时,将partition将自动创建。
texts = ["foo", "bar", "baz"]
metadatas = [{"key": i} for i in range(len(texts))]
ids = ["0", "1", "2"]
partition = "langchain"
# add texts
dashvector.add_texts(texts, metadatas=metadatas, ids=ids, partition=partition)
# similarity search
query = "What did the president say about Ketanji Brown Jackson"
docs = dashvector.similarity_search(query, partition=partition)
# delete
dashvector.delete(ids=ids, partition=partition)