Qdrant 的 FastEmbed

Qdrant 的 FastEmbed 是一个轻量级、快速的 Python 库，专为嵌入生成而构建。

量化模型权重

ONNX 运行时，无 PyTorch 依赖项

CPU 优先设计

用于大型数据集编码的数据并行性。

依赖

要将 FastEmbed 与 LangChain 一起使用，请安装fastembedPython 包。

%pip install --upgrade --quiet  fastembed

from langchain_community.embeddings.fastembed import FastEmbedEmbeddings

model_name: str（默认值：“BAAI/bge-small-en-v1.5”）

要使用的 FastEmbedding 模型的名称。您可以在此处找到支持的型号列表。
max_length: int（默认值：512）

最大令牌数。值为 512 >的未知行为。
cache_dir: Optional[str]（默认值：None）

缓存目录的路径。默认为local_cache在父目录中。
threads: Optional[int]（默认值：None）

单个 onnxruntime 会话可以使用的线程数。
doc_embed_type: Literal["default", "passage"]（default： “default”）

“default”：使用 FastEmbed 的默认嵌入方法。

“passage”：嵌入前为文本添加“passage”前缀。
batch_size: int（默认值：256）

编码的批量大小。值越高，占用的内存越多，但速度越快。
parallel: Optional[int]（默认值：None）

如果>1，将使用数据并行编码，建议用于大型数据集的离线编码。如果0，使用所有可用核心。如果None，不要使用数据并行处理，而是使用默认的 onnxruntime 线程。

embeddings = FastEmbedEmbeddings()

document_embeddings = embeddings.embed_documents(
    ["This is a document", "This is some other document"]
)

query_embeddings = embeddings.embed_query("This is a query")