MLX 本地管道

MLX 模型可以通过MLXPipeline类。

MLX 社区托管了 150 多个模型，这些模型都是开源的，并在 Hugging Face Model Hub 上公开提供，Hugging Face Model Hub 是一个在线平台，人们可以在其中轻松协作并共同构建 ML。

这些可以通过此本地管道包装器或通过 MlXPipeline 类调用其托管推理终端节点从 LangChain 调用。有关 mlx 的更多信息，请参阅示例存储库笔记本。

要使用mlx-lmpython 软件包以及 transformers。您还可以安装huggingface_hub.

%pip install --upgrade --quiet  mlx-lm transformers huggingface_hub

模型加载

可以通过使用from_model_id方法。

from langchain_community.llms.mlx_pipeline import MLXPipeline

pipe = MLXPipeline.from_model_id(
    "mlx-community/quantized-gemma-2b-it",
    pipeline_kwargs={"max_tokens": 10, "temp": 0.1},
)

API 参考：MLXPipeline

它们也可以通过传入现有的transformers管道直接

from mlx_lm import load

model, tokenizer = load("mlx-community/quantized-gemma-2b-it")
pipe = MLXPipeline(model=model, tokenizer=tokenizer)

Create Chain （创建链）

将模型加载到内存中后，您可以编写它，并提示形成一个链。

from langchain_core.prompts import PromptTemplate

template = """Question: {question}

Answer: Let's think step by step."""
prompt = PromptTemplate.from_template(template)

chain = prompt | pipe

question = "What is electroencephalography?"

print(chain.invoke({"question": question}))

API 参考：PromptTemplate

LLM 概念指南
LLM 操作指南

模型加载

Create Chain （创建链）

相关