Skip to main content
Open In ColabOpen on GitHub

Neo4J

Neo4j 是由 Neo4j, Inc 开发的图数据库管理系统。

数据元素 Neo4j 存储的是节点、连接它们的边,以及节点和边的属性。其开发者将其描述为一个符合 ACID 的事务型数据库,具有原生的图存储与处理能力。Neo4j 以非开源的“社区版”形式提供,采用修改版的 GNU 通用公共许可证授权,而在线备份和高可用性扩展功能则采用闭源的商业许可证授权。Neo 还在闭源的商业条款下授权 Neo4j 使用这些扩展功能。

此笔记本展示了如何使用大语言模型(LLM)为图数据库提供自然语言接口,您可以使用 Cypher 查询语言对其进行查询。

Cypher 是一种声明式的图查询语言,可在属性图中实现富有表现力且高效的数据查询。

设置

您需要有一个正在运行的 Neo4j 实例。一种选择是在 Neo4j 的 Aura 云服务中创建一个免费的 Neo4j 数据库实例。您也可以使用Neo4j Desktop 应用程序在本地运行数据库,或通过运行 Docker 容器来启动。 您可以通过执行以下脚本来运行本地 Docker 容器:

docker run \
--name neo4j \
-p 7474:7474 -p 7687:7687 \
-d \
-e NEO4J_AUTH=neo4j/password \
-e NEO4J_PLUGINS=\[\"apoc\"\] \
neo4j:latest

如果您使用的是 Docker 容器,则需要等待几秒钟,以便数据库启动。

from langchain_neo4j import GraphCypherQAChain, Neo4jGraph
from langchain_openai import ChatOpenAI
graph = Neo4jGraph(url="bolt://localhost:7687", username="neo4j", password="password")

本指南默认使用 OpenAI 模型。

import getpass
import os

if "OPENAI_API_KEY" not in os.environ:
os.environ["OPENAI_API_KEY"] = getpass.getpass("OpenAI API Key:")

填充数据库

假设您的数据库为空,您可以使用Cypher查询语言来填充它。以下Cypher语句是幂等的,这意味着无论您运行一次还是多次,数据库中的信息都将保持不变。

graph.query(
"""
MERGE (m:Movie {name:"Top Gun", runtime: 120})
WITH m
UNWIND ["Tom Cruise", "Val Kilmer", "Anthony Edwards", "Meg Ryan"] AS actor
MERGE (a:Actor {name:actor})
MERGE (a)-[:ACTED_IN]->(m)
"""
)
[]

刷新图谱模式信息

如果数据库的模式发生变化,您可以刷新生成Cypher语句所需的模式信息。

graph.refresh_schema()
print(graph.schema)
Node properties:
Movie {runtime: INTEGER, name: STRING}
Actor {name: STRING}
Relationship properties:

The relationships:
(:Actor)-[:ACTED_IN]->(:Movie)

增强的模式信息

选择增强型模式版本后,系统将自动扫描数据库中的示例值,并计算一些分布统计信息。例如,如果某个节点属性的唯一值少于10个,我们将在模式中返回所有可能的值;否则,每个节点和关系属性仅返回一个示例值。

enhanced_graph = Neo4jGraph(
url="bolt://localhost:7687",
username="neo4j",
password="password",
enhanced_schema=True,
)
print(enhanced_graph.schema)
Node properties:
- **Movie**
- `runtime`: INTEGER Min: 120, Max: 120
- `name`: STRING Available options: ['Top Gun']
- **Actor**
- `name`: STRING Available options: ['Tom Cruise', 'Val Kilmer', 'Anthony Edwards', 'Meg Ryan']
Relationship properties:

The relationships:
(:Actor)-[:ACTED_IN]->(:Movie)

查询图谱

我们现在可以使用图谱Cypher问答链来向图谱提问

chain = GraphCypherQAChain.from_llm(
ChatOpenAI(temperature=0), graph=graph, verbose=True, allow_dangerous_requests=True
)
chain.invoke({"query": "Who played in Top Gun?"})


> Entering new GraphCypherQAChain chain...
Generated Cypher:
MATCH (a:Actor)-[:ACTED_IN]->(m:Movie)
WHERE m.name = 'Top Gun'
RETURN a.name
Full Context:
[{'a.name': 'Tom Cruise'}, {'a.name': 'Val Kilmer'}, {'a.name': 'Anthony Edwards'}, {'a.name': 'Meg Ryan'}]

> Finished chain.
{'query': 'Who played in Top Gun?',
'result': 'Tom Cruise, Val Kilmer, Anthony Edwards, and Meg Ryan played in Top Gun.'}

限制结果数量

您可以通过使用 top_k 参数来限制 Cypher QA 链返回的结果数量。 默认值为 10。

chain = GraphCypherQAChain.from_llm(
ChatOpenAI(temperature=0),
graph=graph,
verbose=True,
top_k=2,
allow_dangerous_requests=True,
)
chain.invoke({"query": "Who played in Top Gun?"})


> Entering new GraphCypherQAChain chain...
Generated Cypher:
MATCH (a:Actor)-[:ACTED_IN]->(m:Movie)
WHERE m.name = 'Top Gun'
RETURN a.name
Full Context:
[{'a.name': 'Tom Cruise'}, {'a.name': 'Val Kilmer'}]

> Finished chain.
{'query': 'Who played in Top Gun?',
'result': 'Tom Cruise, Val Kilmer played in Top Gun.'}

返回中间结果

您可以使用 return_intermediate_steps 参数从Cypher QA链返回中间步骤

chain = GraphCypherQAChain.from_llm(
ChatOpenAI(temperature=0),
graph=graph,
verbose=True,
return_intermediate_steps=True,
allow_dangerous_requests=True,
)
result = chain.invoke({"query": "Who played in Top Gun?"})
print(f"Intermediate steps: {result['intermediate_steps']}")
print(f"Final answer: {result['result']}")


> Entering new GraphCypherQAChain chain...
Generated Cypher:
MATCH (a:Actor)-[:ACTED_IN]->(m:Movie)
WHERE m.name = 'Top Gun'
RETURN a.name
Full Context:
[{'a.name': 'Tom Cruise'}, {'a.name': 'Val Kilmer'}, {'a.name': 'Anthony Edwards'}, {'a.name': 'Meg Ryan'}]

> Finished chain.
Intermediate steps: [{'query': "MATCH (a:Actor)-[:ACTED_IN]->(m:Movie)\nWHERE m.name = 'Top Gun'\nRETURN a.name"}, {'context': [{'a.name': 'Tom Cruise'}, {'a.name': 'Val Kilmer'}, {'a.name': 'Anthony Edwards'}, {'a.name': 'Meg Ryan'}]}]
Final answer: Tom Cruise, Val Kilmer, Anthony Edwards, and Meg Ryan played in Top Gun.

返回直接结果

您可以使用 return_direct 参数从Cypher QA链直接返回结果

chain = GraphCypherQAChain.from_llm(
ChatOpenAI(temperature=0),
graph=graph,
verbose=True,
return_direct=True,
allow_dangerous_requests=True,
)
chain.invoke({"query": "Who played in Top Gun?"})


> Entering new GraphCypherQAChain chain...
Generated Cypher:
MATCH (a:Actor)-[:ACTED_IN]->(m:Movie)
WHERE m.name = 'Top Gun'
RETURN a.name

> Finished chain.
{'query': 'Who played in Top Gun?',
'result': [{'a.name': 'Tom Cruise'},
{'a.name': 'Val Kilmer'},
{'a.name': 'Anthony Edwards'},
{'a.name': 'Meg Ryan'}]}

在Cypher生成提示中添加示例

您可以定义希望LLM为特定问题生成的Cypher语句

from langchain_core.prompts.prompt import PromptTemplate

CYPHER_GENERATION_TEMPLATE = """Task:Generate Cypher statement to query a graph database.
Instructions:
Use only the provided relationship types and properties in the schema.
Do not use any other relationship types or properties that are not provided.
Schema:
{schema}
Note: Do not include any explanations or apologies in your responses.
Do not respond to any questions that might ask anything else than for you to construct a Cypher statement.
Do not include any text except the generated Cypher statement.
Examples: Here are a few examples of generated Cypher statements for particular questions:
# How many people played in Top Gun?
MATCH (m:Movie {{name:"Top Gun"}})<-[:ACTED_IN]-()
RETURN count(*) AS numberOfActors

The question is:
{question}"""

CYPHER_GENERATION_PROMPT = PromptTemplate(
input_variables=["schema", "question"], template=CYPHER_GENERATION_TEMPLATE
)

chain = GraphCypherQAChain.from_llm(
ChatOpenAI(temperature=0),
graph=graph,
verbose=True,
cypher_prompt=CYPHER_GENERATION_PROMPT,
allow_dangerous_requests=True,
)
API 参考:提示模板
chain.invoke({"query": "How many people played in Top Gun?"})


> Entering new GraphCypherQAChain chain...
Generated Cypher:
MATCH (m:Movie {name:"Top Gun"})<-[:ACTED_IN]-()
RETURN count(*) AS numberOfActors
Full Context:
[{'numberOfActors': 4}]

> Finished chain.
{'query': 'How many people played in Top Gun?',
'result': 'There were 4 actors in Top Gun.'}

为 Cypher 和答案生成使用独立的 LLM

您可以使用 cypher_llmqa_llm 参数来定义不同的大语言模型(llms)

chain = GraphCypherQAChain.from_llm(
graph=graph,
cypher_llm=ChatOpenAI(temperature=0, model="gpt-3.5-turbo"),
qa_llm=ChatOpenAI(temperature=0, model="gpt-3.5-turbo-16k"),
verbose=True,
allow_dangerous_requests=True,
)
chain.invoke({"query": "Who played in Top Gun?"})


> Entering new GraphCypherQAChain chain...
Generated Cypher:
MATCH (a:Actor)-[:ACTED_IN]->(m:Movie)
WHERE m.name = 'Top Gun'
RETURN a.name
Full Context:
[{'a.name': 'Tom Cruise'}, {'a.name': 'Val Kilmer'}, {'a.name': 'Anthony Edwards'}, {'a.name': 'Meg Ryan'}]

> Finished chain.
{'query': 'Who played in Top Gun?',
'result': 'Tom Cruise, Val Kilmer, Anthony Edwards, and Meg Ryan played in Top Gun.'}

忽略指定的节点和关系类型

您可以使用 include_typesexclude_types 在生成 Cypher 语句时忽略图结构的某些部分。

chain = GraphCypherQAChain.from_llm(
graph=graph,
cypher_llm=ChatOpenAI(temperature=0, model="gpt-3.5-turbo"),
qa_llm=ChatOpenAI(temperature=0, model="gpt-3.5-turbo-16k"),
verbose=True,
exclude_types=["Movie"],
allow_dangerous_requests=True,
)
# Inspect graph schema
print(chain.graph_schema)
Node properties are the following:
Actor {name: STRING}
Relationship properties are the following:

The relationships are the following:

验证生成的Cypher语句

您可以使用 validate_cypher 参数来验证和纠正生成的 Cypher 语句中的关系方向

chain = GraphCypherQAChain.from_llm(
llm=ChatOpenAI(temperature=0, model="gpt-3.5-turbo"),
graph=graph,
verbose=True,
validate_cypher=True,
allow_dangerous_requests=True,
)
chain.invoke({"query": "Who played in Top Gun?"})


> Entering new GraphCypherQAChain chain...
Generated Cypher:
MATCH (a:Actor)-[:ACTED_IN]->(m:Movie)
WHERE m.name = 'Top Gun'
RETURN a.name
Full Context:
[{'a.name': 'Tom Cruise'}, {'a.name': 'Val Kilmer'}, {'a.name': 'Anthony Edwards'}, {'a.name': 'Meg Ryan'}]

> Finished chain.
{'query': 'Who played in Top Gun?',
'result': 'Tom Cruise, Val Kilmer, Anthony Edwards, and Meg Ryan played in Top Gun.'}

将数据库结果中的上下文作为工具/函数输出提供

您可以使用 use_function_response 参数将数据库结果中的上下文作为工具/函数输出传递给大语言模型(LLM)。此方法通过让LLM更紧密地遵循所提供的上下文,提高回答的准确性和相关性。 您需要使用支持原生函数调用的LLM才能使用此功能

chain = GraphCypherQAChain.from_llm(
llm=ChatOpenAI(temperature=0, model="gpt-3.5-turbo"),
graph=graph,
verbose=True,
use_function_response=True,
allow_dangerous_requests=True,
)
chain.invoke({"query": "Who played in Top Gun?"})


> Entering new GraphCypherQAChain chain...
Generated Cypher:
MATCH (a:Actor)-[:ACTED_IN]->(m:Movie)
WHERE m.name = 'Top Gun'
RETURN a.name
Full Context:
[{'a.name': 'Tom Cruise'}, {'a.name': 'Val Kilmer'}, {'a.name': 'Anthony Edwards'}, {'a.name': 'Meg Ryan'}]

> Finished chain.
{'query': 'Who played in Top Gun?',
'result': 'The main actors in Top Gun are Tom Cruise, Val Kilmer, Anthony Edwards, and Meg Ryan.'}

使用函数响应功能时,可以通过提供 function_response_system 来提供自定义系统消息,以指示模型如何生成答案。

请注意,当使用 use_function_response 时,qa_prompt 将不起任何作用

chain = GraphCypherQAChain.from_llm(
llm=ChatOpenAI(temperature=0, model="gpt-3.5-turbo"),
graph=graph,
verbose=True,
use_function_response=True,
function_response_system="Respond as a pirate!",
allow_dangerous_requests=True,
)
chain.invoke({"query": "Who played in Top Gun?"})


> Entering new GraphCypherQAChain chain...
Generated Cypher:
MATCH (a:Actor)-[:ACTED_IN]->(m:Movie)
WHERE m.name = 'Top Gun'
RETURN a.name
Full Context:
[{'a.name': 'Tom Cruise'}, {'a.name': 'Val Kilmer'}, {'a.name': 'Anthony Edwards'}, {'a.name': 'Meg Ryan'}]

> Finished chain.
{'query': 'Who played in Top Gun?',
'result': "Arrr matey! In the film Top Gun, ye be seein' Tom Cruise, Val Kilmer, Anthony Edwards, and Meg Ryan sailin' the high seas of the sky! Aye, they be a fine crew of actors, they be!"}