Skip to main content
Open on GitHub

Graph RAG

This guide provides an introduction to Graph RAG. For detailed documentation of all supported features and configurations, refer to the Graph RAG Project Page.

Overview

The GraphRetriever from the langchain-graph-retriever package provides a LangChain retriever that combines unstructured similarity search on vectors with structured traversal of metadata properties. This enables graph-based retrieval over an existing vector store.

Integration details

RetrieverSourcePyPI PackageLatestProject Page
GraphRetrievergithub.com/datastax/graph-raglangchain-graph-retrieverPyPI - VersionGraph RAG

Benefits

  • Link based on existing metadata: Use existing metadata fields without additional processing. Retrieve more from an existing vector store!

  • Change links on demand: Edges can be specified on-the-fly, allowing different relationships to be traversed based on the question.

  • Pluggable Traversal Strategies: Use built-in traversal strategies like Eager or MMR, or define custom logic to select which nodes to explore.

  • Broad compatibility: Adapters are available for a variety of vector stores with support for additional stores easily added.

Setup

Installation

This retriever lives in the langchain-graph-retriever package.

pip install -qU langchain-graph-retriever

Instantiation

The following examples will show how to perform graph traversal over some sample Documents about animals.

Prerequisites

Toggle for Details
  1. Ensure you have Python 3.10+ installed

  2. Install the following package that provides sample data.

    pip install -qU graph_rag_example_helpers
  3. Download the test documents:

    from graph_rag_example_helpers.datasets.animals import fetch_documents
    animals = fetch_documents()
  4. Select embeddings model:
  5. OpenAI
  6. Azure
  7. Google
  8. AWS
  9. HuggingFace
  10. Ollama
  11. Cohere
  12. MistralAI
  13. Nomic
  14. NVIDIA
  15. Voyage AI
  16. IBM watsonx
  17. Fake
pip install -qU langchain-openai
import getpass
import os

if not os.environ.get("OPENAI_API_KEY"):
os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter API key for OpenAI: ")

from langchain_openai import OpenAIEmbeddings

embeddings = OpenAIEmbeddings(model="text-embedding-3-large")