Neo4j Agent Memory

Neo4j Labs · Graph-native memory for AI agents

Your agents forget everything. Fix that.

Production-grade memory for single- and multi-agent systems. Three memory layers — conversations, entities, reasoning — in one graph. Available as a hosted service (zero infra, just an API key) or run against your own Neo4j. Python + TypeScript SDKs that interoperate.

Experimental · Community Supported

Pick your install

Path Get started in ~30 seconds

Hosted service (recommended for new projects)
Sign up at memory.neo4jlabs.com, get an API key, install one of the SDKs. No Neo4j to run.

# Python
pip install neo4j-agent-memory
export MEMORY_API_KEY=nams_...

# TypeScript
npm install @neo4j-labs/agent-memory
export MEMORY_API_KEY=nams_...

Self-hosted Neo4j
Bring your own Neo4j (Aura, Docker, Desktop). Full schema control, write-Cypher access, geospatial features.

pip install neo4j-agent-memory
# Point at bolt://your-neo4j:7687

Three Memory Types, One Graph

Most agent memory systems give you a flat context window or a simple vector store. Neo4j Agent Memory gives your agent three distinct memory layers — all connected in a single knowledge graph.

Three memory types: short-term
Memory Type What It Stores Key Features

Short-Term

Conversations & experiences

Semantic message search, session scoping, metadata filters, LLM-powered summaries

Long-Term

Facts, entities & preferences

POLE+O entity classification, entity deduplication, temporal fact validity, geospatial queries

Reasoning

Tool usage & reasoning traces

Trace similarity search, tool call statistics, message-linked traces, streaming trace recording

async with MemoryClient(settings) as memory:
    # Short-term: store conversations
    await memory.short_term.add_message(session_id, "user", "Find Italian restaurants near me")

    # Long-term: build a knowledge graph
    await memory.long_term.add_entity("La Trattoria", "ORGANIZATION", subtype="COMPANY")

    # Reasoning: record how the agent solved problems
    trace = await memory.reasoning.start_trace(session_id, task="Restaurant search")
    await memory.reasoning.record_tool_call(step.id, "search_api", {"query": "Italian"}, results)
    await memory.reasoning.complete_trace(trace.id, outcome="Recommended La Trattoria", success=True)

    # Get unified context for LLM prompts
    context = await memory.get_context("restaurant recommendation", session_id=session_id)

Multi-Agent Ready

Multiple agents — even in different languages — can read and write the same memory graph. Conversations stay private per session; entities, preferences, and facts are shared.

Capability How it works

Shared knowledge graph

Every agent in the system reads the same Entity, Preference, and Fact nodes. When Agent A learns something new, Agent B sees it on the next query — no manual sync.

Per-session isolation

session_id (or user_identifier= for multi-tenant) keeps each agent’s conversations and reasoning traces separate. One agent never reads another’s chat history by accident.

Cross-language by design

A Python agent (PydanticAI, LangChain, CrewAI, Google ADK, AWS Strands) and a TypeScript agent (Vercel AI SDK, LangChain JS, Mastra, MCP) on the same NAMS endpoint share memory transparently. Both SDKs implement the same NAMS REST contract.

Multi-tenant scoping

Pass user_identifier= on any operation to scope writes and reads per end-user. One backend, many tenants — see Multi-tenant memory.

See Cross-Agent Memory Sharing for the conceptual model and NAMS Quickstart for the deployment shape.

What Makes This Different

Capability Why It Matters

Graph-Native, Not Bolted On

Built on Neo4j from the ground up. Entities, messages, traces, and preferences are all nodes and relationships — not rows in a table or chunks in a vector store. Query across memory types with a single traversal.

Vector + Graph + Spatial in One

Semantic similarity search, graph traversal, and geospatial queries all in one database. No separate vector DB, no Redis, no Postgres. Find entities by meaning, by relationship, or by location.

Multi-Stage Entity Extraction

Combine spaCy, GLiNER2, GLiREL, and LLM extractors in a configurable pipeline. 8 domain schemas (podcast, news, medical, legal…​). Streaming extraction for 100K+ token documents.

Three Memory Layers, Connected

Short-term conversations inform long-term entity extraction. Reasoning traces link back to triggering messages. The agent understands what happened, what it knows, and how it solved things.

Wikipedia Enrichment & Geocoding

Entities are automatically enriched with Wikipedia descriptions, images, and Wikidata IDs. Location entities get coordinates for geospatial queries. All in the background.

Framework Agnostic

Python: LangChain, PydanticAI, LlamaIndex, CrewAI, OpenAI Agents, AWS Strands, Google ADK, Microsoft Agent Framework, AgentCore. TypeScript: Vercel AI SDK, LangChain JS, Mastra, AWS Strands, MCP tools. Or use the client directly.

Hosted or Self-Hosted

Use the NAMS hosted service for zero-infra deployments, or run the same SDK against your own Neo4j (Aura, Desktop, Docker) when you need write-Cypher, geospatial, or air-gapped operation. See Bolt vs NAMS.

Demos

Lenny’s Podcast Memory Explorer (Flagship)

299 podcast episodes transformed into a searchable knowledge graph with a full-stack AI chat agent, interactive graph visualization, geospatial map view, and Wikipedia-enriched entity cards.

Episodes loaded Agent tools Memory types

299

19

3

  • Interactive graph visualization (Neo4j NVL)

  • Geospatial map with marker clusters & heatmaps

  • Wikipedia-enriched entity cards with images

  • SSE streaming with tool call visualization

  • Automatic preference learning from conversation

  • FastAPI + PydanticAI + Next.js + Chakra UI

Full-Stack Chat Agent

A news research assistant that uses all three memory types with an interactive memory graph visualization. Double-click nodes to expand neighbors and explore the knowledge graph.

  • Memory graph visualization with expansion

  • Conversation-scoped graph filtering

  • Dual Neo4j databases (memory + news)

Domain Schema Examples

8 specialized extraction schemas for different domains. See how GLiNER2 domain schemas improve entity extraction accuracy with type descriptions and confidence tuning.

Schemas: Podcast · News · Scientific · Business · Medical · Legal · Entertainment · POLE+O

Quick Start

pip install neo4j-agent-memory[openai]
import asyncio
from pydantic import SecretStr
from neo4j_agent_memory import MemoryClient, MemorySettings

async def main():
    settings = MemorySettings(
        neo4j={"uri": "bolt://localhost:7687", "password": SecretStr("password")}
    )

    async with MemoryClient(settings) as memory:
        # Store a conversation
        await memory.short_term.add_message("user-123", "user", "I love Italian food!")

        # Build knowledge automatically
        await memory.long_term.add_preference("food", "Loves Italian cuisine", confidence=0.9)

        # Get combined context for your LLM
        context = await memory.get_context("restaurant recommendation", session_id="user-123")
        print(context)

asyncio.run(main())

Entity Extraction Pipeline

Multi-stage extraction pipeline: spaCy → GLiNER2 → GLiREL → LLM → Knowledge Graph

Combine extractors in a configurable pipeline. 5 merge strategies. 8 domain schemas. Streaming extraction for 100K+ token documents.

Stage Purpose Best for

spaCy

Fast rule-based & statistical NER

High throughput, no API cost

GLiNER2

Zero-shot NER with domain schemas

Accuracy without LLM cost

GLiREL

Relation extraction between entities

Graph edge extraction

LLM fallback

High-accuracy extraction via API

Complex or ambiguous text

Framework Integrations

Python (neo4j-agent-memory)

Framework Integration Guide

LangChain

Memory & retriever

LangChain guide

PydanticAI

Deps & trace recording

PydanticAI guide

LlamaIndex

Memory nodes

LlamaIndex guide

CrewAI

Crew memory

CrewAI guide

OpenAI Agents

Session memory

OpenAI Agents guide

AWS Strands

Agent tools

Strands guide

Google ADK

MemoryService

Google Cloud guide

Microsoft Agent Framework

ContextProvider + GDS

Microsoft Agent guide

AWS AgentCore

HybridMemoryProvider

AgentCore guide

TypeScript (@neo4j-labs/agent-memory)

Framework Integration Guide

Vercel AI SDK

Middleware (3-tier context injection + auto-persist)

Vercel AI guide

LangChain JS

Neo4jChatMessageHistory, Neo4jEntityRetriever

LangChain JS guide

Mastra

Neo4jMastraMemory provider

Mastra guide

AWS Strands (JS)

Session storage + conversation manager

Strands guide

MCP tools

12 standard tools + dispatcher (Node, Bun, Deno, Workers, Edge)

MCP guide

Architecture

Architecture overview: MemoryClient with three memory layers backed by Neo4j

Documentation

Section Purpose Start here

Tutorials

Learn by building complete examples

Build your first memory-enabled agent

How-To Guides

Practical recipes for common tasks

Configure entity extraction

Reference

Technical specifications and API docs

Configuration reference

Explanation

Concepts and architecture

Understanding memory types

Guide Description

Build Your First Agent

Installation, setup, and your first memory-enabled agent

Understanding Memory Types

Deep dive into short-term, long-term, and reasoning memory

Configure Entity Extraction

Domain schemas, multi-stage pipelines, and custom extractors

Configuration Reference

Complete configuration options and environment variables

Framework Integrations

Using with PydanticAI, LangChain, LlamaIndex, and CrewAI

Requirements

If you use… You need…

NAMS (hosted)

Python 3.10+ or Node.js 20+. An API key from memory.neo4jlabs.com. No Neo4j to run.

Self-hosted (Bolt)

Python 3.10+. Neo4j 5.20+ (Aura, Desktop, or Docker — APOC plugin recommended). Optional OpenAI/Anthropic/Bedrock/Vertex key for client-side embeddings and LLM extraction.