Performance recommendations

Always specify the target database

Specify the target database on all queries, either with the ExecuteQueryWithDatabase() configuration callback in ExecuteQuery() or with the DatabaseName configuration parameter when creating new sessions. If no database is provided, the driver has to send an extra request to the server to figure out what the default database is. The overhead is minimal for a single query, but becomes significant over hundreds of queries.

Good practices

result, err := neo4j.ExecuteQuery(ctx, driver, "<QUERY>", nil,
    neo4j.EagerResultTransformer,
    neo4j.ExecuteQueryWithDatabase("<database-name>"))

session := driver.NewSession(ctx, neo4j.SessionConfig{
    DatabaseName: "<database-name>",
})

Bad practices

result, err := neo4j.ExecuteQuery(ctx, driver, "<QUERY>", nil,
    neo4j.EagerResultTransformer)

session := driver.NewSession(ctx, neo4j.SessionConfig{})

Be aware of the cost of transactions

When submitting queries through .ExecuteQuery() or through .ExecuteRead/Write(), the driver wraps them into a transaction. This behavior ensures that the database always ends up in a consistent state, regardless of what happens during the execution of a transaction (power outages, software crashes, etc). As a further robustness layer, the driver also retries failed transactions with an exponential backoff.

Creating a safe execution context around a query yields an overhead that is small, but that adds up as the number of transactions increases. When each query is sent as a transaction of its own, if one transaction fails and needs to be rolled back, all the other transactions are unaffected. This is the safest mode of execution with respect to failures, but also the slowest due to the overhead of transactions scaling with the number of queries.

Each query as a separate transaction (low throughput)

for i := 0; i < 10000; i++ {
    neo4j.ExecuteQuery(ctx, driver, "<QUERY>", nil,
    neo4j.EagerResultTransformer,
    neo4j.ExecuteQueryWithDatabase("<database-name>"))
    // or session.executeRead/Write() calls
}

A more performant approach is to group all queries into a single transaction. In this way, the transaction as a whole is isolated from others, but individual queries in the transaction are not isolated, and failure of one results in a rollback of all queries.

Group queries into one transaction (higher throughput)

session := driver.NewSession(ctx, neo4j.SessionConfig{DatabaseName: "<database-name>"})
defer session.Close(ctx)
res, err := session.ExecuteRead(ctx, func(tx neo4j.ManagedTransaction) (any, error) {
    for i := 0; i < 10000; i++ {
        tx.Run(ctx, "<QUERY>", nil)
    }
    return nil, nil
})

An even faster approach is to skip .ExecuteRead/Write() and call .Run() directly on the session. The queries run as auto-commit transactions, and are still isolated from other concurrent queries, but if any of them fail, they will not be retried. With this method, you trade some robustness for more throughput, as the queries are shot to the server as fast as it can handle. One upper limit on the client size is given by the size of the connection pool: each call to .Run() borrows a connection, so the amount of parallel work is limited by the number of available connections.

Queries as auto-commit transactions (highest throughput)

session := driver.NewSession(ctx, neo4j.SessionConfig{DatabaseName: "<database-name>"})
defer session.Close(ctx)
for i := 0; i < 10000; i++ {
    session.Run(ctx, "<QUERY>", nil)
}

Don’t fetch large result sets all at once

When submitting queries that may result in a lot of records, don’t retrieve them all at once. The Neo4j server can retrieve records in batches and stream the to the driver as they become available. Lazy-loading a result spreads out network traffic and memory usage.

For convenience, .ExecuteQuery() always retrieves all result records at once (it is what the Eager in EagerResult stands for). To lazy-load a result, you have to use .ExecuteRead/Write() (or other forms of manually-handled transactions) and not call .Collect(ctx) on the result; iterate on it instead.

Example 1. Comparison between eager and lazy loading

Eager loading	Lazy loading
The server has to read all 250 records from the storage before it can send even the first one to the driver (i.e. it takes more time for the client to receive the first record). Before any record is available to the application, the driver has to receive all 250 records. The client has to hold in memory all 250 records.	The server reads the first record and sends it to the driver. The application can process records as soon as the first record is transferred. Waiting time and resource consumption for the remaining records is deferred to when the application requests more records. The server’s fetch time can be used for client-side processing. Resource consumption is bounded by the driver’s fetch size.

Eager loading

Lazy loading

The server has to read all 250 records from the storage before it can send even the first one to the driver (i.e. it takes more time for the client to receive the first record).
Before any record is available to the application, the driver has to receive all 250 records.
The client has to hold in memory all 250 records.

The server reads the first record and sends it to the driver.
The application can process records as soon as the first record is transferred.
Waiting time and resource consumption for the remaining records is deferred to when the application requests more records.
The server’s fetch time can be used for client-side processing.
Resource consumption is bounded by the driver’s fetch size.

Time and memory comparison between eager and lazy loading

package main

import (
    "context"
    "time"
    "fmt"
    "github.com/neo4j/neo4j-go-driver/v6/neo4j"
)

// Returns 250 records, each with properties
// - `output` (an expensive computation, to slow down retrieval)
// - `dummyData` (a list of 10000 ints, about 8 KB).
var slowQuery = `
UNWIND range(1, 250) AS s
RETURN reduce(s=s, x in range(1,1000000) | s + sin(toFloat(x))+cos(toFloat(x))) AS output,
range(1, 10000) AS dummyData
`
// Delay for each processed record
var sleepTime = "0.5s"

func main() {
    ctx := context.Background()
    dbUri := "<database-uri>"
    dbUser := "<username>"
    dbPassword := "<password>"
    driver, err := neo4j.NewDriver(
        dbUri,
        neo4j.BasicAuth(dbUser, dbPassword, ""))
    if err != nil {
        panic(err)
    }
    defer driver.Close(ctx)

    err = driver.VerifyConnectivity(ctx)
    if err != nil {
        panic(err)
    }

    log("LAZY LOADING (executeRead)")
    lazyLoading(ctx, driver)

    log("EAGER LOADING (executeQuery)")
    eagerLoading(ctx, driver)
}

func lazyLoading(ctx context.Context, driver neo4j.Driver) {
    defer timer("lazyLoading")()

    sleepTimeParsed, err := time.ParseDuration(sleepTime)
    if err != nil {
        panic(err)
    }

    session := driver.NewSession(ctx, neo4j.SessionConfig{DatabaseName: "<database-name>"})
    defer session.Close(ctx)
    session.ExecuteRead(ctx,
        func(tx neo4j.ManagedTransaction) (any, error) {
            log("Submit query")
            result, err := tx.Run(ctx, slowQuery, nil)
            if err != nil {
                return nil, err
            }
            for result.Next(ctx) != false {
                record := result.Record()
                output, _ := record.Get("output")
                log(fmt.Sprintf("Processing record %v", output))
                time.Sleep(sleepTimeParsed)  // proxy for some expensive operation
            }
            return nil, nil
        })
}

func eagerLoading(ctx context.Context, driver neo4j.Driver) {
    defer timer("eagerLoading")()

    log("Submit query")
    result, err := neo4j.ExecuteQuery(ctx, driver,
        slowQuery,
        nil,
        neo4j.EagerResultTransformer,
        neo4j.ExecuteQueryWithDatabase("<database-name>"))
    if err != nil {
        panic(err)
    }

    sleepTimeParsed, err := time.ParseDuration(sleepTime)
    if err != nil {
        panic(err)
    }

    // Loop through results and do something with them
    for _, record := range result.Records {
        output, _ := record.Get("output")
        log(fmt.Sprintf("Processing record %v", output))
        time.Sleep(sleepTimeParsed)  // proxy for some expensive operation
    }
}

func log(msg string) {
    fmt.Println("[", time.Now().Unix(), "] ", msg)
}

func timer(name string) func() {
    start := time.Now()
    return func() {
        fmt.Printf("-- %s took %v --\n\n", name, time.Since(start))
    }
}

Output

[ 1718802595 ]  LAZY LOADING (executeRead)
[ 1718802595 ]  Submit query
[ 1718802595 ]  Processing record 0.5309371354666308  (1)
[ 1718802595 ]  Processing record 1.5309371354662915
[ 1718802596 ]  Processing record 2.5309371354663197
...
[ 1718802720 ]  Processing record 249.53093713547042
-- lazyLoading took 2m5.467064085s --

[ 1718802720 ]  EAGER LOADING (executeQuery)
[ 1718802720 ]  Submit query
[ 1718802744 ]  Processing record 0.5309371354666308  (2)
[ 1718802744 ]  Processing record 1.5309371354662915
[ 1718802745 ]  Processing record 2.5309371354663197
...
[ 1718802869 ]  Processing record 249.53093713547042
-- eagerLoading took 2m29.113482541s --  (3)

1	With lazy loading, the first record is available sooner.
2	With eager loading, the first record is available once the result is consumed (i.e. after the server has retrieved all 250 records).
3	The total running time is longer with eager loading because the client waits until it receives the last record, whereas with lazy loading the client can process records while the server fetches the next ones. With lazy loading, the client can also stop requesting records after some condition is met (by calling `.consume()` on the `Result` object) to save time and resources.

The driver’s fetch size affects the behavior of lazy loading. It instructs the server to stream an amount of records equal to the fetch size, and then wait until the client has caught up before retrieving and sending more.

The fetch size usually allows to bound memory consumption on the client side, especially for results in which the variance of the records size is small. If one single record is very large, the driver still needs to allocate space for the whole object, so memory usage may get large even with a small fetch size.

On the other hand, the fetch size doesn’t always bound memory consumption on the server side: that depends on the query. For example, a query with ORDER BY requires the whole result set to be loaded into memory for sorting, before records can be streamed to the client.

The lower the fetch size, the more messages client and server have to exchange. Especially if the server’s latency is high, a low fetch size may deteriorate performance.

Route read queries to cluster readers

In a cluster, route read queries to any reader node. You do this by:

using the ExecuteQueryWithReadersRouting() configuration callback in ExecuteQuery() calls
using ExecuteRead() instead of ExecuteWrite() (for managed transactions)
setting AccessMode: neo4j.AccessModeRead when creating a new session (for explicit transactions).

Good practices

result, err := neo4j.ExecuteQuery(ctx, driver,
    "MATCH (p:Person) RETURN p", nil, neo4j.EagerResultTransformer,
    neo4j.ExecuteQueryWithDatabase("<database-name>"),
    neo4j.ExecuteQueryWithReadersRouting())

session := driver.NewSession(ctx, neo4j.SessionConfig{DatabaseName: "<database-name>"})
defer session.Close(ctx)
result, err := session.ExecuteRead(ctx,
    func(tx neo4j.ManagedTransaction) (any, error) {
        return tx.Run(ctx, "MATCH (p:Person) RETURN p", nil)
    })

Bad practices

result, err := neo4j.ExecuteQuery(ctx, driver,
    "MATCH (p:Person) RETURN p", nil, neo4j.EagerResultTransformer,
    neo4j.ExecuteQueryWithDatabase("<database-name>"))
    // defaults to routing = writers

session := driver.NewSession(ctx, neo4j.SessionConfig{DatabaseName: "<database-name>"})
defer session.Close(ctx)
result, err := session.ExecuteWrite(ctx,  // don't ask to write on a read-only operation
    func(tx neo4j.ManagedTransaction) (any, error) {
        return tx.Run(ctx, "MATCH (p:Person) RETURN p", nil)
    })

Create indexes

Create indexes for properties that you often filter against. For example, if you often look up Person nodes by the name property, it is beneficial to create an index on Person.name. You can create indexes with the CREATE INDEX Cypher clause, for both nodes and relationships.

// Create an index on Person.name
neo4j.ExecuteQuery(ctx, driver,
    "CREATE INDEX personName FOR (n:Person) ON (n.name)",
    nil, neo4j.EagerResultTransformer,
    neo4j.ExecuteQueryWithDatabase("<database-name>"))

For more information, see Indexes for search performance.

Profile queries

Profile your queries to locate queries whose performance can be improved. You can profile queries by prepending them with PROFILE. The server output is available through the .Profile() method on the ResultSummary object.

result, _ := neo4j.ExecuteQuery(ctx, driver,
    "PROFILE MATCH (p {name: $name}) RETURN p",
    map[string]any{
        "name": "Alice",
    },
    neo4j.EagerResultTransformer,
    neo4j.ExecuteQueryWithDatabase("<database-name>"))
fmt.Println(result.Summary.Profile().Arguments()["string-representation"])
/*
Planner COST
Runtime PIPELINED
Runtime version 5.0
Batch size 128

+-----------------+----------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+
| Operator        | Details        | Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline            |
+-----------------+----------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+
| +ProduceResults | p              |              1 |    1 |       3 |                |                        |           |                     |
| |               +----------------+----------------+------+---------+----------------+                        |           |                     |
| +Filter         | p.name = $name |              1 |    1 |       4 |                |                        |           |                     |
| |               +----------------+----------------+------+---------+----------------+                        |           |                     |
| +AllNodesScan   | p              |             10 |    4 |       5 |            120 |                 9160/0 |   108.923 | Fused in Pipeline 0 |
+-----------------+----------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+

Total database accesses: 12, total allocated memory: 184
*/

In case some queries are so slow that you are unable to even run them in reasonable times, you can prepend them with EXPLAIN instead of PROFILE. This will return the plan that the server would use to run the query, but without executing it. The server output is available through the .Plan() method on the ResultSummary object.

result, _ := neo4j.ExecuteQuery(ctx, driver,
    "EXPLAIN MATCH (p {name: $name}) RETURN p",
    map[string]any{
        "name": "Alice",
    },
    neo4j.EagerResultTransformer,
    neo4j.ExecuteQueryWithDatabase("<database-name>"))
fmt.Println(result.Summary.Plan().Arguments()["string-representation"])
/*
Planner COST
Runtime PIPELINED
Runtime version 5.0
Batch size 128

+-----------------+----------------+----------------+---------------------+
| Operator        | Details        | Estimated Rows | Pipeline            |
+-----------------+----------------+----------------+---------------------+
| +ProduceResults | p              |              1 |                     |
| |               +----------------+----------------+                     |
| +Filter         | p.name = $name |              1 |                     |
| |               +----------------+----------------+                     |
| +AllNodesScan   | p              |             10 | Fused in Pipeline 0 |
+-----------------+----------------+----------------+---------------------+

Total database accesses: ?
*/

Specify node labels

Specify node labels in all queries. This allows the query planner to work much more efficiently, and to leverage indexes where available. To learn how to combine labels, see Cypher → Label expressions.

Good practices

result, err := neo4j.ExecuteQuery(ctx, driver,
    "MATCH (p:Person|Animal {name: $name}) RETURN p",
    map[string]any{
        "name": "Alice",
    }, neo4j.EagerResultTransformer,
    neo4j.ExecuteQueryWithDatabase("<database-name>"))

session := driver.NewSession(ctx, neo4j.SessionConfig{DatabaseName: "<database-name>"})
defer session.Close(ctx)
result, err := session.Run(ctx,
    "MATCH (p:Person|Animal {name: $name}) RETURN p",
    map[string]any{
        "name": "Alice",
    })

Bad practices

result, err := neo4j.ExecuteQuery(ctx, driver,
    "MATCH (p {name: $name}) RETURN p",
    map[string]any{
        "name": "Alice",
    }, neo4j.EagerResultTransformer,
    neo4j.ExecuteQueryWithDatabase("<database-name>"))

session := driver.NewSession(ctx, neo4j.SessionConfig{DatabaseName: "<database-name>"})
defer session.Close(ctx)
result, err := session.Run(ctx,
    "MATCH (p {name: $name}) RETURN p",
    map[string]any{
        "name": "Alice",
    })

Batch data creation

Batch queries when creating a lot of records using the WITH and UNWIND Cypher clauses.

Good practice

numbers := make([]int, 10000)
for i := range numbers { numbers[i] = i }
neo4j.ExecuteQuery(ctx, driver, `
    UNWIND $numbers AS node
    MERGE (:Number {value: node.value})
    `, map[string]any{
        "numbers": numbers,
    }, neo4j.EagerResultTransformer,
    neo4j.ExecuteQueryWithDatabase("<database-name>"))

Bad practice

for i := 0; i < 10000; i++ {
    neo4j.ExecuteQuery(ctx, driver,
        "MERGE (:Number {value: $value})",
        map[string]any{
            "value": i,
        }, neo4j.EagerResultTransformer,
        neo4j.ExecuteQueryWithDatabase("<database-name>"))
}

The most efficient way of performing a first import of large amounts of data into a new database is the neo4j-admin database import command.

Use query parameters

Always use query parameters instead of hardcoding or concatenating values into queries. Besides protecting from Cypher injections, this allows to leverage the database query cache.

Good practices

result, err := neo4j.ExecuteQuery(ctx, driver,
    "MATCH (p:Person {name: $name}) RETURN p",
    map[string]any{
        "name": "Alice",
    }, neo4j.EagerResultTransformer,
    neo4j.ExecuteQueryWithDatabase("<database-name>"))

session := driver.NewSession(ctx, neo4j.SessionConfig{DatabaseName: "<database-name>"})
defer session.Close(ctx)
session.Run(ctx, "MATCH (p:Person {name: $name}) RETURN p", map[string]any{
    "name": "Alice",
})

Bad practices

result, err := neo4j.ExecuteQuery(ctx, driver,
    "MATCH (p:Person {name: 'Alice'}) RETURN p",
    // or "MATCH (p:Person {name: '" + name + "'}) RETURN p"
    nil, neo4j.EagerResultTransformer,
    neo4j.ExecuteQueryWithDatabase("<database-name>"))

session := driver.NewSession(ctx, neo4j.SessionConfig{DatabaseName: "<database-name>"})
defer session.Close(ctx)
session.Run(ctx, "MATCH (p:Person {name: $name}) RETURN p", nil)
           // or "MATCH (p:Person {name: '" + name + "'}) RETURN p"

Concurrency

Use concurrency patterns. This is likely to be more impactful on performance if you parallelize complex and time-consuming queries in your application, but not so much if you run many simple ones.

Use `MERGE` for creation only when needed

The Cypher clause MERGE is convenient for data creation, as it allows to avoid duplicate data when an exact clone of the given pattern exists. However, it requires the database to run two queries: it first needs to MATCH the pattern, and only then can it CREATE it (if needed).

If you know already that the data you are inserting is new, avoid using MERGE and use CREATE directly instead — this practically halves the number of database queries.

Filter notifications

Filter the category and/or severity of notifications the server should raise.

Glossary

LTS: A Long Term Support release is one guaranteed to be supported for a number of years. Neo4j 4.4 and 5.26 are LTS versions.
Aura: Aura is Neo4j’s fully managed cloud service. It comes with both free and paid plans.
Cypher: Cypher is Neo4j’s graph query language that lets you retrieve data from the database. It is like SQL, but for graphs.
APOC: Awesome Procedures On Cypher (APOC) is a library of (many) functions that can not be easily expressed in Cypher itself.
Bolt: Bolt is the protocol used for interaction between Neo4j instances and drivers. It listens on port 7687 by default.
ACID: Atomicity, Consistency, Isolation, Durability (ACID) are properties guaranteeing that database transactions are processed reliably. An ACID-compliant DBMS ensures that the data in the database remains accurate and consistent despite failures.
eventual consistency: A database is eventually consistent if it provides the guarantee that all cluster members will, at some point in time, store the latest version of the data.
causal consistency: A database is causally consistent if read and write queries are seen by every member of the cluster in the same order. This is stronger than eventual consistency.
NULL: The null marker is not a type but a placeholder for absence of value. For more information, see Cypher → Working with null.
transaction: A transaction is a unit of work that is either committed in its entirety or rolled back on failure. An example is a bank transfer: it involves multiple steps, but they must all succeed or be reverted, to avoid money being subtracted from one account but not added to the other.
backpressure: Backpressure is a force opposing the flow of data. It ensures that the client is not being overwhelmed by data faster than it can handle.
bookmark: A bookmark is a token representing some state of the database. By passing one or multiple bookmarks along with a query, the server will make sure that the query does not get executed before the represented state(s) have been established.
transaction function: A transaction function is a callback executed by an ExecuteRead or ExecuteWrite call. The driver automatically re-executes the callback in case of server failure.
Driver: A Driver object holds the details required to establish connections with a Neo4j database.

Performance recommendations

Always specify the target database

Good practices

Bad practices

Be aware of the cost of transactions

Don’t fetch large result sets all at once

Route read queries to cluster readers

Good practices

Bad practices

Create indexes

Profile queries

Specify node labels

Good practices

Bad practices

Batch data creation

Good practice

Bad practice

Use query parameters

Good practices

Bad practices

Concurrency

Use MERGE for creation only when needed

Filter notifications

Glossary

Use `MERGE` for creation only when needed