The Neo4j Manual V2.3.12 2.3.12

neo4j-manual-2.3.12

neo4j-manual-2.3.12

User Manual:

Open the PDF directly: View PDF PDF.
Page Count: 656

DownloadThe Neo4j Manual V2.3.12 Neo4j-manual-2.3.12
Open PDF In BrowserView PDF
The Neo4j Manual v2.3.12

1

The Neo4j Team neo4j.com

1

http://neo4j.com/

The Neo4j Manual v2.3.12
by The Neo4j Team neo4j.com

1

Publication date 2017-12-08 12:26:35
Copyright © 2017 Neo Technology

Starting points
•
•
•
•
•
•
•

What is the Neo4j graph database?
Cypher Query Language
REST API
Installation
Upgrading
Security
Resources

License: Creative Commons 3.0
This book is presented in open source and licensed through Creative Commons 3.0. You are free to copy, distribute, transmit, and/or
adapt the work. This license is based upon the following conditions:

Attribution

You must attribute the work in the manner specified by the author or licensor (but not in any way that suggests that
they endorse you or your use of the work).

Share Alike

If you alter, transform, or build upon this work, you may distribute the resulting work only under the same, similar or
a compatible license.
Any of the above conditions can be waived if you get permission from the copyright holder.
In no way are any of the following rights affected by the license:
• Your fair dealing or fair use rights
• The author’s moral rights
• Rights other persons may have either in the work itself or in how the work is used, such as publicity or privacy rights

Note

For any reuse or distribution, you must make clear to the others the license terms of this work. The best way to do this is
2
with a direct link to this page: http://creativecommons.org/licenses/by-sa/3.0/

1
2

http://neo4j.com/
http://creativecommons.org/licenses/by-sa/3.0/

Preface ........................................................................................................................................................ v
I. Introduction ............................................................................................................................................. 1
1. Neo4j Highlights ............................................................................................................................. 3
2. Graph Database Concepts ............................................................................................................. 4
II. Tutorials ................................................................................................................................................ 14
3. Introduction to Cypher ................................................................................................................ 16
4. Use Cypher in an application ...................................................................................................... 46
5. Basic Data Modeling Examples ................................................................................................... 47
6. Advanced Data Modeling Examples ............................................................................................ 62
7. Languages .................................................................................................................................... 96
III. Cypher Query Language ................................................................................................................... 102
8. Introduction ................................................................................................................................ 105
9. Syntax ......................................................................................................................................... 118
10. General Clauses ....................................................................................................................... 136
11. Reading Clauses ....................................................................................................................... 154
12. Writing Clauses ........................................................................................................................ 186
13. Functions .................................................................................................................................. 214
14. Schema ..................................................................................................................................... 243
15. Query Tuning ............................................................................................................................ 253
16. Execution Plans ........................................................................................................................ 259
IV. Reference ........................................................................................................................................... 277
17. Capabilities ............................................................................................................................... 279
18. Transaction Management ........................................................................................................ 285
19. Data Import .............................................................................................................................. 295
20. Graph Algorithms ..................................................................................................................... 296
21. REST API ................................................................................................................................... 297
22. Deprecations ............................................................................................................................ 434
V. Operations .......................................................................................................................................... 435
23. Installation & Deployment ....................................................................................................... 437
24. Configuration & Performance .................................................................................................. 448
25. High Availability ........................................................................................................................ 472
26. Backup ...................................................................................................................................... 494
27. Security ..................................................................................................................................... 499
28. Monitoring ................................................................................................................................ 505
VI. Tools .................................................................................................................................................. 529
29. Import tool ............................................................................................................................... 531
30. Web Interface ........................................................................................................................... 544
31. Neo4j Shell ............................................................................................................................... 545
VII. Advanced Usage ............................................................................................................................... 561
32. Extending the Neo4j Server ..................................................................................................... 563
33. Using Neo4j embedded in Java applications ........................................................................... 576
34. The Traversal Framework ........................................................................................................ 613
35. Legacy Indexing ........................................................................................................................ 621
36. Batch Insertion ......................................................................................................................... 636
Terminology ............................................................................................................................................ 640
A. Resources ........................................................................................................................................... 644
B. Manpages ........................................................................................................................................... 645
neo4j ............................................................................................................................................... 646
neo4j-shell ...................................................................................................................................... 647
neo4j-import ................................................................................................................................... 648
neo4j-backup .................................................................................................................................. 650
neo4j-arbiter ................................................................................................................................... 651

iv

Preface
This is the reference manual for Neo4j version 2.3.12, authored by the Neo4j Team.
The main parts of the manual are:
•
•
•
•
•
•
•
•
•
•

Part I, “Introduction” [1] — introducing graph database concepts and Neo4j.
Part II, “Tutorials” [14] — learn how to use Neo4j.
Part III, “Cypher Query Language” [102] — details on the Cypher query language.
Part IV, “Reference” [277] — detailed information on Neo4j.
Part V, “Operations” [435] — how to install and maintain Neo4j.
Part VI, “Tools” [529] — guides on tools.
Part VII, “Advanced Usage” [561] — using Neo4j in more advanced ways.
Terminology [640] — terminology about graph databases.
Appendix A, Resources [644] — find additional documentation resources.
Appendix B, Manpages [645] — command line documentation.

The material is practical, technical, and focused on answering specific questions. It addresses how
things work, what to do and what to avoid to successfully run Neo4j in a production environment.
The goal is to be thumb-through and rule-of-thumb friendly.
Each section should stand on its own, so you can hop right to whatever interests you. When possible,
the sections distill “rules of thumb” which you can keep in mind whenever you wander out of the house
without this manual in your back pocket.
The included code examples are executed when Neo4j is built and tested. Also, the REST API request
and response examples are captured from real interaction with a Neo4j server. Thus, the examples are
always in sync with how Neo4j actually works.
There are other documentation resources besides the manual as well, see Appendix A,
Resources [644].
Who should read this?
The topics should be relevant to architects, administrators, developers and operations personnel.
Where to get help?
You can learn a lot about Neo4j at different events. To get information on upcoming Neo4j events, have
a look here:
• http://neo4j.com/events/
• http://neo4j.meetup.com/
Get help from the Neo4j open source community; here are some starting points.
• The neo4j tag at stackoverflow: http://stackoverflow.com/questions/tagged/neo4j
• Neo4j Discussions: https://groups.google.com/forum/#!forum/neo4j
• Twitter: https://twitter.com/neo4j
Report a bug or add a feature request:
• https://github.com/neo4j/neo4j/issues
Questions regarding the documentation: The Neo4j Manual is published online with a comment
function, please use that to post any questions or comments regarding the documentation.
If you want to contribute to the Neo4j open source project, see http://neo4j.com/developer/contribute/.

v

Part I. Introduction

This part gives a bird’s eye view of what a graph database is and also outlines some specifics of Neo4j.

1. Neo4j Highlights ..................................................................................................................................... 3
2. Graph Database Concepts ..................................................................................................................... 4
2.1. The Neo4j Graph Database ........................................................................................................ 5
2.2. Comparing Database Models ................................................................................................... 11

2

Chapter 1. Neo4j Highlights
As a robust, scalable and high-performance database, Neo4j is suitable for full enterprise deployment.
It features:
•
•
•
•
•

true ACID transactions,
high availability,
scales to billions of nodes and relationships,
high speed querying through traversals,
declarative graph query language.

Proper ACID behavior is the foundation of data reliability. Neo4j enforces that all operations that
modify data occur within a transaction, guaranteeing consistent data. This robustness extends
from single instance embedded graphs to multi-server high availability installations. For details, see
Chapter 18, Transaction Management [285].
Reliable graph storage can easily be added to any application. A graph can scale in size and complexity
as the application evolves, with little impact on performance. Whether starting new development, or
augmenting existing functionality, Neo4j is only limited by physical hardware.
A single server instance can handle a graph of billions of nodes and relationships. When data
throughput is insufficient, the graph database can be distributed among multiple servers in a high
availability configuration. See Chapter 25, High Availability [472] to learn more.
The graph database storage shines when storing richly-connected data. Querying is performed through
traversals, which can perform millions of traversal steps per second. A traversal step resembles a join in
a RDBMS.

3

Chapter 2. Graph Database Concepts
This chapter contains an introduction to the graph data model and also compares it to other data
models used when persisting data.

4

Graph Database Concepts

2.1. The Neo4j Graph Database
A graph database stores data in a graph, the most generic of data structures, capable of elegantly
representing any kind of data in a highly accessible way.
For terminology around graph databases, see Terminology [640].
Here’s an example graph which we will approach step by step in the following sections:
Person

Person

nam e = 'Tom Hanks'
born = 1956

nam e = 'Robert Zem eckis'
born = 1951

ACTED_IN
roles = [ 'Forrest ']

DIRECTED

Movie
t it le = 'Forrest Gum p'
released = 1994

Nodes
A graph records data in nodes and relationships. Both can have properties. This is
sometimes referred to as the Property Graph Model.
The fundamental units that form a graph are nodes and relationships. In Neo4j, both nodes and
relationships can contain properties.
Nodes are often used to represent entities, but depending on the domain relationships may be used for
that purpose as well.
Apart from properties and relationships, nodes can also be labeled with zero or more labels.
The simplest possible graph is a single Node. A Node can have zero or more named values referred to
as properties. Let’s start out with one node that has a single property named title:

t it le = 'Forrest Gum p'

The next step is to have multiple nodes. Let’s add two more nodes and one more property on the node
in the previous example:

nam e = 'Tom Hanks'
born = 1956

t it le = 'Forrest Gum p'
released = 1994

nam e = 'Robert Zem eckis'
born = 1951

Relationships
Relationships organize the nodes by connecting them. A relationship connects two
nodes — a start node and an end node. Just like nodes, relationships can have properties.
Relationships between nodes are a key part of a graph database. They allow for finding related data.
Just like nodes, relationships can have properties.
A relationship connects two nodes, and is guaranteed to have valid start and end nodes.
5

Graph Database Concepts
Relationships organize nodes into arbitrary structures, allowing a graph to resemble a list, a tree,
a map, or a compound entity — any of which can be combined into yet more complex, richly interconnected structures.
Our example graph will make a lot more sense once we add relationships to it:
nam e = 'Tom Hanks'
born = 1956

nam e = 'Robert Zem eckis'
born = 1951

ACTED_IN
roles = [ 'Forrest ']

DIRECTED

t it le = 'Forrest Gum p'
released = 1994

Our example uses ACTED_IN and DIRECTED as relationship types. The roles property on the ACTED_IN
relationship has an array value with a single item in it.
Below is an ACTED_IN relationship, with the Tom Hanks node as start node and Forrest Gump as end node.
ACTED_IN
roles = [ 'Forrest ']

nam e = 'Tom Hanks'
born = 1956

t it le = 'Forrest Gum p'
released = 1994

You could also say that the Tom Hanks node has an outgoing relationship, while the Forrest Gump node has
an incoming relationship.

Relationships are equally well traversed in either direction.

This means that there is no need to add duplicate relationships in the opposite direction
(with regard to traversal or performance).
While relationships always have a direction, you can ignore the direction where it is not useful in your
application.
Note that a node can have relationships to itself as well:
nam e = 'Tom Hanks'
born = 1956

KNOWS

The example above would mean that Tom Hanks KNOWS himself.
To further enhance graph traversal all relationships have a relationship type.
Let’s have a look at what can be found by simply following the relationships of a node in our example
graph:
nam e = 'Tom Hanks'
born = 1956

nam e = 'Robert Zem eckis'
born = 1951

ACTED_IN
roles = [ 'Forrest ']

t it le = 'Forrest Gum p'
released = 1994

6

DIRECTED

Graph Database Concepts
Using relationship direction and type
What we want to know

Start from

Relationship type

Direction

get actors in movie

movie node

ACTED_IN

incoming

get movies with actor

person node

ACTED_IN

outgoing

get directors of movie

movie node

DIRECTED

incoming

get movies directed by

person node

DIRECTED

outgoing

Properties
Both nodes and relationships can have properties.
Properties are named values where the name is a string. The supported property values are:
•
•
•
•

Numeric values,
String values,
Boolean values,
Collections of any other type of value.
NULL is not a valid property value.
NULLs can instead be modeled by the absence of a key.

For further details on supported property values, see Section 33.3, “Property values” [584].

Labels
Labels assign roles or types to nodes.
A label is a named graph construct that is used to group nodes into sets; all nodes labeled with the
same label belongs to the same set. Many database queries can work with these sets instead of the
whole graph, making queries easier to write and more efficient to execute. A node may be labeled with
any number of labels, including none, making labels an optional addition to the graph.
Labels are used when defining constraints and adding indexes for properties (see the section called
“Schema” [9]).
An example would be a label named User that you label all your nodes representing users with. With
that in place, you can ask Neo4j to perform operations only on your user nodes, such as finding all
users with a given name.
However, you can use labels for much more. For instance, since labels can be added and removed
during runtime, they can be used to mark temporary states for your nodes. You might create an Offline
label for phones that are offline, a Happy label for happy pets, and so on.
In our example, we’ll add Person and Movie labels to our graph:
Person

Person

nam e = 'Tom Hanks'
born = 1956

nam e = 'Robert Zem eckis'
born = 1951

ACTED_IN
roles = [ 'Forrest ']

Movie
t it le = 'Forrest Gum p'
released = 1994

7

DIRECTED

Graph Database Concepts
A node can have multiple labels, let’s add an Actor label to the Tom Hanks node.
Person
Act or
nam e = 'Tom Hanks'
born = 1956

Label names
Any non-empty Unicode string can be used as a label name. In Cypher, you may need to use the
backtick (`) syntax to avoid clashes with Cypher identifier rules or to allow non-alphanumeric characters
in a label. By convention, labels are written with CamelCase notation, with the first letter in upper case.
For instance, User or CarOwner.
Labels have an id space of an int, meaning the maximum number of labels the database can contain is
roughly 2 billion.

Traversal
A traversal navigates through a graph to find paths.
A traversal is how you query a graph, navigating from starting nodes to related nodes, finding answers
to questions like “what music do my friends like that I don’t yet own,” or “if this power supply goes
down, what web services are affected?”
Traversing a graph means visiting its nodes, following relationships according to some rules. In most
cases only a subgraph is visited, as you already know where in the graph the interesting nodes and
relationships are found.
Cypher provides a declarative way to query the graph powered by traversals and other techniques. See
Part III, “Cypher Query Language” [102] for more information.
When writing server plugins or using Neo4j embedded, Neo4j provides a callback based traversal API
which lets you specify the traversal rules. At a basic level there’s a choice between traversing breadthor depth-first.
If we want to find out which movies Tom Hanks acted in according to our tiny example database the
traversal would start from the Tom Hanks node, follow any ACTED_IN relationships connected to the node,
and end up with Forrest Gump as the result (see the dashed lines):
Person

Person

nam e = 'Tom Hanks'
born = 1956

nam e = 'Robert Zem eckis'
born = 1951

ACTED_IN
roles = [ 'Forrest ']

DIRECTED

Movie
t it le = 'Forrest Gum p'
released = 1994

Paths
A path is one or more nodes with connecting relationships, typically retrieved as a query
or traversal result.
In the previous example, the traversal result could be returned as a path:
8

Graph Database Concepts

Person

nam e = 'Tom Hanks'
born = 1956

ACTED_IN
roles = [ 'Forrest ']

Movie

t it le = 'Forrest Gum p'
released = 1994

The path above has length one.
The shortest possible path has length zero — that is it contains only a single node and no
relationships — and can look like this:
Person
nam e = 'Tom Hanks'
born = 1956

This path has length one:
Person
nam e = 'Tom Hanks'
born = 1956

KNOWS

Schema
Neo4j is a schema-optional graph database.
You can use Neo4j without any schema. Optionally you can introduce it in order to gain performance or
modeling benefits. This allows a way of working where the schema does not get in your way until you
are at a stage where you want to reap the benefits of having one.

Note

Schema commands can only be applied on the master machine in a Neo4j cluster (see
Chapter 25, High Availability [472]). If you apply them on a slave you will receive a
Neo.ClientError.Transaction.InvalidType error code (see Section 21.2, “Neo4j Status
Codes” [307]).

Indexes
Performance is gained by creating indexes, which improve the speed of looking up nodes
in the database.
Note

This feature was introduced in Neo4j 2.0, and is not the same as the legacy indexes (see
Chapter 35, Legacy Indexing [621]).
Once you’ve specified which properties to index, Neo4j will make sure your indexes are kept up to date
as your graph evolves. Any operation that looks up nodes by the newly indexed properties will see a
significant performance boost.
Indexes in Neo4j are eventually available. That means that when you first create an index the operation
returns immediately. The index is populating in the background and so is not immediately available for
querying. When the index has been fully populated it will eventually come online. That means that it is
now ready to be used in queries.
If something should go wrong with the index, it can end up in a failed state. When it is failed, it will not
be used to speed up queries. To rebuild it, you can drop and recreate the index. Look at logs for clues
about the failure.
9

Graph Database Concepts
You can track the status of your index by asking for the index state through the API you are using. Note,
however, that this is not yet possible through Cypher.
How to use indexes through the different APIs:
•
•
•
•

Cypher: Section 14.1, “Indexes” [244]
REST API: Section 21.15, “Indexing” [368]
Listing Indexes via Shell: the section called “Listing Indexes and Constraints” [554]
Java Core API: Section 33.4, “User database with indexes” [585]

Constraints
Note

This feature was introduced in Neo4j 2.0.
Neo4j can help you keep your data clean. It does so using constraints, that allow you to specify the
rules for what your data should look like. Any changes that break these rules will be denied.
In this version, unique constraints is the only available constraint type.
How to use constraints through the different APIs:
• Cypher: Section 14.2, “Constraints” [247]
• REST API: Section 21.16, “Constraints” [370]
• Listing Constraints via Shell: the section called “Listing Indexes and Constraints” [554]

10

Graph Database Concepts

2.2. Comparing Database Models
A graph database stores data structured in the nodes and relationships of a graph. How does this
compare to other persistence models? Because a graph is a generic structure, let’s compare how a few
models would look in a graph.

A Graph Database transforms a RDBMS
Topple the stacks of records in a relational database while keeping all the relationships, and you’ll see
a graph. Where an RDBMS is optimized for aggregated data, Neo4j is optimized for highly connected
data.
Figure 2.1. RDBMS

A1

B1

A2

B2

A3

B3
B4
B5

C1

B6

C2

B7

C3

Figure 2.2. Graph Database as RDBMS
C3

A1

A2

C1

B2

B6

B1

B4

B5

A3

C2

B3

B7

A Graph Database elaborates a Key-Value Store
A Key-Value model is great for lookups of simple values or lists. When the values are themselves
interconnected, you’ve got a graph. Neo4j lets you elaborate the simple data structures into more
complex, interconnected data.

11

Graph Database Concepts
Figure 2.3. Key-Value Store
V1
K2
K1
K2

V2

K3

K1
K3

V3
K1

K* represents a key, V* a value. Note that some keys point to other keys as well as plain values.
Figure 2.4. Graph Database as Key-Value Store

K2

K1

V2

V1

V3

K3

A Graph Database relates Column-Family
Column Family (BigTable-style) databases are an evolution of key-value, using "families" to allow
grouping of rows. Stored in a graph, the families could become hierarchical, and the relationships
among data becomes explicit.

A Graph Database navigates a Document Store
The container hierarchy of a document database accommodates nice, schema-free data that can easily
be represented as a tree. Which is of course a graph. Refer to other documents (or document elements)
within that tree and you have a more expressive representation of the same data. When in Neo4j, those
relationships are easily navigable.

12

Graph Database Concepts
Figure 2.5. Document Store

D1

D2

S1
D2/S2

S3
V1

D1/S1

S2
V4

V3

V2

D=Document, S=Subdocument, V=Value, D2/S2 = reference to subdocument in (other) document.
Figure 2.6. Graph Database as Document Store

V1

D1

V2

D2

S3

S1

S2
V3

V4

13

Part II. Tutorials

The tutorial part describes how use Neo4j. It takes you from Hello World to advanced usage of graphs.

3. Introduction to Cypher ........................................................................................................................ 16
3.1. Background and Motivation ..................................................................................................... 17
3.2. Graphs, Patterns, and Cypher .................................................................................................. 18
3.3. Patterns in Practice ................................................................................................................... 21
3.4. Getting the Results You Want ................................................................................................... 26
3.5. How to Compose Large Statements ......................................................................................... 30
3.6. Labels, Constraints and Indexes ............................................................................................... 32
3.7. Loading Data ............................................................................................................................. 34
3.8. Utilizing Data Structures ........................................................................................................... 37
3.9. Cypher vs. SQL .......................................................................................................................... 40
4. Use Cypher in an application .............................................................................................................. 46
5. Basic Data Modeling Examples ........................................................................................................... 47
5.1. Movie Database ......................................................................................................................... 48
5.2. Social Movie Database .............................................................................................................. 50
5.3. Finding Paths ............................................................................................................................. 52
5.4. Linked Lists ................................................................................................................................ 56
5.5. TV Shows ................................................................................................................................... 58
6. Advanced Data Modeling Examples .................................................................................................... 62
6.1. ACL structures in graphs .......................................................................................................... 63
6.2. Hyperedges ................................................................................................................................ 67
6.3. Basic friend finding based on social neighborhood ................................................................. 69
6.4. Co-favorited places ................................................................................................................... 70
6.5. Find people based on similar favorites .................................................................................... 72
6.6. Find people based on mutual friends and groups ................................................................... 73
6.7. Find friends based on similar tagging ...................................................................................... 74
6.8. Multirelational (social) graphs ................................................................................................... 75
6.9. Implementing newsfeeds in a graph ........................................................................................ 76
6.10. Boosting recommendation results ......................................................................................... 79
6.11. Calculating the clustering coefficient of a network ................................................................ 80
6.12. Pretty graphs ........................................................................................................................... 81
6.13. A multilevel indexing structure (path tree) ............................................................................. 85
6.14. Complex similarity computations ........................................................................................... 89
6.15. The Graphity activity stream model ....................................................................................... 90
6.16. User roles in graphs ................................................................................................................ 92
7. Languages ............................................................................................................................................ 96
7.1. How to use the REST API from Java ......................................................................................... 97

15

Chapter 3. Introduction to Cypher
This friendly guide will introduce you to Cypher, Neo4j’s query language.
The guide will help you:
•
•
•
•
•

start thinking about graphs and patterns,
apply this knowledge to simple problems,
learn how to write Cypher statements,
use Cypher for loading data,
transition from SQL to Cypher.
1

If you want to keep a reference at your side while reading, please see the Cypher Refcard .

Work in Progress

There may still be unfinished parts in this chapter. Please comment on it so we can make it
suit our readers better!

1

http://neo4j.com/docs/2.3.12/cypher-refcard/

16

Introduction to Cypher

3.1. Background and Motivation
Cypher provides a convenient way to express queries and other Neo4j actions. Although Cypher
is particularly useful for exploratory work, it is fast enough to be used in production. Java-based
approaches (eg, unmanaged extensions) can also be used to handle particularly demanding use cases.

Query processing
To use Cypher effectively, it’s useful to have an idea of how it works. So, let’s take a high-level look at the
way Cypher processes queries.
•
•
•
•
•

Parse and validate the query.
Generate the execution plan.
Locate the initial node(s).
Select and traverse relationships.
Change and/or return values.

Preparation
Parsing and validating the Cypher statement(s) is important, but mundane. However, generating an
optimal search strategy can be far more challenging.
The execution plan must tell the database how to locate initial node(s), select relationships for traversal,
etc. This involves tricky optimization problems (eg, which actions should happen first), but we can safely
leave the details to the Neo4j engineers. So, let’s move on to locating the initial node(s).

Locate the initial node(s)
Neo4j is highly optimized for traversing property graphs. Under ideal circumstances, it can traverse
millions of nodes and relationships per second, following chains of pointers in the computer’s memory.
However, before traversal can begin, Neo4j must know one or more starting nodes. Unless the user (or,
more likely, a client program) can provide this information, Neo4j will have to search for these nodes.
A “brute force” search of the database (eg, for a specified property value) can be very time consuming.
Every node must be examined, first to see if it has the property, then to see if the value meets the
desired criteria. To avoid this effort, Neo4j creates and uses indexes. So, Neo4j uses a separate index
for each label/property combination.

Traversal and actions
Once the initial nodes are determined, Neo4j can traverse portions of the graph and perform any
requested actions. The execution plan helps Neo4j to determine which nodes are relevant, which
relationships to traverse, etc.

17

Introduction to Cypher

3.2. Graphs, Patterns, and Cypher
Nodes, Relationships, and Patterns
Neo4j’s Property Graphs are composed of nodes and relationships, either of which may have properties
(ie, attributes). Nodes represent entities (eg, concepts, events, places, things); relationships (which may
be directed) connect pairs of nodes.
However, nodes and relationships are simply low-level building blocks. The real strength of the Property
Graph lies in its ability to encode patterns of connected nodes and relationships. A single node or
relationship typically encodes very little information, but a pattern of nodes and relationships can
encode arbitrarily complex ideas.
Cypher, Neo4j’s query language, is strongly based on patterns. Specifically, patterns are used to match
desired graph structures. Once a matching structure has been found (or created), Neo4j can use it for
further processing.

Simple and Complex Patterns
A simple pattern, which has only a single relationship, connects a pair of nodes (or, occasionally, a node
to itself). For example, a Person LIVES_IN a City or a City is PART_OF a Country.
Complex patterns, using multiple relationships, can express arbitrarily complex concepts and support
a variety of interesting use cases. For example, we might want to match instances where a Person
LIVES_IN a Country. The following Cypher code combines two simple patterns into a (mildly) complex
pattern which performs this match:
(:Person) -[:LIVES_IN]-> (:City) -[:PART_OF]-> (:Country)

Pattern recognition is fundamental to the way that the brain works. Consequently, humans are very
good at working with patterns. When patterns are presented visually (eg, in a diagram or map), humans
can use them to recognize, specify, and understand concepts. As a pattern-based language, Cypher
takes advantage of this capability.

Cypher Concepts
2

3

Like SQL (used in relational databases ), Cypher is a textual, declarative query language. It uses a form
4
of ASCII art to represent graph-related patterns. SQL-like clauses and keywords (eg, MATCH, WHERE, DELETE)
are used to combine these patterns and specify desired actions.
This combination tells Neo4j which patterns to match and what to do with the matching items (eg,
5
nodes, relationships, paths, collections). However, as a declarative language, Cypher does not tell
6
Neo4j how to find nodes, traverse relationships, etc. (This level of control is available from Neo4j’s Java
7
APIs , see Section 32.2, “Unmanaged Extensions” [568])
Diagrams made up of icons and arrows are commonly used to visualize graphs; textual annotations
provide labels, define properties, etc. Cypher’s ASCII-art syntax formalizes this approach, while adapting
it to the limitations of text.

Node Syntax
Cypher uses a pair of parentheses (usually containing a text string) to represent a node, eg: (), (foo).
This is reminiscent of a circle or a rectangle with rounded end caps. Here are some ASCII-art encodings
for example Neo4j nodes, providing varying types and amounts of detail:
()
2

https://en.wikipedia.org/wiki/SQL
https://en.wikipedia.org/wiki/Relational_database_management_system
4
https://en.wikipedia.org/wiki/ASCII_art
5
https://en.wikipedia.org/wiki/Declarative_programming
6
https://en.wikipedia.org/wiki/Java_(programming_language)
7
https://en.wikipedia.org/wiki/Application_programming_interface
3

18

Introduction to Cypher
(matrix)
(:Movie)
(matrix:Movie)
(matrix:Movie {title: "The Matrix"})
(matrix:Movie {title: "The Matrix", released: 1997})

The simplest form, (), represents an anonymous, uncharacterized node. If we want to refer to the
node elsewhere, we can add an identifier, eg: (matrix). Identifiers are restricted (ie, scoped) to a single
statement: an identifier may have different (or no) meaning in another statement.
The Movie label (prefixed in use with a colon) declares the node’s type. This restricts the pattern, keeping
it from matching (say) a structure with an Actor node in this position. Neo4j’s node indexes also use
labels: each index is specific to the combination of a label and a property.
The node’s properties (eg, title) are represented as a list of key/value pairs, enclosed within a pair of
braces, eg: {...}. Properties can be used to store information and/or restrict patterns. For example, we
could match nodes whose title is "The Matrix".

Relationship Syntax
Cypher uses a pair of dashes (--) to represent an undirected relationship. Directed relationships have
an arrowhead at one end (eg, <--, -->). Bracketed expressions (eg: [...]) can be used to add details.
This may include identifiers, properties, and/or type information, eg:
-->
-[role]->
-[:ACTED_IN]->
-[role:ACTED_IN]->
-[role:ACTED_IN {roles: ["Neo"]}]->

The syntax and semantics found within a relationship’s bracket pair are very similar to those used
between a node’s parentheses. An identifier (eg, role) can be defined, to be used elsewhere in the
statement. The relationship’s type (eg, ACTED_IN) is analogous to the node’s label. The properties (eg,
roles) are entirely equivalent to node properties. (Note that the value of a property may be an array.)

Pattern Syntax
Combining the syntax for nodes and relationships, we can express patterns. The following could be a
simple pattern (or fact) in this domain:
(keanu:Person:Actor {name: "Keanu Reeves"} )
-[role:ACTED_IN
{roles: ["Neo"] } ]->
(matrix:Movie
{title: "The Matrix"} )

Like with node labels, the relationship type ACTED_IN is added as a symbol, prefixed with a colon:
:ACTED_IN. Identifiers (eg, role) can be used elsewhere in the statement to refer to the relationship.
Node and relationship properties use the same notation. In this case, we used an array property for the
roles, allowing multiple roles to be specified.

Pattern Nodes vs. Database Nodes

When a node is used in a pattern, it describes zero or more nodes in the database. Similarly,
each pattern describes zero or more paths of nodes and relationships.

Pattern Identifiers
To increase modularity and reduce repetition, Cypher allows patterns to be assigned to identifiers. This
allow the matching paths to be inspected, used in other expressions, etc.
acted_in = (:Person)-[:ACTED_IN]->(:Movie)

The acted_in variable would contain two nodes and the connecting relationship for each path that was
found or created. There are a number of functions to access details of a path, including nodes(path),
rels(path) (same as relationships(path)), and length(path).
19

Introduction to Cypher

Clauses
Cypher statements typically have multiple clauses, each of which performs a specific task, eg:
• create and match patterns in the graph
• filter, project, sort, or paginate results
• connect/compose partial statements
By combining Cypher clauses, we can compose more complex statements that express what we want
to know or create. Neo4j then figures out how to achieve the desired goal in an efficient manner.

20

Introduction to Cypher

3.3. Patterns in Practice
Creating Data
We’ll start by looking into the clauses that allow us to create data.
To add data, we just use the patterns we already know. By providing patterns we can specify what
graph structures, labels and properties we would like to make part of our graph.
Obviously the simplest clause is called CREATE. It will just go ahead and directly create the patterns that
you specify.
For the patterns we’ve looked at so far this could look like the following:
CREATE (:Movie { title:"The Matrix",released:1997 })

If we execute this statement, Cypher returns the number of changes, in this case adding 1 node, 1 label
and 2 properties.
(empty result)
Nodes created: 1
Properties set: 2
Labels added: 1

As we started out with an empty database, we now have a database with a single node in it:

Movie
t it le = 'The Mat rix'
released = 1997

If case we also want to return the created data we can add a RETURN clause, which refers to the identifier
we’ve assigned to our pattern elements.
CREATE (p:Person { name:"Keanu Reeves", born:1964 })
RETURN p

This is what gets returned:
p
Node[1]{name:"Keanu Reeves", born:1964}
1 row
Nodes created: 1
Properties set: 2
Labels added: 1

If we want to create more than one element, we can separate the elements with commas or use
multiple CREATE statements.
We can of course also create more complex structures, like an ACTED_IN relationship with information
about the character, or DIRECTED ones for the director.
CREATE (a:Person { name:"Tom Hanks",
born:1956 })-[r:ACTED_IN { roles: ["Forrest"]}]->(m:Movie { title:"Forrest Gump",released:1994 })
CREATE (d:Person { name:"Robert Zemeckis", born:1951 })-[:DIRECTED]->(m)
RETURN a,d,r,m

This is the part of the graph we just updated:
21

Introduction to Cypher

Person

Person

nam e = 'Tom Hanks'
born = 1956

nam e = 'Robert Zem eckis'
born = 1951

ACTED_IN
roles = [ 'Forrest ']

DIRECTED

Movie
t it le = 'Forrest Gum p'
released = 1994

In most cases, we want to connect new data to existing structures. This requires that we know how to
find existing patterns in our graph data, which we will look at next.

Matching Patterns
Matching patterns is a task for the MATCH statement. We pass the same kind of patterns we’ve used so
far to MATCH to describe what we’re looking for. It is similar to query by example, only that our examples
also include the structures.

Note

A MATCH statement will search for the patterns we specify and return one row per successful
pattern match.
To find the data we’ve created so far, we can start looking for all nodes labeled with the Movie label.
MATCH (m:Movie)
RETURN m

Here’s the result:
Movie

Movie

t it le = 'The Mat rix'
released = 1997

t it le = 'Forrest Gum p'
released = 1994

This should show both The Matrix and Forrest Gump.
We can also look for a specific person, like Keanu Reeves.
MATCH (p:Person { name:"Keanu Reeves" })
RETURN p

This query returns the matching node:
Person
nam e = 'Keanu Reeves'
born = 1964

Note that we only provide enough information to find the nodes, not all properties are required. In
most cases you have key-properties like SSN, ISBN, emails, logins, geolocation or product codes to look
for.
We can also find more interesting connections, like for instance the movies titles that Tom Hanks acted
in and the roles he played.
MATCH (p:Person { name:"Tom Hanks" })-[r:ACTED_IN]->(m:Movie)

22

Introduction to Cypher
RETURN m.title, r.roles

m.title

r.roles

"Forrest Gump"

["Forrest"]

1 row

In this case we only returned the properties of the nodes and relationships that we were interested in.
You can access them everywhere via a dot notation identifer.property.
Of course this only lists his role as Forrest in Forrest Gump because that’s all data that we’ve added.
Now we know enough to connect new nodes to existing ones and can combine MATCH and CREATE to
attach structures to the graph.

Attaching Structures
To extend the graph with new information, we first match the existing connection points and then
attach the newly created nodes to them with relationships. Adding Cloud Atlas as a new movie for Tom
Hanks could be achieved like this:
MATCH (p:Person { name:"Tom Hanks" })
CREATE (m:Movie { title:"Cloud Atlas",released:2012 })
CREATE (p)-[r:ACTED_IN { roles: ['Zachry']}]->(m)
RETURN p,r,m

Here’s what the structure looks like in the database:
Person
nam e = 'Tom Hanks'
born = 1956

ACTED_IN
roles = [ 'Zachry']

Movie
t it le = 'Cloud At las'
released = 2012

Tip

It is important to remember that we can assign identifiers to both nodes and relationships
and use them later on, no matter if they were created or matched.

It is possible to attach both node and relationship in a single CREATE clause. For readability it helps to
split them up though.

Important

A tricky aspect of the combination of MATCH and CREATE is that we get one row per matched
pattern. This causes subsequent CREATE statements to be executed once for each row. In
many cases this is what you want. If that’s not intended, please move the CREATE statement
before the MATCH, or change the cardinality of the query with means discussed later or use
the get or create semantics of the next clause: MERGE.

Completing Patterns
Whenever we get data from external systems or are not sure if certain information already exists in
the graph, we want to be able to express a repeatable (idempotent) update operation. In Cypher MERGE
23

Introduction to Cypher
has this function. It acts like a combination of MATCH or CREATE, which checks for the existence of data
first before creating it. With MERGE you define a pattern to be found or created. Usually, as with MATCH
you only want to include the key property to look for in your core pattern. MERGE allows you to provide
additional properties you want to set ON CREATE.
If we wouldn’t know if our graph already contained Cloud Atlas we could merge it in again.
MERGE (m:Movie { title:"Cloud Atlas" })
ON CREATE SET m.released = 2012
RETURN m

m
Node[5]{title:"Cloud Atlas", released:2012}
1 row

We get a result in any both cases: either the data (potentially more than one row) that was already in
the graph or a single, newly created Movie node.

Note

A MERGE clause without any previously assigned identifiers in it either matches the full pattern
or creates the full pattern. It never produces a partial mix of matching and creating within a
pattern. To achieve a partial match/create, make sure to use already defined identifiers for
the parts that shouldn’t be affected.
So foremost MERGE makes sure that you can’t create duplicate information or structures, but it comes
with the cost of needing to check for existing matches first. Especially on large graphs it can be costly
to scan a large set of labeled nodes for a certain property. You can alleviate some of that by creating
supporting indexes or constraints, which we’ll discuss later. But it’s still not for free, so whenever you’re
sure to not create duplicate data use CREATE over MERGE.

Tip

MERGE can also assert that a relationship is only created once. For that to work you have to

pass in both nodes from a previous pattern match.

MATCH (m:Movie { title:"Cloud Atlas" })
MATCH (p:Person { name:"Tom Hanks" })
MERGE (p)-[r:ACTED_IN]->(m)
ON CREATE SET r.roles =['Zachry']
RETURN p,r,m

Person
nam e = 'Tom Hanks'
born = 1956

ACTED_IN
roles = [ 'Zachry']

Movie
t it le = 'Cloud At las'
released = 2012

In case the direction of a relationship is arbitrary, you can leave off the arrowhead. MERGE will then
check for the relationship in either direction, and create a new directed relationship if no matching
relationship was found.
24

Introduction to Cypher
If you choose to pass in only one node from a preceding clause, MERGE offers an interesting functionality.
It will then only match within the direct neighborhood of the provided node for the given pattern, and,
if not found create it. This can come in very handy for creating for example tree structures.
CREATE (y:Year { year:2014 })
MERGE (y)<-[:IN_YEAR]-(m10:Month { month:10 })
MERGE (y)<-[:IN_YEAR]-(m11:Month { month:11 })
RETURN y,m10,m11

This is the graph structure that gets created:
Mont h

Mont h

m ont h = 11

m ont h = 10

IN_YEAR

IN_YEAR

Year
year = 2014

Here there is no global search for the two Month nodes; they are only searched for in the context of the
2014 Year node.

25

Introduction to Cypher

3.4. Getting the Results You Want
Let’s first get some data in to retrieve results from:
CREATE
CREATE
CREATE
CREATE
CREATE
CREATE
CREATE
CREATE
CREATE

(matrix:Movie { title:"The Matrix",released:1997 })
(cloudAtlas:Movie { title:"Cloud Atlas",released:2012 })
(forrestGump:Movie { title:"Forrest Gump",released:1994 })
(keanu:Person { name:"Keanu Reeves", born:1964 })
(robert:Person { name:"Robert Zemeckis", born:1951 })
(tom:Person { name:"Tom Hanks", born:1956 })
(tom)-[:ACTED_IN { roles: ["Forrest"]}]->(forrestGump)
(tom)-[:ACTED_IN { roles: ['Zachry']}]->(cloudAtlas)
(robert)-[:DIRECTED]->(forrestGump)

This is the data we will start out with:
Movie

Person

t it le = 'The Mat rix'
released = 1997

Person

nam e = 'Tom Hanks'
born = 1956

ACTED_IN
roles = [ 'Zachry']

Person

nam e = 'Robert Zem eckis'
born = 1951

ACTED_IN
roles = [ 'Forrest ']

Movie

nam e = 'Keanu Reeves'
born = 1964

DIRECTED

Movie

t it le = 'Cloud At las'
released = 2012

t it le = 'Forrest Gum p'
released = 1994

Filtering Results
So far we’ve matched patterns in the graph and always returned all results we found. Quite often there
are conditions in play for what we want to see. Similar to in SQL those filter conditions are expressed
in a WHERE clause. This clause allows to use any number of boolean expressions (predicates) combined
with AND, OR, XOR and NOT. The simplest predicates are comparisons, especially equality.
MATCH (m:Movie)
WHERE m.title = "The Matrix"
RETURN m

m
Node[0]{title:"The Matrix", released:1997}
1 row

For equality on one or more properties, a more compact syntax can be used as well:
MATCH (m:Movie { title: "The Matrix" })
RETURN m

Other options are numeric comparisons, matching regular expressions and checking the existence of
values within a collection.
The WHERE clause below includes a regular expression match, a greater than comparison and a test to
see if a value exists in a collection.
MATCH (p:Person)-[r:ACTED_IN]->(m:Movie)
WHERE p.name =~ "K.+" OR m.released > 2000 OR "Neo" IN r.roles
RETURN p,r,m

p

r

m

Node[5]{name:"Tom Hanks",
born:1956}

:ACTED_IN[1]{roles:["Zachry"]}

Node[1]{title:"Cloud Atlas",
released:2012}

1 row

26

Introduction to Cypher
One aspect that might be a little surprising is that you can even use patterns as predicates. Where MATCH
expands the number and shape of patterns matched, a pattern predicate restricts the current result
set. It only allows the paths to pass that satisfy the additional patterns as well (or NOT).
MATCH (p:Person)-[:ACTED_IN]->(m)
WHERE NOT (p)-[:DIRECTED]->()
RETURN p,m

p

m

Node[5]{name:"Tom Hanks", born:1956}

Node[1]{title:"Cloud Atlas", released:2012}

Node[5]{name:"Tom Hanks", born:1956}

Node[2]{title:"Forrest Gump", released:1994}

2 rows

Here we find actors, because they sport an ACTED_IN relationship but then skip those that ever DIRECTED
any movie.
There are also more advanced ways of filtering like collection-predicates which we will look at later on.

Returning Results
So far we’ve returned only nodes, relationships, or paths directly via their identifiers. But the RETURN
clause can actually return any number of expressions. But what are actually expressions in Cypher?
The simplest expressions are literal values like numbers, strings and arrays as [1,2,3], and maps like
{name:"Tom Hanks", born:1964, movies:["Forrest Gump", ...], count:13}. You can access individual
properties of any node, relationship, or map with a dot-syntax like n.name. Individual elements or slices
of arrays can be retrieved with subscripts like names[0] or movies[1..-1]. Each function evaluation
like length(array), toInt("12"), substring("2014-07-01",0,4), or coalesce(p.nickname,"n/a") is also an
expression.
Predicates that you’d use in WHERE count as boolean expressions.
Of course simpler expressions can be composed and concatenated to form more complex expressions.
By default the expression itself will be used as label for the column, in many cases you want to alias
that with a more understandable name using expression AS alias. You can later on refer to that column
using its alias.
MATCH (p:Person)
RETURN p, p.name AS name, upper(p.name), coalesce(p.nickname,"n/a") AS nickname, { name: p.name,
label:head(labels(p))} AS person

p

name

upper(p.name)

nickname

person

Node[3]{name:"Keanu "Keanu Reeves"
Reeves", born:1964}

"KEANU REEVES"

"n/a"

{name -> "Keanu
Reeves", label ->
"Person"}

Node[4]
{name:"Robert
Zemeckis",
born:1951}

"Robert Zemeckis"

"ROBERT ZEMECKIS"

"n/a"

{name -> "Robert
Zemeckis", label ->
"Person"}

Node[5]{name:"Tom
Hanks", born:1956}

"Tom Hanks"

"TOM HANKS"

"n/a"

{name -> "Tom
Hanks", label ->
"Person"}

3 rows

If you’re interested in unique results you can use the DISTINCT keyword after RETURN to indicate that.

Aggregating Information
In many cases you want to aggregate or group the data that you encounter while traversing patterns
in your graph. In Cypher aggregation happens in the RETURN clause while computing your final results.
27

Introduction to Cypher
Many common aggregation functions are supported, e.g. count, sum, avg, min, and max, but there are
several more.
Counting the number of people in your database could be achieved by this:
MATCH (:Person)
RETURN count(*) AS people

people
3
1 row

Please note that NULL values are skipped during aggregation. For aggregating only unique values use
DISTINCT, like in count(DISTINCT role).
Aggregation in Cypher just works. You specify which result columns you want to aggregate and Cypher
will use all non-aggregated columns as grouping keys.
Aggregation affects which data is still visible in ordering or later query parts.
To find out how often an actor and director worked together, you’d run this statement:
MATCH (actor:Person)-[:ACTED_IN]->(movie:Movie)<-[:DIRECTED]-(director:Person)
RETURN actor,director,count(*) AS collaborations

actor

director

collaborations

Node[5]{name:"Tom Hanks",
born:1956}

Node[4]{name:"Robert Zemeckis",
born:1951}

1

1 row

Frequently you want to sort and paginate after aggregating a count(x).

Ordering and Pagination
Ordering works like in other query languages, with an ORDER BY expression [ASC|DESC] clause. The
expression can be any expression discussed before as long as it is computable from the returned
information.
So for instance if you return person.name you can still ORDER BY person.age as both are accessible from
the person reference. You cannot order by things that you can’t infer from the information you return.
This is especially important with aggregation and DISTINCT return values as both remove the visibility of
data that is aggregated.
Pagination is a straightforward use of SKIP {offset} LIMIT {count}.
A common pattern is to aggregate for a count (score or frequency), order by it and only return the top-n
entries.
For instance to find the most prolific actors you could do:
MATCH (a:Person)-[:ACTED_IN]->(m:Movie)
RETURN a,count(*) AS appearances
ORDER BY appearances DESC LIMIT 10;

a

appearances

Node[5]{name:"Tom Hanks", born:1956}

2

1 row

Collecting Aggregation
The most helpful aggregation function is collect, which, as the name says, collects all aggregated
values into a real array or list. This comes very handy in many situations as you don’t loose the detail
information while aggregating.
28

Introduction to Cypher
Collect is well suited for retrieving the typical parent-child structures, where one core entity (parent,
root or head) is returned per row with all it’s dependent information in associated collections created
with collect. This means there’s no need to repeat the parent information per each child-row or even
running 1+n statements to retrieve the parent and its children individually.
To retrieve the cast of each movie in our database you could use this statement:
MATCH (m:Movie)<-[:ACTED_IN]-(a:Person)
RETURN m.title AS movie, collect(a.name) AS cast, count(*) AS actors

movie

cast

actors

"Forrest Gump"

["Tom Hanks"]

1

"Cloud Atlas"

["Tom Hanks"]

1

2 rows

The lists created by collect can either be used from the client consuming the Cypher results or directly
within a statement with any of the collection functions or predicates.

29

Introduction to Cypher

3.5. How to Compose Large Statements
Let’s first get some data in to retrieve results from:
CREATE
CREATE
CREATE
CREATE
CREATE
CREATE
CREATE
CREATE
CREATE

(matrix:Movie { title:"The Matrix",released:1997 })
(cloudAtlas:Movie { title:"Cloud Atlas",released:2012 })
(forrestGump:Movie { title:"Forrest Gump",released:1994 })
(keanu:Person { name:"Keanu Reeves", born:1964 })
(robert:Person { name:"Robert Zemeckis", born:1951 })
(tom:Person { name:"Tom Hanks", born:1956 })
(tom)-[:ACTED_IN { roles: ["Forrest"]}]->(forrestGump)
(tom)-[:ACTED_IN { roles: ['Zachry']}]->(cloudAtlas)
(robert)-[:DIRECTED]->(forrestGump)

Combine statements with UNION
A Cypher statement is usually quite compact. Expressing references between nodes as visual patterns
makes them easy to understand.
If you want to combine the results of two statements that have the same result structure, you can use
UNION [ALL].
For instance if you want to list both actors and directors without using the alternative relationship-type
syntax ()-[:ACTED_IN|:DIRECTED]->() you can do this:
MATCH (actor:Person)-[r:ACTED_IN]->(movie:Movie)
RETURN actor.name AS name, type(r) AS acted_in, movie.title AS title
UNION
MATCH (director:Person)-[r:DIRECTED]->(movie:Movie)
RETURN director.name AS name, type(r) AS acted_in, movie.title AS title

name

acted_in

title

"Tom Hanks"

"ACTED_IN"

"Cloud Atlas"

"Tom Hanks"

"ACTED_IN"

"Forrest Gump"

"Robert Zemeckis"

"DIRECTED"

"Forrest Gump"

3 rows

Use WITH to Chain Statements
In Cypher it’s possible to chain fragments of statements together, much like you would do within a
data-flow pipeline. Each fragment works on the output from the previous one and its results can feed
into the next one.
You use the WITH clause to combine the individual parts and declare which data flows from one to the
other. WITH is very much like RETURN with the difference that it doesn’t finish a query but prepares the
input for the next part. You can use the same expressions, aggregations, ordering and pagination as in
the RETURN clause.
The only difference is that you must alias all columns as they would otherwise not be accessible. Only
columns that you declare in your WITH clause is available in subsequent query parts.
See below for an example where we collect the movies someone appeared in, and then filter out those
which appear in only one movie.
MATCH (person:Person)-[:ACTED_IN]->(m:Movie)
WITH person, count(*) AS appearances, collect(m.title) AS movies
WHERE appearances > 1
RETURN person.name, appearances, movies

person.name

appearances

movies

"Tom Hanks"

2

["Cloud Atlas", "Forrest Gump"]

1 row

30

Introduction to Cypher

Tip

If you want to filter by an aggregated value in SQL or similar languages you would have to
use HAVING. That’s a single purpose clause for filtering aggregated information. In Cypher,
WHERE can be used in both cases.

31

Introduction to Cypher

3.6. Labels, Constraints and Indexes
Labels are a convenient way to group nodes together. They are used to restrict queries, define
constraints and create indexes.

Using Constraints
You can also specify unique constraints that guarantee uniqueness of a certain property on nodes with
a specific label.
These constraints are also used by the MERGE clause to make certain that a node only exists once.
The following will give an example of how to use labels and add constraints and indexes to them. Let’s
start out adding a constraint — in this case we decided that all Movie node titles should be unique.
CREATE CONSTRAINT ON (movie:Movie) ASSERT movie.title IS UNIQUE

Note that adding the unique constraint will add an index on that property, so we won’t do that
separately. If we drop a constraint, and still want an index on the same property, we have to create such
an index.
Constraints can be added after a label is already in use, but that requires that the existing data
complies with the constraints.

Using indexes
For a graph query to run fast, you don’t need indexes, you only need them to find your starting points.
The main reason for using indexes in a graph database is to find the starting points in the graph as fast
as possible. After the initial index seek you rely on in-graph structures and the first class citizenship of
relationships in the graph database to achieve high performance.
In this case we want an index to speed up finding actors by name in the database:
CREATE INDEX ON :Actor(name)

Indexes can be added at any time. Note that it will take some time for an index to come online when
there’s existing data.
Now, let’s add some data.
CREATE (actor:Actor { name:"Tom Hanks" }),(movie:Movie { title:'Sleepless IN Seattle' }),
(actor)-[:ACTED_IN]->(movie);

Normally you don’t specify indexes when querying for data. They will be used automatically. This means
we can simply look up the Tom Hanks node, and the index will kick in behind the scenes to boost
performance.
MATCH (actor:Actor { name: "Tom Hanks" })
RETURN actor;

Labels
Now let’s say we want to add another label for a node. Here’s how to do that:
MATCH (actor:Actor { name: "Tom Hanks" })
SET actor :American;

To remove a label from nodes, this is what to do:
MATCH (actor:Actor { name: "Tom Hanks" })
REMOVE actor:American;

Related Content
For more information on labels and related topics, see:
32

Introduction to Cypher
•
•
•
•
•
•
•

the section called “Labels” [7]
Chapter 14, Schema [243]
Section 14.2, “Constraints” [247]
Section 14.1, “Indexes” [244]
Section 10.8, “Using” [152]
Section 12.3, “Set” [200]
Section 12.5, “Remove” [205]

33

Introduction to Cypher

3.7. Loading Data
As you’ve seen you can not only query data expressively but also create data with Cypher statements.
Naturally in most cases you wouldn’t want to write or generate huge statements to generate your data
but instead use an existing data source that you pass into your statement and which is used to drive
the graph generation process.
That process not only includes creating completely new data but also integrating with existing
structures and updating your graph.

Parameters
In general we recommend passing in varying literal values from the outside as named parameters. This
allows Cypher to reuse existing execution plans for the statements.
Of course you can also pass in parameters for data to be imported. Those can be scalar values, maps,
lists or even lists of maps.
In your Cypher statement you can then iterate over those values (e.g. with UNWIND) to create your graph
structures.
For instance to create a movie graph from JSON data structures pulled from an API you could use:
{
"movies" : [ {
"title" : "Stardust",
"released" : 2007,
"cast" : [ {
"actor" : {
"name" : "Robert de Niro",
"born" : 1943
},
"characters" : [ "Captain Shakespeare" ]
}, {
"actor" : {
"name" : "Michelle Pfeiffer",
"born" : 1958
},
"characters" : [ "Lamia" ]
} ]
} ]
}
UNWIND {movies} as movie
MERGE (m:Movie {title:movie.title}) ON CREATE SET m.released = movie.released
FOREACH (role IN movie.cast |
MERGE (a:Person {name:role.actor.name}) ON CREATE SET a.born = role.actor.born
MERGE (a)-[:ACTED_IN {roles:role.characters}]->(m)
)

Importing CSV
Cypher provides an elegant built-in way to import tabular CSV data into graph structures.
The LOAD CSV clause parses a local or remote file into a stream of rows which represent maps (with
headers) or lists. Then you can use whatever Cypher operations you want to apply to either create
nodes or relationships or to merge with existing graph structures.
As CSV files usually represent either node- or relationship-lists, you run multiple passes to create nodes
and relationships separately.
For more details, see Section 11.6, “Load CSV” [182].
movies.csv
id,title,country,year

34

Introduction to Cypher
1,Wall Street,USA,1987
2,The American President,USA,1995
3,The Shawshank Redemption,USA,1994
LOAD CSV WITH HEADERS FROM "http://neo4j.com/docs/2.3.12/csv/intro/movies.csv" AS line
CREATE (m:Movie { id:line.id,title:line.title, released:toInt(line.year)});

persons.csv
id,name
1,Charlie Sheen
2,Oliver Stone
3,Michael Douglas
4,Martin Sheen
5,Morgan Freeman
LOAD CSV WITH HEADERS FROM "http://neo4j.com/docs/2.3.12/csv/intro/persons.csv" AS line
MERGE (a:Person { id:line.id })
ON CREATE SET a.name=line.name;

roles.csv
personId,movieId,role
1,1,Bud Fox
4,1,Carl Fox
3,1,Gordon Gekko
4,2,A.J. MacInerney
3,2,President Andrew Shepherd
5,3,Ellis Boyd 'Red' Redding
LOAD CSV WITH HEADERS FROM "http://neo4j.com/docs/2.3.12/csv/intro/roles.csv" AS line
MATCH (m:Movie { id:line.movieId })
MATCH (a:Person { id:line.personId })
CREATE (a)-[:ACTED_IN { roles: [line.role]}]->(m);

Person

Person

id = '1'
nam e = 'Charlie Sheen'

Person

id = '4'
nam e = 'Mart in Sheen'

ACTED_IN
roles = [ 'Bud Fox']

ACTED_IN
roles = [ 'Carl Fox']

Movie
id = '1'
t it le = 'Wall St reet '
released = 1987

Person

id = '3'
nam e = 'Michael Douglas'

ACTED_IN
roles = [ 'A.J. MacInerney']

ACTED_IN
roles = [ 'Gordon Gekko']

id = '5'
nam e = 'Morgan Freem an'

ACTED_IN
roles = [ 'President Andrew Shepherd']

Movie

Person
id = '2'
nam e = 'Oliver St one'

ACTED_IN
roles = [ 'Ellis Boyd \\'Red\\' Redding']

Movie

released = 1995
id = '2'
t it le = 'The Am erican President '

released = 1994
id = '3'
t it le = 'The Shawshank Redem pt ion'

If your file contains denormalized data, you can either run the same file with multiple passes and
simple operations as shown above or you might have to use MERGE to create entities uniquely.
For our use-case we can import the data using a CSV structure like this:
movie_actor_roles.csv
title;released;actor;born;characters
Back to the Future;1985;Michael J. Fox;1961;Marty McFly
Back to the Future;1985;Christopher Lloyd;1938;Dr. Emmet Brown
LOAD CSV WITH HEADERS FROM "http://neo4j.com/docs/2.3.12/csv/intro/movie_actor_roles.csv" AS line
FIELDTERMINATOR ";"
MERGE (m:Movie { title:line.title })
ON CREATE SET m.released = toInt(line.released)
MERGE (a:Person { name:line.actor })
ON CREATE SET a.born = toInt(line.born)
MERGE (a)-[:ACTED_IN { roles:split(line.characters,",")}]->(m)

35

Introduction to Cypher

Person

Person

id = '1'
nam e = 'Charlie Sheen'

Person

id = '4'
nam e = 'Mart in Sheen'

ACTED_IN
roles = [ 'Bud Fox']

ACTED_IN
roles = [ 'Carl Fox']

Movie
id = '1'
t it le = 'Wall St reet '
released = 1987

Person

id = '3'
nam e = 'Michael Douglas'

ACTED_IN
roles = [ 'A.J. MacInerney']

ACTED_IN
roles = [ 'Gordon Gekko']

id = '5'
nam e = 'Morgan Freem an'

ACTED_IN
roles = [ 'President Andrew Shepherd']

Movie

ACTED_IN
roles = [ 'Ellis Boyd \\'Red\\' Redding']

Movie

released = 1995
id = '2'
t it le = 'The Am erican President '

Person
id = '2'
nam e = 'Oliver St one'

released = 1994
id = '3'
t it le = 'The Shawshank Redem pt ion'

Person

Person

nam e = 'Michael J. Fox'
born = 1961

nam e = 'Christ opher Lloyd'
born = 1938

ACTED_IN
roles = [ 'Mart y McFly']

ACTED_IN
roles = [ 'Dr. Em m et Brown']

Movie
t it le = 'Back t o t he Fut ure'
released = 1985

If you import a large amount of data (more than 10000 rows), it is recommended to prefix your LOAD
CSV clause with a PERIODIC COMMIT hint. This allows Neo4j to regularly commit the import transactions to
avoid memory churn for large transaction-states.

36

Introduction to Cypher

3.8. Utilizing Data Structures
Cypher can create and consume more complex data structures out of the box. As already mentioned
you can create literal lists ([1,2,3]) and maps ({name: value}) within a statement.
There are a number of functions that work with lists. They range from simple ones like size(list) that
returns the size of a list to reduce, which runs an expression against the elements and accumulates the
results.
Let’s first load a bit of data into the graph. If you want more details on how the data is loaded, see the
section called “Importing CSV” [34].
LOAD CSV WITH HEADERS FROM "http://neo4j.com/docs/2.3.12/csv/intro/movies.csv" AS line
CREATE (m:Movie { id:line.id,title:line.title, released:toInt(line.year)});
LOAD CSV WITH HEADERS FROM "http://neo4j.com/docs/2.3.12/csv/intro/persons.csv" AS line
MERGE (a:Person { id:line.id })
ON CREATE SET a.name=line.name;
LOAD CSV WITH HEADERS FROM "http://neo4j.com/docs/2.3.12/csv/intro/roles.csv" AS line
MATCH (m:Movie { id:line.movieId })
MATCH (a:Person { id:line.personId })
CREATE (a)-[:ACTED_IN { roles: [line.role]}]->(m);
LOAD CSV WITH HEADERS FROM "http://neo4j.com/docs/2.3.12/csv/intro/movie_actor_roles.csv" AS line
FIELDTERMINATOR ";"
MERGE (m:Movie { title:line.title })
ON CREATE SET m.released = toInt(line.released)
MERGE (a:Person { name:line.actor })
ON CREATE SET a.born = toInt(line.born)
MERGE (a)-[:ACTED_IN { roles:split(line.characters,",")}]->(m)

Now, let’s try out data structures.
To begin with, collect the names of the actors per movie, and return two of them:
MATCH (movie:Movie)<-[:ACTED_IN]-(actor:Person)
RETURN movie.title AS movie, collect(actor.name)[0..2] AS two_of_cast

movie

two_of_cast

"The American President"

["Michael Douglas", "Martin Sheen"]

"Back to the Future"

["Christopher Lloyd", "Michael J. Fox"]

"Wall Street"

["Michael Douglas", "Martin Sheen"]

"The Shawshank Redemption"

["Morgan Freeman"]

4 rows

You can also access individual elements or slices of a list quickly with list[1] or list[5..-5]. Other
functions to access parts of a list are head(list), tail(list) and last(list).

List Predicates
When using lists and arrays in comparisons you can use predicates like value IN list or any(x IN list
WHERE x = value). There are list predicates to satisfy conditions for all, any, none and single elements.
MATCH path =(:Person)-->(:Movie)<--(:Person)
WHERE ANY (n IN nodes(path) WHERE n.name = 'Michael Douglas')
RETURN extract(n IN nodes(path)| coalesce(n.name, n.title))

extract(n IN nodes(path

coalesce(n.name, n.title))

["Martin Sheen", "Wall S

et", "Michael Douglas"]

["Charlie Sheen", "Wall

eet", "Michael Douglas"]

6 rows

37

Introduction to Cypher
extract(n IN nodes(path

coalesce(n.name, n.title))

["Michael Douglas", "Wal

treet", "Martin Sheen"]

["Michael Douglas", "Wal

treet", "Charlie Sheen"]

["Martin Sheen", "The Am

can President", "Michael Douglas"]

["Michael Douglas", "The

erican President", "Martin Sheen"]

6 rows

List Processing
Oftentimes you want to process lists to filter, aggregate (reduce) or transform (extract) their values.
Those transformations can be done within Cypher or in the calling code. This kind of list-processing can
reduce the amount of data handled and returned, so it might make sense to do it within the Cypher
statement.
A simple, non-graph example would be:
WITH range(1,10) AS numbers
WITH extract(n IN numbers | n*n) AS squares
WITH filter(n IN squares WHERE n > 25) AS large_squares
RETURN reduce(a = 0, n IN large_squares | a + n) AS sum_large_squares

sum_large_squares
330
1 row

In a graph-query you can filter or aggregate collected values instead or work on array properties.
MATCH (m:Movie)<-[r:ACTED_IN]-(a:Person)
WITH m.title AS movie, collect({ name: a.name, roles: r.roles }) AS cast
RETURN movie, filter(actor IN cast WHERE actor.name STARTS WITH "M")

movie

filter(actor IN cast WHERE actor.name STARTS
WITH "M")

"The American President"

[{name -> "Michael Douglas", roles -> ["President
Andrew Shepherd"]}, {name -> "Martin Sheen", roles
-> ["A. J. MacInerney"]}]

"Back to the Future"

[{name -> "Michael J. Fox", roles -> ["Marty
McFly"]}]

"Wall Street"

[{name -> "Michael Douglas", roles -> ["Gordon
Gekko"]}, {name -> "Martin Sheen", roles -> ["Carl
Fox"]}]

"The Shawshank Redemption"

[{name -> "Morgan Freeman", roles -> ["Ellis Boyd
'Red' Redding"]}]

4 rows

Unwind Lists
Sometimes you have collected information into a list, but want to use each element individually as a
row. For instance, you might want to further match patterns in the graph. Or you passed in a collection
of values but now want to create or match a node or relationship for each element. Then you can use
the UNWIND clause to unroll a list into a sequence of rows again.
For instance, a query to find the top 3 co-actors and then follow their movies and again list the cast for
each of those movies:
MATCH (actor:Person)-[:ACTED_IN]->(movie:Movie)<-[:ACTED_IN]-(colleague:Person)
WHERE actor.name < colleague.name

38

Introduction to Cypher
WITH actor, colleague, count(*) AS frequency, collect(movie) AS movies
ORDER BY frequency DESC LIMIT 3 UNWIND movies AS m
MATCH (m)<-[:ACTED_IN]-(a)
RETURN m.title AS movie, collect(a.name) AS cast

movie

cast

"The American President"

["Michael Douglas", "Martin Sheen"]

"Back to the Future"

["Christopher Lloyd", "Michael J. Fox"]

"Wall Street"

["Michael Douglas", "Martin Sheen", "Charlie
Sheen", "Michael Douglas", "Martin Sheen", "Charlie
Sheen"]

3 rows

39

Introduction to Cypher

3.9. Cypher vs. SQL
If you have used SQL and want to learn Cypher, this chapter is for you! We won’t dig very deep into
either of the languages, but focus on bridging the gap.

Data Model
For our example, we will use data about persons who act in, direct, produce movies.
Here’s an entity-relationship model for the example:

Person

act ed in

direct ed produced

Movie

We have Person and Movie entities, which are related in three different ways, each of which have manyto-many cardinality.
In a RDBMS we would use tables for the entities as well as for the associative entities (join tables)
needed. In this case we decided to go with the following tables: movie, person, acted_in, directed,
produced. You’ll find the SQL for this below.
In Neo4j, the basic data units are nodes and relationships. Both can have properties, which correspond
to attributes in a RDBMS.
Nodes can be grouped by putting labels on them. In the example, we will use the labels Movie and
Person.
When using Neo4j, related entities can be represented directly by using relationships. There’s no need
to deal with foreign keys to handle the relationships, the database will take care of such mechanics.
Also, the relationships always have full referential integrity. There are no constraints to enable for this,
as it’s not optional; it’s really part of the underlying data model. Relationships always have a type, and
we will differentiate the different kinds of relationships by using the types ACTED_IN, DIRECTED, PRODUCED.

Sample Data
First off, let’s see how to set up our example data in a RDBMS. We’ll start out creating a few tables and
then go on to populate them.
CREATE TABLE movie (
id INTEGER,
title VARCHAR(100),
released INTEGER,
tagline VARCHAR(100)
);
CREATE TABLE person (
id INTEGER,
name VARCHAR(100),
born INTEGER
);
CREATE TABLE acted_in (
role varchar(100),
person_id INTEGER,
movie_id INTEGER
);
CREATE TABLE directed (

40

Introduction to Cypher
person_id INTEGER,
movie_id INTEGER
);
CREATE TABLE produced (
person_id INTEGER,
movie_id INTEGER
);

Populating with data:
INSERT INTO movie (id, title, released, tagline)
VALUES (
(1, 'The Matrix', 1999, 'Welcome to the Real World'),
(2, 'The Devil''s Advocate', 1997, 'Evil has its winning ways'),
(3, 'Monster', 2003, 'The first female serial killer of America')
);
INSERT INTO person (id, name, born)
VALUES (
(1, 'Keanu Reeves', 1964),
(2, 'Carrie-Anne Moss', 1967),
(3, 'Laurence Fishburne', 1961),
(4, 'Hugo Weaving', 1960),
(5, 'Andy Wachowski', 1967),
(6, 'Lana Wachowski', 1965),
(7, 'Joel Silver', 1952),
(8, 'Charlize Theron', 1975),
(9, 'Al Pacino', 1940),
(10, 'Taylor Hackford', 1944)
);
INSERT INTO acted_in (role, person_id, movie_id)
VALUES (
('Neo', 1, 1),
('Trinity', 2, 1),
('Morpheus', 3, 1),
('Agent Smith', 4, 1),
('Kevin Lomax', 1, 2),
('Mary Ann Lomax', 8, 2),
('John Milton', 9, 2),
('Aileen', 8, 3)
);
INSERT INTO directed (person_id, movie_id)
VALUES (
(5, 1),
(6, 1),
(10, 2)
);
INSERT INTO produced (person_id, movie_id)
VALUES (
(7, 1),
(8, 3)
);

Doing this in Neo4j will look quite different. To begin with, we won’t create any schema up front. We’ll
come back to schema later, for now it’s enough to know that labels can be used right away without
declaring them.
In the CREATE statements below, we tell Neo4j what data we want to have in the graph. Simply put, the
parentheses denote nodes, while the arrows (-->, or in our case with a relationship type included [:DIRECTED]->) denote relationships. For the nodes we set identifiers like TheMatrix so we can easily refer
to them later on in the statement. Note that the identifiers are scoped to the statement, and not visible
to other Cypher statements. We could use identifiers for the relationships as well, but there’s no need
for that in this case.
CREATE (TheMatrix:Movie { title:'The Matrix', released:1999, tagline:'Welcome to the Real World' })

41

Introduction to Cypher
CREATE (Keanu:Person { name:'Keanu Reeves', born:1964 })
CREATE (Carrie:Person { name:'Carrie-Anne Moss', born:1967 })
CREATE (Laurence:Person { name:'Laurence Fishburne', born:1961 })
CREATE (Hugo:Person { name:'Hugo Weaving', born:1960 })
CREATE (AndyW:Person { name:'Andy Wachowski', born:1967 })
CREATE (LanaW:Person { name:'Lana Wachowski', born:1965 })
CREATE (JoelS:Person { name:'Joel Silver', born:1952 })
CREATE (Keanu)-[:ACTED_IN { roles: ['Neo']}]->(TheMatrix),
(Carrie)-[:ACTED_IN { roles: ['Trinity']}]->(TheMatrix),
(Laurence)-[:ACTED_IN { roles: ['Morpheus']}]->(TheMatrix),
(Hugo)-[:ACTED_IN { roles: ['Agent Smith']}]->(TheMatrix),(AndyW)-[:DIRECTED]->(TheMatrix),
(LanaW)-[:DIRECTED]->(TheMatrix),(JoelS)-[:PRODUCED]->(TheMatrix)
CREATE (TheDevilsAdvocate:Movie { title:"The Devil's Advocate", released:1997,
tagline: 'Evil has its winning ways' })
CREATE (Monster:Movie { title: 'Monster', released: 2003,
tagline: 'The first female serial killer of America' })
CREATE (Charlize:Person { name:'Charlize Theron', born:1975 })
CREATE (Al:Person { name:'Al Pacino', born:1940 })
CREATE (Taylor:Person { name:'Taylor Hackford', born:1944 })
CREATE (Keanu)-[:ACTED_IN { roles: ['Kevin Lomax']}]->(TheDevilsAdvocate),
(Charlize)-[:ACTED_IN { roles: ['Mary Ann Lomax']}]->(TheDevilsAdvocate),
(Al)-[:ACTED_IN { roles: ['John Milton']}]->(TheDevilsAdvocate),
(Taylor)-[:DIRECTED]->(TheDevilsAdvocate),(Charlize)-[:ACTED_IN { roles: ['Aileen']}]->(Monster),
(Charlize)-[:PRODUCED { roles: ['Aileen']}]->(Monster)

Simple read of data
Let’s find all entries in the movie table and output their title attribute in our RDBMS:
SELECT movie.title
FROM movie;

TITLE
The Matrix
The Devil's Advocate
Monster
3 rows

Using Neo4j, find all nodes labeled Movie and output their title property:
MATCH (movie:Movie)
RETURN movie.title;

movie.title
"The Matrix"
"The Devil's Advocate"
"Monster"
3 rows

MATCH tells Neo4j to match a pattern in the graph. In this case the pattern is very simple: any node with a
Movie label on it. We bind the result of the pattern matching to the identifier movie, for use in the RETURN
clause. And as you can see, the RETURN keyword of Cypher is similar to SELECT in SQL.

Now let’s get movies released after 1998.
SELECT movie.title
FROM movie
WHERE movie.released > 1998;

42

Introduction to Cypher
TITLE
The Matrix
Monster
2 rows

In this case the addition actually looks identical in Cypher.
MATCH (movie:Movie)
WHERE movie.released > 1998
RETURN movie.title;

movie.title
"The Matrix"
"Monster"
2 rows

Note however that the semantics of WHERE in Cypher is somewhat different, see Section 11.3,
“Where” [166] for more information.

Join
Let’s list all persons and the movies they acted in.
SELECT person.name, movie.title
FROM person
JOIN acted_in AS acted_in ON acted_in.person_id = person.id
JOIN movie ON acted_in.movie_id = movie.id;

NAME

TITLE

Keanu Reeves

The Matrix

Keanu Reeves

The Devil's Advocate

Carrie-Anne Moss

The Matrix

Laurence Fishburne

The Matrix

Hugo Weaving

The Matrix

Charlize Theron

The Devil's Advocate

Charlize Theron

Monster

Al Pacino

The Devil's Advocate

8 rows

The same using Cypher:
MATCH (person:Person)-[:ACTED_IN]->(movie:Movie)
RETURN person.name, movie.title;

Here we match a Person and a Movie node, in case they are connected with an ACTED_IN relationship.
person.name

movie.title

"Hugo Weaving"

"The Matrix"

"Laurence Fishburne"

"The Matrix"

"Carrie-Anne Moss"

"The Matrix"

"Keanu Reeves"

"The Matrix"

"Al Pacino"

"The Devil's Advocate"

8 rows

43

Introduction to Cypher
person.name

movie.title

"Charlize Theron"

"The Devil's Advocate"

"Keanu Reeves"

"The Devil's Advocate"

"Charlize Theron"

"Monster"

8 rows

To make things slightly more complex, let’s search for the co-actors of Keanu Reeves. In SQL we use a
self join on the person table and join on the acted_in table once for Keanu, and once for the co-actors.
SELECT DISTINCT co_actor.name
FROM person AS keanu
JOIN acted_in AS acted_in1 ON acted_in1.person_id = keanu.id
JOIN acted_in AS acted_in2 ON acted_in2.movie_id = acted_in1.movie_id
JOIN person AS co_actor
ON acted_in2.person_id = co_actor.id AND co_actor.id <> keanu.id
WHERE keanu.name = 'Keanu Reeves';

NAME
Al Pacino
Carrie-Anne Moss
Charlize Theron
Hugo Weaving
Laurence Fishburne
5 rows

In Cypher, we use a pattern with two paths that target the same Movie node.
MATCH (keanu:Person)-[:ACTED_IN]->(movie:Movie),(coActor:Person)-[:ACTED_IN]->(movie)
WHERE keanu.name = 'Keanu Reeves'
RETURN DISTINCT coActor.name;

You may have noticed that we used the co_actor.id <> keanu.id predicate in SQL only. This is because
Neo4j will only match on the ACTED_IN relationship once in the same pattern. If this is not what we want,
we can split the pattern up by using two MATCH clauses like this:
MATCH (keanu:Person)-[:ACTED_IN]->(movie:Movie)
MATCH (coActor:Person)-[:ACTED_IN]->(movie)
WHERE keanu.name = 'Keanu Reeves'
RETURN DISTINCT coActor.name;

This time Keanu Reeves is included in the result as well:
coActor.name
"Al Pacino"
"Charlize Theron"
"Keanu Reeves"
"Hugo Weaving"
"Laurence Fishburne"
"Carrie-Anne Moss"
6 rows

Next, let’s find out who has both acted in and produced movies.
SELECT person.name

44

Introduction to Cypher
FROM person
WHERE person.id IN (SELECT person_id FROM acted_in)
AND person.id IN (SELECT person_id FROM produced)

NAME
Charlize Theron
1 rows

In Cypher, we use patterns as predicates in this case. That is, we require the relationships to exist, but
don’t care about the connected nodes; thus the empty parentheses.
MATCH (person:Person)
WHERE (person)-[:ACTED_IN]->() AND (person)-[:PRODUCED]->()
RETURN person.name

Aggregation
Now let’s find out a bit about the directors in movies that Keanu Reeves acted in. We want to know how
many of those movies each of them directed.
SELECT director.name, count(*)
FROM person keanu
JOIN acted_in ON keanu.id = acted_in.person_id
JOIN directed ON acted_in.movie_id = directed.movie_id
JOIN person AS director ON directed.person_id = director.id
WHERE keanu.name = 'Keanu Reeves'
GROUP BY director.name
ORDER BY count(*) DESC

NAME

C2

Andy Wachowski

1

Lana Wachowski

1

Taylor Hackford

1

3 rows

Here’s how we’ll do the same in Cypher:
MATCH (keanu:Person { name: 'Keanu Reeves' })-[:ACTED_IN]->(movie:Movie),
(director:Person)-[:DIRECTED]->(movie)
RETURN director.name, count(*)
ORDER BY count(*) DESC

As you can see there is no GROUP BY in the Cypher equivalent. Instead, Neo4j will automatically figure out
the grouping key.

45

Chapter 4. Use Cypher in an application
The most direct way to use Cypher programmatically is to execute a HTTP POST operation against
the transactional Cypher endpoint. You can send a large number of statements with parameters to
the server with each request. For immediate execution you can use the /db/data/transaction/commit
endpoint with a JSON payload like this:
curl -i -H accept:application/json -H content-type:application/json -XPOST http://localhost:7474/db/data/transaction/commit \
-d '{"statements":[{"statement":"CREATE (p:Person {name:{name},born:{born}}) RETURN p","parameters":{"name":"Keanu
Reeves","born":1964}}]}'

The above command results in:
{"results":[{"columns":["p"],"data":[{"row":[{"name":"Keanu Reeves","born":1964}]}]}],"errors":[]}

You can add as many "statement" objects in the "statements" list as you want.
For larger use-cases that span multiple requests but whose read-write-read-write operations should
be executed within the same transactional scope you’d use the /db/data/transaction endpoint. This will
give you a transaction URL as the Location header, which you can continue to write to and read from. At
the end you either commit the whole transaction by POSTing to the (also returned) commit URL or by
issuing a DELETE request against the transaction URL.
curl -i -H accept:application/json -H content-type:application/json -XPOST http://localhost:7474/db/data/transaction \
-d '{"statements":[{"statement":"CREATE (p:Person {name:{name},born:{born}}) RETURN p","parameters":{"name":"Clint
Eastwood","born":1930}}]}'

The above command results in:
HTTP/1.1 201 Created
Location: http://localhost:7474/db/data/transaction/261
{"commit":"http://localhost:7474/db/data/transaction/261/commit","transaction":{"expires":"Wed, 03 Sep 2014 23:26:51
+0000"},"errors":[],
"results":[{"columns":["p"],"data":[{"row":[{"name":"Clint Eastwood","born":1930}]}]}]}

See Section 21.1, “Transactional Cypher HTTP endpoint” [298] for more information.

46

Chapter 5. Basic Data Modeling Examples
The following chapters contain simple examples to get you started thinking about data modeling with
graphs. If you are looking for more advanced examples you can head straight to Chapter 6, Advanced
Data Modeling Examples [62].
The examples use Cypher queries a lot, read Part III, “Cypher Query Language” [102] for more
information.

47

Basic Data Modeling Examples

5.1. Movie Database
Our example graph consists of movies with title and year and actors with a name. Actors have ACTS_IN
relationships to movies, which represents the role they played. This relationship also has a role
attribute.
We’ll go with three movies and three actors:
CREATE
CREATE
CREATE
CREATE
CREATE
CREATE
CREATE
CREATE
CREATE
CREATE
CREATE
CREATE
CREATE
CREATE
CREATE

(matrix1:Movie { title : 'The Matrix', year : '1999-03-31' })
(matrix2:Movie { title : 'The Matrix Reloaded', year : '2003-05-07' })
(matrix3:Movie { title : 'The Matrix Revolutions', year : '2003-10-27' })
(keanu:Actor { name:'Keanu Reeves' })
(laurence:Actor { name:'Laurence Fishburne' })
(carrieanne:Actor { name:'Carrie-Anne Moss' })
(keanu)-[:ACTS_IN { role : 'Neo' }]->(matrix1)
(keanu)-[:ACTS_IN { role : 'Neo' }]->(matrix2)
(keanu)-[:ACTS_IN { role : 'Neo' }]->(matrix3)
(laurence)-[:ACTS_IN { role : 'Morpheus' }]->(matrix1)
(laurence)-[:ACTS_IN { role : 'Morpheus' }]->(matrix2)
(laurence)-[:ACTS_IN { role : 'Morpheus' }]->(matrix3)
(carrieanne)-[:ACTS_IN { role : 'Trinity' }]->(matrix1)
(carrieanne)-[:ACTS_IN { role : 'Trinity' }]->(matrix2)
(carrieanne)-[:ACTS_IN { role : 'Trinity' }]->(matrix3)

This gives us the following graph to play with:
Act or

Act or

nam e = 'Keanu Reeves'

ACTS_IN
role = 'Neo'

ACTS_IN
role = 'Trinit y'

ACTS_IN
role = 'Neo'

Movie
year = '2003-05-07'
t it le = 'The Mat rix Reloaded'

Act or

nam e = 'Carrie-Anne Moss'

ACTS_IN
role = 'Trinit y'

nam e = 'Laurence Fishburne'

ACTS_IN
role = 'Morpheus'

ACTS_IN
role = 'Neo'

Movie

ACTS_IN
role = 'Trinit y'

Movie

t it le = 'The Mat rix'
year = '1999-03-31'

year = '2003-10-27'
t it le = 'The Mat rix Revolut ions'

Let’s check how many nodes we have now:
MATCH (n)
RETURN "Hello Graph with " + count(*)+ " Nodes!" AS welcome;

Return a single node, by name:
MATCH (movie:Movie { title: 'The Matrix' })
RETURN movie;

Return the title and date of the matrix node:
MATCH (movie:Movie { title: 'The Matrix' })
RETURN movie.title, movie.year;

Which results in:
movie.title

movie.year

"The Matrix"

"1999-03-31"

1 row

Show all actors:
MATCH (actor:Actor)
RETURN actor;

Return just the name, and order them by name:
48

ACTS_IN
role = 'Morpheus'

ACTS_IN
role = 'Morpheus'

Basic Data Modeling Examples

MATCH (actor:Actor)
RETURN actor.name
ORDER BY actor.name;

Count the actors:
MATCH (actor:Actor)
RETURN count(*);

Get only the actors whose names end with “s”:
MATCH (actor:Actor)
WHERE actor.name =~ ".*s$"
RETURN actor.name;

Here’s some exploratory queries for unknown datasets. Don’t do this on live production databases!
Count nodes:
MATCH (n)
RETURN count(*);

Count relationship types:
MATCH (n)-[r]->()
RETURN type(r), count(*);

type(r)

count(*)

"ACTS_IN"

9

1 row

List all nodes and their relationships:
MATCH (n)-[r]->(m)
RETURN n AS FROM , r AS `->`, m AS to;

from

->

to

Node[3]{name:"Keanu Reeves"}

:ACTS_IN[2]{role:"Neo"}

Node[2]{year:"2003-10-27",
title:"The Matrix Revolutions"}

Node[3]{name:"Keanu Reeves"}

:ACTS_IN[1]{role:"Neo"}

Node[1]{year:"2003-05-07",
title:"The Matrix Reloaded"}

Node[3]{name:"Keanu Reeves"}

:ACTS_IN[0]{role:"Neo"}

Node[0]{title:"The Matrix",
year:"1999-03-31"}

Node[4]{name:"Laurence
Fishburne"}

:ACTS_IN[5]{role:"Morpheus"}

Node[2]{year:"2003-10-27",
title:"The Matrix Revolutions"}

Node[4]{name:"Laurence
Fishburne"}

:ACTS_IN[4]{role:"Morpheus"}

Node[1]{year:"2003-05-07",
title:"The Matrix Reloaded"}

Node[4]{name:"Laurence
Fishburne"}

:ACTS_IN[3]{role:"Morpheus"}

Node[0]{title:"The Matrix",
year:"1999-03-31"}

Node[5]{name:"Carrie-Anne Moss"} :ACTS_IN[8]{role:"Trinity"}

Node[2]{year:"2003-10-27",
title:"The Matrix Revolutions"}

Node[5]{name:"Carrie-Anne Moss"} :ACTS_IN[7]{role:"Trinity"}

Node[1]{year:"2003-05-07",
title:"The Matrix Reloaded"}

Node[5]{name:"Carrie-Anne Moss"} :ACTS_IN[6]{role:"Trinity"}

Node[0]{title:"The Matrix",
year:"1999-03-31"}

9 rows

49

Basic Data Modeling Examples

5.2. Social Movie Database
Our example graph consists of movies with title and year and actors with a name. Actors have ACTS_IN
relationships to movies, which represents the role they played. This relationship also has a role
attribute.
So far, we queried the movie data; now let’s update the graph too.
CREATE
CREATE
CREATE
CREATE
CREATE
CREATE
CREATE
CREATE
CREATE
CREATE
CREATE
CREATE
CREATE
CREATE
CREATE

(matrix1:Movie { title : 'The Matrix', year : '1999-03-31' })
(matrix2:Movie { title : 'The Matrix Reloaded', year : '2003-05-07' })
(matrix3:Movie { title : 'The Matrix Revolutions', year : '2003-10-27' })
(keanu:Actor { name:'Keanu Reeves' })
(laurence:Actor { name:'Laurence Fishburne' })
(carrieanne:Actor { name:'Carrie-Anne Moss' })
(keanu)-[:ACTS_IN { role : 'Neo' }]->(matrix1)
(keanu)-[:ACTS_IN { role : 'Neo' }]->(matrix2)
(keanu)-[:ACTS_IN { role : 'Neo' }]->(matrix3)
(laurence)-[:ACTS_IN { role : 'Morpheus' }]->(matrix1)
(laurence)-[:ACTS_IN { role : 'Morpheus' }]->(matrix2)
(laurence)-[:ACTS_IN { role : 'Morpheus' }]->(matrix3)
(carrieanne)-[:ACTS_IN { role : 'Trinity' }]->(matrix1)
(carrieanne)-[:ACTS_IN { role : 'Trinity' }]->(matrix2)
(carrieanne)-[:ACTS_IN { role : 'Trinity' }]->(matrix3)

We will add ourselves, friends and movie ratings.
Here’s how to add a node for yourself and return it, let’s say your name is “Me”:
CREATE (me:User { name: "Me" })
RETURN me;

me
Node[6]{name:"Me"}
1 row
Nodes created: 1
Properties set: 1
Labels added: 1

Let’s check if the node is there:
MATCH (me:User { name: "Me" })
RETURN me.name;

Add a movie rating:
MATCH (me:User { name: "Me" }),(movie:Movie { title: "The Matrix" })
CREATE (me)-[:RATED { stars : 5, comment : "I love that movie!" }]->(movie);

Which movies did I rate?
MATCH (me:User { name: "Me" }),(me)-[rating:RATED]->(movie)
RETURN movie.title, rating.stars, rating.comment;

movie.title

rating.stars

rating.comment

"The Matrix"

5

"I love that movie!"

1 row

We need a friend!
CREATE (friend:User { name: "A Friend" })
RETURN friend;

50

Basic Data Modeling Examples
Add our friendship idempotently, so we can re-run the query without adding it several times. We return
the relationship to check that it has not been created several times.
MATCH (me:User { name: "Me" }),(friend:User { name: "A Friend" })
CREATE UNIQUE (me)-[friendship:FRIEND]->(friend)
RETURN friendship;

You can rerun the query, see that it doesn’t change anything the second time!
Let’s update our friendship with a since property:
MATCH (me:User { name: "Me" })-[friendship:FRIEND]->(friend:User { name: "A Friend" })
SET friendship.since='forever'
RETURN friendship;

Let’s pretend us being our friend and wanting to see which movies our friends have rated.
MATCH (me:User { name: "A Friend" })-[:FRIEND]-(friend)-[rating:RATED]->(movie)
RETURN movie.title, avg(rating.stars) AS stars, collect(rating.comment) AS comments, count(*);

movie.title

stars

comments

count(*)

"The Matrix"

5. 0

["I love that movie!"]

1

1 row

That’s too little data, let’s add some more friends and friendships.
MATCH (me:User { name: "Me" })
FOREACH (i IN range(1,10)| CREATE (friend:User { name: "Friend " + i }),(me)-[:FRIEND]->(friend));

Show all our friends:
MATCH (me:User { name: "Me" })-[r:FRIEND]->(friend)
RETURN type(r) AS friendship, friend.name;

friendship

friend.name

"FRIEND"

"Friend 5"

"FRIEND"

"Friend 4"

"FRIEND"

"Friend 3"

"FRIEND"

"Friend 2"

"FRIEND"

"Friend 1"

"FRIEND"

"Friend 10"

"FRIEND"

"Friend 8"

"FRIEND"

"Friend 9"

"FRIEND"

"Friend 6"

"FRIEND"

"Friend 7"

"FRIEND"

"A Friend"

11 rows

51

Basic Data Modeling Examples

5.3. Finding Paths
Our example graph consists of movies with title and year and actors with a name. Actors have ACTS_IN
relationships to movies, which represents the role they played. This relationship also has a role
attribute.
We queried and updated the data so far, now let’s find interesting constellations, a.k.a. paths.
CREATE
CREATE
CREATE
CREATE
CREATE
CREATE
CREATE
CREATE
CREATE
CREATE
CREATE
CREATE
CREATE
CREATE
CREATE

(matrix1:Movie { title : 'The Matrix', year : '1999-03-31' })
(matrix2:Movie { title : 'The Matrix Reloaded', year : '2003-05-07' })
(matrix3:Movie { title : 'The Matrix Revolutions', year : '2003-10-27' })
(keanu:Actor { name:'Keanu Reeves' })
(laurence:Actor { name:'Laurence Fishburne' })
(carrieanne:Actor { name:'Carrie-Anne Moss' })
(keanu)-[:ACTS_IN { role : 'Neo' }]->(matrix1)
(keanu)-[:ACTS_IN { role : 'Neo' }]->(matrix2)
(keanu)-[:ACTS_IN { role : 'Neo' }]->(matrix3)
(laurence)-[:ACTS_IN { role : 'Morpheus' }]->(matrix1)
(laurence)-[:ACTS_IN { role : 'Morpheus' }]->(matrix2)
(laurence)-[:ACTS_IN { role : 'Morpheus' }]->(matrix3)
(carrieanne)-[:ACTS_IN { role : 'Trinity' }]->(matrix1)
(carrieanne)-[:ACTS_IN { role : 'Trinity' }]->(matrix2)
(carrieanne)-[:ACTS_IN { role : 'Trinity' }]->(matrix3)

All other movies that actors in “The Matrix” acted in ordered by occurrence:
MATCH (:Movie { title: "The Matrix" })<-[:ACTS_IN]-(actor)-[:ACTS_IN]->(movie)
RETURN movie.title, count(*)
ORDER BY count(*) DESC ;

movie.title

count(*)

"The Matrix Revolutions"

3

"The Matrix Reloaded"

3

2 rows

Let’s see who acted in each of these movies:
MATCH (:Movie { title: "The Matrix" })<-[:ACTS_IN]-(actor)-[:ACTS_IN]->(movie)
RETURN movie.title, collect(actor.name), count(*) AS count
ORDER BY count DESC ;

movie.title

collect(actor.name)

count

"The Matrix Revolutions"

["Carrie-Anne Moss", "Laurence
Fishburne", "Keanu Reeves"]

3

"The Matrix Reloaded"

["Carrie-Anne Moss", "Laurence
Fishburne", "Keanu Reeves"]

3

2 rows

What about co-acting, that is actors that acted together:
MATCH (:Movie { title: "The Matrix"
})<-[:ACTS_IN]-(actor)-[:ACTS_IN]->(movie)<-[:ACTS_IN]-(colleague)
RETURN actor.name, collect(DISTINCT colleague.name);

actor.name

collect(distinct colleague.name)

"Carrie-Anne Moss"

["Laurence Fishburne", "Keanu Reeves"]

"Keanu Reeves"

["Carrie-Anne Moss", "Laurence Fishburne"]

3 rows

52

Basic Data Modeling Examples
actor.name

collect(distinct colleague.name)

"Laurence Fishburne"

["Carrie-Anne Moss", "Keanu Reeves"]

3 rows

Who of those other actors acted most often with anyone from the matrix cast?
MATCH (:Movie { title: "The Matrix"
})<-[:ACTS_IN]-(actor)-[:ACTS_IN]->(movie)<-[:ACTS_IN]-(colleague)
RETURN colleague.name, count(*)
ORDER BY count(*) DESC LIMIT 10;

colleague.name

count(*)

"Carrie-Anne Moss"

4

"Keanu Reeves"

4

"Laurence Fishburne"

4

3 rows

Starting with paths, a path is a sequence of nodes and relationships from a start node to an end node.
We know that Trinity loves Neo, but how many paths exist between the two actors? We’ll limit the path
length of the pattern as it exhaustively searches the graph otherwise. This is done by using *0..5 in the
pattern relationship.
MATCH p =(:Actor { name: "Keanu Reeves" })-[:ACTS_IN*0..5]-(:Actor { name: "Carrie-Anne Moss" })
RETURN p, length(p)
LIMIT 10;

p

length(p)

[Node[3]{name:"Keanu Reeves"}, :ACTS_IN[0]
{role:"Neo"}, Node[0]{title:"The Matrix",
year:"1999-03-31"}, :ACTS_IN[6]{role:"Trinity"},
Node[5]{name:"Carrie-Anne Moss"}]

2

[Node[3]{name:"Keanu Reeves"}, :ACTS_IN[1]
4
{role:"Neo"}, Node[1]{year:"2003-05-07", title:"The
Matrix Reloaded"}, :ACTS_IN[4]{role:"Morpheus"},
Node[4]{name:"Laurence Fishburne"}, :ACTS_IN[3]
{role:"Morpheus"}, Node[0]{title:"The Matrix",
year:"1999-03-31"}, :ACTS_IN[6]{role:"Trinity"},
Node[5]{name:"Carrie-Anne Moss"}]
[Node[3]{name:"Keanu Reeves"}, :ACTS_IN[2]
{role:"Neo"}, Node[2]{year:"2003-10-27",
title:"The Matrix Revolutions"}, :ACTS_IN[5]
{role:"Morpheus"}, Node[4]{name:"Laurence
Fishburne"}, :ACTS_IN[3]{role:"Morpheus"},
Node[0]{title:"The Matrix",
year:"1999-03-31"}, :ACTS_IN[6]{role:"Trinity"},
Node[5]{name:"Carrie-Anne Moss"}]

4

[Node[3]{name:"Keanu Reeves"}, :ACTS_IN[1]
2
{role:"Neo"}, Node[1]{year:"2003-05-07", title:"The
Matrix Reloaded"}, :ACTS_IN[7]{role:"Trinity"},
Node[5]{name:"Carrie-Anne Moss"}]
[Node[3]{name:"Keanu Reeves"}, :ACTS_IN[0]
{role:"Neo"}, Node[0]{title:"The Matrix",

4

9 rows

53

Basic Data Modeling Examples
p

length(p)

year:"1999-03-31"}, :ACTS_IN[3]{role:"Morpheus"},
Node[4]{name:"Laurence Fishburne"}, :ACTS_IN[4]
{role:"Morpheus"}, Node[1]{year:"2003-05-07",
title:"The Matrix Reloaded"}, :ACTS_IN[7]
{role:"Trinity"}, Node[5]{name:"Carrie-Anne
Moss"}]
[Node[3]{name:"Keanu Reeves"}, :ACTS_IN[2]
{role:"Neo"}, Node[2]{year:"2003-10-27",
title:"The Matrix Revolutions"}, :ACTS_IN[5]
{role:"Morpheus"}, Node[4]{name:"Laurence
Fishburne"}, :ACTS_IN[4]{role:"Morpheus"},
Node[1]{year:"2003-05-07", title:"The Matrix
Reloaded"}, :ACTS_IN[7]{role:"Trinity"}, Node[5]
{name:"Carrie-Anne Moss"}]

4

[Node[3]{name:"Keanu Reeves"}, :ACTS_IN[2]
2
{role:"Neo"}, Node[2]{year:"2003-10-27", title:"The
Matrix Revolutions"}, :ACTS_IN[8]{role:"Trinity"},
Node[5]{name:"Carrie-Anne Moss"}]
[Node[3]{name:"Keanu Reeves"}, :ACTS_IN[0]
{role:"Neo"}, Node[0]{title:"The Matrix",
year:"1999-03-31"}, :ACTS_IN[3]{role:"Morpheus"},
Node[4]{name:"Laurence Fishburne"}, :ACTS_IN[5]
{role:"Morpheus"}, Node[2]{year:"2003-10-27",
title:"The Matrix Revolutions"}, :ACTS_IN[8]
{role:"Trinity"}, Node[5]{name:"Carrie-Anne
Moss"}]

4

[Node[3]{name:"Keanu Reeves"}, :ACTS_IN[1]
4
{role:"Neo"}, Node[1]{year:"2003-05-07", title:"The
Matrix Reloaded"}, :ACTS_IN[4]{role:"Morpheus"},
Node[4]{name:"Laurence Fishburne"}, :ACTS_IN[5]
{role:"Morpheus"}, Node[2]{year:"2003-10-27",
title:"The Matrix Revolutions"}, :ACTS_IN[8]
{role:"Trinity"}, Node[5]{name:"Carrie-Anne
Moss"}]
9 rows

But that’s a lot of data, we just want to look at the names and titles of the nodes of the path.
MATCH p =(:Actor { name: "Keanu Reeves" })-[:ACTS_IN*0..5]-(:Actor { name: "Carrie-Anne Moss" })
RETURN extract(n IN nodes(p)| coalesce(n.title,n.name)) AS `names AND titles`, length(p)
ORDER BY length(p)
LIMIT 10;

names and titles

length(p)

["Keanu Reeves", "The Matrix", "Carrie-Anne Moss"]

2

["Keanu Reeves", "The Matrix Reloaded", "CarrieAnne Moss"]

2

["Keanu Reeves", "The Matrix Revolutions", "Carrie- 2
Anne Moss"]
["Keanu Reeves", "The Matrix Reloaded", "Laurence
Fishburne", "The Matrix", "Carrie-Anne Moss"]

4

9 rows

54

Basic Data Modeling Examples
names and titles

length(p)

["Keanu Reeves", "The Matrix
Revolutions", "Laurence Fishburne", "The
Matrix", "Carrie-Anne Moss"]

4

["Keanu Reeves", "The Matrix", "Laurence
Fishburne", "The Matrix Reloaded", "Carrie-Anne
Moss"]

4

["Keanu Reeves", "The Matrix
Revolutions", "Laurence Fishburne", "The Matrix
Reloaded", "Carrie-Anne Moss"]

4

["Keanu Reeves", "The Matrix", "Laurence
Fishburne", "The Matrix Revolutions", "Carrie-Anne
Moss"]

4

["Keanu Reeves", "The Matrix Reloaded", "Laurence
Fishburne", "The Matrix Revolutions", "Carrie-Anne
Moss"]

4

9 rows

55

Basic Data Modeling Examples

5.4. Linked Lists
A powerful feature of using a graph database, is that you can create your own in-graph data
structures — for example a linked list.
This data structure uses a single node as the list reference. The reference has an outgoing relationship
to the head of the list, and an incoming relationship from the last element of the list. If the list is empty,
the reference will point to itself.
To make it clear what happens, we will show how the graph looks after each query.
To initialize an empty linked list, we simply create a node, and make it link to itself. Unlike the actual list
elements, it doesn’t have a value property.
CREATE (root { name: 'ROOT' })-[:LINK]->(root)
RETURN root

nam e = 'ROOT'

LINK

Adding values is done by finding the relationship where the new value should be placed in, and
replacing it with a new node, and two relationships to it. We also have to handle the fact that the before
and after nodes could be the same as the root node. The case where before, after and the root node
are all the same, makes it necessary to use CREATE UNIQUE to not create two new value nodes by mistake.
MATCH (root)-[:LINK*0..]->(before),(after)-[:LINK*0..]->(root),(before)-[old:LINK]->(after)
WHERE root.name = 'ROOT' AND (before.value < 25 OR before = root) AND (25 < after.value OR after =
root)
CREATE UNIQUE (before)-[:LINK]->({ value:25 })-[:LINK]->(after)
DELETE old

nam e = 'ROOT'

LINK

LINK

value = 25

Let’s add one more value:
MATCH (root)-[:LINK*0..]->(before),(after)-[:LINK*0..]->(root),(before)-[old:LINK]->(after)
WHERE root.name = 'ROOT' AND (before.value < 10 OR before = root) AND (10 < after.value OR after =
root)
CREATE UNIQUE (before)-[:LINK]->({ value:10 })-[:LINK]->(after)
DELETE old

56

Basic Data Modeling Examples

nam e = 'ROOT'

LINK

value = 10

LINK

LINK

value = 25

Deleting a value, conversely, is done by finding the node with the value, and the two relationships going
in and out from it, and replacing the relationships with a new one.
MATCH (root)-[:LINK*0..]->(before),(before)-[delBefore:LINK]->(del)-[delAfter:LINK]->(after),
(after)-[:LINK*0..]->(root)
WHERE root.name = 'ROOT' AND del.value = 10
CREATE UNIQUE (before)-[:LINK]->(after)
DELETE del, delBefore, delAfter

nam e = 'ROOT'

LINK

LINK

value = 25

Deleting the last value node is what requires us to use CREATE UNIQUE when replacing the relationships.
Otherwise, we would end up with two relationships from the root node to itself, as both before and
after nodes are equal to the root node, meaning the pattern would match twice.
MATCH (root)-[:LINK*0..]->(before),(before)-[delBefore:LINK]->(del)-[delAfter:LINK]->(after),
(after)-[:LINK*0..]->(root)
WHERE root.name = 'ROOT' AND del.value = 25
CREATE UNIQUE (before)-[:LINK]->(after)
DELETE del, delBefore, delAfter

nam e = 'ROOT'

57

LINK

Basic Data Modeling Examples

5.5. TV Shows
This example show how TV Shows with Seasons, Episodes, Characters, Actors, Users and Reviews can
be modeled in a graph database.

Data Model
Let’s start out with an entity-relationship model of the domain at hand:

TV Show

has

Season

has

User

Episode

wrot e

has

feat ured

Review

Act or

played

Charact er

To implement this in Neo4j we’ll use the following relationship types:
Relationship Type

Description

HAS_SEASON

Connects a show with its seasons.

HAS_EPISODE

Connects a season with its episodes.

FEATURED_CHARACTER

Connects an episode with its characters.

PLAYED_CHARACTER

Connects actors with characters. Note that an
actor can play multiple characters in an episode,
and that the same character can be played by
multiple actors as well.

HAS_REVIEW

Connects an episode with its reviews.

WROTE_REVIEW

Connects users with reviews they contributed.

Sample Data
Let’s create some data and see how the domain plays out in practice:
CREATE
CREATE
CREATE
CREATE
CREATE

(himym:TVShow { name: "How I Met Your Mother" })
(himym_s1:Season { name: "HIMYM Season 1" })
(himym_s1_e1:Episode { name: "Pilot" })
(ted:Character { name: "Ted Mosby" })
(joshRadnor:Actor { name: "Josh Radnor" })

58

Basic Data Modeling Examples
CREATE UNIQUE (joshRadnor)-[:PLAYED_CHARACTER]->(ted)
CREATE UNIQUE (himym)-[:HAS_SEASON]->(himym_s1)
CREATE UNIQUE (himym_s1)-[:HAS_EPISODE]->(himym_s1_e1)
CREATE UNIQUE (himym_s1_e1)-[:FEATURED_CHARACTER]->(ted)
CREATE (himym_s1_e1_review1 { title: "Meet Me At The Bar In 15 Minutes & Suit Up",
content: "It was awesome" })
CREATE (wakenPayne:User { name: "WakenPayne" })
CREATE (wakenPayne)-[:WROTE_REVIEW]->(himym_s1_e1_review1)<-[:HAS_REVIEW]-(himym_s1_e1)

This is how the data looks in the database:
TVShow
nam e = 'How I Met Your Mot her'

HAS_SEASON

Season
nam e = 'HIMYM Season 1'

HAS_EPISODE

User

Episode

nam e = 'WakenPayne'

nam e = 'Pilot '

WROTE_REVIEW

HAS_REVIEW

Act or
nam e = 'Josh Radnor'

FEATURED_CHARACTER

PLAYED_CHARACTER

Charact er

t it le = 'Meet Me At The Bar In 15 Minut es & Suit Up'
cont ent = 'It was awesom e'

nam e = 'Ted Mosby'

Note that even though we could have modeled the reviews as relationships with title and content
properties on them, we made them nodes instead. We gain a lot of flexibility in this way, for example if
we want to connect comments to each review.
Now let’s add more data:
MATCH (himym:TVShow { name: "How I Met Your Mother" }),(himym_s1:Season),
(himym_s1_e1:Episode { name: "Pilot" }),
(himym)-[:HAS_SEASON]->(himym_s1)-[:HAS_EPISODE]->(himym_s1_e1)
CREATE (marshall:Character { name: "Marshall Eriksen" })
CREATE (robin:Character { name: "Robin Scherbatsky" })
CREATE (barney:Character { name: "Barney Stinson" })
CREATE (lily:Character { name: "Lily Aldrin" })
CREATE (jasonSegel:Actor { name: "Jason Segel" })
CREATE (cobieSmulders:Actor { name: "Cobie Smulders" })
CREATE (neilPatrickHarris:Actor { name: "Neil Patrick Harris" })
CREATE (alysonHannigan:Actor { name: "Alyson Hannigan" })
CREATE UNIQUE (jasonSegel)-[:PLAYED_CHARACTER]->(marshall)
CREATE UNIQUE (cobieSmulders)-[:PLAYED_CHARACTER]->(robin)
CREATE UNIQUE (neilPatrickHarris)-[:PLAYED_CHARACTER]->(barney)
CREATE UNIQUE (alysonHannigan)-[:PLAYED_CHARACTER]->(lily)
CREATE UNIQUE (himym_s1_e1)-[:FEATURED_CHARACTER]->(marshall)
CREATE UNIQUE (himym_s1_e1)-[:FEATURED_CHARACTER]->(robin)
CREATE UNIQUE (himym_s1_e1)-[:FEATURED_CHARACTER]->(barney)
CREATE UNIQUE (himym_s1_e1)-[:FEATURED_CHARACTER]->(lily)
CREATE (himym_s1_e1_review2 { title: "What a great pilot for a show :)",

59

Basic Data Modeling Examples
content: "The humour is great." })
CREATE (atlasredux:User { name: "atlasredux" })
CREATE (atlasredux)-[:WROTE_REVIEW]->(himym_s1_e1_review2)<-[:HAS_REVIEW]-(himym_s1_e1)

Information for a show
For a particular TV show, show all the seasons and all the episodes and all the reviews and all the cast
members from that show, that is all of the information connected to that TV show.
MATCH (tvShow:TVShow)-[:HAS_SEASON]->(season)-[:HAS_EPISODE]->(episode)
WHERE tvShow.name = "How I Met Your Mother"
RETURN season.name, episode.name

season.name

episode.name

"HIMYM Season 1"

"Pilot"

1 row

We could also grab the reviews if there are any by slightly tweaking the query:
MATCH (tvShow:TVShow)-[:HAS_SEASON]->(season)-[:HAS_EPISODE]->(episode)
WHERE tvShow.name = "How I Met Your Mother"
WITH season, episode
OPTIONAL MATCH (episode)-[:HAS_REVIEW]->(review)
RETURN season.name, episode.name, review

season.name

episode.name

review

"HIMYM Season 1"

"Pilot"

Node[15]{title:"What a
great pilot for a show :)",
content:"The humour is great. "}

"HIMYM Season 1"

"Pilot"

Node[5]{title:"Meet Me At The
Bar In 15 Minutes & Suit Up",
content:"It was awesome"}

2 rows

Now let’s list the characters featured in a show. Note that in this query we only put identifiers on the
nodes we actually use later on. The other nodes of the path pattern are designated by ().
MATCH (tvShow:TVShow)-[:HAS_SEASON]->()-[:HAS_EPISODE]->()-[:FEATURED_CHARACTER]->(character)
WHERE tvShow.name = "How I Met Your Mother"
RETURN DISTINCT character.name

character.name
"Lily Aldrin"
"Barney Stinson"
"Robin Scherbatsky"
"Marshall Eriksen"
"Ted Mosby"
5 rows

Now let’s look at how to get all cast members of a show.
MATCH
(tvShow:TVShow)-[:HAS_SEASON]->()-[:HAS_EPISODE]->(episode)-[:FEATURED_CHARACTER]->()<-[:PLAYED_CHARACTER]-(actor)
WHERE tvShow.name = "How I Met Your Mother"
RETURN DISTINCT actor.name

60

Basic Data Modeling Examples
actor.name
"Alyson Hannigan"
"Neil Patrick Harris"
"Cobie Smulders"
"Jason Segel"
"Josh Radnor"
5 rows

Information for an actor
First let’s add another TV show that Josh Radnor appeared in:
CREATE (er:TVShow { name: "ER" })
CREATE (er_s7:Season { name: "ER S7" })
CREATE (er_s7_e17:Episode { name: "Peter's Progress" })
CREATE (tedMosby:Character { name: "The Advocate " })
CREATE UNIQUE (er)-[:HAS_SEASON]->(er_s7)
CREATE UNIQUE (er_s7)-[:HAS_EPISODE]->(er_s7_e17)
WITH er_s7_e17
MATCH (actor:Actor),(episode:Episode)
WHERE actor.name = "Josh Radnor" AND episode.name = "Peter's Progress"
WITH actor, episode
CREATE (keith:Character { name: "Keith" })
CREATE UNIQUE (actor)-[:PLAYED_CHARACTER]->(keith)
CREATE UNIQUE (episode)-[:FEATURED_CHARACTER]->(keith)

And now we’ll create a query to find the episodes that he has appeared in:
MATCH (actor:Actor)-[:PLAYED_CHARACTER]->(character)<-[:FEATURED_CHARACTER]-(episode)
WHERE actor.name = "Josh Radnor"
RETURN episode.name AS Episode, character.name AS Character

Episode

Character

"Peter's Progress"

"Keith"

"Pilot"

"Ted Mosby"

2 rows

Now let’s go for a similar query, but add the season and show to it as well.
MATCH (actor:Actor)-[:PLAYED_CHARACTER]->(character)<-[:FEATURED_CHARACTER]-(episode),
(episode)<-[:HAS_EPISODE]-(season)<-[:HAS_SEASON]-(tvshow)
WHERE actor.name = "Josh Radnor"
RETURN tvshow.name AS Show, season.name AS Season, episode.name AS Episode,
character.name AS Character

Show

Season

Episode

Character

"ER"

"ER S7"

"Peter's Progress"

"Keith"

"How I Met Your Mother"

"HIMYM Season 1"

"Pilot"

"Ted Mosby"

2 rows

61

Chapter 6. Advanced Data Modeling Examples
The following chapters contain simplified examples of how different domains can be modeled
using Neo4j. The aim is not to give full examples, but to suggest possible ways to think using nodes,
relationships, graph patterns and data locality in traversals.
The examples use Cypher queries a lot, read Part III, “Cypher Query Language” [102] for more
information.

62

Advanced Data Modeling Examples

6.1. ACL structures in graphs
This example gives a generic overview of an approach to handling Access Control Lists (ACLs) in graphs,
and a simplified example with concrete queries.

Generic approach
In many scenarios, an application needs to handle security on some form of managed objects. This
example describes one pattern to handle this through the use of a graph structure and traversers
that build a full permissions-structure for any managed object with exclude and include overriding
possibilities. This results in a dynamic construction of ACLs based on the position and context of the
managed object.
The result is a complex security scheme that can easily be implemented in a graph structure,
supporting permissions overriding, principal and content composition, without duplicating data
anywhere.

Technique
As seen in the example graph layout, there are some key concepts in this domain model:
• The managed content (folders and files) that are connected by HAS_CHILD_CONTENT relationships
• The Principal subtree pointing out principals that can act as ACL members, pointed out by the
PRINCIPAL relationships.
• The aggregation of principals into groups, connected by the IS_MEMBER_OF relationship. One principal
(user or group) can be part of many groups at the same time.
• The SECURITY — relationships, connecting the content composite structure to the principal composite
structure, containing a addition/removal modifier property ("+RW").

Constructing the ACL
The calculation of the effective permissions (e.g. Read, Write, Execute) for a principal for any given ACLmanaged node (content) follows a number of rules that will be encoded into the permissions-traversal:

Top-down-Traversal
This approach will let you define a generic permission pattern on the root content, and then refine that
for specific sub-content nodes and specific principals.
63

Advanced Data Modeling Examples
1. Start at the content node in question traverse upwards to the content root node to determine the
path to it.
2. Start with a effective optimistic permissions list of "all permitted" (111 in a bit encoded
ReadWriteExecute case) or 000 if you like pessimistic security handling (everything is forbidden unless
explicitly allowed).
3. Beginning from the topmost content node, look for any SECURITY relationships on it.
4. If found, look if the principal in question is part of the end-principal of the SECURITY relationship.
5. If yes, add the "+" permission modifiers to the existing permission pattern, revoke the "-" permission
modifiers from the pattern.
6. If two principal nodes link to the same content node, first apply the more generic prinipals modifiers.
7. Repeat the security modifier search all the way down to the target content node, thus overriding
more generic permissions with the set on nodes closer to the target node.
The same algorithm is applicable for the bottom-up approach, basically just traversing from the target
content node upwards and applying the security modifiers dynamically as the traverser goes up.

Example
Now, to get the resulting access rights for e.g. "user 1" on the "My File.pdf" in a Top-Down approach on
the model in the graph above would go like:
1. Traveling upward, we start with "Root folder", and set the permissions to 11 initially (only considering
Read, Write).
2. There are two SECURITY relationships to that folder. User 1 is contained in both of them, but "root" is
more generic, so apply it first then "All principals" +W +R → 11.
3. "Home" has no SECURITY instructions, continue.
4. "user1 Home" has SECURITY. First apply "Regular Users" (-R -W) → 00, Then "user 1" (+R +W) → 11.
5. The target node "My File.pdf" has no SECURITY modifiers on it, so the effective permissions for "User
1" on "My File.pdf" are ReadWrite → 11.

Read-permission example
In this example, we are going to examine a tree structure of directories and files. Also, there are users
that own files and roles that can be assigned to users. Roles can have permissions on directory or files
structures (here we model only canRead, as opposed to full rwx Unix permissions) and be nested. A more
thorough example of modeling ACL structures can be found at How to Build Role-Based Access Control
1
in SQL .

1

http://www.xaprb.com/blog/2006/08/16/how-to-build-role-based-access-control-in-sql/

64

Advanced Data Modeling Examples

Node[ 19]
'nam e' = 'Root '

has

Node[ 21]
'nam e' = 'Role'
subRole

subRole

Node[ 7]

Node[ 22]

'nam e' = 'User'

'nam e' = 'SUDOers'

m em ber

m em ber

m em ber

m em ber

has

canRead

Node[ 13]

Node[ 14]

Node[ 8]

Node[ 9]

Node[ 18]

'nam e' = 'User2'

'nam e' = 'User1'

'nam e' = 'Adm in2'

'nam e' = 'Adm in1'

'nam e' = 'FileRoot '
cont ains

cont ains

Node[ 10]

Node[ 12]

'nam e' = 'Hom e'

'nam e' = 'et c'

owns

cont ains

cont ains

cont ains

Node[ 20]

Node[ 15]

Node[ 11]

'nam e' = 'Hom eU1'

'nam e' = 'Hom eU2'

'nam e' = 'init .d'

owns

leaf

cont ains

Node[ 17]

Node[ 23]

'nam e' = 'File1'

'nam e' = 'Deskt op'
leaf

Node[ 16]
'nam e' = 'File2'

Find all files in the directory structure
In order to find all files contained in this structure, we need a variable length query that follows all
contains relationships and retrieves the nodes at the other end of the leaf relationships.
MATCH ({ name: 'FileRoot' })-[:contains*0..]->(parentDir)-[:leaf]->(file)
RETURN file

resulting in:
file
Node[10]{name:"File1"}
Node[9]{name:"File2"}
2 rows

What files are owned by whom?
If we introduce the concept of ownership on files, we then can ask for the owners of the files we
find — connected via owns relationships to file nodes.
MATCH ({ name: 'FileRoot' })-[:contains*0..]->()-[:leaf]->(file)<-[:owns]-(user)
RETURN file, user

Returning the owners of all files below the FileRoot node.
file

user

Node[10]{name:"File1"}

Node[7]{name:"User1"}

Node[9]{name:"File2"}

Node[6]{name:"User2"}

2 rows

Who has access to a File?
If we now want to check what users have read access to all Files, and define our ACL as
• The root directory has no access granted.
65

Advanced Data Modeling Examples
• Any user having a role that has been granted canRead access to one of the parent folders of a File has
read access.
In order to find users that can read any part of the parent folder hierarchy above the files, Cypher
provides optional variable length path.
MATCH (file)<-[:leaf]-()<-[:contains*0..]-(dir)
OPTIONAL MATCH (dir)<-[:canRead]-(role)-[:member]->(readUser)
WHERE file.name =~ 'File.*'
RETURN file.name, dir.name, role.name, readUser.name

This will return the file, and the directory where the user has the canRead permission along with the
user and their role.
file.name

dir.name

role.name

readUser.name

"File2"

"Desktop"





"File2"

"HomeU2"





"File2"

"Home"





"File2"

"FileRoot"

"SUDOers"

"Admin2"

"File2"

"FileRoot"

"SUDOers"

"Admin1"

"File1"

"HomeU1"





"File1"

"Home"





"File1"

"FileRoot"

"SUDOers"

"Admin2"

"File1"

"FileRoot"

"SUDOers"

"Admin1"

9 rows

The results listed above contain null for optional path segments, which can be mitigated by either
asking several queries or returning just the really needed values.

66

Advanced Data Modeling Examples

6.2. Hyperedges
Imagine a user being part of different groups. A group can have different roles, and a user can be part
of different groups. He also can have different roles in different groups apart from the membership.
The association of a User, a Group and a Role can be referred to as a HyperEdge. However, it can be
easily modeled in a property graph as a node that captures this n-ary relationship, as depicted below in
the U1G2R1 node.
Figure 6.1. Graph
nam e = 'User1'

hasRoleInGroup

hasRoleInGroup

nam e = 'U1G2R1'

in

in

nam e = 'U1G1R2'

hasGroup

hasRole

hasGroup

nam e = 'Group2'

canHave

nam e = 'Group1'

isA

canHave

nam e = 'Role1'

canHave

isA

nam e = 'Group'

isA

hasRole

canHave

nam e = 'Role2'

isA

nam e = 'Role'

Find Groups
To find out in what roles a user is for a particular groups (here Group2), the following query can traverse
this HyperEdge node and provide answers.
Query
MATCH ({ name: 'User1' })-[:hasRoleInGroup]->(hyperEdge)-[:hasGroup]->({ name: 'Group2' }),
(hyperEdge)-[:hasRole]->(role)
RETURN role.name

The role of User1 is returned:
Result
role.name
"Role1"
1 row

67

Advanced Data Modeling Examples

Find all groups and roles for a user
Here, find all groups and the roles a user has, sorted by the name of the role.
Query
MATCH ({ name: 'User1' })-[:hasRoleInGroup]->(hyperEdge)-[:hasGroup]->(group),
(hyperEdge)-[:hasRole]->(role)
RETURN role.name, group.name
ORDER BY role.name ASC

The groups and roles of User1 are returned:
Result
role.name

group.name

"Role1"

"Group2"

"Role2"

"Group1"

2 rows

Find common groups based on shared roles
Assume a more complicated graph:
1.
2.
3.
4.
5.

Two user nodes User1, User2.
User1 is in Group1, Group2, Group3.
User1 has Role1, Role2 in Group1; Role2, Role3 in Group2; Role3, Role4 in Group3 (hyper edges).
User2 is in Group1, Group2, Group3.
User2 has Role2, Role5 in Group1; Role3, Role4 in Group2; Role5, Role6 in Group3 (hyper edges).

The graph for this looks like the following (nodes like U1G2R23 representing the HyperEdges):
Figure 6.2. Graph
nam e = 'User1'

hasRoleInGroup

nam e = 'U1G3R34'

hasRole

hasRole

hasRole

nam e = 'Role4'

nam e = 'U2G2R34'

hasRole

nam e = 'Role3'

hasRole

nam e = 'User2'

hasRoleInGroup

hasRoleInGroup

hasRoleInGroup

nam e = 'U1G2R23'

hasGroup

hasGroup

nam e = 'Group2'

hasRoleInGroup

nam e = 'U1G1R12'

hasGroup

hasRole

hasRole

nam e = 'Role2'

hasRole

nam e = 'U2G1R25'

hasRole

nam e = 'Group3'

nam e = 'Role1'

hasGroup

hasGroup

hasGroup

nam e = 'Group1'

hasRoleInGroup

nam e = 'U2G3R56'

hasRole

hasRole

nam e = 'Role5'

hasRole

nam e = 'Role6'

To return Group1 and Group2 as User1 and User2 share at least one common role in these two groups, the
query looks like this:
Query
MATCH (u1)-[:hasRoleInGroup]->(hyperEdge1)-[:hasGroup]->(group),(hyperEdge1)-[:hasRole]->(role),
(u2)-[:hasRoleInGroup]->(hyperEdge2)-[:hasGroup]->(group),(hyperEdge2)-[:hasRole]->(role)
WHERE u1.name = 'User1' AND u2.name = 'User2'
RETURN group.name, count(role)
ORDER BY group.name ASC

The groups where User1 and User2 share at least one common role:
Result
group.name

count(role)

"Group1"

1

"Group2"

1

2 rows

68

Advanced Data Modeling Examples

6.3. Basic friend finding based on social neighborhood
Imagine an example graph like the following one:
Figure 6.3. Graph
nam e = 'Joe'

knows

knows

nam e = 'Sara'

knows

knows

nam e = 'Bill'

knows

nam e = 'Derrick'

knows

nam e = 'Jill'

knows

nam e = 'Ian'

To find out the friends of Joe’s friends that are not already his friends, the query looks like this:
Query
MATCH (joe { name: 'Joe' })-[:knows*2..2]-(friend_of_friend)
WHERE NOT (joe)-[:knows]-(friend_of_friend)
RETURN friend_of_friend.name, COUNT(*)
ORDER BY COUNT(*) DESC , friend_of_friend.name

This returns a list of friends-of-friends ordered by the number of connections to them, and secondly by
their name.
Result
friend_of_friend.name

COUNT(*)

"Ian"

2

"Derrick"

1

"Jill"

1

3 rows

69

Advanced Data Modeling Examples

6.4. Co-favorited places
Figure 6.4. Graph
nam e = 'Joe'

favorit e

nam e = 'SaunaX'

favorit e

nam e = 'CoffeeShop1'

nam e = 'Jill'

favorit e

favorit e

favorit e

nam e = 'CoffeeShop2'

t agged

t agged

nam e = 'MelsPlace'

t agged

t agged

nam e = 'Cool'

t agged

favorit e

nam e = 'CoffeeShop3'

nam e = 'CoffeShop2'

t agged

nam e = 'Cosy'

Co-favorited places — users who like x also like y
Find places that people also like who favorite this place:
• Determine who has favorited place x.
• What else have they favorited that is not place x.
Query
MATCH (place)<-[:favorite]-(person)-[:favorite]->(stuff)
WHERE place.name = 'CoffeeShop1'
RETURN stuff.name, count(*)
ORDER BY count(*) DESC , stuff.name

The list of places that are favorited by people that favorited the start place.
Result
stuff.name

count(*)

"MelsPlace"

2

"CoffeShop2"

1

"SaunaX"

1

3 rows

Co-Tagged places — places related through tags
Find places that are tagged with the same tags:
• Determine the tags for place x.
• What else is tagged the same as x that is not x.
Query
MATCH (place)-[:tagged]->(tag)<-[:tagged]-(otherPlace)
WHERE place.name = 'CoffeeShop1'
RETURN otherPlace.name, collect(tag.name)
ORDER BY length(collect(tag.name)) DESC , otherPlace.name

This query returns other places than CoffeeShop1 which share the same tags; they are ranked by the
number of tags.
Result
otherPlace.name

collect(tag.name)

"MelsPlace"

["Cosy", "Cool"]

3 rows

70

Advanced Data Modeling Examples
otherPlace.name

collect(tag.name)

"CoffeeShop2"

["Cool"]

"CoffeeShop3"

["Cosy"]

3 rows

71

Advanced Data Modeling Examples

6.5. Find people based on similar favorites
Figure 6.5. Graph
nam e = 'Joe'

friend

favorit e

nam e = 'Derrick'

favorit e

favorit e

nam e = 'Sara'

favorit e

favorit e

favorit e

nam e = 'Cat s'

nam e = 'Jill'

favorit e

nam e = 'Bikes'

To find out the possible new friends based on them liking similar things as the asking person, use a
query like this:
Query
MATCH (me { name: 'Joe' })-[:favorite]->(stuff)<-[:favorite]-(person)
WHERE NOT (me)-[:friend]-(person)
RETURN person.name, count(stuff)
ORDER BY count(stuff) DESC

The list of possible friends ranked by them liking similar stuff that are not yet friends is returned.
Result
person.name

count(stuff)

"Derrick"

2

"Jill"

1

2 rows

72

Advanced Data Modeling Examples

6.6. Find people based on mutual friends and groups
Figure 6.6. Graph
Node[ 3]

Node[ 4]

nam e = 'Jill'

nam e = 'Joe'

knows

knows

Node[ 2]

Node[ 0]

nam e = 'Bob'

nam e = 'Bill'

m em ber_of_group

m em ber_of_group

m em ber_of_group

m em ber_of_group

Node[ 1]
nam e = 'Group1'

In this scenario, the problem is to determine mutual friends and groups, if any, between persons. If no
mutual groups or friends are found, there should be a 0 returned.
Query
MATCH (me { name: 'Joe' }),(other)
WHERE other.name IN ['Jill', 'Bob']
OPTIONAL MATCH pGroups=(me)-[:member_of_group]->(mg)<-[:member_of_group]-(other)
OPTIONAL MATCH pMutualFriends=(me)-[:knows]->(mf)<-[:knows]-(other)
RETURN other.name AS name, count(DISTINCT pGroups) AS mutualGroups,
count(DISTINCT pMutualFriends) AS mutualFriends
ORDER BY mutualFriends DESC

The question we are asking is — how many unique paths are there between me and Jill, the paths being
common group memberships, and common friends. If the paths are mandatory, no results will be
returned if me and Bob lack any common friends, and we don’t want that. To make a path optional, you
have to make at least one of it’s relationships optional. That makes the whole path optional.
Result
name

mutualGroups

mutualFriends

"Jill"

1

1

"Bob"

1

0

2 rows

73

Advanced Data Modeling Examples

6.7. Find friends based on similar tagging
Figure 6.7. Graph
nam e = 'Joe'

favorit e

nam e = 'Cat s'

favorit e

nam e = 'Horses'

t agged

nam e = 'Sara'

favorit e

favorit e

favorit e

nam e = 'Surfing'

t agged

favorit e

favorit e

nam e = 'Bikes'

t agged

nam e = 'Anim als'

nam e = 'Derrick'

t agged

nam e = 'Hobby'

To find people similar to me based on the taggings of their favorited items, one approach could be:
•
•
•
•

Determine the tags associated with what I favorite.
What else is tagged with those tags?
Who favorites items tagged with the same tags?
Sort the result by how many of the same things these people like.

Query
MATCH
(me)-[:favorite]->(myFavorites)-[:tagged]->(tag)<-[:tagged]-(theirFavorites)<-[:favorite]-(people)
WHERE me.name = 'Joe' AND NOT me=people
RETURN people.name AS name, count(*) AS similar_favs
ORDER BY similar_favs DESC

The query returns the list of possible friends ranked by them liking similar stuff that are not yet friends.
Result
name

similar_favs

"Sara"

2

"Derrick"

1

2 rows

74

Advanced Data Modeling Examples

6.8. Multirelational (social) graphs
Figure 6.8. Graph
nam e = 'cars'

LIKES

LIKES

nam e = 'Ben'
FOLLOWS
nam e = 'Sara'

nam e = 'Maria'

LIKES
nam e = 'cat s'
LIKES

FOLLOWS
FOLLOWS

nam e = 'bikes'

LIKES

FOLLOWS
FOLLOWS LOVES
LOVES

nam e = 'Joe'

LIKES

nam e = 'nat ure'

This example shows a multi-relational network between persons and things they like. A multi-relational
graph is a graph with more than one kind of relationship between nodes.
Query
MATCH (me { name: 'Joe' })-[r1:FOLLOWS|:LOVES]->(other)-[r2]->(me)
WHERE type(r1)=type(r2)
RETURN other.name, type(r1)

The query returns people that FOLLOWS or LOVES Joe back.
Result
other.name

type(r1)

"Maria"

"FOLLOWS"

"Maria"

"LOVES"

"Sara"

"FOLLOWS"

3 rows

75

Advanced Data Modeling Examples

6.9. Implementing newsfeeds in a graph
nam e = 'Bob'

FRIEND
st at us = 'CONFIRMED'

FRIEND
st at us = 'CONFIRMED'

nam e = 'Alice'

STATUS

dat e = 2
nam e = 'alice_s1'
t ext = 'Alices st at us1'

NEXT

dat e = 5
nam e = 'alice_s2'
t ext = 'Alices st at us2'

STATUS

FRIEND
st at us = 'PENDING'

dat e = 1
nam e = 'bob_s1'
t ext = 'bobs st at us1'

NEXT

nam e = 'Joe'

dat e = 4
nam e = 'bob_s2'
t ext = 'bobs st at us2'

STATUS

dat e = 3
nam e = 'joe_s1'
t ext = 'Joe st at us1'

NEXT

dat e = 6
nam e = 'joe_s2'
t ext = 'Joe st at us2'

Implementation of newsfeed or timeline feature is a frequent requirement for social applications. The
2
following exmaples are inspired by Newsfeed feature powered by Neo4j Graph Database . The query
asked here is:
Starting at me, retrieve the time-ordered status feed of the status updates of me and and all friends that
are connected via a CONFIRMED FRIEND relationship to me.
Query
MATCH (me { name: 'Joe' })-[rels:FRIEND*0..1]-(myfriend)
WHERE ALL (r IN rels WHERE r.status = 'CONFIRMED')
WITH myfriend
MATCH (myfriend)-[:STATUS]-(latestupdate)-[:NEXT*0..1]-(statusupdates)
RETURN myfriend.name AS name, statusupdates.date AS date, statusupdates.text AS text
ORDER BY statusupdates.date DESC LIMIT 3

To understand the strategy, let’s divide the query into five steps:
1. First Get the list of all my friends (along with me) through FRIEND relationship (MATCH (me {name:
'Joe'})-[rels:FRIEND*0..1]-(myfriend)). Also, the WHERE predicate can be added to check whether the
friend request is pending or confirmed.
2

https://web.archive.org/web/20121102191919/http://techfin.in/2012/10/newsfeed-feature-powered-by-neo4j-graph-database/

76

Advanced Data Modeling Examples
2. Get the latest status update of my friends through Status relationship (MATCH (myfriend)-[:STATUS](latestupdate)).
3. Get subsequent status updates (along with the latest one) of my friends through NEXT relationships
(MATCH (myfriend)-[:STATUS]-(latestupdate)-[:NEXT*0..1]-(statusupdates)) which will give you the
latest and one additional statusupdate; adjust 0..1 to whatever suits your case.
4. Sort the status updates by posted date (ORDER BY statusupdates.date DESC).
5. LIMIT the number of updates you need in every query (LIMIT 3).
Result
name

date

text

"Joe"

6

"Joe status2"

"Bob"

4

"bobs status2"

"Joe"

3

"Joe status1"

3 rows

Here, the example shows how to add a new status update into the existing data for a user.
Query
MATCH (me)
WHERE me.name='Bob'
OPTIONAL MATCH (me)-[r:STATUS]-(secondlatestupdate)
DELETE r
CREATE (me)-[:STATUS]->(latest_update { text:'Status',date:123 })
WITH latest_update, collect(secondlatestupdate) AS seconds
FOREACH (x IN seconds | CREATE (latest_update)-[:NEXT]->(x))
RETURN latest_update.text AS new_status

Dividing the query into steps, this query resembles adding new item in middle of a doubly linked list:
1. Get the latest update (if it exists) of the user through the STATUS relationship (OPTIONAL MATCH (me)[r:STATUS]-(secondlatestupdate)).
2. Delete the STATUS relationship between user and secondlatestupdate (if it exists), as this would
become the second latest update now and only the latest update would be added through a STATUS
relationship; all earlier updates would be connected to their subsequent updates through a NEXT
relationship. (DELETE r).
3. Now, create the new statusupdate node (with text and date as properties) and connect
this with the user through a STATUS relationship (CREATE (me)-[:STATUS]->(latest_update
{ text:'Status',date:123 })).
4. Pipe over statusupdate or an empty collection to the next query part (WITH latest_update,
collect(secondlatestupdate) AS seconds).
5. Now, create a NEXT relationship between the latest status update and the second latest status update
(if it exists) (FOREACH(x in seconds | CREATE (latest_update)-[:NEXT]->(x))).
Result
new_status
"Status"
1 row
Nodes created: 1
Relationships created: 2
Properties set: 2
Relationships deleted: 1

77

Advanced Data Modeling Examples

Node[ 0]
nam e = 'Bob'

STATUS

Node[ 1]
dat e = 1
nam e = 'bob_s1'
t ext = 'bobs st at us1'

NEXT

Node[ 2]
dat e = 4
nam e = 'bob_s2'
t ext = 'bobs st at us2'

78

Advanced Data Modeling Examples

6.10. Boosting recommendation results
Figure 6.9. Graph
nam e = 'Clark Kent '

KNOWS
weight = 4

nam e = 'Lois Lane'

KNOWS
weight = 4

nam e = 'Jim m y Olsen'

KNOWS
weight = 4

WORKS_AT
weight = 2
act ivit y = 56

nam e = 'Anderson Cooper'

KNOWS
weight = 4

WORKS_AT
weight = 2
act ivit y = 10

nam e = 'Perry Whit e'

WORKS_AT
weight = 2
act ivit y = 2

WORKS_AT
weight = 2
act ivit y = 3

nam e = 'CNN'

WORKS_AT
weight = 2
act ivit y = 45

KNOWS
weight = 4

WORKS_AT
weight = 2
act ivit y = 6

nam e = 'Daily Planet '

This query finds the recommended friends for the origin that are working at the same place as the
origin, or know a person that the origin knows, also, the origin should not already know the target. This
recommendation is weighted for the weight of the relationship r2, and boosted with a factor of 2, if
there is an activity-property on that relationship
Query
MATCH (origin)-[r1:KNOWS|WORKS_AT]-(c)-[r2:KNOWS|WORKS_AT]-(candidate)
WHERE origin.name = "Clark Kent" AND type(r1)=type(r2) AND NOT (origin)-[:KNOWS]-(candidate)
RETURN origin.name AS origin, candidate.name AS candidate, SUM(ROUND(r2.weight
+(COALESCE(r2.activity,
0)* 2))) AS boost
ORDER BY boost DESC LIMIT 10

This returns the recommended friends for the origin nodes and their recommendation score.
Result
origin

candidate

boost

"Clark Kent"

"Perry White"

22. 0

"Clark Kent"

"Anderson Cooper"

4. 0

2 rows

79

Advanced Data Modeling Examples

6.11. Calculating the clustering coefficient of a network
Figure 6.10. Graph
nam e = 'st art node'

KNOWS

KNOWS

KNOWS

KNOWS

KNOWS

KNOWS

KNOWS

3

In this example, adapted from Niko Gamulins blog post on Neo4j for Social Network Analysis ,
the graph in question is showing the 2-hop relationships of a sample person as nodes with KNOWS
relationships.
4

The clustering coefficient of a selected node is defined as the probability that two randomly selected
neighbors are connected to each other. With the number of neighbors as n and the number of mutual
connections between the neighbors r the calculation is:
The number of possible connections between two neighbors is n!/(2!(n-2)!) = 4!/(2!(4-2)!) = 24/4 =
6, where n is the number of neighbors n = 4 and the actual number r of connections is 1. Therefore the
clustering coefficient of node 1 is 1/6.
n and r are quite simple to retrieve via the following query:
Query
MATCH (a { name: "startnode" })--(b)
WITH a, count(DISTINCT b) AS n
MATCH (a)--()-[r]-()--(a)
RETURN n, count(DISTINCT r) AS r

This returns n and r for the above calculations.
Result
n

r

4

1
1 row

3
4

http://mypetprojects.blogspot.se/2012/06/social-network-analysis-with-neo4j.html
http://en.wikipedia.org/wiki/Clustering_coefficient

80

Advanced Data Modeling Examples

6.12. Pretty graphs
5

This section is showing how to create some of the named pretty graphs on Wikipedia .

Star graph
The graph is created by first creating a center node, and then once per element in the range, creates a
leaf node and connects it to the center.
Query
CREATE (center)
FOREACH (x IN range(1,6)| CREATE (leaf),(center)-[:X]->(leaf))
RETURN id(center) AS id;

The query returns the id of the center node.
Result
id
0
1 row
Nodes created: 7
Relationships created: 6

Figure 6.11. Graph

X
X

X

X

X
X

Wheel graph
This graph is created in a number of steps:
•
•
•
•
•

Create a center node.
Once per element in the range, create a leaf and connect it to the center.
Connect neighboring leafs.
Find the minimum and maximum leaf and connect these.
Return the id of the center node.

Query
CREATE (center)
5

http://en.wikipedia.org/wiki/Gallery_of_named_graphs

81

Advanced Data Modeling Examples
FOREACH (x IN range(1,6)| CREATE (leaf { count:x }),(center)-[:X]->(leaf))
WITH center
MATCH (large_leaf)<--(center)-->(small_leaf)
WHERE large_leaf.count = small_leaf.count + 1
CREATE (small_leaf)-[:X]->(large_leaf)
WITH center, min(small_leaf.count) AS min, max(large_leaf.count) AS max
MATCH (first_leaf)<--(center)-->(last_leaf)
WHERE first_leaf.count = min AND last_leaf.count = max
CREATE (last_leaf)-[:X]->(first_leaf)
RETURN id(center) AS id

The query returns the id of the center node.
Result
id
0
1 row
Nodes created: 7
Relationships created: 12
Properties set: 6

Figure 6.12. Graph
count = 3

X
count = 4

X

count = 2

X

X

X

X
X
count = 5

X

X

X

X
count = 1

X
count = 6

Complete graph
To create this graph, we first create 6 nodes and label them with the Leaf label. We then match all the
unique pairs of nodes, and create a relationship between them.
Query
FOREACH (x IN range(1,6)| CREATE (leaf:Leaf { count : x }))
WITH *
MATCH (leaf1:Leaf),(leaf2:Leaf)
WHERE id(leaf1)< id(leaf2)
CREATE (leaf1)-[:X]->(leaf2);

Nothing is returned by this query.
Result
(empty result)
Nodes created: 6
Relationships created: 15
Properties set: 6
Labels added: 6

82

Advanced Data Modeling Examples
Figure 6.13. Graph
Leaf

Leaf

X

count = 4

X

count = 3

X

Leaf

Leaf

XX

X

X

X

X

X

count = 5

X

X

Leaf

count = 6

X

X

count = 2

X

Leaf
count = 1

Friendship graph
This query first creates a center node, and then once per element in the range, creates a cycle graph
and connects it to the center
Query
CREATE (center)
FOREACH (x IN range(1,3)| CREATE (leaf1),(leaf2),(center)-[:X]->(leaf1),(center)-[:X]->(leaf2),
(leaf1)-[:X]->(leaf2))
RETURN ID(center) AS id

The id of the center node is returned by the query.
Result
id
0
1 row
Nodes created: 7
Relationships created: 9

83

Advanced Data Modeling Examples
Figure 6.14. Graph

X
X
X
X
X

X
X
X
X

84

Advanced Data Modeling Examples

6.13. A multilevel indexing structure (path tree)
In this example, a multi-level tree structure is used to index event nodes (here Event1, Event2 and Event3,
in this case with a YEAR-MONTH-DAY granularity, making this a timeline indexing structure. However,
this approach should work for a wide range of multi-level ranges.
The structure follows a couple of rules:
• Events can be indexed multiple times by connecting the indexing structure leafs with the events via a
VALUE relationship.
• The querying is done in a path-range fashion. That is, the start- and end path from the indexing root
to the start and end leafs in the tree are calculated
• Using Cypher, the queries following different strategies can be expressed as path sections and put
together using one single query.
The graph below depicts a structure with 3 Events being attached to an index structure at different
leafs.
Figure 6.15. Graph
Root

2010

2011

Year 2010

Year 2011

12

01

Mont h 12

Mont h 01

31

01

NEXT
Day 31

VALUE

Event 1

02

NEXT
Day 01

VALUE

03

NEXT
Day 02

VALUE

Day 03

VALUE

Event 2

Event 3

Return zero range
Here, only the events indexed under one leaf (2010-12-31) are returned. The query only needs one path
segment rootPath (color Green) through the index.

85

Advanced Data Modeling Examples
Figure 6.16. Graph
Root

2010

2011

Year 2010

Year 2011

12

01

Mont h 12

Mont h 01

31

01

NEXT
Day 31

VALUE

Event 1

02

NEXT
Day 01

VALUE

03

NEXT
Day 02

VALUE

Day 03

VALUE

Event 2

Event 3

Query
MATCH rootPath=(root)-[:`2010`]->()-[:`12`]->()-[:`31`]->(leaf),(leaf)-[:VALUE]->(event)
WHERE root.name = 'Root'
RETURN event.name
ORDER BY event.name ASC

Returning all events on the date 2010-12-31, in this case Event1 and Event2
Result
event.name
"Event1"
"Event2"
2 rows

Return the full range
In this case, the range goes from the first to the last leaf of the index tree. Here, startPath (color
Greenyellow) and endPath (color Green) span up the range, valuePath (color Blue) is then connecting the
leafs, and the values can be read from the middle node, hanging off the values (color Red) path.

86

Advanced Data Modeling Examples
Figure 6.17. Graph
Root

2010

2011

Year 2010

Year 2011

12

01

Mont h 12

Mont h 01

31

01

NEXT
Day 31

VALUE

Event 1

02

NEXT
Day 01

VALUE

03

NEXT
Day 02

VALUE

Day 03

VALUE

Event 2

Event 3

Query
MATCH startPath=(root)-[:`2010`]->()-[:`12`]->()-[:`31`]->(startLeaf),
endPath=(root)-[:`2011`]->()-[:`01`]->()-[:`03`]->(endLeaf),
valuePath=(startLeaf)-[:NEXT*0..]->(middle)-[:NEXT*0..]->(endLeaf),
vals=(middle)-[:VALUE]->(event)
WHERE root.name = 'Root'
RETURN event.name
ORDER BY event.name ASC

Returning all events between 2010-12-31 and 2011-01-03, in this case all events.
Result
event.name
"Event1"
"Event2"
"Event2"
"Event3"
4 rows

Return partly shared path ranges
Here, the query range results in partly shared paths when querying the index, making the introduction
of and common path segment commonPath (color Black) necessary, before spanning up startPath (color
Greenyellow) and endPath (color Darkgreen) . After that, valuePath (color Blue) connects the leafs and the
indexed values are returned off values (color Red) path.
87

Advanced Data Modeling Examples
Figure 6.18. Graph
Root

2010

2011

Year 2010

Year 2011

12

01

Mont h 12

Mont h 01

31

01

NEXT
Day 31

VALUE

Event 1

02

NEXT
Day 01

VALUE

03

NEXT
Day 02

VALUE

Day 03

VALUE

Event 2

Event 3

Query
MATCH commonPath=(root)-[:`2011`]->()-[:`01`]->(commonRootEnd),
startPath=(commonRootEnd)-[:`01`]->(startLeaf), endPath=(commonRootEnd)-[:`03`]->(endLeaf),
valuePath=(startLeaf)-[:NEXT*0..]->(middle)-[:NEXT*0..]->(endLeaf),
vals=(middle)-[:VALUE]->(event)
WHERE root.name = 'Root'
RETURN event.name
ORDER BY event.name ASC

Returning all events between 2011-01-01 and 2011-01-03, in this case Event2 and Event3.
Result
event.name
"Event2"
"Event3"
2 rows

88

Advanced Data Modeling Examples

6.14. Complex similarity computations
Calculate similarities by complex calculations
Here, a similarity between two players in a game is calculated by the number of times they have eaten
the same food.
Query
MATCH (me { name: 'me' })-[r1:ATE]->(food)<-[r2:ATE]-(you)
WITH me,count(DISTINCT r1) AS H1,count(DISTINCT r2) AS H2,you
MATCH (me)-[r1:ATE]->(food)<-[r2:ATE]-(you)
RETURN sum((1-ABS(r1.times/H1-r2.times/H2))*(r1.times+r2.times)/(H1+H2)) AS similarity

The two players and their similarity measure.
Result
similarity
-30. 0
1 row

Figure 6.19. Graph
nam e = 'm e'

nam e = 'you'

ATE
t im es = 10

nam e = 'm eat '

89

ATE
t im es = 5

Advanced Data Modeling Examples

6.15. The Graphity activity stream model
Find Activity Streams in a network without scaling penalty
This is an approach for scaling the retrieval of activity streams in a friend graph put forward by Rene
6
Pickard as Graphity . In short, a linked list is created for every persons friends in the order that the last
activities of these friends have occured. When new activities occur for a friend, all the ordered friend
lists that this friend is part of are reordered, transferring computing load to the time of new event
updates instead of activity stream reads.

Tip

This approach of course makes excessive use of relationship types. This needs to be
taken into consideration when designing a production system with this approach. See
Section 17.5, “Capacity” [284] for the maximum number of relationship types.
To find the activity stream for a person, just follow the linked list of the friend list, and retrieve the
needed amount of activities form the respective activity list of the friends.
Query
MATCH p=(me { name: 'Jane' })-[:jane_knows*]->(friend),(friend)-[:has]->(status)
RETURN me.name, friend.name, status.name, length(p)
ORDER BY length(p)

The returns the activity stream for Jane.
Result
me.name

friend.name

status.name

length(p)

"Jane"

"Bill"

"Bill_s1"

1

"Jane"

"Joe"

"Joe_s1"

2

"Jane"

"Bob"

"Bob_s1"

3

3 rows

6

http://www.rene-pickhardt.de/graphity-an-efficient-graph-model-for-retrieving-the-top-k-news-feeds-for-users-in-socialnetworks/

90

Advanced Data Modeling Examples
Figure 6.20. Graph
nam e = 'Jane'

jane_knows

nam e = 'Bill'

jane_knows

nam e = 'Joe'

has

nam e = 'Joe_s1'

next

nam e = 'Joe_s2'

has

nam e = 'Bill_s1'

jane_knows

nam e = 'Bob'

next

bob_knows

nam e = 'Bill_s2'

has

bob_knows

nam e = 'Bob_s1'

nam e = 'Ted'

has

nam e = 'Ted_s1'

next

nam e = 'Ted_s2'

91

Advanced Data Modeling Examples

6.16. User roles in graphs
This is an example showing a hierarchy of roles. What’s interesting is that a tree is not sufficient for
storing this kind of structure, as elaborated below.

This is an implementation of an example found in the article A Model to Represent Directed Acyclic
7
8
Graphs (DAG) on SQL Databases by Kemal Erdogan . The article discusses how to store directed
9
acyclic graphs (DAGs) in SQL based DBs. DAGs are almost trees, but with a twist: it may be possible to
reach the same node through different paths. Trees are restricted from this possibility, which makes
them much easier to handle. In our case it is “Ali” and “Engin”, as they are both admins and users and
thus reachable through these group nodes. Reality often looks this way and can’t be captured by tree
structures.
In the article an SQL Stored Procedure solution is provided. The main idea, that also have some support
from scientists, is to pre-calculate all possible (transitive) paths. Pros and cons of this approach:
•
•
•
•

decent performance on read
low performance on insert
wastes lots of space
relies on stored procedures

In Neo4j storing the roles is trivial. In this case we use PART_OF (green edges) relationships to model the
group hierarchy and MEMBER_OF (blue edges) to model membership in groups. We also connect the top
level groups to the reference node by ROOT relationships. This gives us a useful partitioning of the graph.
Neo4j has no predefined relationship types, you are free to create any relationship types and give them
the semantics you want.
Lets now have a look at how to retrieve information from the graph. The the queries are done using
Cypher, the Java code is using the Neo4j Traversal API (see Section 34.2, “Traversal Framework Java
API” [615], which is part of Part VII, “Advanced Usage” [561]).

Get the admins
In Cypher, we could get the admins like this:
7

http://www.codeproject.com/Articles/22824/A-Model-to-Represent-Directed-Acyclic-Graphs-DAG-o
http://www.codeproject.com/script/Articles/MemberArticles.aspx?amid=274518
9
http://en.wikipedia.org/wiki/Directed_acyclic_graph
8

92

Advanced Data Modeling Examples

MATCH ({ name: 'Admins' })<-[:PART_OF*0..]-(group)<-[:MEMBER_OF]-(user)
RETURN user.name, group.name

resulting in:
user.name

group.name

"Ali"

"Admins"

"Demet"

"HelpDesk"

"Engin"

"HelpDesk"

3 rows

And here’s the code when using the Java Traversal API:
Node admins = getNodeByName( "Admins" );
TraversalDescription traversalDescription = db.traversalDescription()
.breadthFirst()
.evaluator( Evaluators.excludeStartPosition() )
.relationships( RoleRels.PART_OF, Direction.INCOMING )
.relationships( RoleRels.MEMBER_OF, Direction.INCOMING );
Traverser traverser = traversalDescription.traverse( admins );

resulting in the output
Found:
Found:
Found:
Found:

Ali at depth: 0
HelpDesk at depth: 0
Demet at depth: 1
Engin at depth: 1

The result is collected from the traverser using this code:
String output = "";
for ( Path path : traverser )
{
Node node = path.endNode();
output += "Found: " + node.getProperty( NAME ) + " at depth: "
+ ( path.length() - 1 ) + "\n";
}

Get the group memberships of a user
In Cypher:
MATCH ({ name: 'Jale' })-[:MEMBER_OF]->()-[:PART_OF*0..]->(group)
RETURN group.name

group.name
"ABCTechnicians"
"Technicians"
"Users"
3 rows

Using the Neo4j Java Traversal API, this query looks like:
Node jale = getNodeByName( "Jale" );
traversalDescription = db.traversalDescription()
.depthFirst()
.evaluator( Evaluators.excludeStartPosition() )
.relationships( RoleRels.MEMBER_OF, Direction.OUTGOING )
.relationships( RoleRels.PART_OF, Direction.OUTGOING );

93

Advanced Data Modeling Examples
traverser = traversalDescription.traverse( jale );

resulting in:
Found: ABCTechnicians at depth: 0
Found: Technicians at depth: 1
Found: Users at depth: 2

Get all groups
In Cypher:
MATCH ({ name: 'Reference_Node' })<-[:ROOT]->()<-[:PART_OF*0..]-(group)
RETURN group.name

group.name
"Users"
"Managers"
"Technicians"
"ABCTechnicians"
"Admins"
"HelpDesk"
6 rows

In Java:
Node referenceNode = getNodeByName( "Reference_Node") ;
traversalDescription = db.traversalDescription()
.breadthFirst()
.evaluator( Evaluators.excludeStartPosition() )
.relationships( RoleRels.ROOT, Direction.INCOMING )
.relationships( RoleRels.PART_OF, Direction.INCOMING );
traverser = traversalDescription.traverse( referenceNode );

resulting in:
Found:
Found:
Found:
Found:
Found:
Found:

Users at depth: 0
Admins at depth: 0
Technicians at depth: 1
Managers at depth: 1
HelpDesk at depth: 1
ABCTechnicians at depth: 2

Get all members of all groups
Now, let’s try to find all users in the system being part of any group.
In Cypher, this looks like:
MATCH ({ name: 'Reference_Node' })<-[:ROOT]->(root), p=(root)<-[PART_OF*0..]-()<-[:MEMBER_OF]-(user)
RETURN user.name, min(length(p))
ORDER BY min(length(p)), user.name

and results in the following output:
user.name

min(length(p))

"Ali"

1

"Burcu"

1

10 rows

94

Advanced Data Modeling Examples
user.name

min(length(p))

"Can"

1

"Engin"

1

"Demet"

2

"Fuat"

2

"Gul"

2

"Hakan"

2

"Irmak"

2

"Jale"

3

10 rows

in Java:
traversalDescription = db.traversalDescription()
.breadthFirst()
.evaluator(
Evaluators.includeWhereLastRelationshipTypeIs( RoleRels.MEMBER_OF ) );
traverser = traversalDescription.traverse( referenceNode );
Found:
Found:
Found:
Found:
Found:
Found:
Found:
Found:
Found:
Found:

Can at depth: 1
Burcu at depth: 1
Engin at depth: 1
Ali at depth: 1
Irmak at depth: 2
Hakan at depth: 2
Fuat at depth: 2
Gul at depth: 2
Demet at depth: 2
Jale at depth: 3

As seen above, querying even more complex scenarios can be done using comparatively short
constructs in Cypher or Java.

95

Chapter 7. Languages
Please see http://neo4j.com/developer/language-guides/ for the current set of drivers!
There’s an included Java example which shows a “low-level” approach to using the Neo4j REST API from
Java.

96

Languages

7.1. How to use the REST API from Java
Creating a graph through the REST API from Java
The REST API uses HTTP and JSON, so that it can be used from many languages and platforms. Still,
when geting started it’s useful to see some patterns that can be re-used. In this brief overview, we’ll
show you how to create and manipulate a simple graph through the REST API and also how to query it.
1

2

For these examples, we’ve chosen the Jersey client components, which are easily downloaded via
Maven.

Start the server
Before we can perform any actions on the server, we need to start it as per Section 23.2, “Server
Installation” [439]. Next up, we’ll check the connection to the server:
WebResource resource = Client.create()
.resource( SERVER_ROOT_URI );
ClientResponse response = resource.get( ClientResponse.class );
System.out.println( String.format( "GET on [%s], status code [%d]",
SERVER_ROOT_URI, response.getStatus() ) );
response.close();

If the status of the response is 200 OK, then we know the server is running fine and we can continue. If
the code fails to connect to the server, then please have a look at Part V, “Operations” [435].

Note

If you get any other response than 200 OK (particularly 4xx or 5xx responses) then please
check your configuration and look in the log files in the data/log directory.

Sending Cypher
Using the REST API, we can send Cypher queries to the server. This is the main way to use Neo4j. It
allows control of the transactional boundaries as needed.
Let’s try to use this to list all the nodes in the database which have a name property.
final String txUri = SERVER_ROOT_URI + "transaction/commit";
WebResource resource = Client.create().resource( txUri );
String payload = "{\"statements\" : [ {\"statement\" : \"" +query + "\"} ]}";
ClientResponse response = resource
.accept( MediaType.APPLICATION_JSON )
.type( MediaType.APPLICATION_JSON )
.entity( payload )
.post( ClientResponse.class );
System.out.println( String.format(
"POST [%s] to [%s], status code [%d], returned data: "
+ System.lineSeparator() + "%s",
payload, txUri, response.getStatus(),
response.getEntity( String.class ) ) );
response.close();

For more details, see Section 21.1, “Transactional Cypher HTTP endpoint” [298].

Fine-grained REST API calls
For exploratory and special purposes, there is a fine grained REST API, see Chapter 21, REST API [297].
The following sections highlight some of the basic operations.
1
2

http://jersey.java.net/
https://jersey.java.net/nonav/documentation/1.9/user-guide.html#chapter_deps

97

Languages

Creating a node
The REST API uses POST to create nodes. Encapsulating that in Java is straightforward using the Jersey
client:
final String nodeEntryPointUri = SERVER_ROOT_URI + "node";
// http://localhost:7474/db/data/node
WebResource resource = Client.create()
.resource( nodeEntryPointUri );
// POST {} to the node entry point URI
ClientResponse response = resource.accept( MediaType.APPLICATION_JSON )
.type( MediaType.APPLICATION_JSON )
.entity( "{}" )
.post( ClientResponse.class );
final URI location = response.getLocation();
System.out.println( String.format(
"POST to [%s], status code [%d], location header [%s]",
nodeEntryPointUri, response.getStatus(), location.toString() ) );
response.close();
return location;

If the call completes successfully, under the covers it will have sent a HTTP request containing a JSON
payload to the server. The server will then have created a new node in the database and responded
with a 201 Created response and a Location header with the URI of the newly created node.
In our example, we call this functionality twice to create two nodes in our database.

Adding properties
Once we have nodes in our datatabase, we can use them to store useful data. In this case, we’re going
to store information about music in our database. Let’s start by looking at the code that we use to
create nodes and add properties. Here we’ve added nodes to represent "Joe Strummer" and a band
called "The Clash".
URI firstNode = createNode();
addProperty( firstNode, "name", "Joe Strummer" );
URI secondNode = createNode();
addProperty( secondNode, "band", "The Clash" );

Inside the addProperty method we determine the resource that represents properties for the node and
decide on a name for that property. We then proceed to PUT the value of that property to the server.
String propertyUri = nodeUri.toString() + "/properties/" + propertyName;
// http://localhost:7474/db/data/node/{node_id}/properties/{property_name}
WebResource resource = Client.create()
.resource( propertyUri );
ClientResponse response = resource.accept( MediaType.APPLICATION_JSON )
.type( MediaType.APPLICATION_JSON )
.entity( "\"" + propertyValue + "\"" )
.put( ClientResponse.class );
System.out.println( String.format( "PUT to [%s], status code [%d]",
propertyUri, response.getStatus() ) );
response.close();

If everything goes well, we’ll get a 204 No Content back indicating that the server processed the request
but didn’t echo back the property value.

Adding relationships
Now that we have nodes to represent Joe Strummer and The Clash, we can relate them. The REST
API supports this through a POST of a relationship representation to the start node of the relationship.
98

Languages
Correspondingly in Java we POST some JSON to the URI of our node that represents Joe Strummer, to
establish a relationship between that node and the node representing The Clash.
URI relationshipUri = addRelationship( firstNode, secondNode, "singer",
"{ \"from\" : \"1976\", \"until\" : \"1986\" }" );

Inside the addRelationship method, we determine the URI of the Joe Strummer node’s relationships,
and then POST a JSON description of our intended relationship. This description contains the destination
node, a label for the relationship type, and any attributes for the relation as a JSON collection.
private static URI addRelationship( URI startNode, URI endNode,
String relationshipType, String jsonAttributes )
throws URISyntaxException
{
URI fromUri = new URI( startNode.toString() + "/relationships" );
String relationshipJson = generateJsonRelationship( endNode,
relationshipType, jsonAttributes );
WebResource resource = Client.create()
.resource( fromUri );
// POST JSON to the relationships URI
ClientResponse response = resource.accept( MediaType.APPLICATION_JSON )
.type( MediaType.APPLICATION_JSON )
.entity( relationshipJson )
.post( ClientResponse.class );
final URI location = response.getLocation();
System.out.println( String.format(
"POST to [%s], status code [%d], location header [%s]",
fromUri, response.getStatus(), location.toString() ) );
response.close();
return location;
}

If all goes well, we receive a 201 Created status code and a Location header which contains a URI of the
newly created relation.

Add properties to a relationship
Like nodes, relationships can have properties. Since we’re big fans of both Joe Strummer and the Clash,
we’ll add a rating to the relationship so that others can see he’s a 5-star singer with the band.
addMetadataToProperty( relationshipUri, "stars", "5" );

Inside the addMetadataToProperty method, we determine the URI of the properties of the relationship
and PUT our new values (since it’s PUT it will always overwrite existing values, so be careful).
private static void addMetadataToProperty( URI relationshipUri,
String name, String value ) throws URISyntaxException
{
URI propertyUri = new URI( relationshipUri.toString() + "/properties" );
String entity = toJsonNameValuePairCollection( name, value );
WebResource resource = Client.create()
.resource( propertyUri );
ClientResponse response = resource.accept( MediaType.APPLICATION_JSON )
.type( MediaType.APPLICATION_JSON )
.entity( entity )
.put( ClientResponse.class );
System.out.println( String.format(
"PUT [%s] to [%s], status code [%d]", entity, propertyUri,
response.getStatus() ) );
response.close();

99

Languages
}

Assuming all goes well, we’ll get a 204 OK response back from the server (which we can check by calling
ClientResponse.getStatus()) and we’ve now established a very small graph that we can query.

Querying graphs
As with the embedded version of the database, the Neo4j server uses graph traversals to look for data
in graphs. Currently the Neo4j server expects a JSON payload describing the traversal to be POST-ed at
the starting node for the traversal (though this is likely to change in time to a GET-based approach).
To start this process, we use a simple class that can turn itself into the equivalent JSON, ready for POSTing to the server, and in this case we’ve hardcoded the traverser to look for all nodes with outgoing
relationships with the type "singer".
// TraversalDefinition turns into JSON to send to the Server
TraversalDefinition t = new TraversalDefinition();
t.setOrder( TraversalDefinition.DEPTH_FIRST );
t.setUniqueness( TraversalDefinition.NODE );
t.setMaxDepth( 10 );
t.setReturnFilter( TraversalDefinition.ALL );
t.setRelationships( new Relation( "singer", Relation.OUT ) );

Once we have defined the parameters of our traversal, we just need to transfer it. We do this by
determining the URI of the traversers for the start node, and then POST-ing the JSON representation of
the traverser to it.
URI traverserUri = new URI( startNode.toString() + "/traverse/node" );
WebResource resource = Client.create()
.resource( traverserUri );
String jsonTraverserPayload = t.toJson();
ClientResponse response = resource.accept( MediaType.APPLICATION_JSON )
.type( MediaType.APPLICATION_JSON )
.entity( jsonTraverserPayload )
.post( ClientResponse.class );
System.out.println( String.format(
"POST [%s] to [%s], status code [%d], returned data: "
+ System.lineSeparator() + "%s",
jsonTraverserPayload, traverserUri, response.getStatus(),
response.getEntity( String.class ) ) );
response.close();

Once that request has completed, we get back our dataset of singers and the bands they belong to:
[ {
"outgoing_relationships" : "http://localhost:7474/db/data/node/82/relationships/out",
"data" : {
"band" : "The Clash",
"name" : "Joe Strummer"
},
"traverse" : "http://localhost:7474/db/data/node/82/traverse/{returnType}",
"all_typed_relationships" : "http://localhost:7474/db/data/node/82/relationships/all/{-list|&|types}",
"property" : "http://localhost:7474/db/data/node/82/properties/{key}",
"all_relationships" : "http://localhost:7474/db/data/node/82/relationships/all",
"self" : "http://localhost:7474/db/data/node/82",
"properties" : "http://localhost:7474/db/data/node/82/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/82/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/82/relationships/in",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/82/relationships/in/{-list|&|types}",
"create_relationship" : "http://localhost:7474/db/data/node/82/relationships"
}, {
"outgoing_relationships" : "http://localhost:7474/db/data/node/83/relationships/out",

100

Languages
"data" : {
},
"traverse" : "http://localhost:7474/db/data/node/83/traverse/{returnType}",
"all_typed_relationships" : "http://localhost:7474/db/data/node/83/relationships/all/{-list|&|types}",
"property" : "http://localhost:7474/db/data/node/83/properties/{key}",
"all_relationships" : "http://localhost:7474/db/data/node/83/relationships/all",
"self" : "http://localhost:7474/db/data/node/83",
"properties" : "http://localhost:7474/db/data/node/83/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/83/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/83/relationships/in",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/83/relationships/in/{-list|&|types}",
"create_relationship" : "http://localhost:7474/db/data/node/83/relationships"
} ]

Phew, is that it?
That’s a flavor of what we can do with the REST API. Naturally any of the HTTP idioms we provide on the
server can be easily wrapped, including removing nodes and relationships through DELETE. Still if you’ve
gotten this far, then switching .post() for .delete() in the Jersey client code should be straightforward.

What’s next?
The HTTP API provides a good basis for implementers of client libraries, it’s also great for HTTP and
REST folks. In the future though we expect that idiomatic language bindings will appear to take
advantage of the REST API while providing comfortable language-level constructs for developers to use,
much as there are similar bindings for the embedded database.

Appendix: the code
3

• CreateSimpleGraph.java
4
• Relation.java
5
• TraversalDefinition.java

3

https://github.com/neo4j/neo4j/blob/2.3.12/community/server-examples/src/main/java/org/neo4j/examples/server/
CreateSimpleGraph.java
4
https://github.com/neo4j/neo4j/blob/2.3.12/community/server-examples/src/main/java/org/neo4j/examples/server/
Relation.java
5
https://github.com/neo4j/neo4j/blob/2.3.12/community/server-examples/src/main/java/org/neo4j/examples/server/
TraversalDefinition.java

101

Part III. Cypher Query Language

The Cypher part is the authoritative source for details on the Cypher Query Language. For a short
introduction, see Section 8.1, “What is Cypher?” [106]. To take your first steps with Cypher, see
Chapter 3, Introduction to Cypher [16]. For the terminology used, see Terminology [640].

8. Introduction ........................................................................................................................................
8.1. What is Cypher? ......................................................................................................................
8.2. Updating the graph .................................................................................................................
8.3. Transactions ............................................................................................................................
8.4. Uniqueness ..............................................................................................................................
8.5. Parameters ..............................................................................................................................
8.6. Compatibility ...........................................................................................................................
9. Syntax .................................................................................................................................................
9.1. Values ......................................................................................................................................
9.2. Expressions ..............................................................................................................................
9.3. Identifiers .................................................................................................................................
9.4. Operators ................................................................................................................................
9.5. Comments ...............................................................................................................................
9.6. Patterns ...................................................................................................................................
9.7. Collections ...............................................................................................................................
9.8. Working with NULL .................................................................................................................
10. General Clauses ...............................................................................................................................
10.1. Return ....................................................................................................................................
10.2. Order by ................................................................................................................................
10.3. Limit .......................................................................................................................................
10.4. Skip ........................................................................................................................................
10.5. With .......................................................................................................................................
10.6. Unwind ..................................................................................................................................
10.7. Union .....................................................................................................................................
10.8. Using ......................................................................................................................................
11. Reading Clauses ...............................................................................................................................
11.1. Match .....................................................................................................................................
11.2. Optional Match ......................................................................................................................
11.3. Where ....................................................................................................................................
11.4. Start .......................................................................................................................................
11.5. Aggregation ...........................................................................................................................
11.6. Load CSV ...............................................................................................................................
12. Writing Clauses ................................................................................................................................
12.1. Create ....................................................................................................................................
12.2. Merge .....................................................................................................................................
12.3. Set ..........................................................................................................................................
12.4. Delete ....................................................................................................................................
12.5. Remove ..................................................................................................................................
12.6. Foreach ..................................................................................................................................
12.7. Create Unique .......................................................................................................................
12.8. Importing CSV files with Cypher ...........................................................................................
12.9. Using Periodic Commit .........................................................................................................
13. Functions ..........................................................................................................................................
13.1. Predicates ..............................................................................................................................
13.2. Scalar functions .....................................................................................................................
13.3. Collection functions ...............................................................................................................
13.4. Mathematical functions ........................................................................................................
13.5. String functions .....................................................................................................................
14. Schema .............................................................................................................................................
14.1. Indexes ..................................................................................................................................
14.2. Constraints ............................................................................................................................
14.3. Statistics .................................................................................................................................
15. Query Tuning ....................................................................................................................................
15.1. How are queries executed? ..................................................................................................
15.2. How do I profile a query? .....................................................................................................
15.3. Basic query tuning example .................................................................................................

103

105
106
109
110
111
113
117
118
119
120
123
124
126
127
131
134
136
137
140
142
144
146
148
150
152
154
155
164
166
174
176
182
186
187
192
200
204
205
207
208
211
213
214
215
218
224
229
238
243
244
247
252
253
254
255
256

Cypher Query Language
16. Execution Plans ................................................................................................................................
16.1. Starting point operators .......................................................................................................
16.2. Expand operators ..................................................................................................................
16.3. Combining operators ............................................................................................................
16.4. Row operators .......................................................................................................................
16.5. Update Operators .................................................................................................................

104

259
260
263
265
270
275

Chapter 8. Introduction
To get an overview of Cypher, continue reading Section 8.1, “What is Cypher?” [106]. The rest of this
chapter deals with the context of Cypher statements, like for example transaction management and
how to use parameters. For the Cypher language reference itself see other chapters at Part III, “Cypher
Query Language” [102]. To take your first steps with Cypher, see Chapter 3, Introduction to Cypher [16].
For the terminology used, see Terminology [640].

105

Introduction

8.1. What is Cypher?
Introduction
Cypher is a declarative graph query language that allows for expressive and efficient querying and
updating of the graph store. Cypher is a relatively simple but still very powerful language. Very
complicated database queries can easily be expressed through Cypher. This allows you to focus on
your domain instead of getting lost in database access.
Cypher is designed to be a humane query language, suitable for both developers and (importantly, we
think) operations professionals. Our guiding goal is to make the simple things easy, and the complex
things possible. Its constructs are based on English prose and neat iconography which helps to make
queries more self-explanatory. We have tried to optimize the language for reading and not for writing.
Being a declarative language, Cypher focuses on the clarity of expressing what to retrieve from a graph,
not on how to retrieve it. This is in contrast to imperative languages like Java, scripting languages like
1
2
Gremlin , and the JRuby Neo4j bindings . This approach makes query optimization an implementation
detail instead of burdening the user with it and requiring her to update all traversals just because the
physical database structure has changed (new indexes etc.).
Cypher is inspired by a number of different approaches and builds upon established practices for
3
expressive querying. Most of the keywords like WHERE and ORDER BY are inspired by SQL . Pattern
4
matching borrows expression approaches from SPARQL . Some of the collection semantics have been
borrowed from languages such as Haskell and Python.

Structure
Cypher borrows its structure from SQL — queries are built up using various clauses.
Clauses are chained together, and the they feed intermediate result sets between each other. For
example, the matching identifiers from one MATCH clause will be the context that the next clause exists
in.
The query language is comprised of several distinct clauses. You can read more details about them later
in the manual.
Here are a few clauses used to read from the graph:
• MATCH: The graph pattern to match. This is the most common way to get data from the graph.
• WHERE: Not a clause in its own right, but rather part of MATCH, OPTIONAL MATCH and WITH. Adds constraints
to a pattern, or filters the intermediate result passing through WITH.
• RETURN: What to return.
Let’s see MATCH and RETURN in action.
Imagine an example graph like the following one:

1

http://gremlin.tinkerpop.com
https://github.com/neo4jrb/neo4j/
3
http://en.wikipedia.org/wiki/SQL
4
http://en.wikipedia.org/wiki/SPARQL
2

106

Introduction
Figure 8.1. Example Graph
nam e = 'John'

friend

friend

nam e = 'Sara'

nam e = 'Joe'

friend

friend

nam e = 'Maria'

nam e = 'St eve'

For example, here is a query which finds a user called John and John’s friends (though not his direct
friends) before returning both John and any friends-of-friends that are found.
MATCH (john {name: 'John'})-[:friend]->()-[:friend]->(fof)
RETURN john.name, fof.name

Resulting in:
john.name

fof.name

"John"

"Maria"

"John"

"Steve"

2 rows

Next up we will add filtering to set more parts in motion:
We take a list of user names and find all nodes with names from this list, match their friends and return
only those followed users who have a name property starting with S.
MATCH (user)-[:friend]->(follower)
WHERE user.name IN ['Joe', 'John', 'Sara', 'Maria', 'Steve'] AND follower.name =~ 'S.*'
RETURN user.name, follower.name

Resulting in:
user.name

follower.name

"John"

"Sara"

"Joe"

"Steve"

2 rows

And here are examples of clauses that are used to update the graph:
• CREATE (and DELETE): Create (and delete) nodes and relationships.
• SET (and REMOVE): Set values to properties and add labels on nodes using SET and use REMOVE to remove
them.
• MERGE: Match existing or create new nodes and patterns. This is especially useful together with
uniqueness constraints.
For more Cypher examples, see Chapter 5, Basic Data Modeling Examples [47] as well as the rest of
the Cypher part with details on the language. To use Cypher from Java, see Section 33.15, “Execute
107

Introduction
Cypher Queries from Java” [609]. To take your first steps with Cypher, see Chapter 3, Introduction to
Cypher [16].

108

Introduction

8.2. Updating the graph
Cypher can be used for both querying and updating your graph.

The Structure of Updating Queries
• A Cypher query part can’t both match and update the graph at the same time.
• Every part can either read and match on the graph, or make updates on it.
If you read from the graph and then update the graph, your query implicitly has two parts — the
reading is the first part, and the writing is the second part.
If your query only performs reads, Cypher will be lazy and not actually match the pattern until you ask
for the results. In an updating query, the semantics are that all the reading will be done before any
writing actually happens.
The only pattern where the query parts are implicit is when you first read and then write — any other
order and you have to be explicit about your query parts. The parts are separated using the WITH
statement. WITH is like an event horizon — it’s a barrier between a plan and the finished execution of
that plan.
When you want to filter using aggregated data, you have to chain together two reading query
parts — the first one does the aggregating, and the second filters on the results coming from the first
one.
MATCH (n {name: 'John'})-[:FRIEND]-(friend)
WITH n, count(friend) as friendsCount
WHERE friendsCount > 3
RETURN n, friendsCount

Using WITH, you specify how you want the aggregation to happen, and that the aggregation has to be
finished before Cypher can start filtering.
Here’s an example of updating the graph, writing the aggregated data to the graph:
MATCH (n {name: 'John'})-[:FRIEND]-(friend)
WITH n, count(friend) as friendsCount
SET n.friendCount = friendsCount
RETURN n.friendsCount

You can chain together as many query parts as the available memory permits.

Returning data
Any query can return data. If your query only reads, it has to return data — it serves no purpose if it
doesn’t, and it is not a valid Cypher query. Queries that update the graph don’t have to return anything,
but they can.
After all the parts of the query comes one final RETURN clause. RETURN is not part of any query part — it
is a period symbol at the end of a query. The RETURN clause has three sub-clauses that come with it:
SKIP/LIMIT and ORDER BY.
If you return graph elements from a query that has just deleted them — beware, you are holding a
pointer that is no longer valid. Operations on that node are undefined.

109

Introduction

8.3. Transactions
Any query that updates the graph will run in a transaction. An updating query will always either fully
succeed, or not succeed at all.
Cypher will either create a new transaction or run inside an existing one:
• If no transaction exists in the running context Cypher will create one and commit it once the query
finishes.
• In case there already exists a transaction in the running context, the query will run inside it, and
nothing will be persisted to disk until that transaction is successfully committed.
This can be used to have multiple queries be committed as a single transaction:
1. Open a transaction,
2. run multiple updating Cypher queries,
3. and commit all of them in one go.
Note that a query will hold the changes in memory until the whole query has finished executing. A large
query will consequently need a JVM with lots of heap space.
For using transactions over the REST API, see Section 21.1, “Transactional Cypher HTTP
endpoint” [298].
When writing server extensions or using Neo4j embedded, remember that all iterators returned from
an execution result should be either fully exhausted or closed to ensure that the resources bound to
them will be properly released. Resources include transactions started by the query, so failing to do so
may, for example, lead to deadlocks or other weird behavior.

110

Introduction

8.4. Uniqueness
While pattern matching, Neo4j makes sure to not include matches where the same graph relationship
is found multiple times in a single pattern. In most use cases, this is a sensible thing to do.
Example: looking for a user’s friends of friends should not return said user.
Let’s create a few nodes and relationships:
CREATE (adam:User { name: 'Adam' }),(pernilla:User { name: 'Pernilla' }),(david:User { name: 'David'
}),
(adam)-[:FRIEND]->(pernilla),(pernilla)-[:FRIEND]->(david)

Which gives us the following graph:
User
nam e = 'Adam '

FRIEND

User
nam e = 'Pernilla'

FRIEND

User
nam e = 'David'

Now let’s look for friends of friends of Adam:
MATCH (user:User { name: 'Adam' })-[r1:FRIEND]-()-[r2:FRIEND]-(friend_of_a_friend)
RETURN friend_of_a_friend.name AS fofName

fofName
"David"
1 row

In this query, Cypher makes sure to not return matches where the pattern relationships r1 and r2 point
to the same graph relationship.
This is however not always desired. If the query should return the user, it is possible to spread the
matching over multiple MATCH clauses, like so:
MATCH (user:User { name: 'Adam' })-[r1:FRIEND]-(friend)
MATCH (friend)-[r2:FRIEND]-(friend_of_a_friend)
RETURN friend_of_a_friend.name AS fofName

fofName
"David"
"Adam"
2 rows

Note that while the following query looks similar to the previous one, it is actually equivalent to the one
before.
111

Introduction

MATCH (user:User { name: 'Adam' })-[r1:FRIEND]-(friend),(friend)-[r2:FRIEND]-(friend_of_a_friend)
RETURN friend_of_a_friend.name AS fofName

Here, the MATCH clause has a single pattern with two paths, while the previous query has two distinct
patterns.
fofName
"David"
1 row

112

Introduction

8.5. Parameters
Cypher supports querying with parameters. This means developers don’t have to resort to string
building to create a query. In addition to that, it also makes caching of execution plans much easier for
Cypher.
Parameters can be used for literals and expressions in the WHERE clause, for the index value in the
START clause, index queries, and finally for node/relationship ids. Parameters can not be used as for
property names, relationship types and labels, since these patterns are part of the query structure that
is compiled into a query plan.
Accepted names for parameters are letters and numbers, and any combination of these.
For details on using parameters via the Neo4j REST API, see Section 21.1, “Transactional Cypher
HTTP endpoint” [298]. For details on parameters when using the Neo4j embedded Java API, see
Section 33.16, “Query Parameters” [611].
Below follows a comprehensive set of examples of parameter usage. The parameters are given as JSON
here. Exactly how to submit them depends on the driver in use.

String literal
Parameters
{
"name" : "Johan"
}

Query
MATCH (n)
WHERE n.name = { name }
RETURN n

You can use parameters in this syntax as well:
Parameters
{
"name" : "Johan"
}

Query
MATCH (n { name: { name }})
RETURN n

Regular expression
Parameters
{
"regex" : ".*h.*"
}

Query
MATCH (n)
WHERE n.name =~ { regex }
RETURN n.name

Case-sensitive string pattern matching
Parameters

113

Introduction

{
"name" : "Michael"
}

Query
MATCH (n)
WHERE n.name STARTS WITH { name }
RETURN n.name

Create node with properties
Parameters
{
"props" : {
"position" : "Developer",
"name" : "Andres"
}
}

Query
CREATE ({ props })

Create multiple nodes with properties
Parameters
{
"props" : [ {
"position" : "Developer",
"awesome" : true,
"name" : "Andres"
}, {
"position" : "Developer",
"name" : "Michael",
"children" : 3
} ]
}

Query
CREATE (n:Person { props })
RETURN n

Setting all properties on node
Note that this will replace all the current properties.
Parameters
{
"props" : {
"position" : "Developer",
"name" : "Andres"
}
}

Query
MATCH (n)
WHERE n.name='Michaela'
SET n = { props }

114

Introduction

SKIP and LIMIT
Parameters
{
"s" : 1,
"l" : 1
}

Query
MATCH (n)
RETURN n.name
SKIP { s }
LIMIT { l }

Node id
Parameters
{
"id" : 0
}

Query
MATCH n
WHERE id(n)= { id }
RETURN n.name

Multiple node ids
Parameters
{
"ids" : [ 0, 1, 2 ]
}

Query
MATCH n
WHERE id(n) IN { ids }
RETURN n.name

Index value (legacy indexes)
Parameters
{
"value" : "Michaela"
}

Query
START n=node:people(name = { value })
RETURN n

Index query (legacy indexes)
Parameters
{
"query" : "name:Andreas"
}

115

Introduction
Query
START n=node:people({ query })
RETURN n

116

Introduction

8.6. Compatibility
Cypher is still changing rather rapidly. Parts of the changes are internal — we add new pattern
matchers, aggregators and optimizations or write new query planners, which hopefully makes your
queries run faster.
Other changes are directly visible to our users — the syntax is still changing. New concepts are being
added and old ones changed to fit into new possibilities. To guard you from having to keep up with our
syntax changes, Neo4j allows you to use an older parser, but still gain speed from new optimizations.
There are two ways you can select which parser to use. You can configure your database with the
configuration parameter cypher_parser_version, and enter which parser you’d like to use (see the
section called “Supported Language Versions” [117])). Any Cypher query that doesn’t explicitly say
anything else, will get the parser you have configured, or the latest parser if none is configured.
The other way is on a query by query basis. By simply putting CYPHER 2.2 at the beginning, that
particular query will be parsed with the 2.2 version of the parser. Below is an example using the START
clause to access a legacy index:
CYPHER 2.2
START n=node:nodes(name = "A")
RETURN n

Accessing entities by id via START
In versions of Cypher prior to 2.2 it was also possible to access specific nodes or relationships using the
START clause. In this case you could use a syntax like the following:
CYPHER 1.9
START n=node(42)
RETURN n

Note

The use of the START clause to find nodes by ID was deprecated from Cypher 2.0 onwards
and is now entirely disabled in Cypher 2.2 and up. You should instead make use of the MATCH
clause for starting points. See Section 11.1, “Match” [155] for more information on the
correct syntax for this. The START clause should only be used when accessing legacy indexes
(see Chapter 35, Legacy Indexing [621]).

Supported Language Versions
Neo4j 2.3 supports the following versions of the Cypher language:
• Neo4j Cypher 2.3
• Neo4j Cypher 2.2
• Neo4j Cypher 1.9

Tip

Each release of Neo4j supports a limited number of old Cypher Language Versions. When
you upgrade to a new release of Neo4j, please make sure that it supports the Cypher
language version you need. If not, you may need to modify your queries to work with a
newer Cypher language version.

117

Chapter 9. Syntax
The nitty-gritty details of Cypher syntax.

118

Syntax

9.1. Values
All values that are handled by Cypher have a distinct type. The supported types of values are:
•
•
•
•
•
•
•
•

Numeric values,
String values,
Boolean values,
Nodes,
Relationships,
Paths,
Maps from Strings to other values,
Collections of any other type of value.

Most types of values can be constructed in a query using literal expressions (see Section 9.2,
“Expressions” [120]). Special care must be taken when using NULL, as NULL is a value of every type (see
Section 9.8, “Working with NULL” [134]). Nodes, relationships, and paths are returned as a result of
pattern matching.
Note that labels are not values but are a form of pattern syntax.

119

Syntax

9.2. Expressions
Expressions in general
An expression in Cypher can be:
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•

A decimal (integer or double) literal: 13, -40000, 3.14, 6.022E23.
A hexadecimal integer literal (starting with 0x): 0x13zf, 0xFC3A9, -0x66eff.
An octal integer literal (starting with 0): 01372, 02127, -05671.
A string literal: "Hello", 'World'.
A boolean literal: true, false, TRUE, FALSE.
An identifier: n, x, rel, myFancyIdentifier, `A name with weird stuff in it[]!`.
A property: n.prop, x.prop, rel.thisProperty, myFancyIdentifier.`(weird property name)`.
A dynamic property: n["prop"], rel[n.city + n.zip], map[coll[0]].
A parameter: {param}, {0}
A collection of expressions: ["a", "b"], [1,2,3], ["a", 2, n.property, {param}], [ ].
A function call: length(p), nodes(p).
An aggregate function: avg(x.prop), count(*).
A path-pattern: (a)-->()<--(b).
An operator application: 1 + 2 and 3 < 4.
A predicate expression is an expression that returns true or false: a.prop = "Hello", length(p) > 10,
has(a.name).
• A regular expression: a.name =~ "Tob.*"
• A case-sensitive string matching expression: a.surname STARTS WITH "Sven", a.surname ENDS WITH "son"
or a.surname CONTAINS "son"
• A CASE expression.

Note on string literals
String literals can contain these escape sequences.
Escape Character
sequence
\t Tab
\b Backspace
\n Newline
\r Carriage return
\f Form feed
\' Single quote
\" Double quote
\\ Backslash
\uxxxx Unicode UTF-16 code point (4
hex digits must follow the \u)
\Uxxxxxxxx Unicode UTF-32 code point (8
hex digits must follow the \U)

Case Expressions
Cypher supports CASE expressions, which is a generic conditional expression, similar to if/else
statements in other languages. Two variants of CASE exist — the simple form and the generic form.
120

Syntax

Simple CASE
The expression is calculated, and compared in order with the WHEN clauses until a match is found. If no
match is found the expression in the ELSE clause is used, or null, if no ELSE case exists.
Syntax:
CASE test
WHEN value THEN result
[WHEN ...]
[ELSE default]
END

Arguments:
•
•
•
•

test: A valid expression.
value: An expression whose result will be compared to the test expression.
result: This is the result expression used if the value expression matches the test expression.
default: The expression to use if no match is found.

Query
MATCH (n)
RETURN
CASE n.eyes
WHEN 'blue'
THEN 1
WHEN 'brown'
THEN 2
ELSE 3 END AS result

Result
result
2
1
2
1
3
5 rows

Generic CASE
The predicates are evaluated in order until a true value is found, and the result value is used. If no
match is found the expression in the ELSE clause is used, or null, if no ELSE case exists.
Syntax:
CASE
WHEN predicate THEN result
[WHEN ...]
[ELSE default]
END

Arguments:
• predicate: A predicate that is tested to find a valid alternative.
• result: This is the result expression used if the predicate matches.
• default: The expression to use if no match is found.
Query

121

Syntax

MATCH (n)
RETURN
CASE
WHEN n.eyes = 'blue'
THEN 1
WHEN n.age < 40
THEN 2
ELSE 3 END AS result

Result
result
3
1
2
1
3
5 rows

122

Syntax

9.3. Identifiers
When you reference parts of a pattern or a query, you do so by naming them. The names you give the
different parts are called identifiers.
In this example:
MATCH (n)-->(b) RETURN b

The identifiers are n and b.
Identifier names are case sensitive, and can contain underscores and alphanumeric characters (a-z,
0-9), but must always start with a letter. If other characters are needed, you can quote the identifier
using backquote (`) signs.
The same rules apply to property names.

Identifiers are only visible in the same query part

Identifiers are not carried over to subsequent queries. If multiple query parts are chained
together using WITH, identifiers have to be listed in the WITH clause to be carried over to the
next part. For more information see Section 10.5, “With” [146].

123

Syntax

9.4. Operators
Mathematical operators
The mathematical operators are +, -, *, / and %, ^.

Comparison operators
The comparison operators are =, <>, <, >, <=, >=, IS NULL, and IS NOT NULL. See the section called “Equality
and Comparison of Values” [124] on how they behave.
The operators STARTS WITH, ENDS WITH and CONTAINS can be used to search for a string value by its
content.

Boolean operators
The boolean operators are AND, OR, XOR, NOT.

String operators
Strings can be concatenated using the + operator. For regular expression matching the =~ operator is
used.

Collection operators
Collections can be concatenated using the + operator. To check if an element exists in a collection, you
can use the IN operator.

Property operators
Note

Since version 2.0, the previously existing property operators ? and ! have been removed.
This syntax is no longer supported. Missing properties are now returned as NULL. Please use
(NOT(has(.prop)) OR .prop=) if you really need the old behavior of the
? operator. — Also, the use of ? for optional relationships has been removed in favor of the
newly introduced OPTIONAL MATCH clause.

Equality and Comparison of Values
Equality
Cypher supports comparing values (see Section 9.1, “Values” [119]) by equality using the = and <>
operators.
Values of the same type are only equal if they are the same identical value (e.g. 3 = 3 and "x" <> "xy").
Maps are only equal if they map exactly the same keys to equal values and collections are only equal if
they contain the same sequence of equal values (e.g. [3, 4] = [1+2, 8/2]).
Values of different types are considered as equal according to the following rules:
• Paths are treated as collections of alternating nodes and relationships and are equal to all collections
that contain that very same sequence of nodes and relationships.
• Testing any value against NULL with both the = and the <> operators always is NULL. This includes NULL =
NULL and NULL <> NULL. The only way to reliably test if a value v is NULL is by using the special v IS NULL,
or v IS NOT NULL equality operators.
All other combinations of types of values cannot be compared with each other. Especially, nodes,
relationships, and literal maps are incomparable with each other.
It is an error to compare values that cannot be compared.
124

Syntax

Ordering and Comparison of Values
The comparison operators <=, < (for ascending) and >=, > (for descending) are used to compare values
for ordering. The following points give some details on how the comparison is performed.
•
•
•
•
•
•

Numerical values are compared for ordering using numerical order (e.g. 3 < 4 is true).
The special value java.lang.Double.NaN is regarded as being larger than all other numbers.
String values are compared for ordering using lexicographic order (e.g. "x" < "xy").
Boolean values are compared for ordering such that false < true.
Comparing for ordering when one argument is NULL is NULL (e.g. NULL < 3 is NULL).
It is an error to compare other types of values with each other for ordering.

Chaining Comparison Operations
Comparisons can be chained arbitrarily, e.g., x < y <= z is equivalent to x < y AND y <= z.
Formally, if a, b, c, ..., y, z are expressions and op1, op2, ..., opN are comparison operators, then a
op1 b op2 c ... y opN z is equivalent to a op1 b and b op2 c and ... y opN z.
Note that a op1 b op2 c does not imply any kind of comparison between a and c, so that, e.g., x < y > z
is perfectly legal (though perhaps not pretty).
The example:
MATCH (n) WHERE 21 < n.age <= 30 RETURN n

is equivalent to
MATCH (n) WHERE 21 < n.age AND n.age <= 30 RETURN n

Thus it will match all nodes where the age is between 21 and 30.
This syntax extends to all equality and inequality comparisons, as well as extending to chains longer
than three.
For example:
a < b = c <= d <> e

Is equivalent to:
a < b AND b = c AND c <= d AND d <> e

For other comparison operators, see the section called “Comparison operators” [124].

125

Syntax

9.5. Comments
To add comments to your queries, use double slash. Examples:
MATCH (n) RETURN n //This is an end of line comment
MATCH (n)
//This is a whole line comment
RETURN n
MATCH (n) WHERE n.property = "//This is NOT a comment" RETURN n

126

Syntax

9.6. Patterns
Patterns and pattern-matching are at the very heart of Cypher, so being effective with Cypher requires
a good understanding of patterns.
Using patterns, you describe the shape of the data you’re looking for. For example, in the MATCH clause
you describe the shape with a pattern, and Cypher will figure out how to get that data for you.
The pattern describes the data using a form that is very similar to how one typically draws the shape of
property graph data on a whiteboard: usually as circles (representing nodes) and arrows between them
to represent relationships.
Patterns appear in multiple places in Cypher: in MATCH, CREATE and MERGE clauses, and in pattern
expressions. Each of these is described in more details in:
•
•
•
•
•

Section 11.1, “Match” [155]
Section 11.2, “Optional Match” [164]
Section 12.1, “Create” [187]
Section 12.2, “Merge” [192]
the section called “Using path patterns in WHERE” [170]

Patterns for nodes
The very simplest “shape” that can be described in a pattern is a node. A node is described using a pair
of parentheses, and is typically given a name. For example:
(a)

This simple pattern describes a single node, and names that node using the identifier a.

Patterns for related nodes
More interesting is patterns that describe multiple nodes and relationships between them. Cypher
patterns describe relationships by employing an arrow between two nodes. For example:
(a)-->(b)

This pattern describes a very simple data shape: two nodes, and a single relationship from one to the
other. In this example, the two nodes are both named as a and b respectively, and the relationship is
“directed”: it goes from a to b.
This way of describing nodes and relationships can be extended to cover an arbitrary number of nodes
and the relationships between them, for example:
(a)-->(b)<--(c)

Such a series of connected nodes and relationships is called a "path".
Note that the naming of the nodes in these patterns is only necessary should one need to refer to the
same node again, either later in the pattern or elsewhere in the Cypher query. If this is not necessary
then the name may be omitted, like so:
(a)-->()<--(c)

Labels
In addition to simply describing the shape of a node in the pattern, one can also describe attributes.
The most simple attribute that can be described in the pattern is a label that the node must have. For
example:
(a:User)-->(b)

127

Syntax
One can also describe a node that has multiple labels:
(a:User:Admin)-->(b)

Specifying properties
Nodes and relationships are the fundamental structures in a graph. Neo4j uses properties on both of
these to allow for far richer models.
Properties can be expressed in patterns using a map-construct: curly brackets surrounding a number of
key-expression pairs, separated by commas. E.g. a node with two properties on it would look like:
(a { name: "Andres", sport: "Brazilian Ju-Jitsu" })

A relationship with expectations on it would could look like:
(a)-[{blocked: false}]->(b)

When properties appear in patterns, they add an additional constraint to the shape of the data. In
the case of a CREATE clause, the properties will be set in the newly created nodes and relationships.
In the case of a MERGE clause, the properties will be used as additional constraints on the shape any
existing data must have (the specified properties must exactly match any existing data in the graph).
If no matching data is found, then MERGE behaves like CREATE and the properties will be set in the newly
created nodes and relationships.
Note that patterns supplied to CREATE may use a single parameter to specify properties, e.g: CREATE
(node {paramName}). This is not possible with patterns used in other clauses, as Cypher needs to know
the property names at the time the query is compiled, so that matching can be done effectively.

Describing relationships
The simplest way to describe a relationship is by using the arrow between two nodes, as in the
previous examples. Using this technique, you can describe that the relationship should exist and the
directionality of it. If you don’t care about the direction of the relationship, the arrow head can be
omitted, like so:
(a)--(b)

As with nodes, relationships may also be given names. In this case, a pair of square brackets is used to
break up the arrow and the identifier is placed between. For example:
(a)-[r]->(b)

Much like labels on nodes, relationships can have types. To describe a relationship with a specific type,
you can specify this like so:
(a)-[r:REL_TYPE]->(b)

Unlike labels, relationships can only have one type. But if we’d like to describe some data such that the
relationship could have any one of a set of types, then they can all be listed in the pattern, separating
them with the pipe symbol | like this:
(a)-[r:TYPE1|TYPE2]->(b)

Note that this form of pattern can only be used to describe existing data (ie. when using a pattern
with MATCH or as an expression). It will not work with CREATE or MERGE, since it’s not possible to create a
relationship with multiple types.
As with nodes, the name of the relationship can always be omitted, in this case like so:
(a)-[:REL_TYPE]->(b)

128

Syntax

Variable length
Caution

Variable length pattern matching in versions 2.1.x and earlier does not enforce relationship
uniqueness for patterns described inside of a single MATCH clause. This means that a query
such as the following: MATCH (a)-[r]->(b), (a)-[rs*]->(c) RETURN * may include r as part of
the rs set. This behavior has changed in versions 2.2.0 and later, in such a way that r will be
excluded from the result set, as this better adheres to the rules of relationship uniqueness
as documented here Section 8.4, “Uniqueness” [111]. If you have a query pattern that needs
to retrace relationships rather than ignoring them as the relationship uniqueness rules
normally dictate, you can accomplish this using multiple match clauses, as follows: MATCH
(a)-[r]->(b) MATCH (a)-[rs*]->(c) RETURN *. This will work in all versions of Neo4j that
support the MATCH clause, namely 2.0.0 and later.
Rather than describing a long path using a sequence of many node and relationship descriptions in a
pattern, many relationships (and the intermediate nodes) can be described by specifying a length in the
relationship description of a pattern. For example:
(a)-[*2]->(b)

This describes a graph of three nodes and two relationship, all in one path (a path of length 2). This is
equivalent to:
(a)-->()-->(b)

A range of lengths can also be specified: such relationship patterns are called “variable length
relationships”. For example:
(a)-[*3..5]->(b)

This is a minimum length of 3, and a maximum of 5. It describes a graph of either 4 nodes and 3
relationships, 5 nodes and 4 relationships or 6 nodes and 5 relationships, all connected together in a
single path.
Either bound can be omitted. For example, to describe paths of length 3 or more, use:
(a)-[*3..]->(b)

And to describe paths of length 5 or less, use:
(a)-[*..5]->(b)

Both bounds can be omitted, allowing paths of any length to be described:
(a)-[*]->(b)

As a simple example, let’s take the query below:
Query
MATCH (me)-[:KNOWS*1..2]-(remote_friend)
WHERE me.name = "Filipa"
RETURN remote_friend.name

Result
remote_friend.name
"Dilshad"
"Anders"
2 rows

129

Syntax
This query finds data in the graph which a shape that fits the pattern: specifically a node (with the name
property Filipa) and then the KNOWS related nodes, one or two steps out. This is a typical example of
finding first and second degree friends.
Note that variable length relationships can not be used with CREATE and MERGE.

Assigning to path identifiers
As described above, a series of connected nodes and relationships is called a "path". Cypher allows
paths to be named using an identifer, like so:
p = (a)-[*3..5]->(b)

You can do this in MATCH, CREATE and MERGE, but not when using patterns as expressions.

130

Syntax

9.7. Collections
Cypher has good support for collections.

Collections in general
A literal collection is created by using brackets and separating the elements in the collection with
commas.
Query
RETURN [0,1,2,3,4,5,6,7,8,9] AS collection

Result
collection
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
1 row

In our examples, we’ll use the range function. It gives you a collection containing all numbers between
given start and end numbers. Range is inclusive in both ends.
To access individual elements in the collection, we use the square brackets again. This will extract from
the start index and up to but not including the end index.
Query
RETURN range(0,10)[3]

Result
range(0,10)[3]
3
1 row

You can also use negative numbers, to start from the end of the collection instead.
Query
RETURN range(0,10)[-3]

Result
range(0,10)[-3]
8
1 row

Finally, you can use ranges inside the brackets to return ranges of the collection.
Query
RETURN range(0,10)[0..3]

Result
range(0,10)[0..3]
[0, 1, 2]
1 row

Query
RETURN range(0,10)[0..-5]

Result
range(0,10)[0..-5]
[0, 1, 2, 3, 4, 5]
1 row

131

Syntax
Query
RETURN range(0,10)[-5..]

Result
range(0,10)[-5..]
[6, 7, 8, 9, 10]
1 row

Query
RETURN range(0,10)[..4]

Result
range(0,10)[..4]
[0, 1, 2, 3]
1 row

Note

Out-of-bound slices are simply truncated, but out-of-bound single elements return NULL.
Query
RETURN range(0,10)[15]

Result
range(0,10)[15]

1 row

Query
RETURN range(0,10)[5..15]

Result
range(0,10)[5..15]
[5, 6, 7, 8, 9, 10]
1 row

You can get the size of a collection like this:
Query
RETURN size(range(0,10)[0..3])

Result
size(range(0,10)[0..3])
3
1 row

List comprehension
List comprehension is a syntactic construct available in Cypher for creating a collection based on
existing collections. It follows the form of the mathematical set-builder notation (set comprehension)
instead of the use of map and filter functions.
Query

132

Syntax

RETURN [x IN range(0,10) WHERE x % 2 = 0 | x^3] AS result

Result
result
[0. 0, 8. 0, 64. 0, 216. 0, 512. 0, 1000. 0]
1 row

Either the WHERE part, or the expression, can be omitted, if you only want to filter or map respectively.
Query
RETURN [x IN range(0,10) WHERE x % 2 = 0] AS result

Result
result
[0, 2, 4, 6, 8, 10]
1 row

Query
RETURN [x IN range(0,10)| x^3] AS result

Result
result
[0. 0, 1. 0, 8. 0, 27. 0, 64. 0, 125. 0, 216. 0, 343. 0, 512. 0, 729. 0, 1000. 0]
1 row

Literal maps
From Cypher, you can also construct maps. Through REST you will get JSON objects; in Java they will be
java.util.Map.
Query
RETURN { key : "Value", collectionKey: [{ inner: "Map1" }, { inner: "Map2" }]} AS result

Result
result
{key -> "Value", collectionKey -> [{inner -> "Map1"}, {inner -> "Map2"}]}
1 row

133

Syntax

9.8. Working with NULL
Introduction to NULL in Cypher
In Cypher, NULL is used to represent missing or undefined values. Conceptually, NULL means “a missing
unknown value” and it is treated somewhat differently from other values. For example getting a
property from a node that does not have said property produces NULL. Most expressions that take NULL
as input will produce NULL. This includes boolean expressions that are used as predicates in the WHERE
clause. In this case, anything that is not TRUE is interpreted as being false.
NULL is not equal to NULL. Not knowing two values does not imply that they are the same value. So the
expression NULL = NULL yields NULL and not TRUE.

Logical operations with NULL
The logical operators (AND, OR, XOR, IN, NOT) treat NULL as the “unknown” value of three-valued logic. Here is
the truth table for AND, OR and XOR.
a

b

a AND b

a OR b

a XOR b

FALSE

FALSE

FALSE

FALSE

FALSE

FALSE

NULL

FALSE

NULL

NULL

FALSE

TRUE

FALSE

TRUE

TRUE

TRUE

FALSE

FALSE

TRUE

TRUE

TRUE

NULL

NULL

TRUE

NULL

TRUE

TRUE

TRUE

TRUE

FALSE

NULL

FALSE

FALSE

NULL

NULL

NULL

NULL

NULL

NULL

NULL

NULL

TRUE

NULL

TRUE

NULL

The IN operator and NULL
The IN operator follows similar logic. If Cypher knows that something exists in a collection, the result
will be TRUE. Any collection that contains a NULL and doesn’t have a matching element will return NULL.
Otherwise, the result will be false. Here is a table with examples:
Expression

Result

2 IN [1, 2, 3]

TRUE

2 IN [1, NULL, 3]

NULL

2 IN [1, 2, NULL]

TRUE

2 IN [1]

FALSE

2 IN []

FALSE

NULL IN [1,2,3]

NULL

NULL IN [1,NULL,3]

NULL

NULL IN []

FALSE

Using ALL, ANY, NONE, and SINGLE follows a similar rule. If the result can be calculated definitely, TRUE or
FALSE is returned. Otherwise NULL is produced.

Expressions that return NULL
• Getting a missing element from a collection: [][0], head([])
134

Syntax
•
•
•
•

Trying to access a property that does not exist on a node or relationship: n.missingProperty
Comparisons when either side is NULL: 1 < NULL
Arithmetic expressions containing NULL: 1 + NULL
Function calls where any arguments are NULL: sin(NULL)

135

Chapter 10. General Clauses

136

General Clauses

10.1. Return
The RETURN clause defines what to include in the query result set.
In the RETURN part of your query, you define which parts of the pattern you are interested in. It can be
nodes, relationships, or properties on these.

Tip

If what you actually want is the value of a property, make sure to not return the full node/
relationship. This will improve performance.

Figure 10.1. Graph
nam e = 'A'
happy = 'Yes! '
age = 55

BLOCKS KNOWS

nam e = 'B'

Return nodes
To return a node, list it in the RETURN statement.
Query
MATCH (n { name: "B" })
RETURN n

The example will return the node.
Result
n
Node[1]{name:"B"}
1 row

Return relationships
To return a relationship, just include it in the RETURN list.
Query
MATCH (n { name: "A" })-[r:KNOWS]->(c)
RETURN r

The relationship is returned by the example.
Result
r
:KNOWS[0]{}
1 row

Return property
To return a property, use the dot separator, like this:
137

General Clauses
Query
MATCH (n { name: "A" })
RETURN n.name

The value of the property name gets returned.
Result
n.name
"A"
1 row

Return all elements
When you want to return all nodes, relationships and paths found in a query, you can use the * symbol.
Query
MATCH p=(a { name: "A" })-[r]->(b)
RETURN *

This returns the two nodes, the relationship and the path used in the query.
Result
a

b

p

r

Node[0]{name:"A",
happy:"Yes!", age:55}

Node[1]{name:"B"}

[Node[0]{name:"A",
happy:"Yes!",
age:55}, :BLOCKS[1]{},
Node[1]{name:"B"}]

:BLOCKS[1]{}

Node[0]{name:"A",
happy:"Yes!", age:55}

Node[1]{name:"B"}

[Node[0]{name:"A",
happy:"Yes!",
age:55}, :KNOWS[0]{},
Node[1]{name:"B"}]

:KNOWS[0]{}

2 rows

Identifier with uncommon characters
To introduce a placeholder that is made up of characters that are outside of the english alphabet, you
can use the ` to enclose the identifier, like this:
Query
MATCH (`This isn't a common identifier`)
WHERE `This isn't a common identifier`.name='A'
RETURN `This isn't a common identifier`.happy

The node with name "A" is returned
Result
`This isn't a common identifier`.happy
"Yes!"
1 row

Column alias
If the name of the column should be different from the expression used, you can rename it by using AS
.
Query
MATCH (a { name: "A" })
RETURN a.age AS SomethingTotallyDifferent

138

General Clauses
Returns the age property of a node, but renames the column.
Result
SomethingTotallyDifferent
55
1 row

Optional properties
If a property might or might not be there, you can still select it as usual. It will be treated as NULL if it is
missing
Query
MATCH (n)
RETURN n.age

This example returns the age when the node has that property, or null if the property is not there.
Result
n.age
55

2 rows

Other expressions
Any expression can be used as a return item — literals, predicates, properties, functions, and everything
else.
Query
MATCH (a { name: "A" })
RETURN a.age > 30, "I'm a literal",(a)-->()

Returns a predicate, a literal and function call with a pattern expression parameter.
Result
a.age > 30

"I'm a literal"

(a)-->()

true

"I'm a literal"

[[Node[0]{name:"A", happy:"Yes!",
age:55}, :BLOCKS[1]{}, Node[1]
{name:"B"}], [Node[0]{name:"A",
happy:"Yes!", age:55}, :KNOWS[0]
{}, Node[1]{name:"B"}]]

1 row

Unique results
DISTINCT retrieves only unique rows depending on the columns that have been selected to output.
Query
MATCH (a { name: "A" })-->(b)
RETURN DISTINCT b

The node named B is returned by the query, but only once.
Result
b
Node[1]{name:"B"}
1 row

139

General Clauses

10.2. Order by
is a sub-clause following RETURN or WITH, and it specifies that the output should be
sorted and how.
ORDER BY

Note that you can not sort on nodes or relationships, just on properties on these. ORDER BY relies on
comparisons to sort the output, see the section called “Ordering and Comparison of Values” [125].
In terms of scope of identifiers, ORDER BY follows special rules, depending on if the projecting RETURN
or WITH clause is either aggregating or DISTINCT. If it is an aggregating or DISTINCT projection, only the
identifiers available in the projection are available. If the projection does not alter the output cardinality
(which aggregation and DISTINCT do), identifiers available from before the projecting clause are also
available. When the projection clause shadows already existing identifiers, only the new identifiers are
available.
Lastly, it is not allowed to use aggregating expressions in the ORDER BY sub-clause if they are not also
listed in the projecting clause. This last rule is to make sure that ORDER BY does not change the results,
only the order of them.
Figure 10.2. Graph
nam e = 'A'
age = 34
lengt h = 170

KNOWS

nam e = 'B'
age = 34

KNOWS

nam e = 'C'
age = 32
lengt h = 185

Order nodes by property
ORDER BY is used to sort the output.
Query
MATCH (n)
RETURN n
ORDER BY n.name

The nodes are returned, sorted by their name.
Result
n
Node[0]{name:"A", age:34, length:170}
Node[1]{name:"B", age:34}
Node[2]{name:"C", age:32, length:185}
3 rows

140

General Clauses

Order nodes by multiple properties
You can order by multiple properties by stating each identifier in the ORDER BY clause. Cypher will sort
the result by the first identifier listed, and for equals values, go to the next property in the ORDER BY
clause, and so on.
Query
MATCH (n)
RETURN n
ORDER BY n.age, n.name

This returns the nodes, sorted first by their age, and then by their name.
Result
n
Node[2]{name:"C", age:32, length:185}
Node[0]{name:"A", age:34, length:170}
Node[1]{name:"B", age:34}
3 rows

Order nodes in descending order
By adding DESC[ENDING] after the identifier to sort on, the sort will be done in reverse order.
Query
MATCH (n)
RETURN n
ORDER BY n.name DESC

The example returns the nodes, sorted by their name reversely.
Result
n
Node[2]{name:"C", age:32, length:185}
Node[1]{name:"B", age:34}
Node[0]{name:"A", age:34, length:170}
3 rows

Ordering NULL
When sorting the result set, NULL will always come at the end of the result set for ascending sorting, and
first when doing descending sort.
Query
MATCH (n)
RETURN n.length, n
ORDER BY n.length

The nodes are returned sorted by the length property, with a node without that property last.
Result
n.length

n

170

Node[0]{name:"A", age:34, length:170}

185

Node[2]{name:"C", age:32, length:185}



Node[1]{name:"B", age:34}

3 rows

141

General Clauses

10.3. Limit
LIMIT

constrains the number of rows in the output.

LIMIT accepts any expression that evaluates to a positive integer — however the expression cannot refer

to nodes or relationships.

Figure 10.3. Graph
nam e = 'A'

KNOWS

nam e = 'D'

KNOWS

nam e = 'E'

KNOWS

KNOWS

nam e = 'C'

nam e = 'B'

Return first part
To return a subset of the result, starting from the top, use this syntax:
Query
MATCH (n)
RETURN n
ORDER BY n.name
LIMIT 3

The top three items are returned by the example query.
Result
n
Node[2]{name:"A"}
Node[3]{name:"B"}
Node[4]{name:"C"}
3 rows

Return first from expression
Limit accepts any expression that evaluates to a positive integer as long as it is not referring to any
external identifiers:
Parameters
{
"p" : 12
}

Query
MATCH (n)
RETURN n
ORDER BY n.name
LIMIT toInt(3 * rand())+ 1

Returns one to three top items
142

General Clauses
Result
n
Node[2]{name:"A"}
Node[3]{name:"B"}
2 rows

143

General Clauses

10.4. Skip
SKIP

defines from which row to start including the rows in the output.

By using SKIP, the result set will get trimmed from the top. Please note that no guarantees are made on
the order of the result unless the query specifies the ORDER BY clause. SKIP accepts any expression that
evaluates to a positive integer — however the expression cannot refer to nodes or relationships.
Figure 10.4. Graph
nam e = 'A'

KNOWS

nam e = 'D'

KNOWS

nam e = 'E'

KNOWS

nam e = 'C'

KNOWS

nam e = 'B'

Skip first three
To return a subset of the result, starting from the fourth result, use the following syntax:
Query
MATCH (n)
RETURN n
ORDER BY n.name
SKIP 3

The first three nodes are skipped, and only the last two are returned in the result.
Result
n
Node[0]{name:"D"}
Node[1]{name:"E"}
2 rows

Return middle two
To return a subset of the result, starting from somewhere in the middle, use this syntax:
Query
MATCH (n)
RETURN n
ORDER BY n.name
SKIP 1
LIMIT 2

Two nodes from the middle are returned.
Result
n
Node[3]{name:"B"}
Node[4]{name:"C"}
2 rows

144

General Clauses

Skip first from expression
Skip accepts any expression that evaluates to a positive integer as long as it is not referring to any
external identifiers:
Query
MATCH (n)
RETURN n
ORDER BY n.name
SKIP toInt(3*rand())+ 1

The first three nodes are skipped, and only the last two are returned in the result.
Result
n
Node[3]{name:"B"}
Node[4]{name:"C"}
Node[0]{name:"D"}
Node[1]{name:"E"}
4 rows

145

General Clauses

10.5. With
The WITH clause allows query parts to be chained together, piping the results from one to
be used as starting points or criteria in the next.
Using WITH, you can manipulate the output before it is passed on to the following query parts. The
manipulations can be of the shape and/or number of entries in the result set.
One common usage of WITH is to limit the number of entries that are then passed on to other MATCH
clauses. By combining ORDER BY and LIMIT, it’s possible to get the top X entries by some criteria, and then
bring in additional data from the graph.
Another use is to filter on aggregated values. WITH is used to introduce aggregates which can then by
used in predicates in WHERE. These aggregate expressions create new bindings in the results. WITH can
also, like RETURN, alias expressions that are introduced into the results using the aliases as binding name.
WITH is also used to separate reading from updating of the graph. Every part of a query must be either

read-only or write-only. When going from a writing part to a reading part, the switch must be done with
a WITH clause.
Figure 10.5. Graph
nam e = 'David'

KNOWS

nam e = 'Anders'

BLOCKS

BLOCKS

KNOWS

nam e = 'Ceasar'

nam e = 'Bossm an'

KNOWS

KNOWS

nam e = 'Em il'

Filter on aggregate function results
Aggregated results have to pass through a WITH clause to be able to filter on.
Query
MATCH (david { name: "David" })--(otherPerson)-->()
WITH otherPerson, count(*) AS foaf
WHERE foaf > 1
RETURN otherPerson

The person connected to David with the at least more than one outgoing relationship will be returned
by the query.
146

General Clauses
Result
otherPerson
Node[2]{name:"Anders"}
1 row

Sort results before using collect on them
You can sort your results before passing them to collect, thus sorting the resulting collection.
Query
MATCH (n)
WITH n
ORDER BY n.name DESC LIMIT 3
RETURN collect(n.name)

A list of the names of people in reverse order, limited to 3, in a collection.
Result
collect(n.name)
["Emil", "David", "Ceasar"]
1 row

Limit branching of your path search
You can match paths, limit to a certain number, and then match again using those paths as a base As
well as any number of similar limited searches.
Query
MATCH (n { name: "Anders" })--(m)
WITH m
ORDER BY m.name DESC LIMIT 1
MATCH (m)--(o)
RETURN o.name

Starting at Anders, find all matching nodes, order by name descending and get the top result, then find
all the nodes connected to that top result, and return their names.
Result
o.name
"Bossman"
"Anders"
2 rows

147

General Clauses

10.6. Unwind
UNWIND

expands a collection into a sequence of rows.

With UNWIND, you can transform any collection back into individual rows. These collections can be
parameters that were passed in, previously COLLECTed result or other collection expressions.
One common usage of unwind is to create distinct collections. Another is to create data from
parameter collections that are provided to the query.
UNWIND requires you to specify a new name for the inner values.

Unwind a collection
We want to transform the literal collection into rows named x and return them.
Query
UNWIND[1,2,3] AS x
RETURN x

Each value of the original collection is returned as an individual row.
Result
x
1
2
3
3 rows

Create a distinct collection
We want to transform a collection of duplicates into a set using DISTINCT.
Query
WITH [1,1,2,2] AS coll UNWIND coll AS x
WITH DISTINCT x
RETURN collect(x) AS SET

Each value of the original collection is unwound and passed through DISTINCT to create a unique set.
Result
set
[1, 2]
1 row

Create nodes from a collection parameter
Create a number of nodes and relationships from a parameter-list without using FOREACH.
Parameters
{
"events"
"year"
"id" :
}, {
"year"
"id" :
} ]

: [ {
: 2014,
1
: 2014,
2

148

General Clauses
}

Query
UNWIND { events } AS event
MERGE (y:Year { year:event.year })
MERGE (y)<-[:IN]-(e:Event { id:event.id })
RETURN e.id AS x
ORDER BY x

Each value of the original collection is unwound and passed through MERGE to find or create the nodes
and relationships.
Result
x
1
2
2 rows
Nodes created: 3
Relationships created: 2
Properties set: 3
Labels added: 3

149

General Clauses

10.7. Union
The UNION clause is used to combine the result of multiple queries.
It combines the results of two or more queries into a single result set that includes all the rows that
belong to all queries in the union.
The number and the names of the columns must be identical in all queries combined by using UNION.
To keep all the result rows, use UNION ALL. Using just UNION will combine and remove duplicates from the
result set.
Figure 10.6. Graph
Act or

Act or

nam e = 'Ant hony Hopkins'

nam e = 'Hit chcock'

KNOWS

Act or
ACTS_IN
nam e = 'Helen Mirren'

ACTS_IN

Movie
t it le = 'Hit chcock'

Combine two queries
Combining the results from two queries is done using UNION ALL.
Query
MATCH (n:Actor)
RETURN n.name AS name
UNION ALL MATCH (n:Movie)
RETURN n.title AS name

The combined result is returned, including duplicates.
Result
name
"Anthony Hopkins"
"Helen Mirren"
"Hitchcock"
"Hitchcock"
4 rows

Combine two queries and remove duplicates
By not including ALL in the UNION, duplicates are removed from the combined result set
Query
MATCH (n:Actor)

150

General Clauses
RETURN n.name AS name
UNION
MATCH (n:Movie)
RETURN n.title AS name

The combined result is returned, without duplicates.
Result
name
"Anthony Hopkins"
"Helen Mirren"
"Hitchcock"
3 rows

151

General Clauses

10.8. Using
is used to influence the decisions of the planner when building an execution plan for
a query.
USING

Caution

Forcing planner behavior is an advanced feature, and should be used with caution by
experienced developers and/or database administrators only, as it may cause queries to
perform poorly.
When executing a query, Neo4j needs to decide where in the query graph to start matching. This is
done by looking at the MATCH clause and the WHERE conditions and using that information to find useful
indexes.
This index might not be the best choice though — sometimes multiple indexes could be used, and
Neo4j has picked the wrong one (from a performance point of view).
You can force Neo4j to use a specific starting point through the USING clause. This is called giving an
index hint.
If your query matches large parts of an index, it might be faster to scan the label and filter out nodes
that do not match. To do this, you can use USING SCAN. It will force Cypher to not use an index that could
have been used, and instead do a label scan.

Note

You cannot use index hints if your query has a START clause.

Query using an index hint
To query using an index hint, use USING INDEX.
Query
MATCH (n:Swede)
USING INDEX n:Swede(surname)
WHERE n.surname = 'Taylor'
RETURN n

Query Plan
+-----------------+----------------+------+---------+-------------+-----------------+
| Operator
| Estimated Rows | Rows | DB Hits | Identifiers | Other
|
+-----------------+----------------+------+---------+-------------+-----------------+
| +ProduceResults |
1 |
1 |
0 | n
| n
|
| |
+----------------+------+---------+-------------+-----------------+
| +NodeIndexSeek |
1 |
1 |
2 | n
| :Swede(surname) |
+-----------------+----------------+------+---------+-------------+-----------------+
Total database accesses: 2

Query using multiple index hints
To query using multiple index hints, use USING INDEX.
Query
MATCH (m:German)-->(n:Swede)
USING INDEX m:German(surname)
USING INDEX n:Swede(surname)
WHERE m.surname = 'Plantikow' AND n.surname = 'Taylor'
RETURN m

152

General Clauses
Query Plan
+-------------------+------+---------+----------------+----------------+
| Operator
| Rows | DB Hits | Identifiers
| Other
|
+-------------------+------+---------+----------------+----------------+
| +ColumnFilter
|
1 |
0 | m
| keep columns m |
| |
+------+---------+----------------+----------------+
| +TraversalMatcher |
1 |
11 | anon[17], m, n | n, anon[17], m |
+-------------------+------+---------+----------------+----------------+
Total database accesses: 11

Hinting a label scan
If the best performance is to be had by scanning all nodes in a label and then filtering on that set, use
USING SCAN.
Query
MATCH (m:German)
USING SCAN m:German
WHERE m.surname = 'Plantikow'
RETURN m

Query Plan
+------------------+----------------+------+---------+-------------+------------------------------+
| Operator
| Estimated Rows | Rows | DB Hits | Identifiers | Other
|
+------------------+----------------+------+---------+-------------+------------------------------+
| +ProduceResults |
1 |
1 |
0 | m
| m
|
| |
+----------------+------+---------+-------------+------------------------------+
| +Filter
|
1 |
1 |
1 | m
| m.surname == { AUTOSTRING0} |
| |
+----------------+------+---------+-------------+------------------------------+
| +NodeByLabelScan |
1 |
1 |
2 | m
| :German
|
+------------------+----------------+------+---------+-------------+------------------------------+
Total database accesses: 3

153

Chapter 11. Reading Clauses
The flow of data within a Cypher query is an unordered sequence of maps with key-value pairs — a set
of possible bindings between the identifiers in the query and values derived from the database. This set
is refined and augmented by subsequent parts of the query.

154

Reading Clauses

11.1. Match
The MATCH clause is used to search for the pattern described in it.

Introduction
The MATCH clause allows you to specify the patterns Neo4j will search for in the database. This is
the primary way of getting data into the current set of bindings. It is worth reading up more on the
specification of the patterns themselves in Section 9.6, “Patterns” [127].
MATCH is often coupled to a WHERE part which adds restrictions, or predicates, to the MATCH patterns,

making them more specific. The predicates are part of the pattern description, not a filter applied after
the matching is done. This means that WHERE should always be put together with the MATCH clause it belongs
to.
MATCH can occur at the beginning of the query or later, possibly after a WITH. If it is the first clause,

nothing will have been bound yet, and Neo4j will design a search to find the results matching the clause
and any associated predicates specified in any WHERE part. This could involve a scan of the database,
a search for nodes of a certain label, or a search of an index to find starting points for the pattern
matching. Nodes and relationships found by this search are available as bound pattern elements, and
can be used for pattern matching of sub-graphs. They can also be used in any further MATCH clauses,
where Neo4j will use the known elements, and from there find further unknown elements.
Cypher is declarative, and so usually the query itself does not specify the algorithm to use to perform
the search. Neo4j will automatically work out the best approach to finding start nodes and matching
patterns. Predicates in WHERE parts can be evaluated before pattern matching, during pattern matching,
or after finding matches. However, there are cases where you can influence the decisions taken by
the query compiler. Read more about indexes in Section 14.1, “Indexes” [244], and more about the
specifying index hints to force Neo4j to use a specific index in Section 10.8, “Using” [152].

Tip

To understand more about the patterns used in the MATCH clause, read Section 9.6,
“Patterns” [127].

The following graph is used for the examples below:
Figure 11.1. Graph
Person
nam e = 'Rob Reiner'
nam e = 'Charlie Sheen'

FATHER

Person

TYPE THAT HAS SPACE IN IT

Person

Person

Person

ACTED_IN

nam e = 'Charlie Sheen'

nam e = 'Oliver St one'

nam e = 'Michael Douglas'

DIRECTED

ACTED_IN

ACTED_IN

nam e = 'Mart in Sheen'

ACTED_IN

Movie
nam e = 'WallSt reet '
t it le = 'Wall St reet '

ACTED_IN

nam e = 'Rob Reiner'

DIRECTED

Movie
t it le = 'The Am erican President '
nam e = 'TheAm ericanPresident '

Basic node finding
Get all nodes
By just specifying a pattern with a single node and no labels, all nodes in the graph will be returned.
Query
MATCH (n)
RETURN n

155

Reading Clauses
Returns all the nodes in the database.
Result
n
Node[0]{name:"Oliver Stone"}
Node[1]{name:"Charlie Sheen"}
Node[2]{name:"Martin Sheen"}
Node[3]{title:"The American President", name:"TheAmericanPresident"}
Node[4]{name:"WallStreet", title:"Wall Street"}
Node[5]{name:"Rob Reiner"}
Node[6]{name:"Michael Douglas"}
Node[7]{name:"Rob Reiner"}
Node[8]{name:"Charlie Sheen"}
9 rows

Get all nodes with a label
Getting all nodes with a label on them is done with a single node pattern where the node has a label on
it.
Query
MATCH (movie:Movie)
RETURN movie

Returns all the movies in the database.
Result
movie
Node[3]{title:"The American President", name:"TheAmericanPresident"}
Node[4]{name:"WallStreet", title:"Wall Street"}
2 rows

Related nodes
The symbol -- means related to, without regard to type or direction of the relationship.
Query
MATCH (director { name:'Oliver Stone' })--(movie)
RETURN movie.title

Returns all the movies directed by Oliver Stone.
Result
movie.title
"Wall Street"
1 row

Match with labels
To constrain your pattern with labels on nodes, you add it to your pattern nodes, using the label syntax.
Query
MATCH (charlie:Person { name:'Charlie Sheen' })--(movie:Movie)
RETURN movie

156

Reading Clauses
Return any nodes connected with the Person Charlie that are labeled Movie.
Result
movie
Node[4]{name:"WallStreet", title:"Wall Street"}
1 row

Relationship basics
Outgoing relationships
When the direction of a relationship is interesting, it is shown by using --> or <--, like this:
Query
MATCH (martin { name:'Martin Sheen' })-->(movie)
RETURN movie.title

Returns nodes connected to Martin by outgoing relationships.
Result
movie.title
"The American President"
"Wall Street"
2 rows

Directed relationships and identifier
If an identifier is needed, either for filtering on properties of the relationship, or to return the
relationship, this is how you introduce the identifier.
Query
MATCH (martin { name:'Martin Sheen' })-[r]->(movie)
RETURN r

Returns all outgoing relationships from Martin.
Result
r
:ACTED_IN[3]{}
:ACTED_IN[1]{}
2 rows

Match by relationship type
When you know the relationship type you want to match on, you can specify it by using a colon
together with the relationship type.
Query
MATCH (wallstreet { title:'Wall Street' })<-[:ACTED_IN]-(actor)
RETURN actor

Returns nodes that ACTED_IN Wall Street.
Result
actor
Node[6]{name:"Michael Douglas"}
3 rows

157

Reading Clauses
actor
Node[2]{name:"Martin Sheen"}
Node[1]{name:"Charlie Sheen"}
3 rows

Match by multiple relationship types
To match on one of multiple types, you can specify this by chaining them together with the pipe symbol
|.
Query
MATCH (wallstreet { title:'Wall Street' })<-[:ACTED_IN|:DIRECTED]-(person)
RETURN person

Returns nodes with a ACTED_IN or DIRECTED relationship to Wall Street.
Result
person
Node[0]{name:"Oliver Stone"}
Node[6]{name:"Michael Douglas"}
Node[2]{name:"Martin Sheen"}
Node[1]{name:"Charlie Sheen"}
4 rows

Match by relationship type and use an identifier
If you both want to introduce an identifier to hold the relationship, and specify the relationship type
you want, just add them both, like this.
Query
MATCH (wallstreet { title:'Wall Street' })<-[r:ACTED_IN]-(actor)
RETURN r

Returns nodes that ACTED_IN Wall Street.
Result
r
:ACTED_IN[2]{}
:ACTED_IN[1]{}
:ACTED_IN[0]{}
3 rows

Relationships in depth
Note

Inside a single pattern, relationships will only be matched once. You can read more about
this in Section 8.4, “Uniqueness” [111].

Relationship types with uncommon characters
Sometime your database will have types with non-letter characters, or with spaces in them. Use `
(backtick) to quote these.
Query
MATCH (n { name:'Rob Reiner' })-[r:`TYPE THAT HAS SPACE IN IT`]->()

158

Reading Clauses
RETURN r

Returns a relationship of a type with spaces in it.
Result
r
:TYPE THAT HAS SPACE IN IT[8]{}
1 row

Multiple relationships
Relationships can be expressed by using multiple statements in the form of ()--(), or they can be
strung together, like this:
Query
MATCH (charlie { name:'Charlie Sheen' })-[:ACTED_IN]->(movie)<-[:DIRECTED]-(director)
RETURN charlie,movie,director

Returns the three nodes in the path.
Result
charlie

movie

director

Node[1]{name:"Charlie Sheen"}

Node[4]{name:"WallStreet",
title:"Wall Street"}

Node[0]{name:"Oliver Stone"}

1 row

Variable length relationships
Nodes that are a variable number of relationship→node hops away can be found using the following
syntax: -[:TYPE*minHops..maxHops]->. minHops and maxHops are optional and default to 1 and infinity
respectively. When no bounds are given the dots may be omitted.
Query
MATCH (martin { name:"Martin Sheen" })-[:ACTED_IN*1..2]-(x)
RETURN x

Returns nodes that are 1 or 2 relationships away from Martin.
Result
x
Node[4]{name:"WallStreet", title:"Wall Street"}
Node[1]{name:"Charlie Sheen"}
Node[6]{name:"Michael Douglas"}
Node[3]{title:"The American President", name:"TheAmericanPresident"}
Node[6]{name:"Michael Douglas"}
5 rows

Relationship identifier in variable length relationships
When the connection between two nodes is of variable length, a relationship identifier becomes an
collection of relationships.
Query
MATCH (actor { name:'Charlie Sheen' })-[r:ACTED_IN*2]-(co_actor)
RETURN r

The query returns a collection of relationships.
159

Reading Clauses
Result
r
[:ACTED_IN[0]{}, :ACTED_IN[1]{}]
[:ACTED_IN[0]{}, :ACTED_IN[2]{}]
2 rows

Match with properties on a variable length path
A variable length relationship with properties defined on in it means that all relationships in the path
must have the property set to the given value. In this query, there are two paths between Charile Sheen
and his dad Martin Sheen. One of the includes a “blocked” relationship and the other doesn’t. In this
case we first alter the original graph by using the following query to add “blocked” and “unblocked”
relationships:
MATCH (charlie:Person { name:'Charlie Sheen' }),(martin:Person { name:'Martin Sheen' })
CREATE (charlie)-[:X { blocked:false }]->(:Unblocked)<-[:X { blocked:false }]-(martin)
CREATE (charlie)-[:X { blocked:true }]->(:Blocked)<-[:X { blocked:false }]-(martin);

This means that we are starting out with the following graph:
Person
nam e = 'Rob Reiner'
nam e = 'Charlie Sheen'

FATHER

Person

Person

Person

X
blocked = t rue

ACTED_IN
nam e = 'Oliver St one'

nam e = 'Michael Douglas'

DIRECTED

ACTED_IN

Movie
nam e = 'WallSt reet '
t it le = 'Wall St reet '

ACTED_IN

nam e = 'Rob Reiner'

ACTED_IN

DIRECTED

Movie

ACTED_IN

Blocked

X
blocked = t rue

X
blocked = false

Person
nam e = 'Mart in Sheen'

X
blocked = false

Blocked

X
blocked = false

Unblocked

TYPE THAT HAS SPACE IN IT

X
blocked = false

X
blocked = false

X
blocked = false

Unblocked

X
blocked = false

X
blocked = false

X
blocked = t rue

nam e = 'Charlie Sheen'

X
blocked = false

Unblocked

Blocked

t it le = 'The Am erican President '
nam e = 'TheAm ericanPresident '

Query
MATCH p =(charlie:Person)-[* { blocked:false }]-(martin:Person)
WHERE charlie.name = 'Charlie Sheen' AND martin.name = 'Martin Sheen'
RETURN p

Returns the paths between Charlie and Martin Sheen where all relationships have the blocked property
set to FALSE.
Result
p
[Node[1]{name:"Charlie Sheen"}, :X[9]{blocked:false}, Node[9]{}, :X[10]{blocked:false}, Node[2]
{name:"Martin Sheen"}]
1 row

Zero length paths
Using variable length paths that have the lower bound zero means that two identifiers can point to
the same node. If the distance between two nodes is zero, they are by definition the same node.
Note that when matching zero length paths the result may contain a match even when matching on a
relationship type not in use.
Query
MATCH (wallstreet:Movie { title:'Wall Street' })-[*0..1]-(x)
RETURN x

Returns all nodes that are zero or one relationships away from Wall Street.
Result
x
Node[4]{name:"WallStreet", title:"Wall Street"}
5 rows

160

Reading Clauses
x
Node[1]{name:"Charlie Sheen"}
Node[2]{name:"Martin Sheen"}
Node[6]{name:"Michael Douglas"}
Node[0]{name:"Oliver Stone"}
5 rows

Named path
If you want to return or filter on a path in your pattern graph, you can a introduce a named path.
Query
MATCH p =(michael { name:'Michael Douglas' })-->()
RETURN p

Returns the two paths starting from Michael.
Result
p
[Node[6]{name:"Michael Douglas"}, :ACTED_IN[4]{}, Node[3]{title:"The American President",
name:"TheAmericanPresident"}]
[Node[6]{name:"Michael Douglas"}, :ACTED_IN[2]{}, Node[4]{name:"WallStreet", title:"Wall Street"}]
2 rows

Matching on a bound relationship
When your pattern contains a bound relationship, and that relationship pattern doesn’t specify
direction, Cypher will try to match the relationship in both directions.
Query
MATCH (a)-[r]-(b)
WHERE id(r)= 0
RETURN a,b

This returns the two connected nodes, once as the start node, and once as the end node.
Result
a

b

Node[1]{name:"Charlie Sheen"}

Node[4]{name:"WallStreet", title:"Wall Street"}

Node[4]{name:"WallStreet", title:"Wall Street"}

Node[1]{name:"Charlie Sheen"}

2 rows

Shortest path
Single shortest path
Finding a single shortest path between two nodes is as easy as using the shortestPath function. It’s done
like this:
Query
MATCH (martin:Person { name:"Martin Sheen" }),(oliver:Person { name:"Oliver Stone" }),
p = shortestPath((martin)-[*..15]-(oliver))
RETURN p

This means: find a single shortest path between two nodes, as long as the path is max 15 relationships
long. Inside of the parentheses you define a single link of a path — the starting node, the connecting
161

Reading Clauses
relationship and the end node. Characteristics describing the relationship like relationship type, max
hops and direction are all used when finding the shortest path. You can also mark the path as optional.
Result
p
[Node[2]{name:"Martin Sheen"}, :ACTED_IN[1]{}, Node[4]{name:"WallStreet", title:"Wall
Street"}, :DIRECTED[5]{}, Node[0]{name:"Oliver Stone"}]
1 row

All shortest paths
Finds all the shortest paths between two nodes.
Query
MATCH (martin:Person { name:"Martin Sheen" }),(michael:Person { name:"Michael Douglas" }),
p = allShortestPaths((martin)-[*]-(michael))
RETURN p

Finds the two shortest paths between Martin and Michael.
Result
p
[Node[2]{name:"Martin Sheen"}, :ACTED_IN[3]{}, Node[3]{title:"The American President",
name:"TheAmericanPresident"}, :ACTED_IN[4]{}, Node[6]{name:"Michael Douglas"}]
[Node[2]{name:"Martin Sheen"}, :ACTED_IN[1]{}, Node[4]{name:"WallStreet", title:"Wall
Street"}, :ACTED_IN[2]{}, Node[6]{name:"Michael Douglas"}]
2 rows

Get node or relationship by id
Node by id
Search for nodes by id can be done with the id function in a predicate.

Note

Neo4j reuses its internal ids when nodes and relationships are deleted. This means that
applications using, and relying on internal Neo4j ids, are brittle or at risk of making mistakes.
Rather use application generated ids.
Query
MATCH (n)
WHERE id(n)= 1
RETURN n

The corresponding node is returned.
Result
n
Node[1]{name:"Charlie Sheen"}
1 row

Relationship by id
Search for nodes by id can be done with the id function in a predicate.
This is not recommended practice. See the section called “Node by id” [162] for more information on
the use of Neo4j ids.
Query

162

Reading Clauses

MATCH ()-[r]->()
WHERE id(r)= 0
RETURN r

The relationship with id 0 is returned.
Result
r
:ACTED_IN[0]{}
1 row

Multiple nodes by id
Multiple nodes are selected by specifying them in an IN clause.
Query
MATCH (n)
WHERE id(n) IN [1, 2, 0]
RETURN n

This returns the nodes listed in the IN expression.
Result
n
Node[0]{name:"Oliver Stone"}
Node[1]{name:"Charlie Sheen"}
Node[2]{name:"Martin Sheen"}
3 rows

163

Reading Clauses

11.2. Optional Match
The OPTIONAL MATCH clause is used to search for the pattern described in it, while using NULLs
for missing parts of the pattern.

Introduction
OPTIONAL MATCH matches patterns against your graph database, just like MATCH does. The difference is that
if no matches are found, OPTIONAL MATCH will use NULLs for missing parts of the pattern. OPTIONAL MATCH

could be considered the Cypher equivalent of the outer join in SQL.

Either the whole pattern is matched, or nothing is matched. Remember that WHERE is part of the pattern
description, and the predicates will be considered while looking for matches, not after. This matters
especially in the case of multiple (OPTIONAL) MATCH clauses, where it is crucial to put WHERE together with
the MATCH it belongs to.

Tip

To understand the patterns used in the OPTIONAL MATCH clause, read Section 9.6,
“Patterns” [127].

The following graph is used for the examples below:
Figure 11.2. Graph
Person
nam e = 'Charlie Sheen'

FATHER

Person

Person

Person

Person

ACTED_IN
nam e = 'Oliver St one'

nam e = 'Michael Douglas'

DIRECTED

ACTED_IN

ACTED_IN

Movie

nam e = 'Mart in Sheen'

ACTED_IN

ACTED_IN

nam e = 'Rob Reiner'

DIRECTED

Movie

nam e = 'WallSt reet '
t it le = 'Wall St reet '

t it le = 'The Am erican President '
nam e = 'TheAm ericanPresident '

Relationship
If a relationship is optional, use the OPTIONAL MATCH clause. This is similar to how a SQL outer join works.
If the relationship is there, it is returned. If it’s not, NULL is returned in it’s place.
Query
MATCH (a:Movie { title: 'Wall Street' })
OPTIONAL MATCH (a)-->(x)
RETURN x

Returns NULL, since the node has no outgoing relationships.
Result
x

1 row

Properties on optional elements
Returning a property from an optional element that is NULL will also return NULL.
164

Reading Clauses
Query
MATCH (a:Movie { title: 'Wall Street' })
OPTIONAL MATCH (a)-->(x)
RETURN x, x.name

Returns the element x (NULL in this query), and NULL as its name.
Result
x

x.name





1 row

Optional typed and named relationship
Just as with a normal relationship, you can decide which identifier it goes into, and what relationship
type you need.
Query
MATCH (a:Movie { title: 'Wall Street' })
OPTIONAL MATCH (a)-[r:ACTS_IN]->()
RETURN r

This returns a node, and NULL, since the node has no outgoing ACTS_IN relationships.
Result
r

1 row

165

Reading Clauses

11.3. Where
WHERE adds constraints
of a WITH clause.

to the patterns in a MATCH or OPTIONAL MATCH clause or filters the results

WHERE is not a clause in it’s own right — rather, it’s part of MATCH, OPTIONAL MATCH, START and WITH.

In the case of WITH and START, WHERE simply filters the results.
For MATCH and OPTIONAL MATCH on the other hand, WHERE adds constraints to the patterns described. It
should not be seen as a filter after the matching is finished.

Important

In the case of multiple MATCH / OPTIONAL MATCH clauses, the predicate in WHERE is always
a part of the patterns in the directly preceding MATCH / OPTIONAL MATCH. Both results and
performance may be impacted if the WHERE is put inside the wrong MATCH clause.
Figure 11.3. Graph
Swedish
nam e = 'Andres'
age = 36
belt = 'whit e'

KNOWS

KNOWS

address = 'Sweden/Malm o'
nam e = 'Tobias'
age = 25

em ail = 'pet er_n@exam ple.com '
nam e = 'Pet er'
age = 34

Basic usage
Boolean operations
You can use the expected boolean operators AND and OR, and also the boolean function NOT. See
Section 9.8, “Working with NULL” [134] for more information on how this works with NULL.
Query
MATCH (n)
WHERE n.name = 'Peter' XOR (n.age < 30 AND n.name = "Tobias") OR NOT (n.name = "Tobias" OR
n.name="Peter")
RETURN n

Result
n
Node[0]{address:"Sweden/Malmo", name:"Tobias", age:25}
Node[1]{email:"peter_n@example. com", name:"Peter", age:34}
Node[2]{name:"Andres", age:36, belt:"white"}
3 rows

Filter on node label
To filter nodes by label, write a label predicate after the WHERE keyword using WHERE n:foo.
Query
MATCH (n)
WHERE n:Swedish
RETURN n

166

Reading Clauses
The "Andres" node will be returned.
Result
n
Node[2]{name:"Andres", age:36, belt:"white"}
1 row

Filter on node property
To filter on a property, write your clause after the WHERE keyword. Filtering on relationship properties
works just the same way.
Query
MATCH (n)
WHERE n.age < 30
RETURN n

"Tobias" is returned because he is younger than 30.
Result
n
Node[0]{address:"Sweden/Malmo", name:"Tobias", age:25}
1 row

Filter on dynamic node property
To filter on a property using a dynamically computed name, use square bracket syntax.
Parameters
{
"prop" : "AGE"
}

Query
MATCH (n)
WHERE n[toLower({ prop })]< 30
RETURN n

"Tobias" is returned because he is younger than 30.
Result
n
Node[0]{address:"Sweden/Malmo", name:"Tobias", age:25}
1 row

Property exists
Use the EXISTS() function to only include nodes or relationships in which a property exists.
Query
MATCH (n)
WHERE exists(n.belt)
RETURN n

"Andres" will be returned because he is the only one with a belt property.

Important

The HAS() function has been superseded by EXISTS() and will be removed in a future release.

167

Reading Clauses
Result
n
Node[2]{name:"Andres", age:36, belt:"white"}
1 row

String matching
The start and end of strings can be matched using STARTS WITH and ENDS WITH. To match regardless of
location in a string, use CONTAINS. The matching is case-sensitive.

Match the start of a string
The STARTS WITH operator is used to perform case-sensitive matching on the start of strings.
Query
MATCH (n)
WHERE n.name STARTS WITH 'Pet'
RETURN n

"Peter" will be returned because his name starts with Pet.
Result
n
Node[1]{email:"peter_n@example. com", name:"Peter", age:34}
1 row

Match the end of a string
The ENDS WITH operator is used to perform case-sensitive matching on the end of strings.
Query
MATCH (n)
WHERE n.name ENDS WITH 'ter'
RETURN n

"Peter" will be returned because his name ends with ter.
Result
n
Node[1]{email:"peter_n@example. com", name:"Peter", age:34}
1 row

Match anywhere in a string
The CONTAINS operator is used to perform case-sensitive matching regardless of location in strings.
Query
MATCH (n)
WHERE n.name CONTAINS 'ete'
RETURN n

"Peter" will be returned because his name contains ete.
Result
n
Node[1]{email:"peter_n@example. com", name:"Peter", age:34}
1 row

String matching negation
Use the NOT keyword to exclude all matches on given string from your result:
168

Reading Clauses
Query
MATCH (n)
WHERE NOT n.name ENDS WITH 's'
RETURN n

"Peter" will be returned because his name does not end with s.
Result
n
Node[1]{email:"peter_n@example. com", name:"Peter", age:34}
1 row

Regular expressions
Cypher supports filtering using regular expressions. The regular expression syntax is inherited from
1
the Java regular expressions . This includes support for flags that change how strings are matched,
including case-insensitive (?i), multiline (?m) and dotall (?s). Flags are given at the start of the regular
expression, for example MATCH (n) WHERE n.name =~ '(?i)Lon.*' RETURN n will return nodes with name
London or with name LonDoN.

Regular expressions
You can match on regular expressions by using =~ "regexp", like this:
Query
MATCH (n)
WHERE n.name =~ 'Tob.*'
RETURN n

"Tobias" is returned because his name starts with Tob.
Result
n
Node[0]{address:"Sweden/Malmo", name:"Tobias", age:25}
1 row

Escaping in regular expressions
If you need a forward slash inside of your regular expression, escape it. Remember that back slash
needs to be escaped in string literals.
Query
MATCH (n)
WHERE n.address =~ 'Sweden\\/Malmo'
RETURN n

"Tobias" is returned because his address is in Sweden/Malmo.
Result
n
Node[0]{address:"Sweden/Malmo", name:"Tobias", age:25}
1 row

Case insensitive regular expressions
By pre-pending a regular expression with (?i), the whole expression becomes case insensitive.
Query
1

https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html

169

Reading Clauses

MATCH (n)
WHERE n.name =~ '(?i)ANDR.*'
RETURN n

"Andres" is returned because his name starts with ANDR regardless of case.
Result
n
Node[2]{name:"Andres", age:36, belt:"white"}
1 row

Using path patterns in WHERE
Filter on patterns
Patterns are expressions in Cypher, expressions that return a collection of paths. Collection expressions
are also predicates — an empty collection represents false, and a non-empty represents true.
So, patterns are not only expressions, they are also predicates. The only limitation to your pattern is
that you must be able to express it in a single path. You can not use commas between multiple paths
like you do in MATCH. You can achieve the same effect by combining multiple patterns with AND.
Note that you can not introduce new identifiers here. Although it might look very similar to the MATCH
patterns, the WHERE clause is all about eliminating matched subgraphs. MATCH (a)-[*]->(b) is very
different from WHERE (a)-[*]->(b); the first will produce a subgraph for every path it can find between
a and b, and the latter will eliminate any matched subgraphs where a and b do not have a directed
relationship chain between them.
Query
MATCH (tobias { name: 'Tobias' }),(others)
WHERE others.name IN ['Andres', 'Peter'] AND (tobias)<--(others)
RETURN others

Nodes that have an outgoing relationship to the "Tobias" node are returned.
Result
others
Node[2]{name:"Andres", age:36, belt:"white"}
1 row

Filter on patterns using NOT
The NOT function can be used to exclude a pattern.
Query
MATCH (persons),(peter { name: 'Peter' })
WHERE NOT (persons)-->(peter)
RETURN persons

Nodes that do not have an outgoing relationship to the "Peter" node are returned.
Result
persons
Node[0]{address:"Sweden/Malmo", name:"Tobias", age:25}
Node[1]{email:"peter_n@example. com", name:"Peter", age:34}
2 rows

Filter on patterns with properties
You can also add properties to your patterns:
170

Reading Clauses
Query
MATCH (n)
WHERE (n)-[:KNOWS]-({ name:'Tobias' })
RETURN n

Finds all nodes that have a KNOWS relationship to a node with the name "Tobias".
Result
n
Node[2]{name:"Andres", age:36, belt:"white"}
1 row

Filtering on relationship type
You can put the exact relationship type in the MATCH pattern, but sometimes you want to be able to do
more advanced filtering on the type. You can use the special property TYPE to compare the type with
something else. In this example, the query does a regular expression comparison with the name of the
relationship type.
Query
MATCH (n)-[r]->()
WHERE n.name='Andres' AND type(r)=~ 'K.*'
RETURN r

This returns relationships that has a type whose name starts with K.
Result
r
:KNOWS[1]{}
:KNOWS[0]{}
2 rows

Collections
IN operator
To check if an element exists in a collection, you can use the IN operator.
Query
MATCH (a)
WHERE a.name IN ["Peter", "Tobias"]
RETURN a

This query shows how to check if a property exists in a literal collection.
Result
a
Node[0]{address:"Sweden/Malmo", name:"Tobias", age:25}
Node[1]{email:"peter_n@example. com", name:"Peter", age:34}
2 rows

Missing properties and values
Default to false if property is missing
As missing properties evaluate to NULL, the comparision in the example will evaluate to FALSE for nodes
without the belt property.
171

Reading Clauses
Query
MATCH (n)
WHERE n.belt = 'white'
RETURN n

Only nodes with white belts are returned.
Result
n
Node[2]{name:"Andres", age:36, belt:"white"}
1 row

Default to true if property is missing
If you want to compare a property on a graph element, but only if it exists, you can compare the
property against both the value you are looking for and NULL, like:
Query
MATCH (n)
WHERE n.belt = 'white' OR n.belt IS NULL RETURN n
ORDER BY n.name

This returns all nodes, even those without the belt property.
Result
n
Node[2]{name:"Andres", age:36, belt:"white"}
Node[1]{email:"peter_n@example. com", name:"Peter", age:34}
Node[0]{address:"Sweden/Malmo", name:"Tobias", age:25}
3 rows

Filter on NULL
Sometimes you might want to test if a value or an identifier is NULL. This is done just like SQL does it,
with IS NULL. Also like SQL, the negative is IS NOT NULL, although NOT(IS NULL x) also works.
Query
MATCH (person)
WHERE person.name = 'Peter' AND person.belt IS NULL RETURN person

Nodes that have name Peter but no belt property are returned.
Result
person
Node[1]{email:"peter_n@example. com", name:"Peter", age:34}
1 row

Using ranges
Simple range
To check for an element being inside a specific range, use the inequality operators <, <=, >=, >.
Query
MATCH (a)
WHERE a.name >= 'Peter'
RETURN a

172

Reading Clauses
Nodes having a name property lexicographically greater than or equal to Peter are returned.
Result
a
Node[0]{address:"Sweden/Malmo", name:"Tobias", age:25}
Node[1]{email:"peter_n@example. com", name:"Peter", age:34}
2 rows

Composite range
Several inequalities can be used to construct a range.
Query
MATCH (a)
WHERE a.name > 'Andres' AND a.name < 'Tobias'
RETURN a

Nodes having a name property lexicographically between Andres and Tobias are returned.
Result
a
Node[1]{email:"peter_n@example. com", name:"Peter", age:34}
1 row

173

Reading Clauses

11.4. Start
Find starting points through legacy indexes.
Important

The START clause should only be used when accessing legacy indexes (see Chapter 35, Legacy
Indexing [621]). In all other cases, use MATCH instead (see Section 11.1, “Match” [155]).
In Cypher, every query describes a pattern, and in that pattern one can have multiple starting points.
A starting point is a relationship or a node where a pattern is anchored. Using START you can only
introduce starting points by legacy index seeks. Note that trying to use a legacy index that doesn’t exist
will generate an error.
This is the graph the examples are using:
Figure 11.4. Graph
Node[ 0]
nam e = 'A'

KNOWS

KNOWS

Node[ 2]

Node[ 1]

nam e = 'C'

nam e = 'B'

Get node or relationship from index
Node by index seek
When the starting point can be found by using index seeks, it can be done like this: node:index-name(key
= "value"). In this example, there exists a node index named nodes.
Query
START n=node:nodes(name = "A")
RETURN n

The query returns the node indexed with the name "A".
Result
n
Node[0]{name:"A"}
1 row

Relationship by index seek
When the starting point can be found by using index seeks, it can be done like this: relationship:indexname(key = "value").
Query
START r=relationship:rels(name = "Andrés")
RETURN r

The relationship indexed with the name property set to "Andrés" is returned by the query.
174

Reading Clauses
Result
r
:KNOWS[0]{name:"Andrés"
1 row

Node by index query
When the starting point can be found by more complex Lucene queries, this is the syntax to use:
node:index-name("query").This allows you to write more advanced index queries.
Query
START n=node:nodes("name:A")
RETURN n

The node indexed with name "A" is returned by the query.
Result
n
Node[0]{name:"A"}
1 row

175

Reading Clauses

11.5. Aggregation
Introduction
To calculate aggregated data, Cypher offers aggregation, much like SQL’s GROUP BY.
Aggregate functions take multiple input values and calculate an aggregated value from them. Examples
are avg that calculates the average of multiple numeric values, or min that finds the smallest numeric
value in a set of values.
Aggregation can be done over all the matching subgraphs, or it can be further divided by introducing
key values. These are non-aggregate expressions, that are used to group the values going into the
aggregate functions.
So, if the return statement looks something like this:
RETURN n, count(*)

We have two return expressions: n, and count(*). The first, n, is no aggregate function, and so it will be
the grouping key. The latter, count(*) is an aggregate expression. So the matching subgraphs will be
divided into different buckets, depending on the grouping key. The aggregate function will then run on
these buckets, calculating the aggregate values.
If you want to use aggregations to sort your result set, the aggregation must be included in the RETURN
to be used in your ORDER BY.
The last piece of the puzzle is the DISTINCT keyword. It is used to make all values unique before running
them through an aggregate function.
An example might be helpful. In this case, we are running the query against the following data:
Person
nam e = 'A'
propert y = 13

KNOWS

Person
nam e = 'D'
eyes = 'brown'

KNOWS

KNOWS

Person

Person

nam e = 'C'
propert y = 44
eyes = 'blue'

nam e = 'B'
propert y = 33
eyes = 'blue'

KNOWS

KNOWS

Person
nam e = 'D'

Query
MATCH (me:Person)-->(friend:Person)-->(friend_of_friend:Person)
WHERE me.name = 'A'
RETURN count(DISTINCT friend_of_friend), count(friend_of_friend)

In this example we are trying to find all our friends of friends, and count them. The first aggregate
function, count(DISTINCT friend_of_friend), will only see a friend_of_friend once — DISTINCT removes
the duplicates. The latter aggregate function, count(friend_of_friend), might very well see the same
friend_of_friend multiple times. In this case, both B and C know D and thus D will get counted twice,
when not using DISTINCT.
176

Reading Clauses
Result
count(distinct friend_of_friend)

count(friend_of_friend)

1

2
1 row

The following examples are assuming the example graph structure below.
Figure 11.5. Graph
Person
nam e = 'A'
propert y = 13

KNOWS

KNOWS

Person

Person

nam e = 'C'
propert y = 44
eyes = 'blue'

nam e = 'B'
propert y = 33
eyes = 'blue'

Person
nam e = 'D'
eyes = 'brown'

KNOWS

COUNT
COUNT

is used to count the number of rows.

COUNT can be used in two forms — COUNT(*) which just counts the number of matching rows, and
COUNT(), which counts the number of non-NULL values in .

Count nodes
To count the number of nodes, for example the number of nodes connected to one node, you can use
count(*).
Query
MATCH (n { name: 'A' })-->(x)
RETURN n, count(*)

This returns the start node and the count of related nodes.
Result
n

count(*)

Node[1]{name:"A", property:13}

3

1 row

Group Count Relationship Types
To count the groups of relationship types, return the types and count them with count(*).
Query
MATCH (n { name: 'A' })-[r]->()
RETURN type(r), count(*)

The relationship types and their group count is returned by the query.
Result
type(r)

count(*)

"KNOWS"

3

1 row

177

Reading Clauses

Count entities
Instead of counting the number of results with count(*), it might be more expressive to include the
name of the identifier you care about.
Query
MATCH (n { name: 'A' })-->(x)
RETURN count(x)

The example query returns the number of connected nodes from the start node.
Result
count(x)
3
1 row

Count non-null values
You can count the non-NULL values by using count().
Query
MATCH (n:Person)
RETURN count(n.property)

The count of related nodes with the property property set is returned by the query.
Result
count(n.property)
3
1 row

Statistics
sum
The sum aggregation function simply sums all the numeric values it encounters. NULLs are silently
dropped.
Query
MATCH (n:Person)
RETURN sum(n.property)

This returns the sum of all the values in the property property.
Result
sum(n.property)
90
1 row

avg
avg calculates the average of a numeric column.
Query
MATCH (n:Person)
RETURN avg(n.property)

The average of all the values in the property property is returned by the example query.
178

Reading Clauses
Result
avg(n.property)
30. 0
1 row

percentileDisc
percentileDisc calculates the percentile of a given value over a group, with a percentile from 0.0 to 1.0.

It uses a rounding method, returning the nearest value to the percentile. For interpolated values, see
percentileCont.
Query
MATCH (n:Person)
RETURN percentileDisc(n.property, 0.5)

The 50th percentile of the values in the property property is returned by the example query. In this case,
0.5 is the median, or 50th percentile.
Result
percentileDisc(n.property, 0.5)
33
1 row

percentileCont
percentileCont calculates the percentile of a given value over a group, with a percentile from 0.0 to 1.0.

It uses a linear interpolation method, calculating a weighted average between two values, if the desired
percentile lies between them. For nearest values using a rounding method, see percentileDisc.
Query
MATCH (n:Person)
RETURN percentileCont(n.property, 0.4)

The 40th percentile of the values in the property property is returned by the example query, calculated
with a weighted average.
Result
percentileCont(n.property, 0.4)
29. 0
1 row

stdev
stdev calculates the standard deviation for a given value over a group. It uses a standard two-pass
method, with N - 1 as the denominator, and should be used when taking a sample of the population

for an unbiased estimate. When the standard variation of the entire population is being calculated,
stdevp should be used.
Query
MATCH (n)
WHERE n.name IN ['A', 'B', 'C']
RETURN stdev(n.property)

The standard deviation of the values in the property property is returned by the example query.
Result
stdev(n.property)
15. 716233645501712
1 row

179

Reading Clauses

stdevp
stdevp calculates the standard deviation for a given value over a group. It uses a standard two-pass
method, with N as the denominator, and should be used when calculating the standard deviation for an

entire population. When the standard variation of only a sample of the population is being calculated,
stdev should be used.
Query
MATCH (n)
WHERE n.name IN ['A', 'B', 'C']
RETURN stdevp(n.property)

The population standard deviation of the values in the property property is returned by the example
query.
Result
stdevp(n.property)
12. 832251036613439
1 row

max
max find the largest value in a numeric column.
Query
MATCH (n:Person)
RETURN max(n.property)

The largest of all the values in the property property is returned.
Result
max(n.property)
44
1 row

min
min takes a numeric property as input, and returns the smallest value in that column.
Query
MATCH (n:Person)
RETURN min(n.property)

This returns the smallest of all the values in the property property.
Result
min(n.property)
13
1 row

collect
collect collects all the values into a list. It will ignore NULLs.
Query
MATCH (n:Person)
RETURN collect(n.property)

Returns a single row, with all the values collected.
180

Reading Clauses
Result
collect(n.property)
[13, 33, 44]
1 row

DISTINCT
All aggregation functions also take the DISTINCT modifier, which removes duplicates from the values. So,
to count the number of unique eye colors from nodes related to a, this query can be used:
Query
MATCH (a:Person { name: 'A' })-->(b)
RETURN count(DISTINCT b.eyes)

Returns the number of eye colors.
Result
count(distinct b.eyes)
2
1 row

181

Reading Clauses

11.6. Load CSV
LOAD CSV

is used to import data from CSV files.

• The URL of the CSV file is specified by using FROM followed by an arbitrary expression evaluating to the
URL in question.
• It is required to specify an identifier for the CSV data using AS.
• LOAD CSV supports resources compressed with gzip, Deflate, as well as ZIP archives.
• CSV files can be stored on the database server and are then accessible using a file:/// URL.
Alternatively, LOAD CSV also supports accessing CSV files via HTTPS, HTTP, and FTP.
• LOAD CSV will follow HTTP redirects but for security reasons it will not follow redirects that changes the
protocol, for example if the redirect is going from HTTPS to HTTP.

Configuration settings for file URLs
allow_file_urls [465]

This setting determines if Cypher will allow the use of
file:/// URLs when loading data using LOAD CSV. Such
URLs identify files on the filesystem of the database server.
Default is true.
dbms.security.load_csv_file_url_root
Sets[468]
the root directory for file:/// URLs used with the
Cypher LOAD CSV clause. This must be set to a single
directory on the filesystem of the database server, and will
make all requests to load from file:/// URLs relative to the
specified directory (similar to how a unix chroot operates).
By default, this setting is not configured.
• When not set, file URLs will be resolved as relative
to the root of the database server filesystem. If this
is the case, a file URL will typically look like file:///
home/username/myfile.csv or file:///C:/Users/username/
myfile.csv. Using these URLs in LOAD CSV will read
content from files on the database server filesystem,
specifically /home/username/myfile.csv and C:\Users
\username\myfile.csv respectively. For security reasons
you may not want users to be able to load files located
anywhere on the database server filesystem and should set
dbms.security.load_csv_file_url_root to a safe directory to
load files from.
• When set, file URLs will be resolved as relative to the
directory it’s set to. In this case a file URL will typically
look like file:///myfile.csv or file:///myproject/
myfile.csv.
• If set to data/import using the above URLs in LOAD CSV
would read content from data/import/myfile.csv and
data/import/myproject/myfile.csv respectively, where
both are relative to the database install directory.
• If set to /home/neo4j using the above URLs in LOAD CSV
would read content from /home/neo4j/myfile.csv and /
home/neo4j/myproject/myfile.csv respectively.
See the examples below for further details.
There is also a worked example, see Section 12.8, “Importing CSV files with Cypher” [211].

CSV file format
The CSV file to use with LOAD CSV must have the following characteristics:
182

Reading Clauses
•
•
•
•

the character encoding is UTF-8;
the end line termination is system dependent, e.g., it is \n on unix or \r\n on windows;
the default field terminator is ,;
the field terminator character can be change by using the option FIELDTERMINATOR available in the LOAD
CSV command;
• quoted strings are allowed in the CSV file and the quotes are dropped when reading the data;
• the character for string quotation is double quote ";
• the escape character is \.

Import data from a CSV file
To import data from a CSV file into Neo4j, you can use LOAD CSV to get the data into your query. Then
you write it to your database using the normal updating clauses of Cypher.
artists.csv
"1","ABBA","1992"
"2","Roxette","1986"
"3","Europe","1979"
"4","The Cardigans","1992"

Query
LOAD CSV FROM 'http://neo4j.com/docs/2.3.12/csv/artists.csv' AS line
CREATE (:Artist { name: line[1], year: toInt(line[2])})

A new node with the Artist label is created for each row in the CSV file. In addition, two columns from
the CSV file are set as properties on the nodes.
Result
(empty result)
Nodes created: 4
Properties set: 8
Labels added: 4

Import data from a CSV file containing headers
When your CSV file has headers, you can view each row in the file as a map instead of as an array of
strings.
artists-with-headers.csv
"Id","Name","Year"
"1","ABBA","1992"
"2","Roxette","1986"
"3","Europe","1979"
"4","The Cardigans","1992"

Query
LOAD CSV WITH HEADERS FROM 'http://neo4j.com/docs/2.3.12/csv/artists-with-headers.csv' AS line
CREATE (:Artist { name: line.Name, year: toInt(line.Year)})

This time, the file starts with a single row containing column names. Indicate this using WITH HEADERS and
you can access specific fields by their corresponding column name.
Result
(empty result)
Nodes created: 4
Properties set: 8
Labels added: 4

183

Reading Clauses

Import data from a CSV file with a custom field delimiter
Sometimes, your CSV file has other field delimiters than commas. You can specify which delimiter your
file uses using FIELDTERMINATOR.
artists-fieldterminator.csv
"1";"ABBA";"1992"
"2";"Roxette";"1986"
"3";"Europe";"1979"
"4";"The Cardigans";"1992"

Query
LOAD CSV FROM 'http://neo4j.com/docs/2.3.12/csv/artists-fieldterminator.csv' AS line FIELDTERMINATOR
';'
CREATE (:Artist { name: line[1], year: toInt(line[2])})

As values in this file are separated by a semicolon, a custom FIELDTERMINATOR is specified in the LOAD CSV
clause.
Result
(empty result)
Nodes created: 4
Properties set: 8
Labels added: 4

Importing large amounts of data
If the CSV file contains a significant number of rows (approaching hundreds of thousands or millions),
USING PERIODIC COMMIT can be used to instruct Neo4j to perform a commit after a number of rows. This
reduces the memory overhead of the transaction state. By default, the commit will happen every 1000
rows. For more information, see Section 12.9, “Using Periodic Commit” [213].
Query
USING PERIODIC COMMIT
LOAD CSV FROM 'http://neo4j.com/docs/2.3.12/csv/artists.csv' AS line
CREATE (:Artist { name: line[1], year: toInt(line[2])})

Result
(empty result)
Nodes created: 4
Properties set: 8
Labels added: 4

Setting the rate of periodic commits
You can set the number of rows as in the example, where it is set to 500 rows.
Query
USING PERIODIC COMMIT 500
LOAD CSV FROM 'http://neo4j.com/docs/2.3.12/csv/artists.csv' AS line
CREATE (:Artist { name: line[1], year: toInt(line[2])})

Result
(empty result)
Nodes created: 4
Properties set: 8
Labels added: 4

184

Reading Clauses

Import data containing escaped characters
In this example, we both have additional quotes around the values, as well as escaped quotes inside
one value.
artists-with-escaped-char.csv
"1","The ""Symbol""","1992"

Query
LOAD CSV FROM 'http://neo4j.com/docs/2.3.12/csv/artists-with-escaped-char.csv' AS line
CREATE (a:Artist { name: line[1], year: toInt(line[2])})
RETURN a.name AS name, a.year AS year, length(a.name) AS length

Note that strings are wrapped in quotes in the output here. You can see that when comparing to the
length of the string in this case!
Result
name

year

length

"The "Symbol""

1992

12

1 row
Nodes created: 1
Properties set: 2
Labels added: 1

185

Chapter 12. Writing Clauses
Write data to the database.

186

Writing Clauses

12.1. Create
The CREATE clause is used to create graph elements — nodes and relationships.
Tip

In the CREATE clause, patterns are used a lot. Read Section 9.6, “Patterns” [127] for an
introduction.

Create nodes
Create single node
Creating a single node is done by issuing the following query.
Query
CREATE (n)

Nothing is returned from this query, except the count of affected nodes.
Result
(empty result)
Nodes created: 1

Create multiple nodes
Creating multiple nodes is done by separating them with a comma.
Query
CREATE (n),(m)

Result
(empty result)
Nodes created: 2

Create a node with a label
To add a label when creating a node, use the syntax below.
Query
CREATE (n:Person)

Nothing is returned from this query.
Result
(empty result)
Nodes created: 1
Labels added: 1

Create a node with multiple labels
To add labels when creating a node, use the syntax below. In this case, we add two labels.
Query
CREATE (n:Person:Swedish)

187

Writing Clauses
Nothing is returned from this query.
Result
(empty result)
Nodes created: 1
Labels added: 2

Create node and add labels and properties
When creating a new node with labels, you can add properties at the same time.
Query
CREATE (n:Person { name : 'Andres', title : 'Developer' })

Nothing is returned from this query.
Result
(empty result)
Nodes created: 1
Properties set: 2
Labels added: 1

Return created node
Creating a single node is done by issuing the following query.
Query
CREATE (a { name : 'Andres' })
RETURN a

The newly created node is returned.
Result
a
Node[0]{name:"Andres"}
1 row
Nodes created: 1
Properties set: 1

Create relationships
Create a relationship between two nodes
To create a relationship between two nodes, we first get the two nodes. Once the nodes are loaded, we
simply create a relationship between them.
Query
MATCH (a:Person),(b:Person)
WHERE a.name = 'Node A' AND b.name = 'Node B'
CREATE (a)-[r:RELTYPE]->(b)
RETURN r

The created relationship is returned by the query.
Result
r
:RELTYPE[0]{}
1 row
Relationships created: 1

188

Writing Clauses

Create a relationship and set properties
Setting properties on relationships is done in a similar manner to how it’s done when creating nodes.
Note that the values can be any expression.
Query
MATCH (a:Person),(b:Person)
WHERE a.name = 'Node A' AND b.name = 'Node B'
CREATE (a)-[r:RELTYPE { name : a.name + '<->' + b.name }]->(b)
RETURN r

The newly created relationship is returned by the example query.
Result
r
:RELTYPE[0]{name:"Node A<->Node B"}
1 row
Relationships created: 1
Properties set: 1

Create a full path
When you use CREATE and a pattern, all parts of the pattern that are not already in scope at this time will
be created.
Query
CREATE p =(andres { name:'Andres' })-[:WORKS_AT]->(neo)<-[:WORKS_AT]-(michael { name:'Michael' })
RETURN p

This query creates three nodes and two relationships in one go, assigns it to a path identifier, and
returns it.
Result
p
[Node[0]{name:"Andres"}, :WORKS_AT[0]{}, Node[1]{}, :WORKS_AT[1]{}, Node[2]{name:"Michael"}]
1 row
Nodes created: 3
Relationships created: 2
Properties set: 2

Use parameters with CREATE
Create node with a parameter for the properties
You can also create a graph entity from a map. All the key/value pairs in the map will be set as
properties on the created relationship or node. In this case we add a Person label to the node as well.
Parameters
{
"props" : {
"name" : "Andres",
"position" : "Developer"
}
}

Query
CREATE (n:Person { props })
RETURN n

189

Writing Clauses
Result
n
Node[0]{name:"Andres", position:"Developer"}
1 row
Nodes created: 1
Properties set: 2
Labels added: 1

Create multiple nodes with a parameter for their properties
By providing Cypher an array of maps, it will create a node for each map.
Parameters
{
"props" : [ {
"name" : "Andres",
"position" : "Developer"
}, {
"name" : "Michael",
"position" : "Developer"
} ]
}

Query
UNWIND { props } AS map
CREATE (n)
SET n = map

Result
(empty result)
Nodes created: 2
Properties set: 4

Create multiple nodes with a parameter for their properties using old syntax
By providing Cypher an array of maps, it will create a node for each map.

Note

When you do this, you can’t create anything else in the same CREATE clause.

Note

This syntax is deprecated in Neo4j version 2.3. It may be removed in a future major release.
See the above example using UNWIND for how to achieve the same functionality.
Parameters
{
"props" : [ {
"name" : "Andres",
"position" : "Developer"
}, {
"name" : "Michael",
"position" : "Developer"
} ]
}

Query
CREATE (n { props })

190

Writing Clauses
RETURN n

Result
n
Node[0]{name:"Andres", position:"Developer"}
Node[1]{name:"Michael", position:"Developer"}
2 rows
Nodes created: 2
Properties set: 4

191

Writing Clauses

12.2. Merge
The MERGE clause ensures that a pattern exists in the graph. Either the pattern already
exists, or it needs to be created.

Introduction
MERGE either matches existing nodes and binds them, or it creates new data and binds that. It’s like a
combination of MATCH and CREATE that additionally allows you to specify what happens if the data was

matched or created.

For example, you can specify that the graph must contain a node for a user with a certain name. If
there isn’t a node with the correct name, a new node will be created and its name property set.
When using MERGE on full patterns, the behavior is that either the whole pattern matches, or the whole
pattern is created. MERGE will not partially use existing patterns — it’s all or nothing. If partial matches are
needed, this can be accomplished by splitting a pattern up into multiple MERGE clauses.
As with MATCH, MERGE can match multiple occurrences of a pattern. If there are multiple matches, they will
all be passed on to later stages of the query.
The last part of MERGE is the ON CREATE and ON MATCH. These allow a query to express additional changes to
the properties of a node or relationship, depending on if the element was MATCHed in the database or if
it was CREATEd.
The rule planner (see Section 15.1, “How are queries executed?” [254]) expands a MERGE pattern from
the end point that has the identifier with the lowest lexicographical order. This means that it might
choose a suboptimal expansion path, expanding from a node with a higher degree. The pattern MERGE
(a:A)-[:R]->(b:B) will always expand from a to b, so if it is known that b nodes are a better choice for
start point, renaming identifiers could improve performance.
The following graph is used for the examples below:
Figure 12.1. Graph
Person
chauffeurNam e = 'John Brown'
nam e = 'Charlie Sheen'
bornIn = 'New York'

FATHER

Person

Person

chauffeurNam e = 'Bill Whit e'
nam e = 'Oliver St one'
bornIn = 'New York'

ACTED_IN

DIRECTED

Person

bornIn = 'New Jersey'
chauffeurNam e = 'John Brown'
nam e = 'Michael Douglas'

ACTED_IN

ACTED_IN

Movie

chauffeurNam e = 'Bob Brown'
nam e = 'Mart in Sheen'
bornIn = 'Ohio'

ACTED_IN

ACTED_IN

Movie

nam e = 'WallSt reet '
t it le = 'Wall St reet '

t it le = 'The Am erican President '
nam e = 'TheAm ericanPresident '

Merge nodes
Merge single node with a label
Merging a single node with a given label.
Query
MERGE (robert:Critic)
RETURN robert, labels(robert)

A new node is created because there are no nodes labeled Critic in the database.
192

Person
chauffeurNam e = 'Ted Green'
nam e = 'Rob Reiner'
bornIn = 'New York'

DIRECTED

Writing Clauses
Result
robert

labels(robert)

Node[7]{}

["Critic"]

1 row
Nodes created: 1
Labels added: 1

Merge single node with properties
Merging a single node with properties where not all properties match any existing node.
Query
MERGE (charlie { name:'Charlie Sheen', age:10 })
RETURN charlie

A new node with the name Charlie Sheen will be created since not all properties matched the existing
Charlie Sheen node.
Result
charlie
Node[7]{name:"Charlie Sheen", age:10}
1 row
Nodes created: 1
Properties set: 2

Merge single node specifying both label and property
Merging a single node with both label and property matching an existing node.
Query
MERGE (michael:Person { name:'Michael Douglas' })
RETURN michael.name, michael.bornIn

Michael Douglas will be matched and the name and bornIn properties returned.
Result
michael.name

michael.bornIn

"Michael Douglas"

"New Jersey"

1 row

Merge single node derived from an existing node property
For some property p in each bound node in a set of nodes, a single new node is created for each
unique value for p.
Query
MATCH (person:Person)
MERGE (city:City { name: person.bornIn })
RETURN person.name, person.bornIn, city

Three nodes labeled City are created, each of which contains a name property with the value of New
York, Ohio, and New Jersey, respectively. Note that even though the MATCH clause results in three bound
nodes having the value New York for the bornIn property, only a single New York node (i.e. a City node
with a name of New York) is created. As the New York node is not matched for the first bound node, it
is created. However, the newly-created New York node is matched and bound for the second and third
bound nodes.

193

Writing Clauses
Result
person.name

person.bornIn

city

"Oliver Stone"

"New York"

Node[7]{name:"New York"}

"Charlie Sheen"

"New York"

Node[7]{name:"New York"}

"Martin Sheen"

"Ohio"

Node[8]{name:"Ohio"}

"Rob Reiner"

"New York"

Node[7]{name:"New York"}

"Michael Douglas"

"New Jersey"

Node[9]{name:"New Jersey"}

5 rows
Nodes created: 3
Properties set: 3
Labels added: 3

Use ON CREATE and ON MATCH
Merge with ON CREATE
Merge a node and set properties if the node needs to be created.
Query
MERGE (keanu:Person { name:'Keanu Reeves' })
ON CREATE SET keanu.created = timestamp()
RETURN keanu.name, keanu.created

The query creates the keanu node and sets a timestamp on creation time.
Result
keanu.name

keanu.created

"Keanu Reeves"

1512735402884

1 row
Nodes created: 1
Properties set: 2
Labels added: 1

Merge with ON MATCH
Merging nodes and setting properties on found nodes.
Query
MERGE (person:Person)
ON MATCH SET person.found = TRUE RETURN person.name, person.found

The query finds all the Person nodes, sets a property on them, and returns them.
Result
person.name

person.found

"Oliver Stone"

true

"Charlie Sheen"

true

"Martin Sheen"

true

"Rob Reiner"

true

"Michael Douglas"

true

5 rows
Properties set: 5

Merge with ON CREATE and ON MATCH
Merge a node and set properties if the node needs to be created.
Query

194

Writing Clauses

MERGE (keanu:Person { name:'Keanu Reeves' })
ON CREATE SET keanu.created = timestamp()
ON MATCH SET keanu.lastSeen = timestamp()
RETURN keanu.name, keanu.created, keanu.lastSeen

The query creates the keanu node, and sets a timestamp on creation time. If keanu had already existed,
a different property would have been set.
Result
keanu.name

keanu.created

keanu.lastSeen

"Keanu Reeves"

1512735405486



1 row
Nodes created: 1
Properties set: 2
Labels added: 1

Merge with ON MATCH setting multiple properties
If multiple properties should be set, simply separate them with commas.
Query
MERGE (person:Person)
ON MATCH SET person.found = TRUE , person.lastAccessed = timestamp()
RETURN person.name, person.found, person.lastAccessed

Result
person.name

person.found

person.lastAccessed

"Oliver Stone"

true

1512735404553

"Charlie Sheen"

true

1512735404553

"Martin Sheen"

true

1512735404553

"Rob Reiner"

true

1512735404553

"Michael Douglas"

true

1512735404553

5 rows
Properties set: 10

Merge relationships
Merge on a relationship
MERGE can be used to match or create a relationship.
Query
MATCH (charlie:Person { name:'Charlie Sheen' }),(wallStreet:Movie { title:'Wall Street' })
MERGE (charlie)-[r:ACTED_IN]->(wallStreet)
RETURN charlie.name, type(r), wallStreet.title

Charlie Sheen had already been marked as acting in Wall Street, so the existing relationship is found and
returned. Note that in order to match or create a relationship when using MERGE, at least one bound
node must be specified, which is done via the MATCH clause in the above example.
Result
charlie.name

type(r)

wallStreet.title

"Charlie Sheen"

"ACTED_IN"

"Wall Street"

1 row

Merge on multiple relationships
When MERGE is used on a whole pattern, either everything matches, or everything is created.
Query

195

Writing Clauses

MATCH (oliver:Person { name:'Oliver Stone' }),(reiner:Person { name:'Rob Reiner' })
MERGE (oliver)-[:DIRECTED]->(movie:Movie)<-[:ACTED_IN]-(reiner)
RETURN movie

In our example graph, Oliver Stone and Rob Reiner have never worked together. When we try to MERGE a
movie between them, Neo4j will not use any of the existing movies already connected to either person.
Instead, a new movie node is created.
Result
movie
Node[7]{}
1 row
Nodes created: 1
Relationships created: 2
Labels added: 1

Merge on an undirected relationship
MERGE can also be used with an undirected relationship. When it needs to create a new one, it will pick a

direction.
Query

MATCH (charlie:Person { name:'Charlie Sheen' }),(oliver:Person { name:'Oliver Stone' })
MERGE (charlie)-[r:KNOWS]-(oliver)
RETURN r

As Charlie Sheen and Oliver Stone do not know each other, this MERGE query will create a :KNOWS
relationship between them. The direction of the created relationship is arbitrary.
Result
r
:KNOWS[8]{}
1 row
Relationships created: 1

Merge on a relationship between two existing nodes
MERGE can be used in conjunction with preceding MATCH and MERGE clauses to create a relationship
between two bound nodes m and n, where m is returned by MATCH and n is created or matched by the
earlier MERGE.
Query
MATCH (person:Person)
MERGE (city:City { name: person.bornIn })
MERGE (person)-[r:BORN_IN]->(city)
RETURN person.name, person.bornIn, city

This builds on the example from the section called “Merge single node derived from an existing node
property” [193]. The second MERGE creates a BORN_IN relationship between each person and a city
corresponding to the value of the person’s bornIn property. Charlie Sheen, Rob Reiner and Oliver Stone all
have a BORN_IN relationship to the same City node (New York).
Result
person.name

person.bornIn

city

"Oliver Stone"

"New York"

Node[7]{name:"New York"}

5 rows
Nodes created: 3
Relationships created: 5
Properties set: 3
Labels added: 3

196

Writing Clauses
person.name

person.bornIn

city

"Charlie Sheen"

"New York"

Node[7]{name:"New York"}

"Martin Sheen"

"Ohio"

Node[8]{name:"Ohio"}

"Rob Reiner"

"New York"

Node[7]{name:"New York"}

"Michael Douglas"

"New Jersey"

Node[9]{name:"New Jersey"}

5 rows
Nodes created: 3
Relationships created: 5
Properties set: 3
Labels added: 3

Merge on a relationship between an existing node and a merged node derived from a
node property
MERGE can be used to simultaneously create both a new node n and a relationship between a bound

node m and n.
Query

MATCH (person:Person)
MERGE (person)-[r:HAS_CHAUFFEUR]->(chauffeur:Chauffeur { name: person.chauffeurName })
RETURN person.name, person.chauffeurName, chauffeur

As MERGE found no matches — in our example graph, there are no nodes labeled with Chauffeur and no
HAS_CHAUFFEUR relationships — MERGE creates five nodes labeled with Chauffeur, each of which contains
a name property whose value corresponds to each matched Person node’s chauffeurName property
value. MERGE also creates a HAS_CHAUFFEUR relationship between each Person node and the newly-created
corresponding Chauffeur node. As Charlie Sheen and Michael Douglas both have a chauffeur with the
same name — John Brown — a new node is created in each case, resulting in two Chauffeur nodes having
a name of John Brown, correctly denoting the fact that even though the name property may be identical,
these are two separate people. This is in contrast to the example shown above in the section called
“Merge on a relationship between two existing nodes” [196], where we used the first MERGE to bind the
City nodes to prevent them from being recreated (and thus duplicated) in the second MERGE.
Result
person.name

person.chauffeurName

chauffeur

"Oliver Stone"

"Bill White"

Node[7]{name:"Bill White"}

"Charlie Sheen"

"John Brown"

Node[8]{name:"John Brown"}

"Martin Sheen"

"Bob Brown"

Node[9]{name:"Bob Brown"}

"Rob Reiner"

"Ted Green"

Node[10]{name:"Ted Green"}

"Michael Douglas"

"John Brown"

Node[11]{name:"John Brown"}

5 rows
Nodes created: 5
Relationships created: 5
Properties set: 5
Labels added: 5

Using unique constraints with MERGE
Cypher prevents getting conflicting results from MERGE when using patterns that involve uniqueness
constrains. In this case, there must be at most one node that matches that pattern.
For example, given two uniqueness constraints on :Person(id) and :Person(ssn): then a query such as
MERGE (n:Person {id: 12, ssn: 437}) will fail, if there are two different nodes (one with id 12 and one
with ssn 437) or if there is only one node with only one of the properties. In other words, there must be
exactly one node that matches the pattern, or no matching nodes.
Note that the following examples assume the existence of uniqueness constraints that have been
created using:
197

Writing Clauses

CREATE CONSTRAINT ON (n:Person) ASSERT n.name IS UNIQUE;
CREATE CONSTRAINT ON (n:Person) ASSERT n.role IS UNIQUE;

Merge using unique constraints creates a new node if no node is found
Merge using unique constraints creates a new node if no node is found.
Query
MERGE (laurence:Person { name: 'Laurence Fishburne' })
RETURN laurence.name

The query creates the laurence node. If laurence had already existed, MERGE would just match the existing
node.
Result
laurence.name
"Laurence Fishburne"
1 row
Nodes created: 1
Properties set: 1
Labels added: 1

Merge using unique constraints matches an existing node
Merge using unique constraints matches an existing node.
Query
MERGE (oliver:Person { name:'Oliver Stone' })
RETURN oliver.name, oliver.bornIn

The oliver node already exists, so MERGE just matches it.
Result
oliver.name

oliver.bornIn

"Oliver Stone"

"New York"

1 row

Merge with unique constraints and partial matches
Merge using unique constraints fails when finding partial matches.
Query
MERGE (michael:Person { name:'Michael Douglas', role:'Gordon Gekko' })
RETURN michael

While there is a matching unique michael node with the name Michael Douglas, there is no unique node
with the role of Gordon Gekko and MERGE fails to match.
Error message
Merge did not find a matching node and can not create a new node due to conflicts
with both existing and missing unique nodes. The conflicting constraints are on:
:Person.name and :Person.role

Merge with unique constraints and conflicting matches
Merge using unique constraints fails when finding conflicting matches.
Query
MERGE (oliver:Person { name:'Oliver Stone', role:'Gordon Gekko' })
RETURN oliver

198

Writing Clauses
While there is a matching unique oliver node with the name Oliver Stone, there is also another unique
node with the role of Gordon Gekko and MERGE fails to match.
Error message
Merge did not find a matching node and can not create a new node due to conflicts
with both existing and missing unique nodes. The conflicting constraints are on:
:Person.name and :Person.role

Using map parameters with MERGE
MERGE does not support map parameters like for example CREATE does. To use map parameters with
MERGE, it is necessary to explicitly use the expected properties, like in the following example. For more

information on parameters, see Section 8.5, “Parameters” [113].
Parameters
{
"param" : {
"name" : "Keanu Reeves",
"role" : "Neo"
}
}

Query
MERGE (person:Person { name: { param }.name, role: { param }.role })
RETURN person.name, person.role

Result
person.name

person.role

"Keanu Reeves"

"Neo"

1 row
Nodes created: 1
Properties set: 2
Labels added: 1

199

Writing Clauses

12.3. Set
The SET clause is used to update labels on nodes and properties on nodes and
relationships.
SET can also be used with maps from parameters to set properties.

Note

Setting labels on a node is an idempotent operations — if you try to set a label on a node
that already has that label on it, nothing happens. The query statistics will tell you if
something needed to be done or not.
The examples use this graph as a starting point:

nam e = 'St efan'

KNOWS

Swedish
nam e = 'Andres'
age = 36
hungry = t rue

nam e = 'Em il'

KNOWS

KNOWS

nam e = 'Pet er'
age = 34

Set a property
To set a property on a node or relationship, use SET.
Query
MATCH (n { name: 'Andres' })
SET n.surname = 'Taylor'
RETURN n

The newly changed node is returned by the query.
Result
n
Node[3]{surname:"Taylor", name:"Andres", age:36, hungry:true}
1 row
Properties set: 1

Remove a property
Normally you remove a property by using REMOVE, but it’s sometimes handy to do it using the SET
command. One example is if the property comes from a parameter.
Query

200

Writing Clauses

MATCH (n { name: 'Andres' })
SET n.name = NULL RETURN n

The node is returned by the query, and the name property is now missing.
Result
n
Node[3]{hungry:true, age:36}
1 row
Properties set: 1

Copying properties between nodes and relationships
You can also use SET to copy all properties from one graph element to another. Remember that doing
this will remove all other properties on the receiving graph element.
Query
MATCH (at { name: 'Andres' }),(pn { name: 'Peter' })
SET at = pn
RETURN at, pn

The Andres node has had all it’s properties replaced by the properties in the Peter node.
Result
at

pn

Node[3]{name:"Peter", age:34}

Node[2]{name:"Peter", age:34}

1 row
Properties set: 3

Adding properties from maps
When setting properties from a map (literal, paremeter, or graph element), you can use the += form of
SET to only add properties, and not remove any of the existing properties on the graph element.
Query
MATCH (peter { name: 'Peter' })
SET peter += { hungry: TRUE , position: 'Entrepreneur' }

Result
(empty result)
Properties set: 2

Set a property using a parameter
Use a parameter to give the value of a property.
Parameters
{
"surname" : "Taylor"
}

Query
MATCH (n { name: 'Andres' })
SET n.surname = { surname }
RETURN n

The Andres node has got an surname added.
201

Writing Clauses
Result
n
Node[3]{surname:"Taylor", name:"Andres", age:36, hungry:true}
1 row
Properties set: 1

Set all properties using a parameter
This will replace all existing properties on the node with the new set provided by the parameter.
Parameters
{
"props" : {
"name" : "Andres",
"position" : "Developer"
}
}

Query
MATCH (n { name: 'Andres' })
SET n = { props }
RETURN n

The Andres node has had all it’s properties replaced by the properties in the props parameter.
Result
n
Node[3]{name:"Andres", position:"Developer"}
1 row
Properties set: 4

Set multiple properties using one SET clause
If you want to set multiple properties in one go, simply separate them with a comma.
Query
MATCH (n { name: 'Andres' })
SET n.position = 'Developer', n.surname = 'Taylor'

Result
(empty result)
Properties set: 2

Set a label on a node
To set a label on a node, use SET.
Query
MATCH (n { name: 'Stefan' })
SET n :German
RETURN n

The newly labeled node is returned by the query.
Result
n
Node[1]{name:"Stefan"}
1 row
Labels added: 1

202

Writing Clauses

Set multiple labels on a node
To set multiple labels on a node, use SET and separate the different labels using :.
Query
MATCH (n { name: 'Emil' })
SET n :Swedish:Bossman
RETURN n

The newly labeled node is returned by the query.
Result
n
Node[0]{name:"Emil"}
1 row
Labels added: 2

203

Writing Clauses

12.4. Delete
The DELETE clause is used to delete graph elements — nodes, relationships or paths.
For removing properties and labels, see Section 12.5, “Remove” [205]. Remember that you can not
delete a node without also deleting relationships that start or end on said node. Either explicitly delete
the relationships, or use DETACH DELETE.
The examples start out with the following database:
nam e = 'Andres'
age = 36

KNOWS

nam e = 'Tobias'
age = 25

KNOWS

nam e = 'Pet er'
age = 34

Delete single node
To delete a node, use the DELETE clause.
Query
MATCH (n:Useless)
DELETE n

Result
(empty result)
Nodes deleted: 1

Delete all nodes and relationships
This query isn’t for deleting large amounts of data, but is nice when playing around with small example
data sets.
Query
MATCH (n)
DETACH DELETE n

Result
(empty result)
Nodes deleted: 3
Relationships deleted: 2

Delete a node with all its relationships
When you want to delete a node and any relationship going to or from it, use DETACH DELETE.
Query
MATCH (n { name:'Andres' })
DETACH DELETE n

Result
(empty result)
Nodes deleted: 1
Relationships deleted: 2

204

Writing Clauses

12.5. Remove
The REMOVE clause is used to remove properties and labels from graph elements.
For deleting nodes and relationships, see Section 12.4, “Delete” [204].

Note

Removing labels from a node is an idempotent operation: If you try to remove a label from a
node that does not have that label on it, nothing happens. The query statistics will tell you if
something needed to be done or not.
The examples start out with the following database:
Swedish
nam e = 'Andres'
age = 36

KNOWS

Swedish

KNOWS

Swedish, Germ an

nam e = 'Tobias'
age = 25

nam e = 'Pet er'
age = 34

Remove a property
Neo4j doesn’t allow storing null in properties. Instead, if no value exists, the property is just not there.
So, to remove a property value on a node or a relationship, is also done with REMOVE.
Query
MATCH (andres { name: 'Andres' })
REMOVE andres.age
RETURN andres

The node is returned, and no property age exists on it.
Result
andres
Node[2]{name:"Andres"}
1 row
Properties set: 1

Remove a label from a node
To remove labels, you use REMOVE.
Query
MATCH (n { name: 'Peter' })
REMOVE n:German
RETURN n

Result
n
Node[1]{name:"Peter", age:34}
1 row
Labels removed: 1

205

Writing Clauses

Removing multiple labels
To remove multiple labels, you use REMOVE.
Query
MATCH (n { name: 'Peter' })
REMOVE n:German:Swedish
RETURN n

Result
n
Node[1]{name:"Peter", age:34}
1 row
Labels removed: 2

206

Writing Clauses

12.6. Foreach
The FOREACH clause is used to update data within a collection, whether components of a
path, or result of aggregation.
Collections and paths are key concepts in Cypher. To use them for updating data, you can use the
FOREACH construct. It allows you to do updating commands on elements in a collection — a path, or a
collection created by aggregation.
The identifier context inside of the foreach parenthesis is separate from the one outside it. This means
that if you CREATE a node identifier inside of a FOREACH, you will not be able to use it outside of the
foreach statement, unless you match to find it.
Inside of the FOREACH parentheses, you can do any of the updating commands — CREATE, CREATE UNIQUE,
MERGE, DELETE, and FOREACH.
If you want to execute an additional MATCH for each element in a collection then UNWIND (see Section 10.6,
“Unwind” [148]) would be a more appropriate command.
Figure 12.2. Data for the examples
nam e = 'A'

KNOWS

nam e = 'B'

KNOWS

nam e = 'C'

KNOWS

nam e = 'D'

Mark all nodes along a path
This query will set the property marked to true on all nodes along a path.
Query
MATCH p =(begin)-[*]->(END )
WHERE begin.name='A' AND END .name='D'
FOREACH (n IN nodes(p)| SET n.marked = TRUE )

Nothing is returned from this query, but four properties are set.
Result
(empty result)
Properties set: 4

207

Writing Clauses

12.7. Create Unique
The CREATE UNIQUE clause is a mix of MATCH and CREATE — it will match what it can, and create
what is missing.

Introduction
Tip

MERGE might be what you want to use instead of CREATE UNIQUE. Note however, that MERGE

doesn’t give as strong guarantees for relationships being unique.

CREATE UNIQUE is in the middle of MATCH and CREATE — it will match what it can, and create what is missing.
CREATE UNIQUE will always make the least change possible to the graph — if it can use parts of the

existing graph, it will.

Another difference to MATCH is that CREATE UNIQUE assumes the pattern to be unique. If multiple matching
subgraphs are found an error will be generated.

Tip

In the CREATE UNIQUE clause, patterns are used a lot. Read Section 9.6, “Patterns” [127] for an
introduction.

The examples start out with the following data set:

nam e = 'root '

X

X

nam e = 'A'

X

nam e = 'B'

KNOWS

nam e = 'C'

Create unique nodes
Create node if missing
If the pattern described needs a node, and it can’t be matched, a new node will be created.
Query
MATCH (root { name: 'root' })
CREATE UNIQUE (root)-[:LOVES]-(someone)
RETURN someone

The root node doesn’t have any LOVES relationships, and so a node is created, and also a relationship to
that node.
208

Writing Clauses
Result
someone
Node[4]{}
1 row
Nodes created: 1
Relationships created: 1

Create nodes with values
The pattern described can also contain values on the node. These are given using the following syntax:
prop : .
Query
MATCH (root { name: 'root' })
CREATE UNIQUE (root)-[:X]-(leaf { name:'D' })
RETURN leaf

No node connected with the root node has the name D, and so a new node is created to match the
pattern.
Result
leaf
Node[4]{name:"D"}
1 row
Nodes created: 1
Relationships created: 1
Properties set: 1

Create labeled node if missing
If the pattern described needs a labeled node and there is none with the given labels, Cypher will create
a new one.
Query
MATCH (a { name: 'A' })
CREATE UNIQUE (a)-[:KNOWS]-(c:blue)
RETURN c

The A node is connected in a KNOWS relationship to the c node, but since C doesn’t have the :blue label, a
new node labeled as :blue is created along with a KNOWS relationship from A to it.
Result
c
Node[4]{}
1 row
Nodes created: 1
Relationships created: 1
Labels added: 1

Create unique relationships
Create relationship if it is missing
CREATE UNIQUE is used to describe the pattern that should be found or created.
Query
MATCH (lft { name: 'A' }),(rgt)
WHERE rgt.name IN ['B', 'C']
CREATE UNIQUE (lft)-[r:KNOWS]->(rgt)

209

Writing Clauses
RETURN r

The left node is matched agains the two right nodes. One relationship already exists and can be
matched, and the other relationship is created before it is returned.
Result
r
:KNOWS[4]{}
:KNOWS[3]{}
2 rows
Relationships created: 1

Create relationship with values
Relationships to be created can also be matched on values.
Query
MATCH (root { name: 'root' })
CREATE UNIQUE (root)-[r:X { since:'forever' }]-()
RETURN r

In this example, we want the relationship to have a value, and since no such relationship can be found,
a new node and relationship are created. Note that since we are not interested in the created node, we
don’t name it.
Result
r
:X[4]{since:"forever"}
1 row
Nodes created: 1
Relationships created: 1
Properties set: 1

Describe complex pattern
The pattern described by CREATE UNIQUE can be separated by commas, just like in MATCH and CREATE.
Query
MATCH (root { name: 'root' })
CREATE UNIQUE (root)-[:FOO]->(x),(root)-[:BAR]->(x)
RETURN x

This example pattern uses two paths, separated by a comma.
Result
x
Node[4]{}
1 row
Nodes created: 1
Relationships created: 2

210

Writing Clauses

12.8. Importing CSV files with Cypher
This tutorial will show you how to import data from CSV files using LOAD CSV.
In this example, we’re given three CSV files: a list of persons, a list of movies, and a list of which role was
played by some of these persons in each movie.
CSV files can be stored on the database server and are then accessible using a file:// URL.
Alternatively, LOAD CSV also supports accessing CSV files via HTTPS, HTTP, and FTP. LOAD CSV will follow HTTP
redirects but for security reasons it will not follow redirects that changes the protocol, for example if
the redirect is going from HTTPS to HTTP.
For more details, see Section 11.6, “Load CSV” [182].
Using the following Cypher queries, we’ll create a node for each person, a node for each movie and a
relationship between the two with a property denoting the role. We’re also keeping track of the country
in which each movie was made.
Let’s start with importing the persons:
LOAD CSV WITH HEADERS FROM "http://neo4j.com/docs/2.3.12/csv/import/persons.csv" AS csvLine
CREATE (p:Person { id: toInt(csvLine.id), name: csvLine.name })

The CSV file we’re using looks like this:
persons.csv
id,name
1,Charlie Sheen
2,Oliver Stone
3,Michael Douglas
4,Martin Sheen
5,Morgan Freeman

Now, let’s import the movies. This time, we’re also creating a relationship to the country in which the
movie was made. If you are storing your data in a SQL database, this is the one-to-many relationship
type.
We’re using MERGE to create nodes that represent countries. Using MERGE avoids creating duplicate
country nodes in the case where multiple movies have been made in the same country.

Important

When using MERGE or MATCH with LOAD CSV we need to make sure we have an index
(see Section 14.1, “Indexes” [244]) or a unique constraint (see Section 14.2,
“Constraints” [247]) on the property we’re merging. This will ensure the query executes in
a performant way.
Before running our query to connect movies and countries we’ll create an index for the name property
on the Country label to ensure the query runs as fast as it can:
CREATE INDEX ON :Country(name)
LOAD CSV WITH HEADERS FROM "http://neo4j.com/docs/2.3.12/csv/import/movies.csv" AS csvLine
MERGE (country:Country { name: csvLine.country })
CREATE (movie:Movie { id: toInt(csvLine.id), title: csvLine.title, year:toInt(csvLine.year)})
CREATE (movie)-[:MADE_IN]->(country)

movies.csv
id,title,country,year
1,Wall Street,USA,1987
2,The American President,USA,1995

211

Writing Clauses
3,The Shawshank Redemption,USA,1994

Lastly, we create the relationships between the persons and the movies. Since the relationship is a
many to many relationship, one actor can participate in many movies, and one movie has many actors
in it. We have this data in a separate file.
We’ll index the id property on Person and Movie nodes. The id property is a temporary property used
to look up the appropriate nodes for a relationship when importing the third file. By indexing the id
property, node lookup (e.g. by MATCH) will be much faster. Since we expect the ids to be unique in each
set, we’ll create a unique constraint. This protects us from invalid data since constraint creation will
fail if there are multiple nodes with the same id property. Creating a unique constraint also creates a
unique index (which is faster than a regular index).
CREATE CONSTRAINT ON (person:Person) ASSERT person.id IS UNIQUE
CREATE CONSTRAINT ON (movie:Movie) ASSERT movie.id IS UNIQUE

Now importing the relationships is a matter of finding the nodes and then creating relationships
between them.
For this query we’ll use USING PERIODIC COMMIT (see Section 12.9, “Using Periodic Commit” [213]) which
is helpful for queries that operate on large CSV files. This hint tells Neo4j that the query might build up
inordinate amounts of transaction state, and so needs to be periodically committed. In this case we
also set the limit to 500 rows per commit.
USING PERIODIC COMMIT 500
LOAD CSV WITH HEADERS FROM "http://neo4j.com/docs/2.3.12/csv/import/roles.csv" AS csvLine
MATCH (person:Person { id: toInt(csvLine.personId)}),(movie:Movie { id: toInt(csvLine.movieId)})
CREATE (person)-[:PLAYED { role: csvLine.role }]->(movie)

roles.csv
personId,movieId,role
1,1,Bud Fox
4,1,Carl Fox
3,1,Gordon Gekko
4,2,A.J. MacInerney
3,2,President Andrew Shepherd
5,3,Ellis Boyd 'Red' Redding

Finally, as the id property was only necessary to import the relationships, we can drop the constraints
and the id property from all movie and person nodes.
DROP CONSTRAINT ON (person:Person) ASSERT person.id IS UNIQUE
DROP CONSTRAINT ON (movie:Movie) ASSERT movie.id IS UNIQUE
MATCH (n)
WHERE n:Person OR n:Movie
REMOVE n.id

212

Writing Clauses

12.9. Using Periodic Commit
Note

See Section 12.8, “Importing CSV files with Cypher” [211] on how to import data from CSV
files.
Importing large amounts of data using LOAD CSV with a single Cypher query may fail due to memory
constraints. This will manifest itself as an OutOfMemoryError.
For this situation only, Cypher provides the global USING PERIODIC COMMIT query hint for updating
queries using LOAD CSV. You can optionally set the limit for the number of rows per commit like so: USING
PERIODIC COMMIT 500.
PERIODIC COMMIT will process the rows until the number of rows reaches a limit. Then the current

transaction will be committed and replaced with a newly opened transaction. If no limit is set, a default
value will be used.
See the section called “Importing large amounts of data” [184] in Section 11.6, “Load CSV” [182] for
examples of USING PERIODIC COMMIT with and without setting the number of rows per commit.

Important

Using periodic commit will prevent running out of memory when importing large amounts
of data. However, it will also break transactional isolation and thus it should only be used
where needed.

213

Chapter 13. Functions
This chapter contains information on all functions in Cypher. Note that related information exists in
Section 9.4, “Operators” [124].

Note

Most functions in Cypher will return NULL if an input parameter is NULL.

214

Functions

13.1. Predicates
Predicates are boolean functions that return true or false for a given set of input. They are most
commonly used to filter out subgraphs in the WHERE part of a query.
See also the section called “Comparison operators” [124].
Figure 13.1. Graph
foo, bar
nam e = 'Alice'
age = 38
eyes = 'brown'

KNOWS

nam e = 'Charlie'
age = 53
eyes = 'green'

KNOWS

KNOWS

nam e = 'Bob'
age = 25
eyes = 'blue'

KNOWS

MARRIED

Spouse
nam e = 'Daniel'
age = 54
eyes = 'brown'

array = [ 'one', 't wo', 't hree']
nam e = 'Eskil'
age = 41
eyes = 'blue'

ALL
Tests whether a predicate holds for all elements of this collection.
Syntax: ALL(identifier in collection WHERE predicate)
Arguments:
• collection: An expression that returns a collection
• identifier: This is the identifier that can be used from the predicate.
• predicate: A predicate that is tested against all items in the collection.
Query
MATCH p=(a)-[*1..3]->(b)
WHERE a.name='Alice' AND b.name='Daniel' AND ALL (x IN nodes(p) WHERE x.age > 30)
RETURN p

All nodes in the returned paths will have an age property of at least 30.
Result
p
[Node[2]{name:"Alice", age:38, eyes:"brown"}, :KNOWS[1]{}, Node[4]{name:"Charlie", age:53,
eyes:"green"}, :KNOWS[3]{}, Node[0]{name:"Daniel", age:54, eyes:"brown"}]
1 row

ANY
Tests whether a predicate holds for at least one element in the collection.
Syntax: ANY(identifier in collection WHERE predicate)
215

Functions
Arguments:
• collection: An expression that returns a collection
• identifier: This is the identifier that can be used from the predicate.
• predicate: A predicate that is tested against all items in the collection.
Query
MATCH (a)
WHERE a.name='Eskil' AND ANY (x IN a.array WHERE x = "one")
RETURN a

All nodes in the returned paths has at least one one value set in the array property named array.
Result
a
Node[1]{array:["one", "two", "three"], name:"Eskil", age:41, eyes:"blue"}
1 row

NONE
Returns true if the predicate holds for no element in the collection.
Syntax: NONE(identifier in collection WHERE predicate)
Arguments:
• collection: An expression that returns a collection
• identifier: This is the identifier that can be used from the predicate.
• predicate: A predicate that is tested against all items in the collection.
Query
MATCH p=(n)-[*1..3]->(b)
WHERE n.name='Alice' AND NONE (x IN nodes(p) WHERE x.age = 25)
RETURN p

No nodes in the returned paths has a age property set to 25.
Result
p
[Node[2]{name:"Alice", age:38, eyes:"brown"}, :KNOWS[1]{}, Node[4]{name:"Charlie", age:53, eyes:"green"}]
[Node[2]{name:"Alice", age:38, eyes:"brown"}, :KNOWS[1]{}, Node[4]{name:"Charlie", age:53,
eyes:"green"}, :KNOWS[3]{}, Node[0]{name:"Daniel", age:54, eyes:"brown"}]
2 rows

SINGLE
Returns true if the predicate holds for exactly one of the elements in the collection.
Syntax: SINGLE(identifier in collection WHERE predicate)
Arguments:
• collection: An expression that returns a collection
• identifier: This is the identifier that can be used from the predicate.
• predicate: A predicate that is tested against all items in the collection.
Query
MATCH p=(n)-->(b)

216

Functions
WHERE n.name='Alice' AND SINGLE (var IN nodes(p) WHERE var.eyes = "blue")
RETURN p

Exactly one node in every returned path will have the eyes property set to "blue".
Result
p
[Node[2]{name:"Alice", age:38, eyes:"brown"}, :KNOWS[0]{}, Node[3]{name:"Bob", age:25, eyes:"blue"}]
1 row

EXISTS
Returns true if a match for the pattern exists in the graph, or the property exists in the node,
relationship or map.
Syntax: EXISTS( pattern-or-property )
Arguments:
• pattern-or-property: A pattern or a property (in the form identifier.prop).
Query
MATCH (n)
WHERE EXISTS(n.name)
RETURN n.name AS name, EXISTS((n)-[:MARRIED]->()) AS is_married

This query returns all the nodes with a name property along with a boolean true/false indicating if they
are married.
Result
name

is_married

"Daniel"

false

"Eskil"

false

"Alice"

false

"Bob"

true

"Charlie"

false

5 rows

217

Functions

13.2. Scalar functions
Scalar functions return a single value.

Important

The LENGTH and SIZE functions are quite similar, and so it is important to take note of the
difference. Due to backwards compatibility LENGTH currently works on four types: strings,
paths, collections and pattern expressions. However, for clarity it is recommended to only
use LENGTH on strings and paths, and use the new SIZE function on collections and pattern
expressions. LENGTH on those types may be deprecated in future.
Figure 13.2. Graph
foo, bar
nam e = 'Alice'
age = 38
eyes = 'brown'

KNOWS

KNOWS

nam e = 'Charlie'
age = 53
eyes = 'green'

KNOWS

nam e = 'Bob'
age = 25
eyes = 'blue'

KNOWS

MARRIED

Spouse
nam e = 'Daniel'
age = 54
eyes = 'brown'

array = [ 'one', 't wo', 't hree']
nam e = 'Eskil'
age = 41
eyes = 'blue'

SIZE
To return or filter on the size of a collection, use the SIZE() function.
Syntax: SIZE( collection )
Arguments:
• collection: An expression that returns a collection
Query
RETURN size(['Alice', 'Bob']) AS col

The number of items in the collection is returned by the query.
Result
col
2
1 row

SIZE of pattern expression
This is the same SIZE() method described before, but instead of passing in a collection directly, you
provide a pattern expression that can be used in a match query to provide a new set of results. The size
of the result is calculated, not the length of the expression itself.
218

Functions
Syntax: SIZE( pattern expression )
Arguments:
• pattern expression: A pattern expression that returns a collection
Query
MATCH (a)
WHERE a.name='Alice'
RETURN size((a)-->()-->()) AS fof

The number of sub-graphs matching the pattern expression is returned by the query.
Result
fof
3
1 row

LENGTH
To return or filter on the length of a path, use the LENGTH() function.
Syntax: LENGTH( path )
Arguments:
• path: An expression that returns a path
Query
MATCH p=(a)-->(b)-->(c)
WHERE a.name='Alice'
RETURN length(p)

The length of the path p is returned by the query.
Result
length(p)
2
2
2
3 rows

LENGTH of string
To return or filter on the length of a string, use the LENGTH() function.
Syntax: LENGTH( string )
Arguments:
• string: An expression that returns a string
Query
MATCH (a)
WHERE length(a.name)> 6
RETURN length(a.name)

The length of the name Charlie is returned by the query.
219

Functions
Result
length(a.name)
7
1 row

TYPE
Returns a string representation of the relationship type.
Syntax: TYPE( relationship )
Arguments:
• relationship: A relationship.
Query
MATCH (n)-[r]->()
WHERE n.name='Alice'
RETURN type(r)

The relationship type of r is returned by the query.
Result
type(r)
"KNOWS"
"KNOWS"
2 rows

ID
Returns the id of the relationship or node.
Syntax: ID( property-container )
Arguments:
• property-container: A node or a relationship.
Query
MATCH (a)
RETURN id(a)

This returns the node id for three nodes.
Result
id(a)
0
1
2
3
4
5 rows

COALESCE
Returns the first non-NULL value in the list of expressions passed to it. In case all arguments are NULL,
NULL will be returned.
220

Functions
Syntax: COALESCE( expression [, expression]* )
Arguments:
• expression: The expression that might return NULL.
Query
MATCH (a)
WHERE a.name='Alice'
RETURN coalesce(a.hairColor, a.eyes)

Result
coalesce(a.hairColor, a.eyes)
"brown"
1 row

HEAD
HEAD returns the first element in a collection.

Syntax: HEAD( expression )
Arguments:
• expression: This expression should return a collection of some kind.
Query
MATCH (a)
WHERE a.name='Eskil'
RETURN a.array, head(a.array)

The first node in the path is returned.
Result
a.array

head(a.array)

["one", "two", "three"]

"one"

1 row

LAST
LAST returns the last element in a collection.

Syntax: LAST( expression )
Arguments:
• expression: This expression should return a collection of some kind.
Query
MATCH (a)
WHERE a.name='Eskil'
RETURN a.array, last(a.array)

The last node in the path is returned.
Result
a.array

last(a.array)

["one", "two", "three"]

"three"

1 row

221

Functions

TIMESTAMP
TIMESTAMP returns the difference, measured in milliseconds, between the current time and midnight,

January 1, 1970 UTC. It will return the same value during the whole one query, even if the query is a
long running one.
Syntax: TIMESTAMP()
Arguments:
Query
RETURN timestamp()

The time in milliseconds is returned.
Result
timestamp()
1512735441499
1 row

STARTNODE
STARTNODE returns the starting node of a relationship

Syntax: STARTNODE( relationship )
Arguments:
• relationship: An expression that returns a relationship
Query
MATCH (x:foo)-[r]-()
RETURN startNode(r)

Result
startNode(r)
Node[2]{name:"Alice", age:38, eyes:"brown"}
Node[2]{name:"Alice", age:38, eyes:"brown"}
2 rows

ENDNODE
ENDNODE returns the end node of a relationship

Syntax: ENDNODE( relationship )
Arguments:
• relationship: An expression that returns a relationship
Query
MATCH (x:foo)-[r]-()
RETURN endNode(r)

Result
endNode(r)
Node[4]{name:"Charlie", age:53, eyes:"green"}
Node[3]{name:"Bob", age:25, eyes:"blue"}
2 rows

222

Functions

TOINT
TOINT converts the argument to an integer. A string is parsed as if it was an integer number. If the
parsing fails, NULL will be returned. A floating point number will be cast into an integer.

Syntax: TOINT( expression )
Arguments:
• expression: An expression that returns anything
Query
RETURN toInt("42"), toInt("not a number")

Result
toInt("42")

toInt("not a number")

42


1 row

TOFLOAT
TOFLOAT converts the argument to a float. A string is parsed as if it was an floating point number. If the
parsing fails, NULL will be returned. An integer will be cast to a floating point number.

Syntax: TOFLOAT( expression )
Arguments:
• expression: An expression that returns anything
Query
RETURN toFloat("11.5"), toFloat("not a number")

Result
toFloat("11.5")

toFloat("not a number")

11. 5



1 row

223

Functions

13.3. Collection functions
Collection functions return collections of things — nodes in a path, and so on.
See also the section called “Collection operators” [124].
Figure 13.3. Graph
foo, bar
nam e = 'Alice'
age = 38
eyes = 'brown'

KNOWS

nam e = 'Charlie'
age = 53
eyes = 'green'

KNOWS

KNOWS

nam e = 'Bob'
age = 25
eyes = 'blue'

KNOWS

MARRIED

Spouse
nam e = 'Daniel'
age = 54
eyes = 'brown'

array = [ 'one', 't wo', 't hree']
nam e = 'Eskil'
age = 41
eyes = 'blue'

NODES
Returns all nodes in a path.
Syntax: NODES( path )
Arguments:
• path: A path.
Query
MATCH p=(a)-->(b)-->(c)
WHERE a.name='Alice' AND c.name='Eskil'
RETURN nodes(p)

All the nodes in the path p are returned by the example query.
Result
nodes(p)
[Node[2]{name:"Alice", age:38, eyes:"brown"}, Node[3]{name:"Bob", age:25, eyes:"blue"}, Node[1]{array:
["one", "two", "three"], name:"Eskil", age:41, eyes:"blue"}]
1 row

RELATIONSHIPS
Returns all relationships in a path.
Syntax: RELATIONSHIPS( path )
Arguments:
• path: A path.
224

Functions
Query
MATCH p=(a)-->(b)-->(c)
WHERE a.name='Alice' AND c.name='Eskil'
RETURN relationships(p)

All the relationships in the path p are returned.
Result
relationships(p)
[:KNOWS[0]{}, :MARRIED[4]{}]
1 row

LABELS
Returns a collection of string representations for the labels attached to a node.
Syntax: LABELS( node )
Arguments:
• node: Any expression that returns a single node
Query
MATCH (a)
WHERE a.name='Alice'
RETURN labels(a)

The labels of n is returned by the query.
Result
labels(a)
["foo", "bar"]
1 row

KEYS
Returns a collection of string representations for the property names of a node, relationship, or map.
Syntax: KEYS( property-container )
Arguments:
• property-container: A node, a relationship, or a literal map.
Query
MATCH (a)
WHERE a.name='Alice'
RETURN keys(a)

The name of the properties of n is returned by the query.
Result
keys(a)
["name", "age", "eyes"]
1 row

EXTRACT
To return a single property, or the value of a function from a collection of nodes or relationships, you
can use EXTRACT. It will go through a collection, run an expression on every element, and return the
225

Functions
results in an collection with these values. It works like the map method in functional languages such as
Lisp and Scala.
Syntax: EXTRACT( identifier in collection | expression )
Arguments:
• collection: An expression that returns a collection
• identifier: The closure will have an identifier introduced in it’s context. Here you decide which
identifier to use.
• expression: This expression will run once per value in the collection, and produces the result
collection.
Query
MATCH p=(a)-->(b)-->(c)
WHERE a.name='Alice' AND b.name='Bob' AND c.name='Daniel'
RETURN extract(n IN nodes(p)| n.age) AS extracted

The age property of all nodes in the path are returned.
Result
extracted
[38, 25, 54]
1 row

FILTER
FILTER returns all the elements in a collection that comply to a predicate.

Syntax: FILTER(identifier in collection WHERE predicate)
Arguments:
• collection: An expression that returns a collection
• identifier: This is the identifier that can be used from the predicate.
• predicate: A predicate that is tested against all items in the collection.
Query
MATCH (a)
WHERE a.name='Eskil'
RETURN a.array, filter(x IN a.array WHERE size(x)= 3)

This returns the property named array and a list of values in it, which have size 3.
Result
a.array

filter(x in a.array WHERE size(x) = 3)

["one", "two", "three"]

["one", "two"]

1 row

TAIL
TAIL returns all but the first element in a collection.

Syntax: TAIL( expression )
Arguments:
• expression: This expression should return a collection of some kind.
Query

226

Functions

MATCH (a)
WHERE a.name='Eskil'
RETURN a.array, tail(a.array)

This returns the property named array and all elements of that property except the first one.
Result
a.array

tail(a.array)

["one", "two", "three"]

["two", "three"]

1 row

RANGE
Returns numerical values in a range with a non-zero step value step. Range is inclusive in both ends.
Syntax: RANGE( start, end [, step] )
Arguments:
• start: A numerical expression.
• end: A numerical expression.
• step: A numerical expression.
Query
RETURN range(0,10), range(2,18,3)

Two lists of numbers are returned.
Result
range(0,10)

range(2,18,3)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

[2, 5, 8, 11, 14, 17]

1 row

REDUCE
To run an expression against individual elements of a collection, and store the result of the expression
in an accumulator, you can use REDUCE. It will go through a collection, run an expression on every
element, storing the partial result in the accumulator. It works like the fold or reduce method in
functional languages such as Lisp and Scala.
Syntax: REDUCE( accumulator = initial, identifier in collection | expression )
Arguments:
•
•
•
•

accumulator: An identifier that will hold the result and the partial results as the collection is iterated
initial: An expression that runs once to give a starting value to the accumulator
collection: An expression that returns a collection
identifier: The closure will have an identifier introduced in it’s context. Here you decide which
identifier to use.
• expression: This expression will run once per value in the collection, and produces the result value.
Query
MATCH p=(a)-->(b)-->(c)
WHERE a.name='Alice' AND b.name='Bob' AND c.name='Daniel'
RETURN reduce(totalAge = 0, n IN nodes(p)| totalAge + n.age) AS reduction

The age property of all nodes in the path are summed and returned as a single value.
227

Functions
Result
reduction
117
1 row

228

Functions

13.4. Mathematical functions
These functions all operate on numerical expressions only, and will return an error if used on any other
values.
See also the section called “Mathematical operators” [124].
Figure 13.4. Graph
foo, bar
nam e = 'Alice'
age = 38
eyes = 'brown'

KNOWS

nam e = 'Charlie'
age = 53
eyes = 'green'

KNOWS

KNOWS

nam e = 'Bob'
age = 25
eyes = 'blue'

KNOWS

MARRIED

Spouse
nam e = 'Daniel'
age = 54
eyes = 'brown'

array = [ 'one', 't wo', 't hree']
nam e = 'Eskil'
age = 41
eyes = 'blue'

ABS
ABS returns the absolute value of a number.

Syntax: ABS( expression )
Arguments:
• expression: A numeric expression.
Query
MATCH (a),(e)
WHERE a.name = 'Alice' AND e.name = 'Eskil'
RETURN a.age, e.age, abs(a.age - e.age)

The absolute value of the age difference is returned.
Result
a.age

e.age

abs(a.age - e.age)

38

41

3. 0

1 row

ACOS
ACOS returns the arccosine of the expression, in radians.

Syntax: ACOS( expression )
Arguments:
• expression: A numeric expression.
229

Functions
Query
RETURN acos(0.5)

The arccosine of 0.5.
Result
acos(0.5)
1. 0471975511965979
1 row

ASIN
ASIN returns the arcsine of the expression, in radians.

Syntax: ASIN( expression )
Arguments:
• expression: A numeric expression.
Query
RETURN asin(0.5)

The arcsine of 0.5.
Result
asin(0.5)
0. 5235987755982989
1 row

ATAN
ATAN returns the arctangent of the expression, in radians.

Syntax: ATAN( expression )
Arguments:
• expression: A numeric expression.
Query
RETURN atan(0.5)

The arctangent of 0.5.
Result
atan(0.5)
0. 4636476090008061
1 row

ATAN2
ATAN2 returns the arctangent2 of a set of coordinates, in radians.

Syntax: ATAN2( expression , expression)
Arguments:
• expression: A numeric expression for y.
230

Functions
• expression: A numeric expression for x.
Query
RETURN atan2(0.5, 0.6)

The arctangent2 of 0.5, 0.6.
Result
atan2(0.5, 0.6)
0. 6947382761967033
1 row

CEIL
CEIL returns the smallest integer greater than or equal to the number.

Syntax: CEIL( expression )
Arguments:
• expression: A numeric expression.
Query
RETURN ceil(0.1)

The ceil of 0.1
Result
ceil(0.1)
1. 0
1 row

COS
COS returns the cosine of the expression.

Syntax: COS( expression )
Arguments:
• expression: A numeric expression.
Query
RETURN cos(0.5)

The cosine of 0.5 is returned.
Result
cos(0.5)
0. 8775825618903728
1 row

COT
COT returns the cotangent of the expression.

Syntax: COT( expression )
Arguments:
231

Functions
• expression: A numeric expression.
Query
RETURN cot(0.5)

The cotangent of 0.5 is returned.
Result
cot(0.5)
1. 830487721712452
1 row

DEGREES
DEGREES converts radians to degrees.

Syntax: DEGREES( expression )
Arguments:
• expression: A numeric expression.
Query
RETURN degrees(3.14159)

The number of degrees in something close to pi.
Result
degrees(3.14159)
179. 99984796050427
1 row

E
E returns the constant, e.

Syntax: E()
Arguments:
Query
RETURN e()

The constant e is returned (the base of natural log).
Result
e()
2. 718281828459045
1 row

EXP
EXP returns the value e raised to the power of the expression.

Syntax: EXP( expression )
Arguments:
• expression: A numeric expression.
Query

232

Functions

RETURN exp(2)
2

The exp of 2 is returned: e .
Result
exp(2)
7. 38905609893065
1 row

FLOOR
FLOOR returns the greatest integer less than or equal to the expression.

Syntax: FLOOR( expression )
Arguments:
• expression: A numeric expression.
Query
RETURN floor(0.9)

The floor of 0.9 is returned.
Result
floor(0.9)
0. 0
1 row

HAVERSIN
HAVERSIN returns half the versine of the expression.

Syntax: HAVERSIN( expression )
Arguments:
• expression: A numeric expression.
Query
RETURN haversin(0.5)

The haversine of 0.5 is returned.
Result
haversin(0.5)
0. 06120871905481362
1 row

Spherical distance using the haversin function
The haversin function may be used to compute the distance on the surface of a sphere between two
points (each given by their latitude and longitude). In this example the spherical distance (in km)
between Berlin in Germany (at lat 52.5, lon 13.4) and San Mateo in California (at lat 37.5, lon -122.3) is
calculated using an average earth radius of 6371 km.
Query
CREATE (ber:City { lat: 52.5, lon: 13.4 }),(sm:City { lat: 37.5, lon: -122.3 })
RETURN 2 * 6371 * asin(sqrt(haversin(radians(sm.lat - ber.lat))+ cos(radians(sm.lat))*

233

Functions
cos(radians(ber.lat))* haversin(radians(sm.lon - ber.lon)))) AS dist

The distance between Berlin and San Mateo is returned (about 9129 km).
Result
dist
9129. 969740051658
1 row
Nodes created: 2
Properties set: 4
Labels added: 2

LOG
LOG returns the natural logarithm of the expression.

Syntax: LOG( expression )
Arguments:
• expression: A numeric expression.
Query
RETURN log(27)

The log of 27 is returned.
Result
log(27)
3. 295836866004329
1 row

LOG10
LOG10 returns the base 10 logarithm of the expression.

Syntax: LOG10( expression )
Arguments:
• expression: A numeric expression.
Query
RETURN log10(27)

The log10 of 27 is returned.
Result
log10(27)
1. 4313637641589874
1 row

PI
PI returns the mathematical constant pi.

Syntax: PI()
Arguments:
Query

234

Functions

RETURN pi()

The constant pi is returned.
Result
pi()
3. 141592653589793
1 row

RADIANS
RADIANS converts degrees to radians.

Syntax: RADIANS( expression )
Arguments:
• expression: A numeric expression.
Query
RETURN radians(180)

The number of radians in 180 is returned (pi).
Result
radians(180)
3. 141592653589793
1 row

RAND
RAND returns a random double between 0 and 1.0.

Syntax: RAND( expression )
Arguments:
• expression: A numeric expression.
Query
RETURN rand() AS x1

A random number is returned.
Result
x1
0. 38708134468955
1 row

ROUND
ROUND returns the numerical expression, rounded to the nearest integer.

Syntax: ROUND( expression )
Arguments:
• expression: A numerical expression.
Query

235

Functions

RETURN round(3.141592)

Result
round(3.141592)
3. 0
1 row

SIGN
SIGN returns the signum of a number — zero if the expression is zero, -1 for any negative number, and 1

for any positive number.

Syntax: SIGN( expression )
Arguments:
• expression: A numerical expression
Query
RETURN sign(-17), sign(0.1)

Result
sign(-17)

sign(0.1)

-1. 0

1. 0

1 row

SIN
SIN returns the sine of the expression.

Syntax: SIN( expression )
Arguments:
• expression: A numeric expression.
Query
RETURN sin(0.5)

The sine of 0.5 is returned.
Result
sin(0.5)
0. 479425538604203
1 row

SQRT
SQRT returns the square root of a number.

Syntax: SQRT( expression )
Arguments:
• expression: A numerical expression
Query
RETURN sqrt(256)

236

Functions
Result
sqrt(256)
16. 0
1 row

TAN
TAN returns the tangent of the expression.

Syntax: TAN( expression )
Arguments:
• expression: A numeric expression.
Query
RETURN tan(0.5)

The tangent of 0.5 is returned.
Result
tan(0.5)
0. 5463024898437905
1 row

237

Functions

13.5. String functions
These functions all operate on string expressions only, and will return an error if used on any other
values. The exception to this rule is TOSTRING(), which also accepts numbers.
See also the section called “String operators” [124].
Figure 13.5. Graph
foo, bar
nam e = 'Alice'
age = 38
eyes = 'brown'

KNOWS

KNOWS

nam e = 'Charlie'
age = 53
eyes = 'green'

KNOWS

nam e = 'Bob'
age = 25
eyes = 'blue'

KNOWS

MARRIED

Spouse
nam e = 'Daniel'
age = 54
eyes = 'brown'

array = [ 'one', 't wo', 't hree']
nam e = 'Eskil'
age = 41
eyes = 'blue'

STR
STR returns a string representation of the expression. If the expression returns a string the result willbe

wrapped in quotation marks.
Syntax: STR( expression )
Arguments:

• expression: An expression that returns anything
Query
RETURN str(1), str("hello")

Result
str(1)

str("hello")

"1"

""hello""
1 row

Note

The STR() function is deprecated from Neo4j version 2.3 and onwards. This means it may be
removed in a future Neo4j major release.

REPLACE
REPLACE returns a string with the search string replaced by the replace string. It replaces all occurrences.

Syntax: REPLACE( original, search, replace )
Arguments:
238

Functions
• original: An expression that returns a string
• search: An expression that returns a string to search for
• replace: An expression that returns the string to replace the search string with
Query
RETURN replace("hello", "l", "w")

Result
replace("hello", "l", "w")
"hewwo"
1 row

SUBSTRING
SUBSTRING returns a substring of the original, with a 0-based index start and length. If length is omitted,

it returns a substring from start until the end of the string.
Syntax: SUBSTRING( original, start [, length] )
Arguments:
• original: An expression that returns a string
• start: An expression that returns a positive number
• length: An expression that returns a positive number
Query
RETURN substring("hello", 1, 3), substring("hello", 2)

Result
substring("hello", 1, 3)

substring("hello", 2)

"ell"

"llo"

1 row

LEFT
LEFT returns a string containing the left n characters of the original string.

Syntax: LEFT( original, length )
Arguments:
• original: An expression that returns a string
• n: An expression that returns a positive number
Query
RETURN left("hello", 3)

Result
left("hello", 3)
"hel"
1 row

RIGHT
RIGHT returns a string containing the right n characters of the original string.

Syntax: RIGHT( original, length )
239

Functions
Arguments:
• original: An expression that returns a string
• n: An expression that returns a positive number
Query
RETURN right("hello", 3)

Result
right("hello", 3)
"llo"
1 row

LTRIM
LTRIM returns the original string with whitespace removed from the left side.

Syntax: LTRIM( original )
Arguments:
• original: An expression that returns a string
Query
RETURN ltrim("

hello")

Result
ltrim(" hello")
"hello"
1 row

RTRIM
RTRIM returns the original string with whitespace removed from the right side.

Syntax: RTRIM( original )
Arguments:
• original: An expression that returns a string
Query
RETURN rtrim("hello

")

Result
rtrim("hello ")
"hello"
1 row

TRIM
TRIM returns the original string with whitespace removed from both sides.

Syntax: TRIM( original )
Arguments:
• original: An expression that returns a string
240

Functions
Query
RETURN trim("

hello

")

Result
trim(" hello ")
"hello"
1 row

LOWER
LOWER returns the original string in lowercase.

Syntax: LOWER( original )
Arguments:
• original: An expression that returns a string
Query
RETURN lower("HELLO")

Result
lower("HELLO")
"hello"
1 row

UPPER
UPPER returns the original string in uppercase.

Syntax: UPPER( original )
Arguments:
• original: An expression that returns a string
Query
RETURN upper("hello")

Result
upper("hello")
"HELLO"
1 row

SPLIT
SPLIT returns the sequence of strings witch are delimited by split patterns.

Syntax: SPLIT( original, splitPattern )
Arguments:
• original: An expression that returns a string
• splitPattern: The string to split the original string with
Query
RETURN split("one,two", ",")

241

Functions
Result
split("one,two", ",")
["one", "two"]
1 row

REVERSE
REVERSE returns the original string reversed.

Syntax: REVERSE( original )
Arguments:
• original: An expression that returns a string
Query
RETURN reverse("anagram")

Result
reverse("anagram")
"margana"
1 row

TOSTRING
TOSTRING converts the argument to a string. It converts integral and floating point numbers to strings,

and if called with a string will leave it unchanged.
Syntax: TOSTRING( expression )
Arguments:

• expression: An expression that returns a number or a string
Query
RETURN toString(11.5), toString("already a string")

Result
toString(11.5)

toString("already a string")

"11. 5"

"already a string"

1 row

242

Chapter 14. Schema
Neo4j 2.0 introduced an optional schema for the graph, based around the concept of labels. Labels are
used in the specification of indexes, and for defining constraints on the graph. Together, indexes and
constraints are the schema of the graph. Cypher includes data definition language (DDL) statements for
manipulating the schema.

243

Schema

14.1. Indexes
A database index is a redundant copy of information in the database for the purpose of making
retrieving said data more efficient. This comes at the cost of additional storage space and slower writes,
so deciding what to index and what not to index is an important and often non-trivial task.
Cypher allows the creation of indexes over a property for all nodes that have a given label. Once
an index has been created, it will automatically be managed and kept up to date by the database
whenever the graph is changed. Neo4j will automatically pick up and start using the index once it has
been created and brought online.

Create an index
To create an index on a property for all nodes that have a label, use CREATE INDEX ON. Note that the index
is not immediately available, but will be created in the background.
Query
CREATE INDEX ON :Person(name)

Result
(empty result)

Drop an index
To drop an index on all nodes that have a label and property combination, use the DROP INDEX clause.
Query
DROP INDEX ON :Person(name)

Result
(empty result)
Indexes removed: 1

Use index
There is usually no need to specify which indexes to use in a query, Cypher will figure that out by itself.
For example the query below will use the Person(name) index, if it exists. If you want Cypher to use
specific indexes, you can enforce it using hints. See Section 10.8, “Using” [152].
Query
MATCH (person:Person { name: 'Andres' })
RETURN person

Query Plan
+-----------------+----------------+------+---------+-------------+---------------+
| Operator
| Estimated Rows | Rows | DB Hits | Identifiers | Other
|
+-----------------+----------------+------+---------+-------------+---------------+
| +ProduceResults |
1 |
1 |
0 | person
| person
|
| |
+----------------+------+---------+-------------+---------------+
| +NodeIndexSeek |
1 |
1 |
2 | person
| :Person(name) |
+-----------------+----------------+------+---------+-------------+---------------+
Total database accesses: 2

Use index with WHERE using equality
Indexes are also automatically used for equality comparisons of an indexed property in the WHERE
clause. If you want Cypher to use specific indexes, you can enforce it using hints. See Section 10.8,
“Using” [152].
244

Schema
Query
MATCH (person:Person)
WHERE person.name = 'Andres'
RETURN person

Query Plan
+-----------------+----------------+------+---------+-------------+---------------+
| Operator
| Estimated Rows | Rows | DB Hits | Identifiers | Other
|
+-----------------+----------------+------+---------+-------------+---------------+
| +ProduceResults |
1 |
1 |
0 | person
| person
|
| |
+----------------+------+---------+-------------+---------------+
| +NodeIndexSeek |
1 |
1 |
2 | person
| :Person(name) |
+-----------------+----------------+------+---------+-------------+---------------+
Total database accesses: 2

Use index with WHERE using inequality
Indexes are also automatically used for inequality (range) comparisons of an indexed property in
the WHERE clause. If you want Cypher to use specific indexes, you can enforce it using hints. See
Section 10.8, “Using” [152].
Query
MATCH (person:Person)
WHERE person.name > 'B'
RETURN person

Query Plan
+-----------------------+----------------+------+---------+-------------+---------------------------------+
| Operator
| Estimated Rows | Rows | DB Hits | Identifiers | Other
|
+-----------------------+----------------+------+---------+-------------+---------------------------------+
| +ProduceResults
|
33 |
1 |
0 | person
| person
|
| |
+----------------+------+---------+-------------+---------------------------------+
| +NodeIndexSeekByRange |
33 |
1 |
2 | person
| :Person(name) > { AUTOSTRING0} |
+-----------------------+----------------+------+---------+-------------+---------------------------------+
Total database accesses: 2

Use index with IN
The IN predicate on person.name in the following query will use the Person(name) index, if it exists. If you
want Cypher to use specific indexes, you can enforce it using hints. See Section 10.8, “Using” [152].
Query
MATCH (person:Person)
WHERE person.name IN ['Andres', 'Mark']
RETURN person

Query Plan
+-----------------+----------------+------+---------+-------------+---------------+
| Operator
| Estimated Rows | Rows | DB Hits | Identifiers | Other
|
+-----------------+----------------+------+---------+-------------+---------------+
| +ProduceResults |
2 |
2 |
0 | person
| person
|
| |
+----------------+------+---------+-------------+---------------+
| +NodeIndexSeek |
2 |
2 |
4 | person
| :Person(name) |
+-----------------+----------------+------+---------+-------------+---------------+
Total database accesses: 4

245

Schema

Use index with STARTS WITH
The STARTS WITH predicate on person.name in the following query will use the Person(name) index, if it
exists.

Note

The similar operators ENDS WITH and CONTAINS cannot currently be solved using indexes.
Query
MATCH (person:Person)
WHERE person.name STARTS WITH 'And'
RETURN person

Query Plan
+-----------------------+----------------+------+---------+-------------+-------------------------------------------+
| Operator
| Estimated Rows | Rows | DB Hits | Identifiers | Other
|
+-----------------------+----------------+------+---------+-------------+-------------------------------------------+
| +ProduceResults
|
26 |
1 |
0 | person
| person
|
| |
+----------------+------+---------+-------------+-------------------------------------------+
| +NodeIndexSeekByRange |
26 |
1 |
2 | person
| :Person(name STARTS WITH { AUTOSTRING0}) |
+-----------------------+----------------+------+---------+-------------+-------------------------------------------+
Total database accesses: 2

Use index when checking for the existence of a property
The has(p.name) predicate in the following query will use the Person(name) index, if it exists.
Query
MATCH (p:Person)
WHERE HAS (p.name)
RETURN p

Query Plan
+-----------------+----------------+------+---------+-------------+---------------+
| Operator
| Estimated Rows | Rows | DB Hits | Identifiers | Other
|
+-----------------+----------------+------+---------+-------------+---------------+
| +ProduceResults |
2 |
2 |
0 | p
| p
|
| |
+----------------+------+---------+-------------+---------------+
| +NodeIndexScan |
2 |
2 |
3 | p
| :Person(name) |
+-----------------+----------------+------+---------+-------------+---------------+
Total database accesses: 3

246

Schema

14.2. Constraints
Neo4j helps enforce data integrity with the use of constraints. Constraints can be applied
to either nodes or relationships. Unique node property constraints can be created, as well
as node and relationship property existence constraints.
You can use unique property constraints to ensure that property values are unique for all nodes with
a specific label. Unique constraints do not mean that all nodes have to have a unique value for the
properties — nodes without the property are not subject to this rule.
You can use property existence constraints to ensure that a property exists for all nodes with a specific
label or for all relationships with a specific type. All queries that try to create new nodes or relationships
without the property, or queries that try to remove the mandatory property will now fail.

Note

Property existence constraints are only available in the Neo4j Enterprise Edition. Note that
databases with property existence constraints cannot be opened using Neo4j Community
Edition.
You can have multiple constraints for a given label and you can also combine unique and property
existence constraints on the same property.
Remember that adding constraints is an atomic operation that can take a while — all existing data has
to be scanned before Neo4j can turn the constraint “on”.
Note that adding a unique property constraint on a property will also add an index on that property, so
you cannot add such an index separately. Cypher will use that index for lookups just like other indexes.
If you drop a unique property constraint and still want an index on the property, you will have to create
the index.
The existing constraints can be listed using the REST API, see Section 21.16, “Constraints” [370].

Unique node property constraints
Create uniqueness constraint
To create a constraint that makes sure that your database will never contain more than one node with a
specific label and one property value, use the IS UNIQUE syntax.
Query
CREATE CONSTRAINT ON (book:Book) ASSERT book.isbn IS UNIQUE

Result
(empty result)
Unique constraints added: 1

Drop uniqueness constraint
By using DROP CONSTRAINT, you remove a constraint from the database.
Query
DROP CONSTRAINT ON (book:Book) ASSERT book.isbn IS UNIQUE

Result
(empty result)
Unique constraints removed: 1

Create a node that complies with unique property constraints
Create a Book node with an isbn that isn’t already in the database.
247

Schema
Query
CREATE (book:Book { isbn: '1449356265', title: 'Graph Databases' })

Result
(empty result)
Nodes created: 1
Properties set: 2
Labels added: 1

Create a node that breaks a unique property constraint
Create a Book node with an isbn that is already used in the database.
Query
CREATE (book:Book { isbn: '1449356265', title: 'Graph Databases' })

In this case the node isn’t created in the graph.
Error message
Node 0 already exists with label Book and property "isbn"=[1449356265]

Failure to create a unique property constraint due to conflicting nodes
Create a unique property constraint on the property isbn on nodes with the Book label when there are
two nodes with the same isbn.
Query
CREATE CONSTRAINT ON (book:Book) ASSERT book.isbn IS UNIQUE

In this case the constraint can’t be created because it is violated by existing data. We may choose to use
Section 14.1, “Indexes” [244] instead or remove the offending nodes and then re-apply the constraint.
Error message
Unable to create CONSTRAINT ON ( book:Book ) ASSERT book.isbn IS UNIQUE:
Multiple nodes with label `Book` have property `isbn` = '1449356265':
node(0)
node(1)

Node property existence constraints
Create node property existence constraint
To create a constraint that makes sure that all nodes with a certain label have a certain property, use
the ASSERT exists(identifier.propertyName) syntax.
Query
CREATE CONSTRAINT ON (book:Book) ASSERT exists(book.isbn)

Result
(empty result)
Property existence constraints added: 1

Drop node property existence constraint
By using DROP CONSTRAINT, you remove a constraint from the database.
Query
DROP CONSTRAINT ON (book:Book) ASSERT exists(book.isbn)

248

Schema
Result
(empty result)
Property existence constraints removed: 1

Create a node that complies with property existence constraints
Create a Book node with an existing isbn property.
Query
CREATE (book:Book { isbn: '1449356265', title: 'Graph Databases' })

Result
(empty result)
Nodes created: 1
Properties set: 2
Labels added: 1

Create a node that breaks a property existence constraint
Trying to create a Book node without an isbn property, given a property existence constraint on
:Book(isbn).
Query
CREATE (book:Book { title: 'Graph Databases' })

In this case the node isn’t created in the graph.
Error message
Node 1 with label "Book" must have the property "isbn" due to a constraint

Removing an existence constrained node property
Trying to remove the isbn property from an existing node book, given a property existence constraint on
:Book(isbn).
Query
MATCH (book:Book { title: 'Graph Databases' })
REMOVE book.isbn

In this case the property is not removed.
Error message
Node 0 with label "Book" must have the property "isbn" due to a constraint

Failure to create a node property existence constraint due to existing node
Create a constraint on the property isbn on nodes with the Book label when there already exists a node
without an isbn.
Query
CREATE CONSTRAINT ON (book:Book) ASSERT exists(book.isbn)

In this case the constraint can’t be created because it is violated by existing data. We may choose to
remove the offending nodes and then re-apply the constraint.
Error message
Unable to create CONSTRAINT ON ( book:Book ) ASSERT exists(book.isbn):
Node(0) with label `Book` has no value for property `isbn`

249

Schema

Relationship property existence constraints
Create relationship property existence constraint
To create a constraint that makes sure that all relationships with a certain type have a certain property,
use the ASSERT exists(identifier.propertyName) syntax.
Query
CREATE CONSTRAINT ON ()-[like:LIKED]-() ASSERT exists(like.day)

Result
(empty result)
Property existence constraints added: 1

Drop relationship property existence constraint
To remove a constraint from the database, use DROP CONSTRAINT.
Query
DROP CONSTRAINT ON ()-[like:LIKED]-() ASSERT exists(like.day)

Result
(empty result)
Property existence constraints removed: 1

Create a relationship that complies with property existence constraints
Create a LIKED relationship with an existing day property.
Query
CREATE (user:User)-[like:LIKED { day: 'yesterday' }]->(book:Book)

Result
(empty result)
Nodes created: 2
Relationships created: 1
Properties set: 1
Labels added: 2

Create a relationship that breaks a property existence constraint
Trying to create a LIKED relationship without a day property, given a property existence constraint
:LIKED(day).
Query
CREATE (user:User)-[like:LIKED]->(book:Book)

In this case the relationship isn’t created in the graph.
Error message
Relationship 1 with type "LIKED" must have the property "day" due to a constraint

Removing an existence constrained relationship property
Trying to remove the day property from an existing relationship like of type LIKED, given a property
existence constraint :LIKED(day).
Query
MATCH (user:User)-[like:LIKED]->(book:Book)

250

Schema
REMOVE like.day

In this case the property is not removed.
Error message
Relationship 0 with type "LIKED" must have the property "day" due to a constraint

Failure to create a relationship property existence constraint due to existing relationship
Create a constraint on the property day on relationships with the LIKED type when there already exists a
relationship without a property named day.
Query
CREATE CONSTRAINT ON ()-[like:LIKED]-() ASSERT exists(like.day)

In this case the constraint can’t be created because it is violated by existing data. We may choose to
remove the offending relationships and then re-apply the constraint.
Error message
Unable to create CONSTRAINT ON ()-[ liked:LIKED ]-() ASSERT exists(liked.day):
Relationship(0) with type `LIKED` has no value for property `day`

251

Schema

14.3. Statistics
When you issue a Cypher query, it gets compiled to an execution plan (see Chapter 16, Execution
Plans [259]) that can run and answer your question. To produce an efficient plan for your query,
Neo4j needs information about your database, such as the schema — what indexes and constraints do
exist? Neo4j will also use statistical information it keeps about your database to optimize the execution
plan. With this information, Neo4j can decide which access pattern leads to the best performing plans.
The statistical information that Neo4j keeps is:
1.
2.
3.
4.

The number of nodes with a certain label.
Selectivity per index.
The number of relationships by type.
The number of relationships by type, ending or starting from a node with a specific label.

Neo4j keeps the statistics up to date in two different ways. For label counts for example, the number
is updated whenever you set or remove a label from a node. For indexes, Neo4j needs to scan the full
index to produce the selectivity number. Since this is potentially a very time-consuming operation,
these numbers are collected in the background when enough data on the index has been changed.

Configuration options
Execution plans are cached and will not be replanned until the statistical information used to produce
the plan has changed. The following configuration options allows you to control how sensitive
replanning should be to updates of the database.
index_background_sampling_enabledControls whether indexes will automatically be re-sampled when they

have been updated enough. The Cypher query planner depends on
accurate statistics to create efficient plans, so it is important it is kept
up to date as the database evolves.

Tip

If background sampling is turned off, make sure to trigger
manual sampling when data has been updated.
index_sampling_update_percentage Controls how large portion of the index has to have been updated

before a new sampling run is triggered.
Controls how much the above statistical information is allowed to
dbms.cypher.statistics_divergence_threshold
change before an execution plan is considered stale and has to be
replanned. If the relative change in any of statistics is larger than this
threshold, the plan will be thrown away and a new one will be created.
A threshold of 0.0 means always replan, and a value of 1.0 means
never replan.

Managing statistics from the shell
Usage:
schema
schema
name
schema
schema
p name

sample -a
sample -l Person -p

will sample all indexes.
will sample the index for label Person on property name (if existing).

sample -a -f
sample -f -l :Person -

will force a sample of all indexes.
will force sampling of a specific index.

252

Chapter 15. Query Tuning
Neo4j works very hard to execute queries as fast as possible.
However, when optimizing for maximum query execution performance, it may be helpful to rephrase
queries using knowledge about the domain and the application.
The overall goal of manual query performance optimization is to ensure that only necessary data is
retrieved from the graph. At least data should get filtered out as early as possible in order to reduce
the amount of work that has to be done at later stages of query execution. This also goes for what gets
returned: avoid returning whole nodes and relationships — instead, pick the data you need and return
only that. You should also make sure to set an upper limit on variable length patterns, so they don’t
cover larger portions of the dataset than needed.
Each Cypher query gets optimized and transformed into an execution plan by the Cypher execution
engine. To minimize the resources used for this, make sure to use parameters instead of literals when
possible. This allows Cypher to re-use your queries instead of having to parse and build new execution
plans.
To read more about the execution plan operators mentioned in this chapter, see Chapter 16, Execution
Plans [259].

253

Query Tuning

15.1. How are queries executed?
Each query is turned into an execution plan by something called the execution planner. The execution
plan tells Neo4j which operations to perform when executing the query. Two different execution
planning strategies are included in Neo4j:
Rule
Cost

This planner has rules that are used to produce execution plans. The planner considers
available indexes, but does not use statistical information to guide the query compilation.
This planner uses the statistics service in Neo4j to assign cost to alternative plans and picks
the cheapest one. While this should lead to superior execution plans in most cases, it is still
under development.

By default, Neo4j 2.3.12 will use the cost planner for some queries, but not all. You can force
it to use a specific planner by using the query.planner.version configuration setting (see
dbms.cypher.planner [467]), or by prepending your query with CYPHER planner=cost or CYPHER
planner=rule. Neo4j might still not use the planner you selected — not all queries are solvable by the
cost planner at this point. Note that using PLANNER COST or PLANNER RULE in order to switch between
planners has been deprecated and will stop working in future versions.
You can see which planner was used by looking at the execution plan.

Note

When Cypher is building execution plans, it looks at the schema to see if it can find indexes
it can use. These index decisions are only valid until the schema changes, so adding or
removing indexes leads to the execution plan cache being flushed.

254

Query Tuning

15.2. How do I profile a query?
There are two options to choose from when you want to analyze a query by looking at its execution
plan:
EXPLAIN

PROFILE

If you want to see the execution plan but not run the statement, prepend your Cypher
statement with EXPLAIN. The statement will always return an empty result and make no
changes to the database.
If you want to run the statement and see which operators are doing most of the work,
use PROFILE. This will run your statement and keep track of how many rows pass through
each operator, and how much each operator needs to interact with the storage layer to
retrieve the necessary data. Please note that profiling your query uses more resources, so
you should not profile unless you are actively working on a query.

See Chapter 16, Execution Plans [259] for a detailed explanation of each of the operators contained in
an execution plan.

Tip

Being explicit about what types and labels you expect relationships and nodes to have in
your query helps Neo4j use the best possible statistical information, which leads to better
execution plans. This means that when you know that a relationship can only be of a certain
type, you should add that to the query. The same goes for labels, where declaring labels on
both the start and end nodes of a relationship helps Neo4j find the best way to execute the
statement.

255

Query Tuning

15.3. Basic query tuning example
We’ll start with a basic example to help you get the hang of profiling queries. The following examples
will use a movies data set.
Let’s start by importing the data:
LOAD CSV WITH HEADERS FROM "http://neo4j.com/docs/2.3.12/csv/query-tuning/movies.csv" AS line
MERGE (m:Movie { title:line.title })
ON CREATE SET m.released = toInt(line.released), m.tagline = line.tagline

LOAD CSV WITH HEADERS FROM 'http://neo4j.com/docs/2.3.12/csv/query-tuning/actors.csv' AS line
MATCH (m:Movie { title:line.title })
MERGE (p:Person { name:line.name })
ON CREATE SET p.born = toInt(line.born)
MERGE (p)-[:ACTED_IN { roles:split(line.roles,";")}]->(m)

LOAD CSV WITH HEADERS FROM 'http://neo4j.com/docs/2.3.12/csv/query-tuning/directors.csv' AS line
MATCH (m:Movie { title:line.title })
MERGE (p:Person { name:line.name })
ON CREATE SET p.born = toInt(line.born)
MERGE (p)-[:DIRECTED]->(m)

Let’s say we want to write a query to find Tom Hanks. The naive way of doing this would be to write the
following:
MATCH (p { name:"Tom Hanks" })
RETURN p

This query will find the Tom Hanks node but as the number of nodes in the database increase it will
become slower and slower. We can profile the query to find out why that is.
You can learn more about the options for profiling queries in Section 15.2, “How do I profile a
query?” [255] but in this case we’re going to prefix our query with PROFILE:
PROFILE
MATCH (p { name:"Tom Hanks" })
RETURN p
+-----------------+----------------+------+---------+-------------+---------------------------+
| Operator
| Estimated Rows | Rows | DB Hits | Identifiers | Other
|
+-----------------+----------------+------+---------+-------------+---------------------------+
| +ProduceResults |
16 |
1 |
0 | p
| p
|
| |
+----------------+------+---------+-------------+---------------------------+
| +Filter
|
16 |
1 |
163 | p
| p.name == { AUTOSTRING0} |
| |
+----------------+------+---------+-------------+---------------------------+
| +AllNodesScan
|
163 | 163 |
164 | p
|
|
+-----------------+----------------+------+---------+-------------+---------------------------+
Total database accesses: 327

The first thing to keep in mind when reading execution plans is that you need to read from the bottom
up.
In that vein, starting from the last row, the first thing we notice is that the value in the Rows column
seems high given there is only one node with the name property Tom Hanks in the database. If we look
across to the Operator column we’ll see that AllNodesScan has been used which means that the query
planner scanned through all the nodes in the database.
Moving up to the previous row we see the Filter operator which will check the name property on each of
the nodes passed through by AllNodesScan.
256

Query Tuning
This seems like an inefficient way of finding Tom Hanks given that we are looking at many nodes that
aren’t even people and therefore aren’t what we’re looking for.
The solution to this problem is that whenever we’re looking for a node we should specify a label to help
the query planner narrow down the search space. For this query we’d need to add a Person label.
MATCH (p:Person { name:"Tom Hanks" })
RETURN p

This query will be faster than the first one but as the number of people in our database increase we
again notice that the query slows down.
Again we can profile the query to work out why:
PROFILE
MATCH (p:Person { name:"Tom Hanks" })
RETURN p
+------------------+----------------+------+---------+-------------+---------------------------+
| Operator
| Estimated Rows | Rows | DB Hits | Identifiers | Other
|
+------------------+----------------+------+---------+-------------+---------------------------+
| +ProduceResults |
13 |
1 |
0 | p
| p
|
| |
+----------------+------+---------+-------------+---------------------------+
| +Filter
|
13 |
1 |
125 | p
| p.name == { AUTOSTRING0} |
| |
+----------------+------+---------+-------------+---------------------------+
| +NodeByLabelScan |
125 | 125 |
126 | p
| :Person
|
+------------------+----------------+------+---------+-------------+---------------------------+
Total database accesses: 251

This time the Rows value on the last row has reduced so we’re not scanning some nodes that we were
before which is a good start. The NodeByLabelScan operator indicates that we achieved this by first
doing a linear scan of all the Person nodes in the database.
Once we’ve done that we again scan through all those nodes using the Filter operator, comparing the
name property of each one.
This might be acceptable in some cases but if we’re going to be looking up people by name frequently
then we’ll see better performance if we create an index on the name property for the Person label:
CREATE INDEX ON :Person(name)

Now if we run the query again it will run more quickly:
MATCH (p:Person { name:"Tom Hanks" })
RETURN p

Let’s profile the query to see why that is:
PROFILE
MATCH (p:Person { name:"Tom Hanks" })
RETURN p
+-----------------+----------------+------+---------+-------------+---------------+
| Operator
| Estimated Rows | Rows | DB Hits | Identifiers | Other
|
+-----------------+----------------+------+---------+-------------+---------------+
| +ProduceResults |
1 |
1 |
0 | p
| p
|
| |
+----------------+------+---------+-------------+---------------+
| +NodeIndexSeek |
1 |
1 |
2 | p
| :Person(name) |
+-----------------+----------------+------+---------+-------------+---------------+
Total database accesses: 2

Our execution plan is down to a single row and uses the Node Index Seek operator which does a
schema index seek (see Section 14.1, “Indexes” [244]) to find the appropriate node.
257

Query Tuning

258

Chapter 16. Execution Plans
Neo4j breaks down the work of executing a query into small pieces called operators. Each operator is
responsible for a small part of the overall query. The operators are connected together in a pattern
called a execution plan.
Each operator is annotated with statistics.
Rows
EstimatedRows

DbHits

The number of rows that the operator produced. Only available if the query was
profiled.
If Neo4j used the cost-based compiler you will see the estimated number of rows
that will be produced by the operator. The compiler uses this estimate to choose
a suitable execution plan.
Each operator will ask the Neo4j storage engine to do work such as retrieving or
updating data. A database hit is an abstract unit of this storage engine work.

See Section 15.2, “How do I profile a query?” [255] for how to view the execution plan for your query.
For a deeper understanding of how each operator works, see the relevant section. Operators are
grouped into high-level categories. Please remember that the statistics of the actual database where
the queries run on will decide the plan used. There is no guarantee that a specific query will always be
solved with the same plan.

259

Execution Plans

16.1. Starting point operators
These operators find parts of the graph from which to start.

All Nodes Scan
Reads all nodes from the node store. The identifier that will contain the nodes is seen in the arguments.
If your query is using this operator, you are very likely to see performance problems on any non-trivial
database.
Query
MATCH (n)
RETURN n

Query Plan
+-----------------+----------------+------+---------+-------------+-------+
| Operator
| Estimated Rows | Rows | DB Hits | Identifiers | Other |
+-----------------+----------------+------+---------+-------------+-------+
| +ProduceResults |
35 |
35 |
0 | n
| n
|
| |
+----------------+------+---------+-------------+-------+
| +AllNodesScan
|
35 |
35 |
36 | n
|
|
+-----------------+----------------+------+---------+-------------+-------+
Total database accesses: 36

Directed Relationship By Id Seek
Reads one or more relationships by id from the relationship store. Produces both the relationship and
the nodes on either side.
Query
MATCH (n1)-[r]->()
WHERE id(r)= 0
RETURN r, n1

Query Plan
+-----------------------------------+----------------+------+---------+----------------+--------------------------------------------+
| Operator
| Estimated Rows | Rows | DB Hits | Identifiers
| Other
|
+-----------------------------------+----------------+------+---------+----------------+--------------------------------------------+
| +ProduceResults
|
1 |
1 |
0 | n1, r
| r, n1
|
| |
+----------------+------+---------+----------------+--------------------------------------------+
| +DirectedRelationshipByIdSeekPipe |
1 |
1 |
1 | anon[17], n1, r | EntityByIdRhs(SingleSeekArg({
AUTOINT0})) |
+-----------------------------------+----------------+------+---------+----------------+--------------------------------------------+
Total database accesses: 1

Node by Id seek
Reads one or more nodes by id from the node store.
Query
MATCH (n)
WHERE id(n)= 0

260

Execution Plans
RETURN n

Query Plan
+-----------------+----------------+------+---------+-------------+-------+
| Operator
| Estimated Rows | Rows | DB Hits | Identifiers | Other |
+-----------------+----------------+------+---------+-------------+-------+
| +ProduceResults |
1 |
1 |
0 | n
| n
|
| |
+----------------+------+---------+-------------+-------+
| +NodeByIdSeek
|
1 |
1 |
1 | n
|
|
+-----------------+----------------+------+---------+-------------+-------+
Total database accesses: 1

Node by label scan
Using the label index, fetches all nodes with a specific label on them from the node label index.
Query
MATCH (person:Person)
RETURN person

Query Plan
+------------------+----------------+------+---------+-------------+---------+
| Operator
| Estimated Rows | Rows | DB Hits | Identifiers | Other
|
+------------------+----------------+------+---------+-------------+---------+
| +ProduceResults |
14 |
14 |
0 | person
| person |
| |
+----------------+------+---------+-------------+---------+
| +NodeByLabelScan |
14 |
14 |
15 | person
| :Person |
+------------------+----------------+------+---------+-------------+---------+
Total database accesses: 15

Node index seek
Finds nodes using an index seek. The node identifier and the index used is shown in the arguments of
the operator. If the index is a unique index, the operator is called NodeUniqueIndexSeek instead.
Query
MATCH (location:Location { name: "Malmo" })
RETURN location

Query Plan
+-----------------+----------------+------+---------+-------------+-----------------+
| Operator
| Estimated Rows | Rows | DB Hits | Identifiers | Other
|
+-----------------+----------------+------+---------+-------------+-----------------+
| +ProduceResults |
1 |
1 |
0 | location
| location
|
| |
+----------------+------+---------+-------------+-----------------+
| +NodeIndexSeek |
1 |
1 |
2 | location
| :Location(name) |
+-----------------+----------------+------+---------+-------------+-----------------+
Total database accesses: 2

Node index range seek
Finds nodes using an index seek where the value of the property matches a given prefix string. This
operator can be used for STARTS WITH and comparators such as <, >, <= and >=
Query
MATCH (l:Location)

261

Execution Plans
WHERE l.name STARTS WITH 'Lon'
RETURN l

Query Plan
+-----------------------+----------------+------+---------+-------------+---------------------------------------------+
| Operator
| Estimated Rows | Rows | DB Hits | Identifiers | Other
|
+-----------------------+----------------+------+---------+-------------+---------------------------------------------+
| +ProduceResults
|
26 |
1 |
0 | l
| l
|
| |
+----------------+------+---------+-------------+---------------------------------------------+
| +NodeIndexSeekByRange |
26 |
1 |
2 | l
| :Location(name STARTS WITH { AUTOSTRING0}) |
+-----------------------+----------------+------+---------+-------------+---------------------------------------------+
Total database accesses: 2

Node index scan
An index scan goes through all values stored in an index, and can be used to find all nodes with a
particular label having a specified property (e.g. exists(n.prop)).
Query
MATCH (l:Location)
WHERE HAS (l.name)
RETURN l

Query Plan
+-----------------+----------------+------+---------+-------------+-----------------+
| Operator
| Estimated Rows | Rows | DB Hits | Identifiers | Other
|
+-----------------+----------------+------+---------+-------------+-----------------+
| +ProduceResults |
10 |
10 |
0 | l
| l
|
| |
+----------------+------+---------+-------------+-----------------+
| +NodeIndexScan |
10 |
10 |
11 | l
| :Location(name) |
+-----------------+----------------+------+---------+-------------+-----------------+
Total database accesses: 11

Undirected Relationship By Id Seek
Reads one or more relationships by id from the relationship store. For each relationship, two rows are
produced with start and end nodes arranged differently.
Query
MATCH (n1)-[r]-()
WHERE id(r)= 1
RETURN r, n1

Query Plan
+---------------------------------+----------------+------+---------+-----------------+-------+
| Operator
| Estimated Rows | Rows | DB Hits | Identifiers
| Other |
+---------------------------------+----------------+------+---------+-----------------+-------+
| +ProduceResults
|
1 |
2 |
0 | n1, r
| r, n1 |
| |
+----------------+------+---------+-----------------+-------+
| +UndirectedRelationshipByIdSeek |
1 |
2 |
1 | anon[16], n1, r |
|
+---------------------------------+----------------+------+---------+-----------------+-------+
Total database accesses: 1

262

Execution Plans

16.2. Expand operators
Thes operators explore the graph by expanding graph patterns.

Expand All
Given a start node, expand-all will follow relationships coming in or out, depending on the pattern
relationship. Can also handle variable length pattern relationships.
Query
MATCH (p:Person { name: "me" })-[:FRIENDS_WITH]->(fof)
RETURN fof

Query Plan
+-----------------+----------------+------+---------+------------------+----------------------------+
| Operator
| Estimated Rows | Rows | DB Hits | Identifiers
| Other
|
+-----------------+----------------+------+---------+------------------+----------------------------+
| +ProduceResults |
0 |
1 |
0 | fof
| fof
|
| |
+----------------+------+---------+------------------+----------------------------+
| +Expand(All)
|
0 |
1 |
2 | anon[30], fof, p | (p)-[:FRIENDS_WITH]->(fof) |
| |
+----------------+------+---------+------------------+----------------------------+
| +NodeIndexSeek |
1 |
1 |
2 | p
| :Person(name)
|
+-----------------+----------------+------+---------+------------------+----------------------------+
Total database accesses: 4

Expand Into
When both the start and end node have already been found, expand-into is used to find all connecting
relationships between the two nodes.
Query
MATCH (p:Person { name: "me" })-[:FRIENDS_WITH]->(fof)-->(p)
RETURN fof

Query Plan
+-----------------+----------------+------+---------+----------------------------+----------------------------+
| Operator
| Estimated Rows | Rows | DB Hits | Identifiers
| Other
|
+-----------------+----------------+------+---------+----------------------------+----------------------------+
| +ProduceResults |
0 |
0 |
0 | fof
| fof
|
| |
+----------------+------+---------+----------------------------+----------------------------+
| +Filter
|
0 |
0 |
0 | anon[30], anon[53], fof, p | NOT(anon[30] == anon[53]) |
| |
+----------------+------+---------+----------------------------+----------------------------+
| +Expand(Into)
|
0 |
0 |
0 | anon[30], anon[53], fof, p | (p)-[:FRIENDS_WITH]->(fof) |
| |
+----------------+------+---------+----------------------------+----------------------------+
| +Expand(All)
|
0 |
0 |
1 | anon[53], fof, p
| (p)<--(fof)
|
| |
+----------------+------+---------+----------------------------+----------------------------+
| +NodeIndexSeek |
1 |
1 |
2 | p
| :Person(name)
|
+-----------------+----------------+------+---------+----------------------------+----------------------------+
Total database accesses: 3

Optional Expand All
Optional expand traverses relationships from a given node, and makes sure that predicates are
evaluated before producing rows.
If no matching relationships are found, a single row with NULL for the relationship and end node
identifier is produced.
Query

263

Execution Plans

MATCH (p:Person)
OPTIONAL MATCH (p)-[works_in:WORKS_IN]->(l)
WHERE works_in.duration > 180
RETURN p, l

Query Plan
+----------------------+----------------+------+---------+----------------+------------------------------+
| Operator
| Estimated Rows | Rows | DB Hits | Identifiers
| Other
|
+----------------------+----------------+------+---------+----------------+------------------------------+
| +ProduceResults
|
14 |
15 |
0 | l, p
| p, l
|
| |
+----------------+------+---------+----------------+------------------------------+
| +OptionalExpand(All) |
14 |
15 |
44 | l, p, works_in | (p)-[works_in:WORKS_IN]->(l) |
| |
+----------------+------+---------+----------------+------------------------------+
| +NodeByLabelScan
|
14 |
14 |
15 | p
| :Person
|
+----------------------+----------------+------+---------+----------------+------------------------------+
Total database accesses: 59

264

Execution Plans

16.3. Combining operators
Node Hash Join
Using a hash table, a node hash join joins the inputs coming from the left with the inputs coming from
the right. The join key is specified in the arguments of the operator.
Query
MATCH (andy:Person { name:'Andreas' })-[:WORKS_IN]->(loc)<-[:WORKS_IN]-(matt:Person { name:'Mattis'
})
RETURN loc

Query Plan
+------------------+----------------+------+---------+-------------------------------------+---------------------------+
| Operator
| Estimated Rows | Rows | DB Hits | Identifiers
| Other
|
+------------------+----------------+------+---------+-------------------------------------+---------------------------+
| +ProduceResults |
35 |
0 |
0 | loc
| loc
|
| |
+----------------+------+---------+-------------------------------------+---------------------------+
| +Filter
|
35 |
0 |
0 | anon[37], anon[56], andy, loc, matt | NOT(anon[37] == anon[56]) |
| |
+----------------+------+---------+-------------------------------------+---------------------------+
| +NodeHashJoin
|
35 |
0 |
0 | anon[37], anon[56], andy, loc, matt | loc
|
| |\
+----------------+------+---------+-------------------------------------+---------------------------+
| | +Expand(All)
|
35 |
0 |
0 | anon[56], loc, matt
| (matt)-[:WORKS_IN]->(loc) |
| | |
+----------------+------+---------+-------------------------------------+---------------------------+
| | +NodeIndexSeek |
1 |
0 |
1 | matt
| :Person(name)
|
| |
+----------------+------+---------+-------------------------------------+---------------------------+
| +Expand(All)
|
35 |
0 |
1 | anon[37], andy, loc
| (andy)-[:WORKS_IN]->(loc) |
| |
+----------------+------+---------+-------------------------------------+---------------------------+
| +NodeIndexSeek
|
1 |
1 |
2 | andy
| :Person(name)
|
+------------------+----------------+------+---------+-------------------------------------+---------------------------+
Total database accesses: 4

Apply
Apply works by performing a nested loop. Every row being produced on the left hand side of the Apply
operator will be fed to the Argument operator on the right hand side, and then Apply will yield the
results coming from the RHS. Apply, being a nested loop, can be seen as a warning that a better plan
was not found.
Query
MATCH (p:Person)-[:FRIENDS_WITH]->(f)
WITH p, count(f) AS fs
WHERE fs > 0
OPTIONAL MATCH (p)-[:WORKS_IN*1..2]->(city)
RETURN p, city

Query Plan
+---------------------------+----------------+------+---------+----------------------------------+--------------------------+
| Operator
| Estimated Rows | Rows | DB Hits | Identifiers
| Other
|
+---------------------------+----------------+------+---------+----------------------------------+--------------------------+
| +ProduceResults
|
1 |
2 |
0 | city, p
| p, city
|
| |
+----------------+------+---------+----------------------------------+--------------------------+
| +Apply
|
1 |
2 |
0 | anon[92], anon[126], city, fs, p |
|
| |\
+----------------+------+---------+----------------------------------+--------------------------+
| | +Apply
|
1 |
2 |
0 | anon[92], anon[126], city, fs, p |
|
| | |\
+----------------+------+---------+----------------------------------+--------------------------+
| | | +Optional
|
1 |
2 |
0 | anon[126], city, p
|
|
| | | |
+----------------+------+---------+----------------------------------+--------------------------+
| | | +VarLengthExpand(All) |
1 |
2 |
6 | anon[126], city, p
| (p)-[:WORKS_IN*]->(city) |

265

Execution Plans
| | | |
+----------------+------+---------+----------------------------------+--------------------------+
| | | +Argument
|
1 |
2 |
0 | p
|
|
| | |
+----------------+------+---------+----------------------------------+--------------------------+
| | +Filter
|
1 |
2 |
0 | anon[92], fs, p
| anon[92]
|
| | |
+----------------+------+---------+----------------------------------+--------------------------+
| | +Argument
|
1 |
2 |
0 | anon[92], fs, p
|
|
| |
+----------------+------+---------+----------------------------------+--------------------------+
| +Projection
|
1 |
2 |
0 | anon[92], fs, p
| p; fs; fs > { AUTOINT0} |
| |
+----------------+------+---------+----------------------------------+--------------------------+
| +EagerAggregation
|
1 |
2 |
0 | fs, p
| p
|
| |
+----------------+------+---------+----------------------------------+--------------------------+
| +Expand(All)
|
2 |
2 |
16 | anon[17], f, p
| (p)-[:FRIENDS_WITH]->(f) |
| |
+----------------+------+---------+----------------------------------+--------------------------+
| +NodeByLabelScan
|
14 |
14 |
15 | p
| :Person
|
+---------------------------+----------------+------+---------+----------------------------------+--------------------------+
Total database accesses: 37

Anti Semi Apply
Tests for the absence of a pattern predicate. A pattern predicate that is prepended by NOT is solved with
AntiSemiApply.
Query
MATCH (me:Person { name: "me" }),(other:Person)
WHERE NOT (me)-[:FRIENDS_WITH]->(other)
RETURN other

Query Plan
+--------------------+----------------+------+---------+---------------------+-------------------------------+
| Operator
| Estimated Rows | Rows | DB Hits | Identifiers
| Other
|
+--------------------+----------------+------+---------+---------------------+-------------------------------+
| +ProduceResults
|
4 |
13 |
0 | other
| other
|
| |
+----------------+------+---------+---------------------+-------------------------------+
| +AntiSemiApply
|
4 |
13 |
0 | me, other
|
|
| |\
+----------------+------+---------+---------------------+-------------------------------+
| | +Expand(Into)
|
0 |
0 |
50 | anon[73], me, other | (me)-[:FRIENDS_WITH]->(other) |
| | |
+----------------+------+---------+---------------------+-------------------------------+
| | +Argument
|
14 |
14 |
0 | me, other
|
|
| |
+----------------+------+---------+---------------------+-------------------------------+
| +CartesianProduct |
14 |
14 |
0 | me, other
|
|
| |\
+----------------+------+---------+---------------------+-------------------------------+
| | +NodeByLabelScan |
14 |
14 |
15 | other
| :Person
|
| |
+----------------+------+---------+---------------------+-------------------------------+
| +NodeIndexSeek
|
1 |
1 |
2 | me
| :Person(name)
|
+--------------------+----------------+------+---------+---------------------+-------------------------------+
Total database accesses: 67

Let Anti Semi Apply
Tests for the absence of a pattern predicate. When a query contains multiple pattern predicates
LetSemiApply will be used to evaluate the first of these. It will record the result of evaluating the
predicate but will leave any filtering to another operator. The following query will find all the people
who don’t have anyfriend or who work somewhere. The LetSemiApply operator will be used to check for
the absence of the FRIENDS_WITH relationship from each person.
Query
MATCH (other:Person)
WHERE NOT ((other)-[:FRIENDS_WITH]->()) OR (other)-[:WORKS_IN]->()
RETURN other

266

Execution Plans
Query Plan
+--------------------+----------------+------+---------+---------------------------+-----------------------------+
| Operator
| Estimated Rows | Rows | DB Hits | Identifiers
| Other
|
+--------------------+----------------+------+---------+---------------------------+-----------------------------+
| +ProduceResults
|
11 |
14 |
0 | other
| other
|
| |
+----------------+------+---------+---------------------------+-----------------------------+
| +SelectOrSemiApply |
11 |
14 |
0 | anon[42], other
| anon[42]
|
| |\
+----------------+------+---------+---------------------------+-----------------------------+
| | +Expand(All)
|
15 |
0 |
2 | anon[82], anon[96], other | (other)-[:WORKS_IN]->()
|
| | |
+----------------+------+---------+---------------------------+-----------------------------+
| | +Argument
|
14 |
2 |
0 | other
|
|
| |
+----------------+------+---------+---------------------------+-----------------------------+
| +LetAntiSemiApply |
14 |
14 |
0 | anon[42], other
|
|
| |\
+----------------+------+---------+---------------------------+-----------------------------+
| | +Expand(All)
|
2 |
0 |
14 | anon[50], anon[68], other | (other)-[:FRIENDS_WITH]->() |
| | |
+----------------+------+---------+---------------------------+-----------------------------+
| | +Argument
|
14 |
14 |
0 | other
|
|
| |
+----------------+------+---------+---------------------------+-----------------------------+
| +NodeByLabelScan
|
14 |
14 |
15 | other
| :Person
|
+--------------------+----------------+------+---------+---------------------------+-----------------------------+
Total database accesses: 31

Let Semi Apply
Tests for the existence of a pattern predicate. When a query contains multiple pattern predicates
LetSemiApply will be used to evaluate the first of these. It will record the result of evaluating the
predicate but will leave any filtering to a another operator. The following query will find all the people
who have a friend or who work somewhere. The LetSemiApply operator will be used to check for the
existence of the FRIENDS_WITH relationship from each person.
Query
MATCH (other:Person)
WHERE (other)-[:FRIENDS_WITH]->() OR (other)-[:WORKS_IN]->()
RETURN other

Query Plan
+--------------------+----------------+------+---------+---------------------------+-----------------------------+
| Operator
| Estimated Rows | Rows | DB Hits | Identifiers
| Other
|
+--------------------+----------------+------+---------+---------------------------+-----------------------------+
| +ProduceResults
|
13 |
14 |
0 | other
| other
|
| |
+----------------+------+---------+---------------------------+-----------------------------+
| +SelectOrSemiApply |
13 |
14 |
0 | anon[38], other
| anon[38]
|
| |\
+----------------+------+---------+---------------------------+-----------------------------+
| | +Expand(All)
|
15 |
0 |
12 | anon[77], anon[91], other | (other)-[:WORKS_IN]->()
|
| | |
+----------------+------+---------+---------------------------+-----------------------------+
| | +Argument
|
14 |
12 |
0 | other
|
|
| |
+----------------+------+---------+---------------------------+-----------------------------+
| +LetSemiApply
|
14 |
14 |
0 | anon[38], other
|
|
| |\
+----------------+------+---------+---------------------------+-----------------------------+
| | +Expand(All)
|
2 |
0 |
14 | anon[46], anon[64], other | (other)-[:FRIENDS_WITH]->() |
| | |
+----------------+------+---------+---------------------------+-----------------------------+
| | +Argument
|
14 |
14 |
0 | other
|
|
| |
+----------------+------+---------+---------------------------+-----------------------------+
| +NodeByLabelScan
|
14 |
14 |
15 | other
| :Person
|
+--------------------+----------------+------+---------+---------------------------+-----------------------------+
Total database accesses: 41

Select Or Anti Semi Apply
Tests for the absence of a pattern predicate and evaluates a predicate.
267

Execution Plans
Query
MATCH (other:Person)
WHERE other.age > 25 OR NOT (other)-[:FRIENDS_WITH]->()
RETURN other

Query Plan
+------------------------+----------------+------+---------+---------------------------+-----------------------------+
| Operator
| Estimated Rows | Rows | DB Hits | Identifiers
| Other
|
+------------------------+----------------+------+---------+---------------------------+-----------------------------+
| +ProduceResults
|
4 |
12 |
0 | other
| other
|
| |
+----------------+------+---------+---------------------------+-----------------------------+
| +SelectOrAntiSemiApply |
4 |
12 |
28 | other
| other.age > { AUTOINT0}
|
| |\
+----------------+------+---------+---------------------------+-----------------------------+
| | +Expand(All)
|
2 |
0 |
14 | anon[68], anon[86], other | (other)-[:FRIENDS_WITH]->() |
| | |
+----------------+------+---------+---------------------------+-----------------------------+
| | +Argument
|
14 |
14 |
0 | other
|
|
| |
+----------------+------+---------+---------------------------+-----------------------------+
| +NodeByLabelScan
|
14 |
14 |
15 | other
| :Person
|
+------------------------+----------------+------+---------+---------------------------+-----------------------------+
Total database accesses: 57

Select Or Semi Apply
Tests for the existence of a pattern predicate and evaluates a predicate. This operator allows for the
mixing of normal predicates and pattern predicates that check for the existing of a pattern. First the
normal expression predicate is evaluated, and only if it returns FALSE the costly pattern predicate
evaluation is performed.
Query
MATCH (other:Person)
WHERE other.age > 25 OR (other)-[:FRIENDS_WITH]->()
RETURN other

Query Plan
+--------------------+----------------+------+---------+---------------------------+-----------------------------+
| Operator
| Estimated Rows | Rows | DB Hits | Identifiers
| Other
|
+--------------------+----------------+------+---------+---------------------------+-----------------------------+
| +ProduceResults
|
11 |
2 |
0 | other
| other
|
| |
+----------------+------+---------+---------------------------+-----------------------------+
| +SelectOrSemiApply |
11 |
2 |
28 | other
| other.age > { AUTOINT0}
|
| |\
+----------------+------+---------+---------------------------+-----------------------------+
| | +Expand(All)
|
2 |
0 |
14 | anon[64], anon[82], other | (other)-[:FRIENDS_WITH]->() |
| | |
+----------------+------+---------+---------------------------+-----------------------------+
| | +Argument
|
14 |
14 |
0 | other
|
|
| |
+----------------+------+---------+---------------------------+-----------------------------+
| +NodeByLabelScan
|
14 |
14 |
15 | other
| :Person
|
+--------------------+----------------+------+---------+---------------------------+-----------------------------+
Total database accesses: 57

Semi Apply
Tests for the existence of a pattern predicate. SemiApply takes a row from it’s child operator and feeds
it to the Argument operator on the right hand side of SemiApply. If the right hand side operator tree
yields at least one row, the row from the left hand side is yielded by the SemiApply operator. This makes
SemiApply a filtering operator, used mostly for pattern predicates in queries.
Query
MATCH (other:Person)

268

Execution Plans
WHERE (other)-[:FRIENDS_WITH]->()
RETURN other

Query Plan
+------------------+----------------+------+---------+---------------------------+-----------------------------+
| Operator
| Estimated Rows | Rows | DB Hits | Identifiers
| Other
|
+------------------+----------------+------+---------+---------------------------+-----------------------------+
| +ProduceResults |
11 |
2 |
0 | other
| other
|
| |
+----------------+------+---------+---------------------------+-----------------------------+
| +SemiApply
|
11 |
2 |
0 | other
|
|
| |\
+----------------+------+---------+---------------------------+-----------------------------+
| | +Expand(All)
|
2 |
0 |
14 | anon[46], anon[64], other | (other)-[:FRIENDS_WITH]->() |
| | |
+----------------+------+---------+---------------------------+-----------------------------+
| | +Argument
|
14 |
14 |
0 | other
|
|
| |
+----------------+------+---------+---------------------------+-----------------------------+
| +NodeByLabelScan |
14 |
14 |
15 | other
| :Person
|
+------------------+----------------+------+---------+---------------------------+-----------------------------+
Total database accesses: 29

Triadic
Triadic is used to solve triangular queries, such as the very common "find my friend-of-friends that are
not already my friend". It does so by putting all the "friends" in a set, and use that set to check if the
friend-of-friends are already connected to me.
Query
MATCH (me:Person)-[:FRIENDS_WITH]-()-[:FRIENDS_WITH]-(other)
WHERE NOT (me)-[:FRIENDS_WITH]-(other)
RETURN other

Query Plan
+-------------------+----------------+------+---------+-----------------------------------------+----------------------------+
| Operator
| Estimated Rows | Rows | DB Hits | Identifiers
| Other
|
+-------------------+----------------+------+---------+-----------------------------------------+----------------------------+
| +ProduceResults
|
0 |
2 |
0 | other
| other
|
| |
+----------------+------+---------+-----------------------------------------+----------------------------+
| +TriadicSelection |
0 |
2 |
0 | anon[18], anon[35], anon[37], me, other | me, anon[35], other
|
| |\
+----------------+------+---------+-----------------------------------------+----------------------------+
| | +Filter
|
0 |
2 |
0 | anon[18], anon[35], anon[37], me, other | NOT(anon[18] == anon[37]) |
| | |
+----------------+------+---------+-----------------------------------------+----------------------------+
| | +Expand(All)
|
0 |
6 |
10 | anon[18], anon[35], anon[37], me, other | ()-[:FRIENDS_WITH]-(other) |
| | |
+----------------+------+---------+-----------------------------------------+----------------------------+
| | +Argument
|
4 |
4 |
0 | anon[18], anon[35], me
|
|
| |
+----------------+------+---------+-----------------------------------------+----------------------------+
| +Expand(All)
|
4 |
4 |
18 | anon[18], anon[35], me
| (me)-[:FRIENDS_WITH]-()
|
| |
+----------------+------+---------+-----------------------------------------+----------------------------+
| +NodeByLabelScan |
14 |
14 |
15 | me
| :Person
|
+-------------------+----------------+------+---------+-----------------------------------------+----------------------------+
Total database accesses: 43

269

Execution Plans

16.4. Row operators
These operators take rows produced by another operator and transform them to a different set of rows

Eager
For isolation purposes this operator makes sure that operations that affect subsequent operations are
executed fully for the whole dataset before continuing execution. Otherwise it could trigger endless
loops, matching data again, that was just created. The Eager operator can cause high memory usage
when importing data or migrating graph structures. In such cases split up your operations into simpler
steps e.g. you can import nodes and relationships separately. Alternatively return the records to be
updated and run an update statement afterwards.
Query
MATCH (p:Person)
MERGE (:Person:Clone { name:p.name })

Query Plan
+--------------+------+---------+-------------+----------------------------------+
| Operator
| Rows | DB Hits | Identifiers | Other
|
+--------------+------+---------+-------------+----------------------------------+
| +EmptyResult |
0 |
0 |
|
|
| |
+------+---------+-------------+----------------------------------+
| +UpdateGraph |
14 |
195 | anon[23], p | MergeNode; p.name; :Person(name) |
| |
+------+---------+-------------+----------------------------------+
| +Eager
|
14 |
0 | p
|
|
| |
+------+---------+-------------+----------------------------------+
| +NodeByLabel |
14 |
15 | p
| :Person
|
+--------------+------+---------+-------------+----------------------------------+
Total database accesses: 210

Distinct
Removes duplicate rows from the incoming stream of rows.
Query
MATCH (l:Location)<-[:WORKS_IN]-(p:Person)
RETURN DISTINCT l

Query Plan
+------------------+----------------+------+---------+----------------+----------------------+
| Operator
| Estimated Rows | Rows | DB Hits | Identifiers
| Other
|
+------------------+----------------+------+---------+----------------+----------------------+
| +ProduceResults |
14 |
6 |
0 | l
| l
|
| |
+----------------+------+---------+----------------+----------------------+
| +Distinct
|
14 |
6 |
0 | l
| l
|
| |
+----------------+------+---------+----------------+----------------------+
| +Filter
|
15 |
15 |
15 | anon[19], l, p | p:Person
|
| |
+----------------+------+---------+----------------+----------------------+
| +Expand(All)
|
15 |
15 |
25 | anon[19], l, p | (l)<-[:WORKS_IN]-(p) |
| |
+----------------+------+---------+----------------+----------------------+
| +NodeByLabelScan |
10 |
10 |
11 | l
| :Location
|
+------------------+----------------+------+---------+----------------+----------------------+
Total database accesses: 51

Eager Aggregation
Eagerly loads underlying results and stores it in a hash-map, using the grouping keys as the keys for the
map.
270

Execution Plans
Query
MATCH (l:Location)<-[:WORKS_IN]-(p:Person)
RETURN l.name AS location, COLLECT(p.name) AS people

Query Plan
+-------------------+----------------+------+---------+--------------------------+----------------------+
| Operator
| Estimated Rows | Rows | DB Hits | Identifiers
| Other
|
+-------------------+----------------+------+---------+--------------------------+----------------------+
| +ProduceResults
|
4 |
6 |
0 | location, people
| location, people
|
| |
+----------------+------+---------+--------------------------+----------------------+
| +EagerAggregation |
4 |
6 |
15 | location, people
| location
|
| |
+----------------+------+---------+--------------------------+----------------------+
| +Projection
|
15 |
15 |
15 | anon[19], l, location, p | l.name; p
|
| |
+----------------+------+---------+--------------------------+----------------------+
| +Filter
|
15 |
15 |
15 | anon[19], l, p
| p:Person
|
| |
+----------------+------+---------+--------------------------+----------------------+
| +Expand(All)
|
15 |
15 |
25 | anon[19], l, p
| (l)<-[:WORKS_IN]-(p) |
| |
+----------------+------+---------+--------------------------+----------------------+
| +NodeByLabelScan |
10 |
10 |
11 | l
| :Location
|
+-------------------+----------------+------+---------+--------------------------+----------------------+
Total database accesses: 81

Filter
Filters each row coming from the child operator, only passing through rows that evaluate the predicates
to TRUE.
Query
MATCH (p:Person)
WHERE p.name =~ "^a.*"
RETURN p

Query Plan
+------------------+----------------+------+---------+-------------+-----------------------------+
| Operator
| Estimated Rows | Rows | DB Hits | Identifiers | Other
|
+------------------+----------------+------+---------+-------------+-----------------------------+
| +ProduceResults |
14 |
0 |
0 | p
| p
|
| |
+----------------+------+---------+-------------+-----------------------------+
| +Filter
|
14 |
0 |
14 | p
| p.name ~= /{ AUTOSTRING0}/ |
| |
+----------------+------+---------+-------------+-----------------------------+
| +NodeByLabelScan |
14 |
14 |
15 | p
| :Person
|
+------------------+----------------+------+---------+-------------+-----------------------------+
Total database accesses: 29

Limit
Returns the first n rows from the incoming input.
Query
MATCH (p:Person)
RETURN p
LIMIT 3

Query Plan
+------------------+----------------+------+---------+-------------+------------+
| Operator
| Estimated Rows | Rows | DB Hits | Identifiers | Other
|
+------------------+----------------+------+---------+-------------+------------+
| +ProduceResults |
3 |
3 |
0 | p
| p
|

271

Execution Plans
| |
+----------------+------+---------+-------------+------------+
| +Limit
|
3 |
3 |
0 | p
| Literal(3) |
| |
+----------------+------+---------+-------------+------------+
| +NodeByLabelScan |
14 |
3 |
4 | p
| :Person
|
+------------------+----------------+------+---------+-------------+------------+
Total database accesses: 4

Projection
For each row from its input, projection evaluates a set of expressions and produces a row with the
results of the expressions.
Query
RETURN "hello" AS greeting

Query Plan
+-----------------+----------------+------+---------+-------------+-----------------+
| Operator
| Estimated Rows | Rows | DB Hits | Identifiers | Other
|
+-----------------+----------------+------+---------+-------------+-----------------+
| +ProduceResults |
1 |
1 |
0 | greeting
| greeting
|
| |
+----------------+------+---------+-------------+-----------------+
| +Projection
|
1 |
1 |
0 | greeting
| { AUTOSTRING0} |
+-----------------+----------------+------+---------+-------------+-----------------+
Total database accesses: 0

Skip
Skips n rows from the incoming rows
Query
MATCH (p:Person)
RETURN p
ORDER BY p.id
SKIP 1

Query Plan
+------------------+----------------+------+---------+--------------------------+-----------------------+
| Operator
| Estimated Rows | Rows | DB Hits | Identifiers
| Other
|
+------------------+----------------+------+---------+--------------------------+-----------------------+
| +ProduceResults |
14 |
13 |
0 | p
| p
|
| |
+----------------+------+---------+--------------------------+-----------------------+
| +Projection
|
14 |
13 |
0 | anon[35], anon[59], p, p | anon[35]
|
| |
+----------------+------+---------+--------------------------+-----------------------+
| +Skip
|
14 |
13 |
0 | anon[35], anon[59], p
| { AUTOINT0}
|
| |
+----------------+------+---------+--------------------------+-----------------------+
| +Sort
|
14 |
14 |
0 | anon[35], anon[59], p
| anon[59]
|
| |
+----------------+------+---------+--------------------------+-----------------------+
| +Projection
|
14 |
14 |
28 | anon[35], anon[59], p
| anon[35]; anon[35].id |
| |
+----------------+------+---------+--------------------------+-----------------------+
| +Projection
|
14 |
14 |
0 | anon[35], p
| p
|
| |
+----------------+------+---------+--------------------------+-----------------------+
| +NodeByLabelScan |
14 |
14 |
15 | p
| :Person
|
+------------------+----------------+------+---------+--------------------------+-----------------------+
Total database accesses: 43

Sort
Sorts rows by a provided key.
272

Execution Plans
Query
MATCH (p:Person)
RETURN p
ORDER BY p.name

Query Plan
+------------------+----------------+------+---------+--------------------------+-------------------------+
| Operator
| Estimated Rows | Rows | DB Hits | Identifiers
| Other
|
+------------------+----------------+------+---------+--------------------------+-------------------------+
| +ProduceResults |
14 |
14 |
0 | p
| p
|
| |
+----------------+------+---------+--------------------------+-------------------------+
| +Projection
|
14 |
14 |
0 | anon[24], anon[37], p, p | anon[24]
|
| |
+----------------+------+---------+--------------------------+-------------------------+
| +Sort
|
14 |
14 |
0 | anon[24], anon[37], p
| anon[37]
|
| |
+----------------+------+---------+--------------------------+-------------------------+
| +Projection
|
14 |
14 |
14 | anon[24], anon[37], p
| anon[24]; anon[24].name |
| |
+----------------+------+---------+--------------------------+-------------------------+
| +Projection
|
14 |
14 |
0 | anon[24], p
| p
|
| |
+----------------+------+---------+--------------------------+-------------------------+
| +NodeByLabelScan |
14 |
14 |
15 | p
| :Person
|
+------------------+----------------+------+---------+--------------------------+-------------------------+
Total database accesses: 29

Top
Returns the first n rows sorted by a provided key. The physical operator is called Top. Instead of sorting
the whole input, only the top X rows are kept.
Query
MATCH (p:Person)
RETURN p
ORDER BY p.name
LIMIT 2

Query Plan
+------------------+----------------+------+---------+--------------------------+-------------------------+
| Operator
| Estimated Rows | Rows | DB Hits | Identifiers
| Other
|
+------------------+----------------+------+---------+--------------------------+-------------------------+
| +ProduceResults |
2 |
2 |
0 | p
| p
|
| |
+----------------+------+---------+--------------------------+-------------------------+
| +Projection
|
2 |
2 |
0 | anon[24], anon[37], p, p | anon[24]
|
| |
+----------------+------+---------+--------------------------+-------------------------+
| +Top
|
2 |
2 |
0 | anon[24], anon[37], p
| Literal(2);
|
| |
+----------------+------+---------+--------------------------+-------------------------+
| +Projection
|
14 |
14 |
14 | anon[24], anon[37], p
| anon[24]; anon[24].name |
| |
+----------------+------+---------+--------------------------+-------------------------+
| +Projection
|
14 |
14 |
0 | anon[24], p
| p
|
| |
+----------------+------+---------+--------------------------+-------------------------+
| +NodeByLabelScan |
14 |
14 |
15 | p
| :Person
|
+------------------+----------------+------+---------+--------------------------+-------------------------+
Total database accesses: 29

Union
Union concatenates the results from the right plan after the results of the left plan.
Query
MATCH (p:Location)

273

Execution Plans
RETURN p.name
UNION ALL MATCH (p:Country)
RETURN p.name

Query Plan
+--------------------+----------------+------+---------+-------------+-----------+
| Operator
| Estimated Rows | Rows | DB Hits | Identifiers | Other
|
+--------------------+----------------+------+---------+-------------+-----------+
| +ProduceResults
|
10 |
11 |
0 | p.name
| p.name
|
| |
+----------------+------+---------+-------------+-----------+
| +Union
|
10 |
11 |
0 | p.name
|
|
| |\
+----------------+------+---------+-------------+-----------+
| | +Projection
|
1 |
1 |
1 | p, p.name
| p.name
|
| | |
+----------------+------+---------+-------------+-----------+
| | +NodeByLabelScan |
1 |
1 |
2 | p
| :Country |
| |
+----------------+------+---------+-------------+-----------+
| +Projection
|
10 |
10 |
10 | p, p.name
| p.name
|
| |
+----------------+------+---------+-------------+-----------+
| +NodeByLabelScan
|
10 |
10 |
11 | p
| :Location |
+--------------------+----------------+------+---------+-------------+-----------+
Total database accesses: 24

Unwind
Takes a collection of values and returns one row per item in the collection.
Query
UNWIND range(1,5) AS value
RETURN value;

Query Plan
+-----------------+----------------+------+---------+-------------+-------+
| Operator
| Estimated Rows | Rows | DB Hits | Identifiers | Other |
+-----------------+----------------+------+---------+-------------+-------+
| +ProduceResults |
10 |
5 |
0 | value
| value |
| |
+----------------+------+---------+-------------+-------+
| +UNWIND
|
10 |
5 |
0 | value
|
|
| |
+----------------+------+---------+-------------+-------+
| +Argument
|
1 |
1 |
0 |
|
|
+-----------------+----------------+------+---------+-------------+-------+
Total database accesses: 0

274

Execution Plans

16.5. Update Operators
These operators are used in queries that update the graph.

Constraint Operation
Creates a constraint on a (label,property) pair. The following query will create a unique constraint on
the name property of nodes with the Country label.
Query
CREATE CONSTRAINT ON (c:Country) ASSERT c.name IS UNIQUE

Query Plan
+----------------------+------+---------+
| Operator
| Rows | DB Hits |
+----------------------+------+---------+
| +ConstraintOperation |
0 |
3 |
+----------------------+------+---------+
Total database accesses: 3

Empty Result
Eagerly loads everything coming in to the EmptyResult operator and discards it.
Query
CREATE (:Person)

Query Plan
+--------------+------+---------+-------------+------------+
| Operator
| Rows | DB Hits | Identifiers | Other
|
+--------------+------+---------+-------------+------------+
| +EmptyResult |
0 |
0 |
|
|
| |
+------+---------+-------------+------------+
| +UpdateGraph |
1 |
2 | anon[7]
| CreateNode |
+--------------+------+---------+-------------+------------+
Total database accesses: 2

Update Graph
Applies updates to the graph.
Query
CREATE (:Person { name: "Alistair" })

Query Plan
+--------------+------+---------+-------------+------------+
| Operator
| Rows | DB Hits | Identifiers | Other
|
+--------------+------+---------+-------------+------------+
| +EmptyResult |
0 |
0 |
|
|
| |
+------+---------+-------------+------------+
| +UpdateGraph |
1 |
4 | anon[7]
| CreateNode |
+--------------+------+---------+-------------+------------+
Total database accesses: 4

Merge Into
When both the start and end node have already been found, merge-into is used to find all connecting
relationships or creating a new relationship between the two nodes.
275

Execution Plans
Query
MATCH (p:Person { name: "me" }),(f:Person { name: "Andres" })
MERGE (p)-[:FRIENDS_WITH]->(f)

Query Plan
+--------------+------+---------+----------------+--------------------------------+
| Operator
| Rows | DB Hits | Identifiers
| Other
|
+--------------+------+---------+----------------+--------------------------------+
| +EmptyResult |
0 |
0 |
|
|
| |
+------+---------+----------------+--------------------------------+
| +Merge(Into) |
1 |
5 | anon[68], f, p | (p)-[:FRIENDS_WITH]->(f)
|
| |
+------+---------+----------------+--------------------------------+
| +SchemaIndex |
1 |
2 | f, p
| { AUTOSTRING1}; :Person(name) |
| |
+------+---------+----------------+--------------------------------+
| +SchemaIndex |
1 |
2 | p
| { AUTOSTRING0}; :Person(name) |
+--------------+------+---------+----------------+--------------------------------+
Total database accesses: 9

276

Part IV. Reference

The reference part is the authoritative source for details on Neo4j usage. It covers details on
capabilities, transactions, indexing and queries among other topics.

17. Capabilities .......................................................................................................................................
17.1. Data Security .........................................................................................................................
17.2. Data Integrity .........................................................................................................................
17.3. Data Integration ....................................................................................................................
17.4. Availability and Reliability .....................................................................................................
17.5. Capacity .................................................................................................................................
18. Transaction Management ................................................................................................................
18.1. Interaction cycle ....................................................................................................................
18.2. Isolation levels .......................................................................................................................
18.3. Default locking behavior .......................................................................................................
18.4. Deadlocks ..............................................................................................................................
18.5. Delete semantics ...................................................................................................................
18.6. Creating unique nodes ..........................................................................................................
18.7. Transaction events ................................................................................................................
19. Data Import ......................................................................................................................................
20. Graph Algorithms .............................................................................................................................
21. REST API ...........................................................................................................................................
21.1. Transactional Cypher HTTP endpoint ...................................................................................
21.2. Neo4j Status Codes ...............................................................................................................
21.3. REST API Authentication and Authorization .........................................................................
21.4. Service root ...........................................................................................................................
21.5. Streaming ..............................................................................................................................
21.6. Legacy Cypher HTTP endpoint .............................................................................................
21.7. Property values .....................................................................................................................
21.8. Nodes ....................................................................................................................................
21.9. Relationships .........................................................................................................................
21.10. Relationship types ...............................................................................................................
21.11. Node properties ..................................................................................................................
21.12. Relationship properties .......................................................................................................
21.13. Node labels .........................................................................................................................
21.14. Node degree ........................................................................................................................
21.15. Indexing ...............................................................................................................................
21.16. Constraints ..........................................................................................................................
21.17. Traversals .............................................................................................................................
21.18. Graph Algorithms ................................................................................................................
21.19. Batch operations .................................................................................................................
21.20. Legacy indexing ...................................................................................................................
21.21. Unique Indexing ..................................................................................................................
21.22. WADL Support .....................................................................................................................
21.23. Using the REST API from WebLogic ....................................................................................
22. Deprecations ....................................................................................................................................

278

279
280
281
282
283
284
285
286
287
288
289
292
293
294
295
296
297
298
307
312
316
317
318
332
333
338
349
350
354
360
366
368
370
374
401
408
416
422
432
433
434

Chapter 17. Capabilities

279

Capabilities

17.1. Data Security
Some data may need to be protected from unauthorized access (e.g., theft, modification). Neo4j
does not deal with data encryption explicitly, but supports all means built into the Java programming
language and the JVM to protect data by encrypting it before storing.
Furthermore, data can be easily secured by running on an encrypted datastore at the file system level.
Finally, data protection should be considered in the upper layers of the surrounding system in order to
prevent problems with scraping, malicious data insertion, and other threats.

280

Capabilities

17.2. Data Integrity
In order to keep data consistent, a good database needs mechanisms and structures that guarantee
the integrity of all stored data. In Neo4j, data integrity is guaranteed both for graph elements (Nodes,
Relationships and Properties) and for non-graph data, such as the indexes. Neo4j’s transactional
architecture ensures that data is protected and provides for fast recovery from an unexpected failure,
without the need to rebuild internal indexes or other costly operations.

281

Capabilities

17.3. Data Integration
Most enterprises rely primarily on relational databases to store their data, but this may cause
performance limitations. In some of these cases, Neo4j can be used as an extension to supplement
search/lookup for faster decision making. However, in any situation where multiple data repositories
contain the same data, synchronization can be an issue.
In some applications, it is acceptable for the search platform to be slightly out of sync with the
relational database. In others, tight data integrity (eg., between Neo4j and RDBMS) is necessary.
Typically, this has to be addressed for data changing in real-time and for bulk data changes happening
in the RDBMS.
A few strategies for synchronizing integrated data follows.

Event-based Synchronization
In this scenario, all data stores, both RDBMS and Neo4j, are fed with domain-specific events via
an event bus. Thus, the data held in the different backends is not actually synchronized but rather
replicated.

Periodic Synchronization
Another viable scenario is the periodic export of the latest changes in the RDBMS to Neo4j via some
form of SQL query. This allows a small amount of latency in the synchronization, but has the advantage
of using the RDBMS as the master for all data purposes. The same process can be applied with Neo4j
as the master data source.

Periodic Full Export/Import of Data
Using the Batch Inserter tools for Neo4j, even large amounts of data can be imported into the database
in very short times. Thus, a full export from the RDBMS and import into Neo4j becomes possible. If the
propagation lag between the RDBMS and Neo4j is not a big issue, this is a very viable solution.

282

Capabilities

17.4. Availability and Reliability
Most mission-critical systems require the database subsystem to be accessible at all times. Neo4j
ensures availability and reliability through a few different strategies.

Operational Availability
In order not to create a single point of failure, Neo4j supports different approaches which provide
transparent fallback and/or recovery from failures.

Online backup (Cold spare)
In this approach, a single instance of the master database is used, with Online Backup enabled. In
case of a failure, the backup files can be mounted onto a new Neo4j instance and reintegrated into the
application.

Online Backup High Availability (Hot spare)
Here, a Neo4j "backup" instance listens to online transfers of changes from the master. In the event of
a failure of the master, the backup is already running and can directly take over the load.

High Availability cluster
This approach uses a cluster of database instances, with one (read/write) master and a number of
(read-only) slaves. Failing slaves can simply be restarted and brought back online. Alternatively, a new
slave may be added by cloning an existing one. Should the master instance fail, a new master will be
elected by the remaining cluster nodes.

Disaster Recovery/ Resiliency
In cases of a breakdown of major part of the IT infrastructure, there need to be mechanisms in place
that enable the fast recovery and regrouping of the remaining services and servers. In Neo4j, there are
different components that are suitable to be part of a disaster recovery strategy.

Prevention
• Online Backup High Availability to other locations outside the current data center.
• Online Backup to different file system locations: this is a simpler form of backup, applying changes
directly to backup files; it is thus more suited for local backup scenarios.
• Neo4j High Availability cluster: a cluster of one write-master Neo4j server and a number of readslaves, getting transaction logs from the master. Write-master failover is handled by quorum election
among the read-slaves for a new master.

Detection
• SNMP and JMX monitoring can be used for the Neo4j database.

Correction
• Online Backup: A new Neo4j server can be started directly on the backed-up files and take over new
requests.
• Neo4j High Availability cluster: A broken Neo4j read slave can be reinserted into the cluster, getting
the latest updates from the master. Alternatively, a new server can be inserted by copying an existing
server and applying the latest updates to it.

283

Capabilities

17.5. Capacity
File Sizes
Neo4j relies on Java’s Non-blocking I/O subsystem for all file handling. Furthermore, while the storage
file layout is optimized for interconnected data, Neo4j does not require raw devices. Thus, file sizes are
only limited by the underlying operating system’s capacity to handle large files. Physically, there is no
built-in limit of the file handling capacity in Neo4j.
Neo4j has a built-in page cache, that will cache the contents of the storage files. If there is not enough
RAM to keep the storage files resident, then Neo4j will page parts of the files in and out as necessary,
while keeping the most popular parts of the files resident at all times. Thus, ACID speed degrades
gracefully as RAM becomes the limiting factor.

Read speed
Enterprises want to optimize the use of hardware to deliver the maximum business value from
available resources. Neo4j’s approach to reading data provides the best possible usage of all available
hardware resources. Neo4j does not block or lock any read operations; thus, there is no danger for
deadlocks in read operations and no need for read transactions. With a threaded read access to the
database, queries can be run simultaneously on as many processors as may be available. This provides
very good scale-up scenarios with bigger servers.

Write speed
Write speed is a consideration for many enterprise applications. However, there are two different
scenarios:
1. sustained continuous operation and
2. bulk access (e.g., backup, initial or batch loading).
To support the disparate requirements of these scenarios, Neo4j supports two modes of writing to the
storage layer.
In transactional, ACID-compliant normal operation, isolation level is maintained and read operations
can occur at the same time as the writing process. At every commit, the data is persisted to disk and
can be recovered to a consistent state upon system failures. This requires disk write access and a real
flushing of data. Thus, the write speed of Neo4j on a single server in continuous mode is limited by the
I/O capacity of the hardware. Consequently, the use of fast SSDs is highly recommended for production
scenarios.
Neo4j has a Batch Inserter that operates directly on the store files. This mode does not provide
transactional security, so it can only be used when there is a single write thread. Because data is written
sequentially, and never flushed to the logical logs, huge performance boosts are achieved. The Batch
Inserter is optimized for non-transactional bulk import of large amounts of data.

Data size
In Neo4j, data size is mainly limited by the address space of the primary keys for Nodes, Relationships,
Properties and RelationshipTypes. Currently, the address space is as follows:
nodes
relationships

35

2

35

2

36

(∼ 34 billion)
(∼ 34 billion)
38

properties

2 to 2 depending on property types (maximum
∼ 274 billion, always at least ∼ 68 billion)

relationship types

2

16

284

(∼ 65 000)

Chapter 18. Transaction Management
In order to fully maintain data integrity and ensure good transactional behavior, Neo4j supports the
ACID properties:
•
•
•
•

atomicity: If any part of a transaction fails, the database state is left unchanged.
consistency: Any transaction will leave the database in a consistent state.
isolation: During a transaction, modified data cannot be accessed by other operations.
durability: The DBMS can always recover the results of a committed transaction.

Specifically:
• All database operations that access the graph, indexes, or the schema must be performed in a
transaction.
• The default isolation level is READ_COMMITTED.
• Data retrieved by traversals is not protected from modification by other transactions.
• Non-repeatable reads may occur (i.e., only write locks are acquired and held until the end of the
transaction).
• One can manually acquire write locks on nodes and relationships to achieve higher level of isolation
(SERIALIZABLE).
• Locks are acquired at the Node and Relationship level.
• Deadlock detection is built into the core transaction management.

285

Transaction Management

18.1. Interaction cycle
All database operations that access the graph, indexes, or the schema must be performed in a
transaction. Transactions are thread confined and can be nested as “flat nested transactions”. Flat
nested transactions means that all nested transactions are added to the scope of the top level
transaction. A nested transaction can mark the top level transaction for rollback, meaning the entire
transaction will be rolled back. To only rollback changes made in a nested transaction is not possible.
The interaction cycle of working with transactions looks like this:
1.
2.
3.
4.

Begin a transaction.
Perform database operations.
Mark the transaction as successful or not.
Finish the transaction.

It is very important to finish each transaction. The transaction will not release the locks or memory it
has acquired until it has been finished. The idiomatic use of transactions in Neo4j is to use a try-finally
block, starting the transaction and then try to perform the graph operations. The last operation in the
try block should mark the transaction as successful while the finally block should finish the transaction.
Finishing the transaction will perform commit or rollback depending on the success status.

Caution

All modifications performed in a transaction are kept in memory. This means that very large
updates have to be split into several top level transactions to avoid running out of memory.
It must be a top level transaction since splitting up the work in many nested transactions will
just add all the work to the top level transaction.
In an environment that makes use of thread pooling other errors may occur when failing to finish a
transaction properly. Consider a leaked transaction that did not get finished properly. It will be tied
to a thread and when that thread gets scheduled to perform work starting a new (what looks to be a)
top level transaction it will actually be a nested transaction. If the leaked transaction state is “marked
for rollback” (which will happen if a deadlock was detected) no more work can be performed on that
transaction. Trying to do so will result in error on each call to a write operation.

286

Transaction Management

18.2. Isolation levels
Transactions in Neo4j use a read-committed isolation level, which means they will see data as soon as it
has been committed and will not see data in other transactions that have not yet been committed. This
type of isolation is weaker than serialization but offers significant performance advantages whilst being
sufficient for the overwhelming majority of cases.
In addition, the Neo4j Java API (see Part VII, “Advanced Usage” [561]) enables explicit locking of nodes
and relationships. Using locks gives the opportunity to simulate the effects of higher levels of isolation
by obtaining and releasing locks explicitly. For example, if a write lock is taken on a common node or
relationship, then all transactions will serialize on that lock — giving the effect of a serialization isolation
level.

Lost Updates in Cypher
In Cypher it is possible to acquire write locks to simulate improved isolation in some cases. Consider
the case where multiple concurrent Cypher queries increment the value of a property. Due to the
limitations of the read-committed isolation level, the increments will not result in a deterministic final
value.
For example, the following query, if run by one hundred concurrent clients, will very likely not
increment the property n.prop to 100, but some value lower than 100.
MATCH (n:X {id: 42})
SET n.prop = n.prop + 1

This is because all queries will read the value of n.prop within their own transaction. They will not
see the incremented value from any other transaction that has not yet committed. In the worst case
scenario the final value could be as low as 1, if all threads perform the read before any has committed
their transaction.
To ensure deterministic behavior, it is necessary to grab a write lock on the node in question. In Cypher
there is no explicit support for this, but we can work around this limitation by writing to a temporary
property.
MATCH (n:X {id: 42})
SET n._LOCK_ = true
SET n.prop = n.prop + 1
REMOVE n._LOCK_

The existence of the SET n._LOCK_ statement before the read of the n.prop read ensures the lock
is acquired before the read action, and no updates will be lost due to enforced serialization of all
concurrent queries on that specific node.

287

Transaction Management

18.3. Default locking behavior
• When adding, changing or removing a property on a node or relationship a write lock will be taken on
the specific node or relationship.
• When creating or deleting a node a write lock will be taken for the specific node.
• When creating or deleting a relationship a write lock will be taken on the specific relationship and
both its nodes.
The locks will be added to the transaction and released when the transaction finishes.

288

Transaction Management

18.4. Deadlocks
Understanding deadlocks
Since locks are used it is possible for deadlocks to happen. Neo4j will however detect any deadlock
(caused by acquiring a lock) before they happen and throw an exception. Before the exception is
thrown the transaction is marked for rollback. All locks acquired by the transaction are still being held
but will be released when the transaction is finished (in the finally block as pointed out earlier). Once
the locks are released other transactions that were waiting for locks held by the transaction causing the
deadlock can proceed. The work performed by the transaction causing the deadlock can then be retried
by the user if needed.
Experiencing frequent deadlocks is an indication of concurrent write requests happening in such a
way that it is not possible to execute them while at the same time live up to the intended isolation and
consistency. The solution is to make sure concurrent updates happen in a reasonable way. For example
given two specific nodes (A and B), adding or deleting relationships to both these nodes in random
order for each transaction will result in deadlocks when there are two or more transactions doing that
concurrently. One solution is to make sure that updates always happens in the same order (first A then
B). Another solution is to make sure that each thread/transaction does not have any conflicting writes
to a node or relationship as some other concurrent transaction. This can for example be achieved by
letting a single thread do all updates of a specific type.

Important

Deadlocks caused by the use of other synchronization than the locks managed by Neo4j can
still happen. Since all operations in the Neo4j API are thread safe unless specified otherwise,
there is no need for external synchronization. Other code that requires synchronization
should be synchronized in such a way that it never performs any Neo4j operation in the
synchronized block.

Deadlock handling example code
Below you’ll find examples of how deadlocks can be handled in server extensions/plugins or when
using Neo4j embedded.

Tip

1

The full source code used for the code snippets can be found at DeadlockDocTest.java .

When dealing with deadlocks in code, there are several issues you may want to address:
• Only do a limited amount of retries, and fail if a threshold is reached.
• Pause between each attempt to allow the other transaction to finish before trying again.
• A retry-loop can be useful not only for deadlocks, but for other types of transient errors as well.
In the following sections you’ll find example code in Java which shows how this can be implemented.

Handling deadlocks using TransactionTemplate
If you don’t want to write all the code yourself, there is a class called TransactionTemplate2 that will help
you achieve what’s needed. Below is an example of how to create, customize, and use this template for
retries in transactions.
First, define the base template:
TransactionTemplate template = new TransactionTemplate(
1
2

).retries( 5 ).backoff( 3, TimeUnit.SECONDS );

https://github.com/neo4j/neo4j/blob/2.3.12/community/kernel/src/test/java/examples/DeadlockDocTest.java
http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/helpers/TransactionTemplate.html

289

Transaction Management
Next, specify the database to use and a function to execute:
Object result = template.with(graphDatabaseService).execute( new Function()
{
@Override
public Object apply( Transaction transaction ) throws RuntimeException
{
Object result = null;
return result;
}
} );

The operations that could lead to a deadlock should go into the apply method.
The TransactionTemplate uses a fluent API for configuration, and you can choose whether to set
everything at once, or (as in the example) provide some details just before using it. The template allows
setting a predicate for what exceptions to retry on, and also allows for easy monitoring of events that
take place.

Handling deadlocks using a retry loop
If you want to roll your own retry-loop code, see below for inspiration. Here’s an example of what a
retry block might look like:
Throwable txEx = null;
int RETRIES = 5;
int BACKOFF = 3000;
for ( int i = 0; i < RETRIES; i++ )
{
try ( Transaction tx = graphDatabaseService.beginTx() )
{
Object result = doStuff(tx);
tx.success();
return result;
}
catch ( Throwable ex )
{
txEx = ex;
// Add whatever exceptions to retry on here
if ( !(ex instanceof DeadlockDetectedException) )
{
break;
}
}
// Wait so that we don't immediately get into the same deadlock
if ( i < RETRIES - 1 )
{
try
{
Thread.sleep( BACKOFF );
}
catch ( InterruptedException e )
{
throw new TransactionFailureException( "Interrupted", e );
}
}
}
if ( txEx instanceof TransactionFailureException )
{
throw ((TransactionFailureException) txEx);
}
else if ( txEx instanceof Error )

290

Transaction Management
{
throw
}
else if (
{
throw
}
else
{
throw
}

((Error) txEx);
txEx instanceof RuntimeException )
((RuntimeException) txEx);

new TransactionFailureException( "Failed", txEx );

The above is the gist of what such a retry block would look like, and which you can customize to fit your
needs.

291

Transaction Management

18.5. Delete semantics
When deleting a node or a relationship all properties for that entity will be automatically removed but
the relationships of a node will not be removed.

Caution

Neo4j enforces a constraint (upon commit) that all relationships must have a valid
start node and end node. In effect this means that trying to delete a node that still has
relationships attached to it will throw an exception upon commit. It is however possible
to choose in which order to delete the node and the attached relationships as long as no
relationships exist when the transaction is committed.
The delete semantics can be summarized in the following bullets:
• All properties of a node or relationship will be removed when it is deleted.
• A deleted node can not have any attached relationships when the transaction commits.
• It is possible to acquire a reference to a deleted relationship or node that has not yet been
committed.
• Any write operation on a node or relationship after it has been deleted (but not yet committed) will
throw an exception
• After commit trying to acquire a new or work with an old reference to a deleted node or relationship
will throw an exception.

292

Transaction Management

18.6. Creating unique nodes
In many use cases, a certain level of uniqueness is desired among entities. You could for instance
imagine that only one user with a certain e-mail address may exist in a system. If multiple concurrent
threads naively try to create the user, duplicates will be created. There are three main strategies for
ensuring uniqueness, and they all work across High Availability and single-instance deployments.

Single thread
By using a single thread, no two threads will even try to create a particular entity simultaneously. On
High Availability, an external single-threaded client can perform the operations on the cluster.

Get or create
The preferred way to get or create a unique node is to use unique constraints and Cypher. See the
section called “Get or create unique node using Cypher and unique constraints” [605] for more
information.
3

By using put-if-absent functionality, entity uniqueness can be guaranteed using a legacy index. Here
the legacy index acts as the lock and will only lock the smallest part needed to guaranteed uniqueness
across threads and transactions.
See the section called “Get or create unique node using a legacy index” [605] for how to do this using
the core Java API. When using the REST API, see Section 21.21, “Unique Indexing” [422].

Pessimistic locking
Important

While this is a working solution, please consider using the preferred the section called “Get
or create” [293] instead.
By using explicit, pessimistic locking, unique creation of entities can be achieved in a multi-threaded
environment. It is most commonly done by locking on a single or a set of common nodes.
See the section called “Pessimistic locking for node creation” [606] for how to do this using the core
Java API.

3

http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/index/Index.html#putIfAbsent%28T,%20java.lang.String,
%20java.lang.Object%29

293

Transaction Management

18.7. Transaction events
Transaction event handlers can be registered to receive Neo4j Transaction events. Once it has been
registered at a GraphDatabaseService instance it will receive events about what has happened in
each transaction which is about to be committed. Handlers won’t get notified about transactions
which haven’t performed any write operation or won’t be committed (either if Transactionsuccess()
hasn’t been called or the transaction has been marked as failed Transactionfailure(). Right before
a transaction is about to be committed the beforeCommit method is called with the entire diff of
modifications made in the transaction. At this point the transaction is still running so changes can still
be made. However there’s no guarantee that other handlers will see such changes since the order in
which handlers are executed is undefined. This method can also throw an exception and will, in such
a case, prevent the transaction from being committed (where a call to afterRollback will follow). If
beforeCommit is successfully executed in all registered handlers the transaction will be committed and
the afterCommit method will be called with the same transaction data as well as the object returned
from beforeCommit. In afterCommit the transaction has been closed and so accessing data outside of what
TransactionData covers requires a new transaction to be opened. TransactionEventHandler gets notified
about transactions that has any change accessible via TransactionData so some indexing and schema
changes will not be triggering these events.

294

Chapter 19. Data Import
For importing data using Cypher and CSV, see Section 12.8, “Importing CSV files with Cypher” [211].
For high-performance data import, see Chapter 29, Import tool [531].

295

Chapter 20. Graph Algorithms
Neo4j graph algorithms is a component that contains Neo4j implementations of some common
algorithms for graphs. It includes algorithms like:
•
•
•
•
•

Shortest paths,
all paths,
all simple paths,
Dijkstra and
A*.

The graph algorithms are included with Neo4j.
For usage examples, see Section 21.18, “Graph Algorithms” [401] (REST API) and Section 33.11, “Graph
Algorithm examples” [602] (embedded database). The shortest path algorithm can be used from
Cypher as well, see the section called “Shortest path” [161].

296

Chapter 21. REST API
The Neo4j REST API is designed with discoverability in mind, so that you can start with a GET on
the Section 21.4, “Service root” [316] and from there discover URIs to perform other requests.
The examples below uses URIs in the examples; they are subject to change in the future, so for
future-proofness discover URIs where possible, instead of relying on the current layout. The default
1
representation is json , both for responses and for data sent with POST/PUT requests.
Below follows a listing of ways to interact with the REST API. For language bindings to the REST API, see
Chapter 7, Languages [96].
To interact with the JSON interface you must explicitly set the request header Accept:application/json
for those requests that responds with data. You should also set the header Content-Type:application/
json if your request sends data, for example when you’re creating a relationship. The examples include
the relevant request and response headers.
The server supports streaming results, with better performance and lower memory overhead. See
Section 21.5, “Streaming” [317] for more information.

1

http://www.json.org/

297

REST API

21.1. Transactional Cypher HTTP endpoint
The default way to interact with Neo4j is by using this endpoint.
The Neo4j transactional HTTP endpoint allows you to execute a series of Cypher statements within
the scope of a transaction. The transaction may be kept open across multiple HTTP requests, until
the client chooses to commit or roll back. Each HTTP request can include a list of statements, and for
convenience you can include statements along with a request to begin or commit a transaction.
The server guards against orphaned transactions by using a timeout. If there are no requests for a
given transaction within the timeout period, the server will roll it back. You can configure the timeout
in the server configuration, by setting org.neo4j.server.transaction.timeout to the number of seconds
before timeout. The default timeout is 60 seconds.
The key difference between the transactional HTTP endpoint for Cypher and the Cypher endpoint (see
Section 21.6, “Legacy Cypher HTTP endpoint” [318]) is the ability to use the same transaction across
multiple HTTP requests. The Cypher endpoint always attempts to commit a transaction at the end of
each HTTP request. There has also been improvements to the serialization format.

Note
• Literal line breaks are not allowed inside Cypher statements.
• Open transactions are not shared among members of an HA cluster. Therefore, if you use
this endpoint in an HA cluster, you must ensure that all requests for a given transaction
are sent to the same Neo4j instance.
• Cypher queries with USING PERIODIC COMMIT (see Section 12.9, “Using Periodic
Commit” [213]) may only be executed when creating a new transaction and immediately
committing it with a single HTTP request (see the section called “Begin and commit a
transaction in one request” [298] for how to do that).
• The serialization format for Cypher results is mostly the same as the Cypher endpoint.
However, the format for raw entities is slightly less verbose and does not include
hypermedia links.

Tip

In order to speed up queries in repeated scenarios, try not to use literals but replace
them with parameters wherever possible. This will let the server cache query plans. See
Section 8.5, “Parameters” [113] for more information.

Begin and commit a transaction in one request
If there is no need to keep a transaction open across multiple HTTP requests, you can begin a
transaction, execute statements, and commit with just a single HTTP request.
Example request
• POST http://localhost:7474/db/data/transaction/commit
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
{
"statements" : [ {
"statement" : "CREATE (n) RETURN id(n)"
} ]
}

Example response
• 200: OK
• Content-Type: application/json
298

REST API

{
"results" : [ {
"columns" : [ "id(n)" ],
"data" : [ {
"row" : [ 18 ]
} ]
} ],
"errors" : [ ]
}

Execute multiple statements
You can send multiple Cypher statements in the same request. The response will contain the result of
each statement.
Example request
• POST http://localhost:7474/db/data/transaction/commit
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
{
"statements" : [ {
"statement" : "CREATE (n) RETURN id(n)"
}, {
"statement" : "CREATE (n {props}) RETURN n",
"parameters" : {
"props" : {
"name" : "My Node"
}
}
} ]
}

Example response
• 200: OK
• Content-Type: application/json
{
"results" : [ {
"columns" : [ "id(n)" ],
"data" : [ {
"row" : [ 14 ]
} ]
}, {
"columns" : [ "n" ],
"data" : [ {
"row" : [ {
"name" : "My Node"
} ]
} ]
} ],
"errors" : [ ]
}

Begin a transaction
You begin a new transaction by posting zero or more Cypher statements to the transaction endpoint.
The server will respond with the result of your statements, as well as the location of your open
transaction.
Example request
299

REST API
• POST http://localhost:7474/db/data/transaction
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
{
"statements" : [ {
"statement" : "CREATE (n {props}) RETURN n",
"parameters" : {
"props" : {
"name" : "My Node"
}
}
} ]
}

Example response
• 201: Created
• Content-Type: application/json
• Location: http://localhost:7474/db/data/transaction/9
{
"commit" : "http://localhost:7474/db/data/transaction/9/commit",
"results" : [ {
"columns" : [ "n" ],
"data" : [ {
"row" : [ {
"name" : "My Node"
} ]
} ]
} ],
"transaction" : {
"expires" : "Fri, 08 Dec 2017 11:04:46 +0000"
},
"errors" : [ ]
}

Execute statements in an open transaction
Given that you have an open transaction, you can make a number of requests, each of which executes
additional statements, and keeps the transaction open by resetting the transaction timeout.
Example request
• POST http://localhost:7474/db/data/transaction/11
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
{
"statements" : [ {
"statement" : "CREATE (n) RETURN n"
} ]
}

Example response
• 200: OK
• Content-Type: application/json
{
"commit" : "http://localhost:7474/db/data/transaction/11/commit",

300

REST API
"results" : [ {
"columns" : [ "n" ],
"data" : [ {
"row" : [ { } ]
} ]
} ],
"transaction" : {
"expires" : "Fri, 08 Dec 2017 11:04:46 +0000"
},
"errors" : [ ]
}

Execute statements in an open transaction in REST format for the return
Given that you have an open transaction, you can make a number of requests, each of which executes
additional statements, and keeps the transaction open by resetting the transaction timeout. Specifying
the REST format will give back full Neo4j Rest API representations of the Neo4j Nodes, Relationships and
Paths, if returned.
Example request
• POST http://localhost:7474/db/data/transaction/1
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
{
"statements" : [ {
"statement" : "CREATE (n) RETURN n",
"resultDataContents" : [ "REST" ]
} ]
}

Example response
• 200: OK
• Content-Type: application/json
{
"commit" : "http://localhost:7474/db/data/transaction/1/commit",
"results" : [ {
"columns" : [ "n" ],
"data" : [ {
"rest" : [ {
"outgoing_relationships" : "http://localhost:7474/db/data/node/12/relationships/out",
"labels" : "http://localhost:7474/db/data/node/12/labels",
"all_typed_relationships" : "http://localhost:7474/db/data/node/12/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/12/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/12",
"property" : "http://localhost:7474/db/data/node/12/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/12/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/12/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/12/relationships/in",
"create_relationship" : "http://localhost:7474/db/data/node/12/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/12/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/12/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/12/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 12,
"labels" : [ ]
},
"data" : { }
} ]
} ]

301

REST API
} ],
"transaction" : {
"expires" : "Fri, 08 Dec 2017 11:04:42 +0000"
},
"errors" : [ ]
}

Reset transaction timeout of an open transaction
Every orphaned transaction is automatically expired after a period of inactivity. This may be prevented
by resetting the transaction timeout.
The timeout may be reset by sending a keep-alive request to the server that executes an empty list
of statements. This request will reset the transaction timeout and return the new time at which the
transaction will expire as an RFC1123 formatted timestamp value in the “transaction” section of the
response.
Example request
• POST http://localhost:7474/db/data/transaction/2
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
{
"statements" : [ ]
}

Example response
• 200: OK
• Content-Type: application/json
{
"commit" : "http://localhost:7474/db/data/transaction/2/commit",
"results" : [ ],
"transaction" : {
"expires" : "Fri, 08 Dec 2017 11:04:45 +0000"
},
"errors" : [ ]
}

Commit an open transaction
Given you have an open transaction, you can send a commit request. Optionally, you submit additional
statements along with the request that will be executed before committing the transaction.
Example request
• POST http://localhost:7474/db/data/transaction/6/commit
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
{
"statements" : [ {
"statement" : "CREATE (n) RETURN id(n)"
} ]
}

Example response
• 200: OK
• Content-Type: application/json
302

REST API

{
"results" : [ {
"columns" : [ "id(n)" ],
"data" : [ {
"row" : [ 17 ]
} ]
} ],
"errors" : [ ]
}

Rollback an open transaction
Given that you have an open transaction, you can send a rollback request. The server will rollback the
transaction. Any further statements trying to run in this transaction will fail immediately.
Example request
• DELETE http://localhost:7474/db/data/transaction/3
• Accept: application/json; charset=UTF-8
Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
{
"results" : [ ],
"errors" : [ ]
}

Include query statistics
By setting includeStats to true for a statement, query statistics will be returned for it.
Example request
• POST http://localhost:7474/db/data/transaction/commit
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
{
"statements" : [ {
"statement" : "CREATE (n) RETURN id(n)",
"includeStats" : true
} ]
}

Example response
• 200: OK
• Content-Type: application/json
{
"results" : [ {
"columns" : [ "id(n)" ],
"data" : [ {
"row" : [ 16 ]
} ],
"stats" : {
"contains_updates" : true,
"nodes_created" : 1,
"nodes_deleted" : 0,

303

REST API
"properties_set" : 0,
"relationships_created" : 0,
"relationship_deleted" : 0,
"labels_added" : 0,
"labels_removed" : 0,
"indexes_added" : 0,
"indexes_removed" : 0,
"constraints_added" : 0,
"constraints_removed" : 0
}
} ],
"errors" : [ ]
}

Return results in graph format
If you want to understand the graph structure of nodes and relationships returned by your query, you
can specify the "graph" results data format. For example, this is useful when you want to visualise the
graph structure. The format collates all the nodes and relationships from all columns of the result, and
also flattens collections of nodes and relationships, including paths.
Example request
• POST http://localhost:7474/db/data/transaction/commit
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
{
"statements" : [ {
"statement" : "CREATE ( bike:Bike { weight: 10 } ) CREATE ( frontWheel:Wheel { spokes: 3 } ) CREATE ( backWheel:Wheel
{ spokes: 32 } ) CREATE p1 = (bike)-[:HAS { position: 1 } ]->(frontWheel) CREATE p2 = (bike)-[:HAS { position: 2 } ]>(backWheel) RETURN bike, p1, p2",
"resultDataContents" : [ "row", "graph" ]
} ]
}

Example response
• 200: OK
• Content-Type: application/json
{
"results" : [ {
"columns" : [ "bike", "p1", "p2" ],
"data" : [ {
"row" : [ {
"weight" : 10
}, [ {
"weight" : 10
}, {
"position" : 1
}, {
"spokes" : 3
} ], [ {
"weight" : 10
}, {
"position" : 2
}, {
"spokes" : 32
} ] ],
"graph" : {
"nodes" : [ {
"id" : "19",

304

REST API
"labels" : [ "Bike" ],
"properties" : {
"weight" : 10
}
}, {
"id" : "21",
"labels" : [ "Wheel" ],
"properties" : {
"spokes" : 32
}
}, {
"id" : "20",
"labels" : [ "Wheel" ],
"properties" : {
"spokes" : 3
}
} ],
"relationships" : [ {
"id" : "9",
"type" : "HAS",
"startNode" : "19",
"endNode" : "20",
"properties" : {
"position" : 1
}
}, {
"id" : "10",
"type" : "HAS",
"startNode" : "19",
"endNode" : "21",
"properties" : {
"position" : 2
}
} ]
}
} ]
} ],
"errors" : [ ]
}

Handling errors
The result of any request against the transaction endpoint is streamed back to the client. Therefore
the server does not know whether the request will be successful or not when it sends the HTTP status
code.
Because of this, all requests against the transactional endpoint will return 200 or 201 status code,
regardless of whether statements were successfully executed. At the end of the response payload, the
server includes a list of errors that occurred while executing statements. If this list is empty, the request
completed successfully.
If any errors occur while executing statements, the server will roll back the transaction.
In this example, we send the server an invalid statement to demonstrate error handling.
For more information on the status codes, see Section 21.2, “Neo4j Status Codes” [307].
Example request
• POST http://localhost:7474/db/data/transaction/10/commit
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
{
"statements" : [ {

305

REST API
"statement" : "This is not a valid Cypher Statement."
} ]
}

Example response
• 200: OK
• Content-Type: application/json
{
"results" : [ ],
"errors" : [ {
"code" : "Neo.ClientError.Statement.InvalidSyntax",
"message" : "Invalid input 'T': expected  (line 1, column 1 (offset: 0))\n\"This is not a valid Cypher Statement.
\"\n ^"
} ]
}

306

REST API

21.2. Neo4j Status Codes
The transactional endpoint may in any response include zero or more status codes, indicating issues or
information for the client. Each status code follows the same format: "Neo.[Classification].[Category].
[Title]". The fact that a status code is returned by the server does always mean there is a fatal error.
Status codes can also indicate transient problems that may go away if you retry the request.
What the effect of the status code is can be determined by its classification.

Note

This is not the same thing as HTTP status codes. Neo4j Status Codes are returned in the
response body, at the very end of the response.

Classifications
Classification

Description

Effect on
transaction

ClientError

The Client sent a bad request - changing the request might
yield a successful outcome.

Rollback

ClientNotification

There are notifications about the request sent by the client.

None

DatabaseError

The database failed to service the request.

Rollback

TransientError

The database cannot service the request right now, retrying
later might yield a successful outcome.

Rollback

Status codes
This is a complete list of all status codes Neo4j may return, and what they mean.
Status Code

Description

Neo. ClientError. General. ReadOnly

This is a read only database, writing or modifying
the database is not allowed.

Neo. ClientError. LegacyIndex. NoSuchIndex

The request (directly or indirectly) referred to a
index that does not exist.

Neo. ClientError. Request. Invalid

The client provided an invalid request.

Neo. ClientError. Request. InvalidFormat

The client provided a request that was missing
required fields, or had values that are not allowed.

Neo. ClientError. Schema. ConstraintAlreadyExists

Unable to perform operation because it would
clash with a pre-existing constraint.

Neo. ClientError. Schema.
ConstraintVerificationFailure

Unable to create constraint because data that
exists in the database violates it.

Neo. ClientError. Schema. ConstraintViolation

A constraint imposed by the database was
violated.

Neo. ClientError. Schema. IllegalTokenName

A token name, such as a label, relationship type or
property key, used is not valid. Tokens cannot be
empty strings and cannot be null.

Neo. ClientError. Schema. IndexAlreadyExists

Unable to perform operation because it would
clash with a pre-existing index.

Neo. ClientError. Schema. IndexBelongsToConstraint

A requested operation can not be performed on
the specified index because the index is part of
a constraint. If you want to drop the index, for
instance, you must drop the constraint.
307

REST API
Status Code

Description

Neo. ClientError. Schema. IndexLimitReached

The maximum number of index entries supported
has been reached, no more entities can be
indexed.

Neo. ClientError. Schema. LabelLimitReached

The maximum number of labels supported has
been reached, no more labels can be created.

Neo. ClientError. Schema. NoSuchConstraint

The request (directly or indirectly) referred to a
constraint that does not exist.

Neo. ClientError. Schema. NoSuchIndex

The request (directly or indirectly) referred to an
index that does not exist.

Neo. ClientError. Security. AuthenticationFailed

The client provided an incorrect username and/or
password.

Neo. ClientError. Security. AuthenticationRateLimit The client has provided incorrect authentication

details too many times in a row.

Neo. ClientError. Security. AuthorizationFailed

The client does not have privileges to perform the
operation requested.

Neo. ClientError. Statement. ArithmeticError

Invalid use of arithmetic, such as dividing by zero.

Neo. ClientError. Statement. ConstraintViolation

A constraint imposed by the statement is violated
by the data in the database.

Neo. ClientError. Statement. EntityNotFound

The statement is directly referring to an entity that
does not exist.

Neo. ClientError. Statement. InvalidArguments

The statement is attempting to perform
operations using invalid arguments

Neo. ClientError. Statement. InvalidSemantics

The statement is syntactically valid, but expresses
something that the database cannot do.

Neo. ClientError. Statement. InvalidSyntax

The statement contains invalid or unsupported
syntax.

Neo. ClientError. Statement. InvalidType

The statement is attempting to perform
operations on values with types that are not
supported by the operation.

Neo. ClientError. Statement. NoSuchLabel

The statement is referring to a label that does not
exist.

Neo. ClientError. Statement. NoSuchProperty

The statement is referring to a property that does
not exist.

Neo. ClientError. Statement. ParameterMissing

The statement is referring to a parameter that was
not provided in the request.

Neo. ClientError. Transaction. ConcurrentRequest

There were concurrent requests accessing the
same transaction, which is not allowed.

Neo. ClientError. Transaction.
EventHandlerThrewException

A transaction event handler threw an exception.
The transaction will be rolled back.

Neo. ClientError. Transaction. HookFailed

Transaction hook failure.

Neo. ClientError. Transaction. InvalidType

The transaction is of the wrong type to service the
request. For instance, a transaction that has had
schema modifications performed in it cannot be
used to subsequently perform data operations,
and vice versa.

Neo. ClientError. Transaction. MarkedAsFailed

Transaction was marked as both successful
and failed. Failure takes precedence and so this

308

REST API
Status Code

Description

transaction was rolled back although it may have
looked like it was going to be committed
Neo. ClientError. Transaction. UnknownId

The request referred to a transaction that does not
exist.

Neo. ClientError. Transaction. ValidationFailed

Transaction changes did not pass validation checks

Neo. ClientNotification. Statement. CartesianProduct This query builds a cartesian product between

disconnected patterns.

Neo. ClientNotification. Statement.
DeprecationWarning

This feature is deprecated and will be removed in
future versions.

Neo. ClientNotification. Statement.
DynamicPropertyWarning

Queries using dynamic properties will use neither
index seeks nor index scans for those properties

Neo. ClientNotification. Statement. EagerWarning

The execution plan for this query contains the
Eager operator, which forces all dependent
data to be materialized in main memory before
proceeding

Neo. ClientNotification. Statement.
IndexMissingWarning

Adding a schema index may speed up this query.

Neo. ClientNotification. Statement.
JoinHintUnfulfillableWarning

The database was unable to plan a hinted join.

Neo. ClientNotification. Statement.
JoinHintUnsupportedWarning

Queries with join hints are not supported by the
RULE planner.

Neo. ClientNotification. Statement.
LabelMissingWarning

The provided label is not in the database.

Neo. ClientNotification. Statement.
PlannerUnsupportedWarning

This query is not supported by the COST planner.

Neo. ClientNotification. Statement.
PropertyNameMissingWarning

The provided property name is not in the database

Neo. ClientNotification. Statement.
RelTypeMissingWarning

The provided relationship type is not in the
database.

Neo. ClientNotification. Statement.
RuntimeUnsupportedWarning

This query is not supported by the compiled
runtime.

Neo. ClientNotification. Statement.
UnboundedPatternWarning

The provided pattern is unbounded, consider
adding an upper limit to the number of node
hops.

Neo. DatabaseError. General. CorruptSchemaRule

A malformed schema rule was encountered.
Please contact your support representative.

Neo. DatabaseError. General. FailedIndex

The request (directly or indirectly) referred to an
index that is in a failed state. The index needs to
be dropped and recreated manually.

Neo. DatabaseError. General. UnknownFailure

An unknown failure occurred.

Neo. DatabaseError. Schema.
ConstraintCreationFailure

Creating a requested constraint failed.

Neo. DatabaseError. Schema. ConstraintDropFailure

The database failed to drop a requested
constraint.

Neo. DatabaseError. Schema. DuplicateSchemaRule

The request referred to a schema rule that defined
multiple times.

309

REST API
Status Code

Description

Neo. DatabaseError. Schema. IndexCreationFailure

Failed to create an index.

Neo. DatabaseError. Schema. IndexDropFailure

The database failed to drop a requested index.

Neo. DatabaseError. Schema. NoSuchLabel

The request accessed a label that did not exist.

Neo. DatabaseError. Schema. NoSuchPropertyKey

The request accessed a property that does not
exist.

Neo. DatabaseError. Schema. NoSuchRelationshipType

The request accessed a relationship type that does
not exist.

Neo. DatabaseError. Schema. NoSuchSchemaRule

The request referred to a schema rule that does
not exist.

Neo. DatabaseError. Statement. ExecutionFailure

The database was unable to execute the
statement.

Neo. DatabaseError. Transaction. CouldNotBegin

The database was unable to start the transaction.

Neo. DatabaseError. Transaction. CouldNotCommit

The database was unable to commit the
transaction.

Neo. DatabaseError. Transaction. CouldNotRollback

The database was unable to roll back the
transaction.

Neo. DatabaseError. Transaction. CouldNotWriteToLog The database was unable to write transaction to

log.

Neo. DatabaseError. Transaction. ReleaseLocksFailed The transaction was unable to release one or more

of its locks.

Neo. TransientError. General. DatabaseUnavailable

The database is not currently available to serve
your request, refer to the database logs for more
details. Retrying your request at a later time may
succeed.

Neo. TransientError. Network. UnknownFailure

An unknown network failure occurred, a retry may
resolve the issue.

Neo. TransientError. Schema. ModifiedConcurrently

The database schema was modified while this
transaction was running, the transaction should be
retried.

Neo. TransientError. Security. ModifiedConcurrently The user was modified concurrently to this

request.

Neo. TransientError. Statement.
ExternalResourceFailure

The external resource is not available

Neo. TransientError. Transaction. AcquireLockTimeout The transaction was unable to acquire a lock,

for instance due to a timeout or the transaction
thread being interrupted.

Neo. TransientError. Transaction. ConstraintsChanged Database constraints changed since the start of

this transaction

Neo. TransientError. Transaction. DeadlockDetected

This transaction, and at least one more
transaction, has acquired locks in a way that it will
wait indefinitely, and the database has aborted
it. Retrying this transaction will most likely be
successful.

Neo. TransientError. Transaction. LockClientStopped Transaction terminated, no more locks can be

acquired.

310

REST API
Status Code

Description

Neo. TransientError. Transaction. Outdated

Transaction has seen state which has been
invalidated by applied updates while transaction
was active. Transaction may succeed if retried.

Neo. TransientError. Transaction. Terminated

Explicitly terminated by the user.

311

REST API

21.3. REST API Authentication and Authorization
In order to prevent unauthorized access to Neo4j, the REST API supports authorization and
authentication. When enabled, requests to the REST API must be authorized using the username
and password of a valid user. Authorization is enabled by default, see the section called “Server
authentication and authorization” [500] for how to disable it.
When Neo4j is first installed you can authenticate with the default user neo4j and the default password
neo4j. However, the default password must be changed (see the section called “User status and
password changing” [313]) before access to resources will be permitted. This can easily be done via
the Neo4j Browser, or via direct HTTP calls.
The username and password combination is local to each Neo4j instance. If you wish to have multiple
instances in a cluster, you should ensure that all instances share the same credential. For automated
deployments, you may also copy security configuration from another Neo4j instance (see the section
called “Copying security configuration from one instance to another” [315]).

Authenticating
Missing authorization
If an Authorization header is not supplied, the server will reply with an error.
Example request
• GET http://localhost:7474/db/data/
• Accept: application/json; charset=UTF-8
Example response
• 401: Unauthorized
• Content-Type: application/json; charset=UTF-8
• WWW-Authenticate: None
{
"errors" : [ {
"message" : "No authorization header supplied.",
"code" : "Neo.ClientError.Security.AuthorizationFailed"
} ]
}

Authenticate to access the server
Authenticate by sending a username and a password to Neo4j using HTTP Basic Auth. Requests should
include an Authorization header, with a value of Basic , where "payload" is a base64 encoded
string of "username:password".
Example request
• GET http://localhost:7474/user/neo4j
• Accept: application/json; charset=UTF-8
• Authorization: Basic bmVvNGo6c2VjcmV0
Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
{
"username" : "neo4j",
"password_change" : "http://localhost:7474/user/neo4j/password",

312

REST API
"password_change_required" : false
}

Incorrect authentication
If an incorrect username or password is provided, the server replies with an error.
Example request
• POST http://localhost:7474/db/data/
• Accept: application/json; charset=UTF-8
• Authorization: Basic bmVvNGo6aW5jb3JyZWN0
Example response
• 401: Unauthorized
• Content-Type: application/json; charset=UTF-8
• WWW-Authenticate: None
{
"errors" : [ {
"message" : "Invalid username or password.",
"code" : "Neo.ClientError.Security.AuthorizationFailed"
} ]
}

Required password changes
In some cases, like the very first time Neo4j is accessed, the user will be required to choose a new
password. The database will signal that a new password is required and deny access.
See the section called “User status and password changing” [313] for how to set a new password.
Example request
• GET http://localhost:7474/db/data/
• Accept: application/json; charset=UTF-8
• Authorization: Basic bmVvNGo6bmVvNGo=
Example response
• 403: Forbidden
• Content-Type: application/json; charset=UTF-8
{
"password_change" : "http://localhost:7474/user/neo4j/password",
"errors" : [ {
"message" : "User is required to change their password.",
"code" : "Neo.ClientError.Security.AuthorizationFailed"
} ]
}

User status and password changing
User status
Given that you know the current password, you can ask the server for the user status.
Example request
• GET http://localhost:7474/user/neo4j
• Accept: application/json; charset=UTF-8
313

REST API
• Authorization: Basic bmVvNGo6c2VjcmV0
Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
{
"username" : "neo4j",
"password_change" : "http://localhost:7474/user/neo4j/password",
"password_change_required" : false
}

User status on first access
On first access, and using the default password, the user status will indicate that the users password
requires changing.
Example request
• GET http://localhost:7474/user/neo4j
• Accept: application/json; charset=UTF-8
• Authorization: Basic bmVvNGo6bmVvNGo=
Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
{
"username" : "neo4j",
"password_change" : "http://localhost:7474/user/neo4j/password",
"password_change_required" : true
}

Changing the user password
Given that you know the current password, you can ask the server to change a users password. You can
choose any password you like, as long as it is different from the current password.
Example request
•
•
•
•

POST http://localhost:7474/user/neo4j/password
Accept: application/json; charset=UTF-8
Authorization: Basic bmVvNGo6bmVvNGo=
Content-Type: application/json

{
"password" : "secret"
}

Example response
• 200: OK

Access when auth is disabled
When auth is disabled
When auth has been disabled in the configuration, requests can be sent without an Authorization
header.
314

REST API
Example request
• GET http://localhost:7474/db/data/
• Accept: application/json; charset=UTF-8
Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
{
"extensions" : { },
"node" : "http://localhost:7474/db/data/node",
"node_index" : "http://localhost:7474/db/data/index/node",
"relationship_index" : "http://localhost:7474/db/data/index/relationship",
"extensions_info" : "http://localhost:7474/db/data/ext",
"relationship_types" : "http://localhost:7474/db/data/relationship/types",
"batch" : "http://localhost:7474/db/data/batch",
"cypher" : "http://localhost:7474/db/data/cypher",
"indexes" : "http://localhost:7474/db/data/schema/index",
"constraints" : "http://localhost:7474/db/data/schema/constraint",
"transaction" : "http://localhost:7474/db/data/transaction",
"node_labels" : "http://localhost:7474/db/data/labels",
"neo4j_version" : "2.3.12"
}

Copying security configuration from one instance to another
In many cases, such as automated deployments, you may want to start a Neo4j instance with preconfigured authentication and authorization. This is possible by copying the auth database file from a
pre-existing Neo4j instance to your new instance.
This file is located at data/dbms/auth, and simply copying that file into a new Neo4j instance will transfer
your password and authorization token.

315

REST API

21.4. Service root
Get service root
The service root is your starting point to discover the REST API. It contains the basic starting points for
the database, and some version and extension information.
Figure 21.1. Final Graph

Example request
• GET http://localhost:7474/db/data/
• Accept: application/json; charset=UTF-8
Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
{
"extensions" : { },
"node" : "http://localhost:7474/db/data/node",
"node_index" : "http://localhost:7474/db/data/index/node",
"relationship_index" : "http://localhost:7474/db/data/index/relationship",
"extensions_info" : "http://localhost:7474/db/data/ext",
"relationship_types" : "http://localhost:7474/db/data/relationship/types",
"batch" : "http://localhost:7474/db/data/batch",
"cypher" : "http://localhost:7474/db/data/cypher",
"indexes" : "http://localhost:7474/db/data/schema/index",
"constraints" : "http://localhost:7474/db/data/schema/constraint",
"transaction" : "http://localhost:7474/db/data/transaction",
"node_labels" : "http://localhost:7474/db/data/labels",
"neo4j_version" : "2.3.12"
}

316

REST API

21.5. Streaming
All responses from the REST API can be transmitted as JSON streams, resulting in better performance
and lower memory overhead on the server side. To use streaming, supply the header X-Stream: true
with each request.
Example request
• GET http://localhost:7474/db/data/
• Accept: application/json
• X-Stream: true
Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8; stream=true
{
"extensions" : { },
"node" : "http://localhost:7474/db/data/node",
"node_index" : "http://localhost:7474/db/data/index/node",
"relationship_index" : "http://localhost:7474/db/data/index/relationship",
"extensions_info" : "http://localhost:7474/db/data/ext",
"relationship_types" : "http://localhost:7474/db/data/relationship/types",
"batch" : "http://localhost:7474/db/data/batch",
"cypher" : "http://localhost:7474/db/data/cypher",
"indexes" : "http://localhost:7474/db/data/schema/index",
"constraints" : "http://localhost:7474/db/data/schema/constraint",
"transaction" : "http://localhost:7474/db/data/transaction",
"node_labels" : "http://localhost:7474/db/data/labels",
"neo4j_version" : "2.3.12"
}

317

REST API

21.6. Legacy Cypher HTTP endpoint
Note

This endpoint is deprecated. Please transition to using the new transactional endpoint (see
Section 21.1, “Transactional Cypher HTTP endpoint” [298]). Among other things it allows you
to run multiple Cypher statements in the same transaction.
The Neo4j REST API allows querying with Cypher, see Part III, “Cypher Query Language” [102]. The
results are returned as a list of string headers (columns), and a data part, consisting of a list of all rows,
every row consisting of a list of REST representations of the field value — Node, Relationship, Path or any
simple value like String.

Tip

In order to speed up queries in repeated scenarios, try not to use literals but replace them
with parameters wherever possible in order to let the server cache query plans, see the
section called “Use parameters” [318] for details. Also see Section 8.5, “Parameters” [113]
for where parameters can be used.

Use parameters
Cypher supports queries with parameters which are submitted as JSON.
MATCH (x { name: { startName }})-[r]-(friend)
WHERE friend.name = { name }
RETURN TYPE(r)

Figure 21.2. Final Graph
Node[ 80]
nam e = 'I'

know

Node[ 79]
nam e = 'you'

Example request
• POST http://localhost:7474/db/data/cypher
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
{
"query" : "MATCH (x {name: {startName}})-[r]-(friend) WHERE friend.name = {name} RETURN TYPE(r)",
"params" : {
"startName" : "I",
"name" : "you"
}
}

Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
{

318

REST API
"columns" : [ "TYPE(r)" ],
"data" : [ [ "know" ] ]
}

Create a node
Create a node with a label and a property using Cypher. See the request for the parameter sent with
the query.
CREATE (n:Person { name : { name }})
RETURN n

Figure 21.3. Final Graph
Node[ 66] : Person
nam e = 'Andres'

Example request
• POST http://localhost:7474/db/data/cypher
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
{
"query" : "CREATE (n:Person { name : {name} }) RETURN n",
"params" : {
"name" : "Andres"
}
}

Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
{
"columns" : [ "n" ],
"data" : [ [ {
"labels" : "http://localhost:7474/db/data/node/66/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/66/relationships/out",
"data" : {
"name" : "Andres"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/66/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/66/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/66",
"property" : "http://localhost:7474/db/data/node/66/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/66/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/66/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/66/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/66/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/66/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/66/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/66/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 66,
"labels" : [ "Person" ]
}
} ] ]
}

319

REST API

Create a node with multiple properties
Create a node with a label and multiple properties using Cypher. See the request for the parameter
sent with the query.
CREATE (n:Person { props })
RETURN n

Figure 21.4. Final Graph
Node[ 63] : Person
awesom e = t rue
children = 3
nam e = 'Michael'
posit ion = 'Developer'

Example request
• POST http://localhost:7474/db/data/cypher
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
{
"query" : "CREATE (n:Person { props } ) RETURN n",
"params" : {
"props" : {
"position" : "Developer",
"name" : "Michael",
"awesome" : true,
"children" : 3
}
}
}

Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
{
"columns" : [ "n" ],
"data" : [ [ {
"labels" : "http://localhost:7474/db/data/node/63/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/63/relationships/out",
"data" : {
"position" : "Developer",
"awesome" : true,
"name" : "Michael",
"children" : 3
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/63/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/63/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/63",
"property" : "http://localhost:7474/db/data/node/63/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/63/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/63/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/63/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/63/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/63/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/63/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/63/relationships/in/{-list|&|types}",
"metadata" : {

320

REST API
"id" : 63,
"labels" : [ "Person" ]
}
} ] ]
}

Create multiple nodes with properties
Create multiple nodes with properties using Cypher. See the request for the parameter sent with the
query.
UNWIND { props } AS map
CREATE (n:Person)
SET n = map
RETURN n

Figure 21.5. Final Graph
Node[ 66] : Person
nam e = 'Andres'

Node[ 67] : Person
nam e = 'Andres'
posit ion = 'Developer'

Node[ 68] : Person
nam e = 'Michael'
posit ion = 'Developer'

Example request
• POST http://localhost:7474/db/data/cypher
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
{
"query" : "UNWIND {props} as map CREATE (n:Person) SET n = map RETURN n",
"params" : {
"props" : [ {
"name" : "Andres",
"position" : "Developer"
}, {
"name" : "Michael",
"position" : "Developer"
} ]
}
}

Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
{
"columns" : [ "n" ],
"data" : [ [ {
"labels" : "http://localhost:7474/db/data/node/67/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/67/relationships/out",
"data" : {
"position" : "Developer",
"name" : "Andres"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/67/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/67/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/67",
"property" : "http://localhost:7474/db/data/node/67/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/67/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/67/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/67/relationships/in",
"extensions" : { },

321

REST API
"create_relationship" : "http://localhost:7474/db/data/node/67/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/67/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/67/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/67/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 67,
"labels" : [ "Person" ]
}
} ], [ {
"labels" : "http://localhost:7474/db/data/node/68/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/68/relationships/out",
"data" : {
"position" : "Developer",
"name" : "Michael"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/68/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/68/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/68",
"property" : "http://localhost:7474/db/data/node/68/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/68/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/68/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/68/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/68/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/68/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/68/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/68/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 68,
"labels" : [ "Person" ]
}
} ] ]
}

Set all properties on a node using Cypher
Set all properties on a node.
CREATE (n:Person { name: 'this property is to be deleted' })
SET n = { props }
RETURN n

Figure 21.6. Final Graph
Node[ 94] : Person
awesom e = t rue
children = 3
first Nam e = 'Michael'
posit ion = 'Developer'

Example request
• POST http://localhost:7474/db/data/cypher
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
{
"query" : "CREATE (n:Person { name: 'this property is to be deleted' } ) SET n = { props } RETURN n",
"params" : {
"props" : {
"position" : "Developer",
"firstName" : "Michael",
"awesome" : true,
"children" : 3

322

REST API
}
}
}

Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
{
"columns" : [ "n" ],
"data" : [ [ {
"labels" : "http://localhost:7474/db/data/node/94/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/94/relationships/out",
"data" : {
"position" : "Developer",
"awesome" : true,
"children" : 3,
"firstName" : "Michael"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/94/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/94/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/94",
"property" : "http://localhost:7474/db/data/node/94/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/94/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/94/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/94/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/94/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/94/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/94/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/94/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 94,
"labels" : [ "Person" ]
}
} ] ]
}

Send a query
A simple query returning all nodes connected to some node, returning the node and the name
property, if it exists, otherwise NULL:
MATCH (x { name: 'I' })-[r]->(n)
RETURN type(r), n.name, n.age

Figure 21.7. Final Graph
Node[ 88]
nam e = 'I'

know

know

Node[ 86]

Node[ 87]

nam e = 'you'

nam e = 'him '
age = 25

Example request
• POST http://localhost:7474/db/data/cypher
323

REST API
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
{
"query" : "MATCH (x {name: 'I'})-[r]->(n) RETURN type(r), n.name, n.age",
"params" : { }
}

Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
{
"columns" : [ "type(r)", "n.name", "n.age" ],
"data" : [ [ "know", "him", 25 ], [ "know", "you", null ] ]
}

Return paths
Paths can be returned just like other return types.
MATCH path =(x { name: 'I' })--(friend)
RETURN path, friend.name

Figure 21.8. Final Graph
Node[ 93]
nam e = 'I'

know

Node[ 92]
nam e = 'you'

Example request
• POST http://localhost:7474/db/data/cypher
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
{
"query" : "MATCH path = (x {name: 'I'})--(friend) RETURN path, friend.name",
"params" : { }
}

Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
{
"columns" : [ "path", "friend.name" ],
"data" : [ [ {
"directions" : [ "->" ],
"start" : "http://localhost:7474/db/data/node/93",
"nodes" : [ "http://localhost:7474/db/data/node/93", "http://localhost:7474/db/data/node/92" ],
"length" : 1,
"relationships" : [ "http://localhost:7474/db/data/relationship/21" ],
"end" : "http://localhost:7474/db/data/node/92"

324

REST API
}, "you" ] ]
}

Nested results
When sending queries that return nested results like list and maps, these will get serialized into nested
JSON representations according to their types.
MATCH (n)
WHERE n.name IN ['I', 'you']
RETURN collect(n.name)

Figure 21.9. Final Graph
Node[ 90]
nam e = 'I'

know

Node[ 89]
nam e = 'you'

Example request
• POST http://localhost:7474/db/data/cypher
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
{
"query" : "MATCH (n) WHERE n.name in ['I', 'you'] RETURN collect(n.name)",
"params" : { }
}

Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
{
"columns" : [ "collect(n.name)" ],
"data" : [ [ [ "you", "I" ] ] ]
}

Retrieve query metadata
By passing in an additional GET parameter when you execute Cypher queries, metadata about the
query will be returned, such as how many labels were added or removed by the query.
MATCH (n { name: 'I' })
SET n:Actor
REMOVE n:Director
RETURN labels(n)

Figure 21.10. Final Graph
Node[ 91] : Act or
nam e = 'I'

Example request
325

REST API
• POST http://localhost:7474/db/data/cypher?includeStats=true
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
{
"query" : "MATCH (n {name: 'I'}) SET n:Actor REMOVE n:Director RETURN labels(n)",
"params" : { }
}

Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
{
"columns" : [ "labels(n)" ],
"data" : [ [ [ "Actor" ] ] ],
"stats" : {
"relationships_created" : 0,
"nodes_deleted" : 0,
"relationship_deleted" : 0,
"indexes_added" : 0,
"properties_set" : 0,
"constraints_removed" : 0,
"indexes_removed" : 0,
"labels_removed" : 1,
"constraints_added" : 0,
"labels_added" : 1,
"nodes_created" : 0,
"contains_updates" : true
}
}

Errors
Errors on the server will be reported as a JSON-formatted message, exception name and stacktrace.
MATCH (x { name: 'I' })
RETURN x.dummy/0

Figure 21.11. Final Graph
Node[ 73]
nam e = 'I'

know

Node[ 72]
nam e = 'you'

Example request
• POST http://localhost:7474/db/data/cypher
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
{
"query" : "MATCH (x {name: 'I'}) RETURN x.dummy/0",
"params" : { }

326

REST API
}

Example response
• 400: Bad Request
• Content-Type: application/json; charset=UTF-8
{
"message": "/ by zero",
"exception": "BadInputException",
"fullname": "org.neo4j.server.rest.repr.BadInputException",
"stackTrace": [
"org.neo4j.server.rest.repr.RepresentationExceptionHandlingIterable.exceptionOnNext(RepresentationExceptionHandlingIterable.java:39)",
"org.neo4j.helpers.collection.ExceptionHandlingIterable$1.next(ExceptionHandlingIterable.java:55)",
"org.neo4j.helpers.collection.IteratorWrapper.next(IteratorWrapper.java:47)",
"org.neo4j.server.rest.repr.ListRepresentation.serialize(ListRepresentation.java:64)",
"org.neo4j.server.rest.repr.Serializer.serialize(Serializer.java:75)",
"org.neo4j.server.rest.repr.MappingSerializer.putList(MappingSerializer.java:61)",
"org.neo4j.server.rest.repr.CypherResultRepresentation.serialize(CypherResultRepresentation.java:58)",
"org.neo4j.server.rest.repr.MappingRepresentation.serialize(MappingRepresentation.java:41)",
"org.neo4j.server.rest.repr.OutputFormat.assemble(OutputFormat.java:245)",
"org.neo4j.server.rest.repr.OutputFormat.formatRepresentation(OutputFormat.java:177)",
"org.neo4j.server.rest.repr.OutputFormat.response(OutputFormat.java:160)",
"org.neo4j.server.rest.repr.OutputFormat.ok(OutputFormat.java:73)",
"org.neo4j.server.rest.web.CypherService.cypher(CypherService.java:127)",
"java.lang.reflect.Method.invoke(Method.java:606)",
"org.neo4j.server.rest.transactional.TransactionalRequestDispatcher.dispatch(TransactionalRequestDispatcher.java:139)",
"org.neo4j.server.rest.web.CollectUserAgentFilter.doFilter(CollectUserAgentFilter.java:69)",
"java.lang.Thread.run(Thread.java:745)"
],
"cause": {
"message": "/ by zero",
"errors": [
{
"message": "/ by zero",
"code": "Neo.ClientError.Statement.ArithmeticError"
}
],
"cause": {
"message": "/ by zero",
"errors": [
{
"message": "/ by zero",
"code": "Neo.ClientError.Statement.ArithmeticError"
}
],
"cause": {
"message": "/ by zero",
"errors": [
{
"message": "/ by zero",
"code": "Neo.ClientError.Statement.ArithmeticError"
}
],
"cause": {
"errors": [
{
"code": "Neo.DatabaseError.General.UnknownFailure",
"stackTrace": "org.neo4j.cypher.internal.frontend.v2_3.ArithmeticException\n\tat
org.neo4j.cypher.internal.compiler.v2_3.commands.expressions.Divide.apply(Divide.scala:36)\n\tat
org.neo4j.cypher.internal.compiler.v2_3.pipes.ProjectionPipe$$anonfun$internalCreateResults$1$$anonfun$apply
$1.apply(ProjectionPipe.scala:48)\n\tat org.neo4j.cypher.internal.compiler.v2_3.pipes.ProjectionPipe$$anonfun
$internalCreateResults$1$$anonfun$apply$1.apply(ProjectionPipe.scala:46)\n\tat scala.collection.immutable.Map

327

REST API
$Map1.foreach(Map.scala:116)\n\tat org.neo4j.cypher.internal.compiler.v2_3.pipes.ProjectionPipe$$anonfun
$internalCreateResults$1.apply(ProjectionPipe.scala:46)\n\tat org.neo4j.cypher.internal.compiler.v2_3.pipes.ProjectionPipe
$$anonfun$internalCreateResults$1.apply(ProjectionPipe.scala:45)\n\tat scala.collection.Iterator$$anon
$11.next(Iterator.scala:370)\n\tat scala.collection.Iterator$$anon$11.next(Iterator.scala:370)\n\tat
org.neo4j.cypher.internal.compiler.v2_3.ClosingIterator$$anonfun$next$1.apply(ResultIterator.scala:75)\n\tat
org.neo4j.cypher.internal.compiler.v2_3.ClosingIterator$$anonfun$next$1.apply(ResultIterator.scala:72)\n\tat
org.neo4j.cypher.internal.compiler.v2_3.ClosingIterator$$anonfun$failIfThrows$1.apply(ResultIterator.scala:121)\n
\tat org.neo4j.cypher.internal.compiler.v2_3.ClosingIterator.decoratedCypherException(ResultIterator.scala:130)\n
\tat org.neo4j.cypher.internal.compiler.v2_3.ClosingIterator.failIfThrows(ResultIterator.scala:119)\n
\tat org.neo4j.cypher.internal.compiler.v2_3.ClosingIterator.next(ResultIterator.scala:72)\n\tat
org.neo4j.cypher.internal.compiler.v2_3.ClosingIterator.next(ResultIterator.scala:50)\n\tat
org.neo4j.cypher.internal.compiler.v2_3.PipeExecutionResult.next(PipeExecutionResult.scala:77)\n\tat
org.neo4j.cypher.internal.compiler.v2_3.PipeExecutionResult$$anon$2.next(PipeExecutionResult.scala:70)\n\tat
org.neo4j.cypher.internal.compiler.v2_3.PipeExecutionResult$$anon$2.next(PipeExecutionResult.scala:68)\n
\tat org.neo4j.cypher.internal.compatibility.ExecutionResultWrapperFor2_3$$anon$1$$anonfun$next
$1.apply(CompatibilityFor2_3.scala:234)\n\tat org.neo4j.cypher.internal.compatibility.ExecutionResultWrapperFor2_3$$anon$1$
$anonfun$next$1.apply(CompatibilityFor2_3.scala:234)\n\tat
org.neo4j.cypher.internal.compatibility.exceptionHandlerFor2_3$.runSafely(CompatibilityFor2_3.scala:116)\n\tat
org.neo4j.cypher.internal.compatibility.ExecutionResultWrapperFor2_3$$anon$1.next(CompatibilityFor2_3.scala:234)\n\tat
org.neo4j.cypher.internal.compatibility.ExecutionResultWrapperFor2_3$$anon$1.next(CompatibilityFor2_3.scala:229)\n
\tat org.neo4j.cypher.javacompat.ExecutionResult.next(ExecutionResult.java:233)\n
\tat org.neo4j.cypher.javacompat.ExecutionResult.next(ExecutionResult.java:55)\n\tat
org.neo4j.helpers.collection.ExceptionHandlingIterable$1.next(ExceptionHandlingIterable.java:53)\n
\tat org.neo4j.helpers.collection.IteratorWrapper.next(IteratorWrapper.java:47)\n\tat
org.neo4j.server.rest.repr.ListRepresentation.serialize(ListRepresentation.java:64)\n
\tat org.neo4j.server.rest.repr.Serializer.serialize(Serializer.java:75)\n\tat
org.neo4j.server.rest.repr.MappingSerializer.putList(MappingSerializer.java:61)\n\tat
org.neo4j.server.rest.repr.CypherResultRepresentation.serialize(CypherResultRepresentation.java:58)\n
\tat org.neo4j.server.rest.repr.MappingRepresentation.serialize(MappingRepresentation.java:41)\n
\tat org.neo4j.server.rest.repr.OutputFormat.assemble(OutputFormat.java:245)\n\tat
org.neo4j.server.rest.repr.OutputFormat.formatRepresentation(OutputFormat.java:177)\n
\tat org.neo4j.server.rest.repr.OutputFormat.response(OutputFormat.java:160)\n\tat
org.neo4j.server.rest.repr.OutputFormat.ok(OutputFormat.java:73)\n\tat
org.neo4j.server.rest.web.CypherService.cypher(CypherService.java:127)\n\tat
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)\n\tat
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)\n\tat
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n\tat
java.lang.reflect.Method.invoke(Method.java:606)\n\tat com.sun.jersey.spi.container.JavaMethodInvokerFactory
$1.invoke(JavaMethodInvokerFactory.java:60)\n\tat
com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider
$ResponseOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:205)\n\tat
com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75)\n
\tat org.neo4j.server.rest.transactional.TransactionalRequestDispatcher.dispatch(TransactionalRequestDispatcher.java:139)\n
\tat com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:302)\n\tat
com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108)\n\tat
com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)\n\tat
com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84)\n\tat
com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1542)\n\tat
com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1473)\n
\tat com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1419)\n
\tat com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1409)\n
\tat com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:409)\n\tat
com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:558)\n\tat
com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:733)\n\tat
javax.servlet.http.HttpServlet.service(HttpServlet.java:790)\n\tat
org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:808)\n\tat
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1669)\n\tat
org.neo4j.server.rest.web.CollectUserAgentFilter.doFilter(CollectUserAgentFilter.java:69)\n
\tat org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)\n
\tat org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)\n\tat
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:221)\n\tat
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)\n
\tat org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)\n\tat
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)\n\tat
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)\n

328

REST API
\tat org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)\n
\tat org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:52)\n\tat
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)\n\tat
org.eclipse.jetty.server.Server.handle(Server.java:497)\n\tat
org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310)\n\tat
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)\n\tat org.eclipse.jetty.io.AbstractConnection
$2.run(AbstractConnection.java:540)\n\tat org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)\n
\tat org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)\n\tat
java.lang.Thread.run(Thread.java:745)\n"
}
],
"exception": "ArithmeticException",
"fullname": "org.neo4j.cypher.internal.frontend.v2_3.ArithmeticException",
"stackTrace": [
"org.neo4j.cypher.internal.compiler.v2_3.commands.expressions.Divide.apply(Divide.scala:36)",
"org.neo4j.cypher.internal.compiler.v2_3.pipes.ProjectionPipe$$anonfun$internalCreateResults$1$$anonfun$apply
$1.apply(ProjectionPipe.scala:48)",
"org.neo4j.cypher.internal.compiler.v2_3.pipes.ProjectionPipe$$anonfun$internalCreateResults$1$$anonfun$apply
$1.apply(ProjectionPipe.scala:46)",
"scala.collection.immutable.Map$Map1.foreach(Map.scala:116)",
"org.neo4j.cypher.internal.compiler.v2_3.pipes.ProjectionPipe$$anonfun$internalCreateResults
$1.apply(ProjectionPipe.scala:46)",
"org.neo4j.cypher.internal.compiler.v2_3.pipes.ProjectionPipe$$anonfun$internalCreateResults
$1.apply(ProjectionPipe.scala:45)",
"scala.collection.Iterator$$anon$11.next(Iterator.scala:370)",
"scala.collection.Iterator$$anon$11.next(Iterator.scala:370)",
"org.neo4j.cypher.internal.compiler.v2_3.ClosingIterator$$anonfun$next$1.apply(ResultIterator.scala:75)",
"org.neo4j.cypher.internal.compiler.v2_3.ClosingIterator$$anonfun$next$1.apply(ResultIterator.scala:72)",
"org.neo4j.cypher.internal.compiler.v2_3.ClosingIterator$$anonfun$failIfThrows$1.apply(ResultIterator.scala:121)",
"org.neo4j.cypher.internal.compiler.v2_3.ClosingIterator.decoratedCypherException(ResultIterator.scala:130)",
"org.neo4j.cypher.internal.compiler.v2_3.ClosingIterator.failIfThrows(ResultIterator.scala:119)",
"org.neo4j.cypher.internal.compiler.v2_3.ClosingIterator.next(ResultIterator.scala:72)",
"org.neo4j.cypher.internal.compiler.v2_3.ClosingIterator.next(ResultIterator.scala:50)",
"org.neo4j.cypher.internal.compiler.v2_3.PipeExecutionResult.next(PipeExecutionResult.scala:77)",
"org.neo4j.cypher.internal.compiler.v2_3.PipeExecutionResult$$anon$2.next(PipeExecutionResult.scala:70)",
"org.neo4j.cypher.internal.compiler.v2_3.PipeExecutionResult$$anon$2.next(PipeExecutionResult.scala:68)",
"org.neo4j.cypher.internal.compatibility.ExecutionResultWrapperFor2_3$$anon$1$$anonfun$next
$1.apply(CompatibilityFor2_3.scala:234)",
"org.neo4j.cypher.internal.compatibility.ExecutionResultWrapperFor2_3$$anon$1$$anonfun$next
$1.apply(CompatibilityFor2_3.scala:234)",
"org.neo4j.cypher.internal.compatibility.exceptionHandlerFor2_3$.runSafely(CompatibilityFor2_3.scala:116)",
"org.neo4j.cypher.internal.compatibility.ExecutionResultWrapperFor2_3$$anon
$1.next(CompatibilityFor2_3.scala:234)",
"org.neo4j.cypher.internal.compatibility.ExecutionResultWrapperFor2_3$$anon
$1.next(CompatibilityFor2_3.scala:229)",
"org.neo4j.cypher.javacompat.ExecutionResult.next(ExecutionResult.java:233)",
"org.neo4j.cypher.javacompat.ExecutionResult.next(ExecutionResult.java:55)",
"org.neo4j.helpers.collection.ExceptionHandlingIterable$1.next(ExceptionHandlingIterable.java:53)",
"org.neo4j.helpers.collection.IteratorWrapper.next(IteratorWrapper.java:47)",
"org.neo4j.server.rest.repr.ListRepresentation.serialize(ListRepresentation.java:64)",
"org.neo4j.server.rest.repr.Serializer.serialize(Serializer.java:75)",
"org.neo4j.server.rest.repr.MappingSerializer.putList(MappingSerializer.java:61)",
"org.neo4j.server.rest.repr.CypherResultRepresentation.serialize(CypherResultRepresentation.java:58)",
"org.neo4j.server.rest.repr.MappingRepresentation.serialize(MappingRepresentation.java:41)",
"org.neo4j.server.rest.repr.OutputFormat.assemble(OutputFormat.java:245)",
"org.neo4j.server.rest.repr.OutputFormat.formatRepresentation(OutputFormat.java:177)",
"org.neo4j.server.rest.repr.OutputFormat.response(OutputFormat.java:160)",
"org.neo4j.server.rest.repr.OutputFormat.ok(OutputFormat.java:73)",
"org.neo4j.server.rest.web.CypherService.cypher(CypherService.java:127)",
"java.lang.reflect.Method.invoke(Method.java:606)",
"org.neo4j.server.rest.transactional.TransactionalRequestDispatcher.dispatch(TransactionalRequestDispatcher.java:139)",
"org.neo4j.server.rest.web.CollectUserAgentFilter.doFilter(CollectUserAgentFilter.java:69)",
"java.lang.Thread.run(Thread.java:745)"
]

329

REST API
},
"exception": "ArithmeticException",
"fullname": "org.neo4j.cypher.ArithmeticException",
"stackTrace": [
"org.neo4j.cypher.internal.compatibility.exceptionHandlerFor2_3$.arithmeticException(CompatibilityFor2_3.scala:63)",
"org.neo4j.cypher.internal.compatibility.exceptionHandlerFor2_3$.arithmeticException(CompatibilityFor2_3.scala:60)",
"org.neo4j.cypher.internal.frontend.v2_3.ArithmeticException.mapToPublic(CypherException.scala:111)",
"org.neo4j.cypher.internal.compatibility.exceptionHandlerFor2_3$.runSafely(CompatibilityFor2_3.scala:121)",
"org.neo4j.cypher.internal.compatibility.ExecutionResultWrapperFor2_3$$anon$1.next(CompatibilityFor2_3.scala:234)",
"org.neo4j.cypher.internal.compatibility.ExecutionResultWrapperFor2_3$$anon$1.next(CompatibilityFor2_3.scala:229)",
"org.neo4j.cypher.javacompat.ExecutionResult.next(ExecutionResult.java:233)",
"org.neo4j.cypher.javacompat.ExecutionResult.next(ExecutionResult.java:55)",
"org.neo4j.helpers.collection.ExceptionHandlingIterable$1.next(ExceptionHandlingIterable.java:53)",
"org.neo4j.helpers.collection.IteratorWrapper.next(IteratorWrapper.java:47)",
"org.neo4j.server.rest.repr.ListRepresentation.serialize(ListRepresentation.java:64)",
"org.neo4j.server.rest.repr.Serializer.serialize(Serializer.java:75)",
"org.neo4j.server.rest.repr.MappingSerializer.putList(MappingSerializer.java:61)",
"org.neo4j.server.rest.repr.CypherResultRepresentation.serialize(CypherResultRepresentation.java:58)",
"org.neo4j.server.rest.repr.MappingRepresentation.serialize(MappingRepresentation.java:41)",
"org.neo4j.server.rest.repr.OutputFormat.assemble(OutputFormat.java:245)",
"org.neo4j.server.rest.repr.OutputFormat.formatRepresentation(OutputFormat.java:177)",
"org.neo4j.server.rest.repr.OutputFormat.response(OutputFormat.java:160)",
"org.neo4j.server.rest.repr.OutputFormat.ok(OutputFormat.java:73)",
"org.neo4j.server.rest.web.CypherService.cypher(CypherService.java:127)",
"java.lang.reflect.Method.invoke(Method.java:606)",
"org.neo4j.server.rest.transactional.TransactionalRequestDispatcher.dispatch(TransactionalRequestDispatcher.java:139)",
"org.neo4j.server.rest.web.CollectUserAgentFilter.doFilter(CollectUserAgentFilter.java:69)",
"java.lang.Thread.run(Thread.java:745)"
]
},
"exception": "QueryExecutionKernelException",
"fullname": "org.neo4j.kernel.impl.query.QueryExecutionKernelException",
"stackTrace": [
"org.neo4j.cypher.javacompat.ExecutionResult.converted(ExecutionResult.java:391)",
"org.neo4j.cypher.javacompat.ExecutionResult.next(ExecutionResult.java:237)",
"org.neo4j.cypher.javacompat.ExecutionResult.next(ExecutionResult.java:55)",
"org.neo4j.helpers.collection.ExceptionHandlingIterable$1.next(ExceptionHandlingIterable.java:53)",
"org.neo4j.helpers.collection.IteratorWrapper.next(IteratorWrapper.java:47)",
"org.neo4j.server.rest.repr.ListRepresentation.serialize(ListRepresentation.java:64)",
"org.neo4j.server.rest.repr.Serializer.serialize(Serializer.java:75)",
"org.neo4j.server.rest.repr.MappingSerializer.putList(MappingSerializer.java:61)",
"org.neo4j.server.rest.repr.CypherResultRepresentation.serialize(CypherResultRepresentation.java:58)",
"org.neo4j.server.rest.repr.MappingRepresentation.serialize(MappingRepresentation.java:41)",
"org.neo4j.server.rest.repr.OutputFormat.assemble(OutputFormat.java:245)",
"org.neo4j.server.rest.repr.OutputFormat.formatRepresentation(OutputFormat.java:177)",
"org.neo4j.server.rest.repr.OutputFormat.response(OutputFormat.java:160)",
"org.neo4j.server.rest.repr.OutputFormat.ok(OutputFormat.java:73)",
"org.neo4j.server.rest.web.CypherService.cypher(CypherService.java:127)",
"java.lang.reflect.Method.invoke(Method.java:606)",
"org.neo4j.server.rest.transactional.TransactionalRequestDispatcher.dispatch(TransactionalRequestDispatcher.java:139)",
"org.neo4j.server.rest.web.CollectUserAgentFilter.doFilter(CollectUserAgentFilter.java:69)",
"java.lang.Thread.run(Thread.java:745)"
]
},
"exception": "QueryExecutionException",
"fullname": "org.neo4j.graphdb.QueryExecutionException",
"stackTrace": [
"org.neo4j.kernel.impl.query.QueryExecutionKernelException.asUserException(QueryExecutionKernelException.java:35)",
"org.neo4j.cypher.javacompat.ExecutionResult.converted(ExecutionResult.java:391)",
"org.neo4j.cypher.javacompat.ExecutionResult.next(ExecutionResult.java:237)",
"org.neo4j.cypher.javacompat.ExecutionResult.next(ExecutionResult.java:55)",
"org.neo4j.helpers.collection.ExceptionHandlingIterable$1.next(ExceptionHandlingIterable.java:53)",
"org.neo4j.helpers.collection.IteratorWrapper.next(IteratorWrapper.java:47)",

330

REST API
"org.neo4j.server.rest.repr.ListRepresentation.serialize(ListRepresentation.java:64)",
"org.neo4j.server.rest.repr.Serializer.serialize(Serializer.java:75)",
"org.neo4j.server.rest.repr.MappingSerializer.putList(MappingSerializer.java:61)",
"org.neo4j.server.rest.repr.CypherResultRepresentation.serialize(CypherResultRepresentation.java:58)",
"org.neo4j.server.rest.repr.MappingRepresentation.serialize(MappingRepresentation.java:41)",
"org.neo4j.server.rest.repr.OutputFormat.assemble(OutputFormat.java:245)",
"org.neo4j.server.rest.repr.OutputFormat.formatRepresentation(OutputFormat.java:177)",
"org.neo4j.server.rest.repr.OutputFormat.response(OutputFormat.java:160)",
"org.neo4j.server.rest.repr.OutputFormat.ok(OutputFormat.java:73)",
"org.neo4j.server.rest.web.CypherService.cypher(CypherService.java:127)",
"java.lang.reflect.Method.invoke(Method.java:606)",
"org.neo4j.server.rest.transactional.TransactionalRequestDispatcher.dispatch(TransactionalRequestDispatcher.java:139)",
"org.neo4j.server.rest.web.CollectUserAgentFilter.doFilter(CollectUserAgentFilter.java:69)",
"java.lang.Thread.run(Thread.java:745)"
]
},
"errors": [
{
"message": "/ by zero",
"code": "Neo.ClientError.Request.InvalidFormat"
}
]
}

331

REST API

21.7. Property values
The REST API allows setting properties on nodes and relationships through direct RESTful operations.
However, there are restrictions as to what types of values can be used as property values. Allowed
value types are as follows:
• Numbers: Both integer values, with capacity as Java’s Long type, and floating points, with capacity as
Java’s Double.
• Booleans.
• Strings.
• Arrays of the basic types above.

Arrays
There are two important points to be made about array values. First, all values in the array must be of
the same type. That means either all integers, all floats, all booleans or all strings. Mixing types is not
currently supported.
Second, storing empty arrays is only possible given certain preconditions. Because the JSON transfer
format does not contain type information for arrays, type is inferred from the values in the array. If the
array is empty, the Neo4j Server cannot determine the type. In these cases, it will check if an array is
already stored for the given property, and will use the stored array’s type when storing the empty array.
If no array exists already, the server will reject the request.

Property keys
You can list all property keys ever used in the database. This includes and property keys you have used,
but deleted.
There is currently no way to tell which ones are in use and which ones are not, short of walking the
entire set of properties in the database.

List all property keys
Example request
• GET http://localhost:7474/db/data/propertykeys
• Accept: application/json; charset=UTF-8
Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
[ "key", "happy", "since", "name", "öäüÖÄÜß", "value", "age", "non-existent", "ID", "cost" ]

332

REST API

21.8. Nodes
Create node
Figure 21.12. Final Graph
Node[ 8]

Example request
• POST http://localhost:7474/db/data/node
• Accept: application/json; charset=UTF-8
Example response
• 201: Created
• Content-Type: application/json; charset=UTF-8
• Location: http://localhost:7474/db/data/node/8
{
"extensions" : { },
"labels" : "http://localhost:7474/db/data/node/8/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/8/relationships/out",
"all_typed_relationships" : "http://localhost:7474/db/data/node/8/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/8/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/8",
"property" : "http://localhost:7474/db/data/node/8/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/8/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/8/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/8/relationships/in",
"create_relationship" : "http://localhost:7474/db/data/node/8/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/8/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/8/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/8/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 8,
"labels" : [ ]
},
"data" : { }
}

Create node with properties
Figure 21.13. Final Graph
Node[ 4]
foo = 'bar'

Example request
• POST http://localhost:7474/db/data/node
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
{
"foo" : "bar"

333

REST API
}

Example response
•
•
•
•

201: Created
Content-Length: 1209
Content-Type: application/json; charset=UTF-8
Location: http://localhost:7474/db/data/node/4

{
"extensions" : { },
"labels" : "http://localhost:7474/db/data/node/4/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/4/relationships/out",
"all_typed_relationships" : "http://localhost:7474/db/data/node/4/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/4/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/4",
"property" : "http://localhost:7474/db/data/node/4/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/4/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/4/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/4/relationships/in",
"create_relationship" : "http://localhost:7474/db/data/node/4/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/4/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/4/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/4/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 4,
"labels" : [ ]
},
"data" : {
"foo" : "bar"
}
}

Get node
Note that the response contains URI/templates for the available operations for getting properties and
relationships.
Figure 21.14. Final Graph
Node[ 54]

Example request
• GET http://localhost:7474/db/data/node/54
• Accept: application/json; charset=UTF-8
Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
{
"extensions" : { },
"labels" : "http://localhost:7474/db/data/node/54/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/54/relationships/out",
"all_typed_relationships" : "http://localhost:7474/db/data/node/54/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/54/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/54",
"property" : "http://localhost:7474/db/data/node/54/properties/{key}",

334

REST API
"properties" : "http://localhost:7474/db/data/node/54/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/54/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/54/relationships/in",
"create_relationship" : "http://localhost:7474/db/data/node/54/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/54/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/54/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/54/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 54,
"labels" : [ ]
},
"data" : { }
}

Get non-existent node
Figure 21.15. Final Graph
Node[ 58]

Example request
• GET http://localhost:7474/db/data/node/5800000
• Accept: application/json; charset=UTF-8
Example response
• 404: Not Found
• Content-Type: application/json; charset=UTF-8
{
"message": "Cannot find node with id [5800000] in database.",
"exception": "NodeNotFoundException",
"fullname": "org.neo4j.server.rest.web.NodeNotFoundException",
"stackTrace": [
"org.neo4j.server.rest.web.DatabaseActions.node(DatabaseActions.java:174)",
"org.neo4j.server.rest.web.DatabaseActions.getNode(DatabaseActions.java:219)",
"org.neo4j.server.rest.web.RestfulGraphDatabase.getNode(RestfulGraphDatabase.java:279)",
"java.lang.reflect.Method.invoke(Method.java:606)",
"org.neo4j.server.rest.transactional.TransactionalRequestDispatcher.dispatch(TransactionalRequestDispatcher.java:139)",
"org.neo4j.server.rest.web.CollectUserAgentFilter.doFilter(CollectUserAgentFilter.java:69)",
"java.lang.Thread.run(Thread.java:745)"
],
"cause": {
"message": "Node 5800000 not found",
"errors": [
{
"message": "Node 5800000 not found",
"code": "Neo.ClientError.Statement.EntityNotFound"
}
],
"cause": {
"message": "Unable to load NODE with id 5800000.",
"errors": [
{
"message": "Unable to load NODE with id 5800000.",
"code": "Neo.ClientError.Statement.EntityNotFound"
}
],
"exception": "EntityNotFoundException",
"fullname": "org.neo4j.kernel.api.exceptions.EntityNotFoundException",

335

REST API
"stackTrace": [
"org.neo4j.kernel.impl.factory.GraphDatabaseFacade.getNodeById(GraphDatabaseFacade.java:228)",
"org.neo4j.server.rest.web.DatabaseActions.node(DatabaseActions.java:170)",
"org.neo4j.server.rest.web.DatabaseActions.getNode(DatabaseActions.java:219)",
"org.neo4j.server.rest.web.RestfulGraphDatabase.getNode(RestfulGraphDatabase.java:279)",
"java.lang.reflect.Method.invoke(Method.java:606)",
"org.neo4j.server.rest.transactional.TransactionalRequestDispatcher.dispatch(TransactionalRequestDispatcher.java:139)",
"org.neo4j.server.rest.web.CollectUserAgentFilter.doFilter(CollectUserAgentFilter.java:69)",
"java.lang.Thread.run(Thread.java:745)"
]
},
"exception": "NotFoundException",
"fullname": "org.neo4j.graphdb.NotFoundException",
"stackTrace": [
"org.neo4j.kernel.impl.factory.GraphDatabaseFacade.getNodeById(GraphDatabaseFacade.java:228)",
"org.neo4j.server.rest.web.DatabaseActions.node(DatabaseActions.java:170)",
"org.neo4j.server.rest.web.DatabaseActions.getNode(DatabaseActions.java:219)",
"org.neo4j.server.rest.web.RestfulGraphDatabase.getNode(RestfulGraphDatabase.java:279)",
"java.lang.reflect.Method.invoke(Method.java:606)",
"org.neo4j.server.rest.transactional.TransactionalRequestDispatcher.dispatch(TransactionalRequestDispatcher.java:139)",
"org.neo4j.server.rest.web.CollectUserAgentFilter.doFilter(CollectUserAgentFilter.java:69)",
"java.lang.Thread.run(Thread.java:745)"
]
},
"errors": [
{
"message": "Cannot find node with id [5800000] in database.",
"code": "Neo.ClientError.Statement.EntityNotFound"
}
]
}

Delete node
Figure 21.16. Starting Graph
Node[ 5]

Figure 21.17. Final Graph

Example request
• DELETE http://localhost:7474/db/data/node/5
• Accept: application/json; charset=UTF-8
Example response
• 204: No Content

Nodes with relationships cannot be deleted
The relationships on a node has to be deleted before the node can be deleted.

Tip

You can use DETACH DELETE in Cypher to delete nodes and their relationships in one go.

336

REST API
Figure 21.18. Starting Graph
Node[ 12]

LOVES

Node[ 13]

Example request
• DELETE http://localhost:7474/db/data/node/12
• Accept: application/json; charset=UTF-8
Example response
• 409: Conflict
• Content-Type: application/json; charset=UTF-8
{
"message": "The node with id 12 cannot be deleted. Check that the node is orphaned before deletion.",
"exception": "ConstraintViolationException",
"fullname": "org.neo4j.graphdb.ConstraintViolationException",
"stackTrace": [
"org.neo4j.server.rest.web.DatabaseActions.deleteNode(DatabaseActions.java:228)",
"org.neo4j.server.rest.web.RestfulGraphDatabase.deleteNode(RestfulGraphDatabase.java:293)",
"java.lang.reflect.Method.invoke(Method.java:606)",
"org.neo4j.server.rest.transactional.TransactionalRequestDispatcher.dispatch(TransactionalRequestDispatcher.java:139)",
"org.neo4j.server.rest.web.CollectUserAgentFilter.doFilter(CollectUserAgentFilter.java:69)",
"java.lang.Thread.run(Thread.java:745)"
],
"errors": [
{
"message": "The node with id 12 cannot be deleted. Check that the node is orphaned before deletion.",
"code": "Neo.ClientError.Schema.ConstraintViolation"
}
]
}

337

REST API

21.9. Relationships
Relationships are a first class citizen in the Neo4j REST API. They can be accessed either stand-alone or
through the nodes they are attached to.
The general pattern to get relationships from a node is:
GET http://localhost:7474/db/data/node/123/relationships/{dir}/{-list|&|types}

Where dir is one of all, in, out and types is an ampersand-separated list of types. See the examples
below for more information.

Get Relationship by ID
Figure 21.19. Final Graph
Node[ 19]
nam e = 'I'

know

Node[ 18]
nam e = 'you'

Example request
• GET http://localhost:7474/db/data/relationship/9
• Accept: application/json; charset=UTF-8
Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
{
"extensions" : { },
"start" : "http://localhost:7474/db/data/node/19",
"property" : "http://localhost:7474/db/data/relationship/9/properties/{key}",
"self" : "http://localhost:7474/db/data/relationship/9",
"properties" : "http://localhost:7474/db/data/relationship/9/properties",
"type" : "know",
"end" : "http://localhost:7474/db/data/node/18",
"metadata" : {
"id" : 9,
"type" : "know"
},
"data" : { }
}

Create relationship
Upon successful creation of a relationship, the new relationship is returned.

338

REST API
Figure 21.20. Starting Graph
Node[ 1]
nam e = 'Joe'

knows

Node[ 0]
nam e = 'Sara'

Figure 21.21. Final Graph
Node[ 1]
nam e = 'Joe'

LOVES knows

Node[ 0]
nam e = 'Sara'

Example request
• POST http://localhost:7474/db/data/node/1/relationships
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
{
"to" : "http://localhost:7474/db/data/node/0",
"type" : "LOVES"
}

Example response
• 201: Created
• Content-Type: application/json; charset=UTF-8
• Location: http://localhost:7474/db/data/relationship/1
{
"extensions" : { },
"start" : "http://localhost:7474/db/data/node/1",
"property" : "http://localhost:7474/db/data/relationship/1/properties/{key}",
"self" : "http://localhost:7474/db/data/relationship/1",
"properties" : "http://localhost:7474/db/data/relationship/1/properties",
"type" : "LOVES",
"end" : "http://localhost:7474/db/data/node/0",
"metadata" : {
"id" : 1,
"type" : "LOVES"
},
"data" : { }
}

Create a relationship with properties
Upon successful creation of a relationship, the new relationship is returned.
339

REST API
Figure 21.22. Starting Graph
Node[ 11]
nam e = 'Joe'

knows

Node[ 10]
nam e = 'Sara'

Figure 21.23. Final Graph
Node[ 11]
nam e = 'Joe'

LOVES
foo = 'bar'

knows

Node[ 10]
nam e = 'Sara'

Example request
• POST http://localhost:7474/db/data/node/11/relationships
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
{
"to" : "http://localhost:7474/db/data/node/10",
"type" : "LOVES",
"data" : {
"foo" : "bar"
}
}

Example response
• 201: Created
• Content-Type: application/json; charset=UTF-8
• Location: http://localhost:7474/db/data/relationship/8
{
"extensions" : { },
"start" : "http://localhost:7474/db/data/node/11",
"property" : "http://localhost:7474/db/data/relationship/8/properties/{key}",
"self" : "http://localhost:7474/db/data/relationship/8",
"properties" : "http://localhost:7474/db/data/relationship/8/properties",
"type" : "LOVES",
"end" : "http://localhost:7474/db/data/node/10",
"metadata" : {
"id" : 8,
"type" : "LOVES"
},
"data" : {

340

REST API
"foo" : "bar"
}
}

Delete relationship
Figure 21.24. Starting Graph
Node[ 5]
nam e = 'Rom eo'

LOVES
cost = 'high'

Node[ 4]
nam e = 'Juliet '

Figure 21.25. Final Graph
Node[ 4]

Node[ 5]

nam e = 'Juliet '

nam e = 'Rom eo'

Example request
• DELETE http://localhost:7474/db/data/relationship/2
• Accept: application/json; charset=UTF-8
Example response
• 204: No Content

Get all properties on a relationship
Figure 21.26. Final Graph
Node[ 11]
nam e = 'Rom eo'

LOVES
cost = 'high'
since = '1day'

Node[ 10]
nam e = 'Juliet '

Example request
• GET http://localhost:7474/db/data/relationship/5/properties
• Accept: application/json; charset=UTF-8
Example response
341

REST API
• 200: OK
• Content-Type: application/json; charset=UTF-8
{
"since" : "1day",
"cost" : "high"
}

Set all properties on a relationship
Figure 21.27. Starting Graph
Node[ 9]
nam e = 'Rom eo'

LOVES
cost = 'high'

Node[ 8]
nam e = 'Juliet '

Figure 21.28. Final Graph
Node[ 21]
nam e = 'Rom eo'

LOVES
happy = false

Node[ 20]
nam e = 'Juliet '

Example request
• PUT http://localhost:7474/db/data/relationship/10/properties
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
{
"happy" : false
}

Example response
• 204: No Content

342

REST API

Get single property on a relationship
Figure 21.29. Final Graph
Node[ 13]
nam e = 'Rom eo'

LOVES
cost = 'high'

Node[ 12]
nam e = 'Juliet '

Example request
• GET http://localhost:7474/db/data/relationship/6/properties/cost
• Accept: application/json; charset=UTF-8
Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
"high"

Set single property on a relationship
Figure 21.30. Starting Graph
Node[ 9]
nam e = 'Rom eo'

LOVES
cost = 'high'

Node[ 8]
nam e = 'Juliet '

Figure 21.31. Final Graph
Node[ 9]
nam e = 'Rom eo'

LOVES
cost = 'deadly'

Node[ 8]
nam e = 'Juliet '

Example request
• PUT http://localhost:7474/db/data/relationship/4/properties/cost
343

REST API
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
"deadly"

Example response
• 204: No Content

Get all relationships
Figure 21.32. Final Graph
Node[ 99]

Node[ 101]

LIKES

Node[ 97]

HATES

Node[ 100]

LIKES

Node[ 98]

Example request
• GET http://localhost:7474/db/data/node/97/relationships/all
• Accept: application/json; charset=UTF-8
Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
[ {
"start" : "http://localhost:7474/db/data/node/97",
"data" : { },
"self" : "http://localhost:7474/db/data/relationship/84",
"property" : "http://localhost:7474/db/data/relationship/84/properties/{key}",
"properties" : "http://localhost:7474/db/data/relationship/84/properties",
"type" : "HATES",
"extensions" : { },
"end" : "http://localhost:7474/db/data/node/100",
"metadata" : {
"id" : 84,
"type" : "HATES"
}
}, {
"start" : "http://localhost:7474/db/data/node/99",
"data" : { },
"self" : "http://localhost:7474/db/data/relationship/83",
"property" : "http://localhost:7474/db/data/relationship/83/properties/{key}",
"properties" : "http://localhost:7474/db/data/relationship/83/properties",

344

REST API
"type" : "LIKES",
"extensions" : { },
"end" : "http://localhost:7474/db/data/node/97",
"metadata" : {
"id" : 83,
"type" : "LIKES"
}
}, {
"start" : "http://localhost:7474/db/data/node/97",
"data" : { },
"self" : "http://localhost:7474/db/data/relationship/82",
"property" : "http://localhost:7474/db/data/relationship/82/properties/{key}",
"properties" : "http://localhost:7474/db/data/relationship/82/properties",
"type" : "LIKES",
"extensions" : { },
"end" : "http://localhost:7474/db/data/node/98",
"metadata" : {
"id" : 82,
"type" : "LIKES"
}
} ]

Get incoming relationships
Figure 21.33. Final Graph
Node[ 118]

Node[ 120]

LIKES

Node[ 116]

HATES

Node[ 119]

LIKES

Node[ 117]

Example request
• GET http://localhost:7474/db/data/node/116/relationships/in
• Accept: application/json; charset=UTF-8
Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
[ {
"start" : "http://localhost:7474/db/data/node/118",
"data" : { },
"self" : "http://localhost:7474/db/data/relationship/94",
"property" : "http://localhost:7474/db/data/relationship/94/properties/{key}",
"properties" : "http://localhost:7474/db/data/relationship/94/properties",
"type" : "LIKES",
"extensions" : { },

345

REST API
"end" : "http://localhost:7474/db/data/node/116",
"metadata" : {
"id" : 94,
"type" : "LIKES"
}
} ]

Get outgoing relationships
Figure 21.34. Final Graph
Node[ 143]

Node[ 145]

LIKES

Node[ 141]

HATES

Node[ 144]

LIKES

Node[ 142]

Example request
• GET http://localhost:7474/db/data/node/141/relationships/out
• Accept: application/json; charset=UTF-8
Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
[ {
"start" : "http://localhost:7474/db/data/node/141",
"data" : { },
"self" : "http://localhost:7474/db/data/relationship/110",
"property" : "http://localhost:7474/db/data/relationship/110/properties/{key}",
"properties" : "http://localhost:7474/db/data/relationship/110/properties",
"type" : "HATES",
"extensions" : { },
"end" : "http://localhost:7474/db/data/node/144",
"metadata" : {
"id" : 110,
"type" : "HATES"
}
}, {
"start" : "http://localhost:7474/db/data/node/141",
"data" : { },
"self" : "http://localhost:7474/db/data/relationship/108",
"property" : "http://localhost:7474/db/data/relationship/108/properties/{key}",
"properties" : "http://localhost:7474/db/data/relationship/108/properties",
"type" : "LIKES",
"extensions" : { },
"end" : "http://localhost:7474/db/data/node/142",
"metadata" : {

346

REST API
"id" : 108,
"type" : "LIKES"
}
} ]

Get typed relationships

2

Note that the "&" needs to be encoded like "%26" for example when using cURL from the terminal.
Figure 21.35. Final Graph
Node[ 74]

Node[ 76]

LIKES

Node[ 72]

HATES

Node[ 75]

LIKES

Node[ 73]

Example request
• GET http://localhost:7474/db/data/node/72/relationships/all/LIKES&HATES
• Accept: application/json; charset=UTF-8
Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
[ {
"start" : "http://localhost:7474/db/data/node/72",
"data" : { },
"self" : "http://localhost:7474/db/data/relationship/69",
"property" : "http://localhost:7474/db/data/relationship/69/properties/{key}",
"properties" : "http://localhost:7474/db/data/relationship/69/properties",
"type" : "HATES",
"extensions" : { },
"end" : "http://localhost:7474/db/data/node/75",
"metadata" : {
"id" : 69,
"type" : "HATES"
}
}, {
"start" : "http://localhost:7474/db/data/node/74",
"data" : { },
"self" : "http://localhost:7474/db/data/relationship/68",
"property" : "http://localhost:7474/db/data/relationship/68/properties/{key}",
"properties" : "http://localhost:7474/db/data/relationship/68/properties",
"type" : "LIKES",
"extensions" : { },
2

http://curl.haxx.se/

347

REST API
"end" : "http://localhost:7474/db/data/node/72",
"metadata" : {
"id" : 68,
"type" : "LIKES"
}
}, {
"start" : "http://localhost:7474/db/data/node/72",
"data" : { },
"self" : "http://localhost:7474/db/data/relationship/67",
"property" : "http://localhost:7474/db/data/relationship/67/properties/{key}",
"properties" : "http://localhost:7474/db/data/relationship/67/properties",
"type" : "LIKES",
"extensions" : { },
"end" : "http://localhost:7474/db/data/node/73",
"metadata" : {
"id" : 67,
"type" : "LIKES"
}
} ]

Get relationships on a node without relationships
Figure 21.36. Final Graph
Node[ 128]

Node[ 130]

LIKES

Node[ 126]

HATES

Node[ 129]

LIKES

Node[ 127]

Example request
• GET http://localhost:7474/db/data/node/130/relationships/all
• Accept: application/json; charset=UTF-8
Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
[ ]

348

REST API

21.10. Relationship types
Get relationship types
Example request
• GET http://localhost:7474/db/data/relationship/types
• Accept: application/json; charset=UTF-8
Example response
• 200: OK
• Content-Type: application/json
[ "LOVES", "KNOWS" ]

349

REST API

21.11. Node properties
Set property on node
Setting different properties will retain the existing ones for this node. Note that a single value are
submitted not as a map but just as a value (which is valid JSON) like in the example below.
Figure 21.37. Final Graph
Node[ 182]
foo2 = 'bar2'
foo = 'bar'

Example request
• PUT http://localhost:7474/db/data/node/182/properties/foo
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
"bar"

Example response
• 204: No Content

Update node properties
This will replace all existing properties on the node with the new set of attributes.
Figure 21.38. Final Graph
Node[ 180]
age = '18'

knows

Node[ 181]
nam e = 'joe'

Example request
• PUT http://localhost:7474/db/data/node/180/properties
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
{
"age" : "18"
}

Example response
• 204: No Content

350

REST API

Get properties for node
Figure 21.39. Final Graph
Node[ 55]
foo = 'bar'

Example request
• GET http://localhost:7474/db/data/node/55/properties
• Accept: application/json; charset=UTF-8
Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
{
"foo" : "bar"
}

Get property for node
Get a single node property from a node.
Figure 21.40. Final Graph
Node[ 54]
foo = 'bar'

Example request
• GET http://localhost:7474/db/data/node/54/properties/foo
• Accept: application/json; charset=UTF-8
Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
"bar"

Property values can not be null
This example shows the response you get when trying to set a property to null.
Example request
• POST http://localhost:7474/db/data/node
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
{
"foo" : null
}

Example response
• 400: Bad Request
351

REST API
• Content-Type: application/json; charset=UTF-8
{
"message": "Could not set property \"foo\", unsupported type: null",
"exception": "PropertyValueException",
"fullname": "org.neo4j.server.rest.web.PropertyValueException",
"stackTrace": [
"org.neo4j.server.rest.domain.PropertySettingStrategy.setProperty(PropertySettingStrategy.java:141)",
"org.neo4j.server.rest.domain.PropertySettingStrategy.setProperties(PropertySettingStrategy.java:88)",
"org.neo4j.server.rest.web.DatabaseActions.createNode(DatabaseActions.java:205)",
"org.neo4j.server.rest.web.RestfulGraphDatabase.createNode(RestfulGraphDatabase.java:252)",
"java.lang.reflect.Method.invoke(Method.java:606)",
"org.neo4j.server.rest.transactional.TransactionalRequestDispatcher.dispatch(TransactionalRequestDispatcher.java:139)",
"org.neo4j.server.rest.web.CollectUserAgentFilter.doFilter(CollectUserAgentFilter.java:69)",
"java.lang.Thread.run(Thread.java:745)"
],
"errors": [
{
"message": "Could not set property \"foo\", unsupported type: null",
"code": "Neo.ClientError.Statement.InvalidArguments"
}
]
}

Property values can not be nested
Nesting properties is not supported. You could for example store the nested JSON as a string instead.
Example request
• POST http://localhost:7474/db/data/node/
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
{
"foo" : {
"bar" : "baz"
}
}

Example response
• 400: Bad Request
• Content-Type: application/json; charset=UTF-8
{
"message": "Could not set property \"foo\", unsupported type: {bar\u003dbaz}",
"exception": "PropertyValueException",
"fullname": "org.neo4j.server.rest.web.PropertyValueException",
"stackTrace": [
"org.neo4j.server.rest.domain.PropertySettingStrategy.setProperty(PropertySettingStrategy.java:141)",
"org.neo4j.server.rest.domain.PropertySettingStrategy.setProperties(PropertySettingStrategy.java:88)",
"org.neo4j.server.rest.web.DatabaseActions.createNode(DatabaseActions.java:205)",
"org.neo4j.server.rest.web.RestfulGraphDatabase.createNode(RestfulGraphDatabase.java:252)",
"java.lang.reflect.Method.invoke(Method.java:606)",
"org.neo4j.server.rest.transactional.TransactionalRequestDispatcher.dispatch(TransactionalRequestDispatcher.java:139)",
"org.neo4j.server.rest.web.CollectUserAgentFilter.doFilter(CollectUserAgentFilter.java:69)",
"java.lang.Thread.run(Thread.java:745)"
],
"errors": [
{
"message": "Could not set property \"foo\", unsupported type: {bar\u003dbaz}",
"code": "Neo.ClientError.Statement.InvalidArguments"

352

REST API
}
]
}

Delete all properties from node
Figure 21.41. Starting Graph
Node[ 52]
jim = 't obias'

Figure 21.42. Final Graph
Node[ 52]

Example request
• DELETE http://localhost:7474/db/data/node/52/properties
• Accept: application/json; charset=UTF-8
Example response
• 204: No Content

Delete a named property from a node
To delete a single property from a node, see the example below
Figure 21.43. Starting Graph
Node[ 51]
nam e = 't obias'

Figure 21.44. Final Graph
Node[ 51]

Example request
• DELETE http://localhost:7474/db/data/node/51/properties/name
• Accept: application/json; charset=UTF-8
Example response
• 204: No Content

353

REST API

21.12. Relationship properties
Update relationship properties
Figure 21.45. Starting Graph
Node[ 108]

KNOWS

Node[ 109]

Figure 21.46. Final Graph
Node[ 108]

KNOWS
jim = 't obias'

Node[ 109]

Example request
• PUT http://localhost:7474/db/data/relationship/54/properties
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
{
"jim" : "tobias"
}

Example response
• 204: No Content

354

REST API

Remove properties from a relationship
Figure 21.47. Starting Graph
Node[ 1]
nam e = 'Rom eo'

LOVES
cost = 'high'

Node[ 0]
nam e = 'Juliet '

Figure 21.48. Final Graph
Node[ 1]
nam e = 'Rom eo'

LOVES

Node[ 0]
nam e = 'Juliet '

Example request
• DELETE http://localhost:7474/db/data/relationship/0/properties
• Accept: application/json; charset=UTF-8
Example response
• 204: No Content

Remove property from a relationship
See the example request below.
Figure 21.49. Starting Graph
Node[ 7]
nam e = 'Rom eo'

LOVES
cost = 'high'

Node[ 6]
nam e = 'Juliet '

355

REST API
Figure 21.50. Final Graph
Node[ 7]
nam e = 'Rom eo'

LOVES

Node[ 6]
nam e = 'Juliet '

Example request
• DELETE http://localhost:7474/db/data/relationship/3/properties/cost
• Accept: application/json; charset=UTF-8
Example response
• 204: No Content

Remove non-existent property from a relationship
Attempting to remove a property that doesn’t exist results in an error.
Figure 21.51. Starting Graph
Node[ 3]
nam e = 'Rom eo'

LOVES
cost = 'high'

Node[ 2]
nam e = 'Juliet '

Example request
• DELETE http://localhost:7474/db/data/relationship/1/properties/non-existent
• Accept: application/json; charset=UTF-8
Example response
• 404: Not Found
• Content-Type: application/json; charset=UTF-8
{
"message": "Relationship[1] does not have a property \"non-existent\"",
"exception": "NoSuchPropertyException",
"fullname": "org.neo4j.server.rest.web.NoSuchPropertyException",
"stackTrace": [
"org.neo4j.server.rest.web.DatabaseActions.removeRelationshipProperty(DatabaseActions.java:670)",
"org.neo4j.server.rest.web.RestfulGraphDatabase.deleteRelationshipProperty(RestfulGraphDatabase.java:812)",
"java.lang.reflect.Method.invoke(Method.java:606)",
"org.neo4j.server.rest.transactional.TransactionalRequestDispatcher.dispatch(TransactionalRequestDispatcher.java:139)",
"org.neo4j.server.rest.web.CollectUserAgentFilter.doFilter(CollectUserAgentFilter.java:69)",

356

REST API
"java.lang.Thread.run(Thread.java:745)"
],
"errors": [
{
"message": "Relationship[1] does not have a property \"non-existent\"",
"code": "Neo.ClientError.Statement.NoSuchProperty"
}
]
}

Remove properties from a non-existing relationship
Attempting to remove all properties from a relationship which doesn’t exist results in an error.
Example request
• DELETE http://localhost:7474/db/data/relationship/1234/properties
• Accept: application/json; charset=UTF-8
Example response
• 404: Not Found
• Content-Type: application/json; charset=UTF-8
{
"message": "org.neo4j.graphdb.NotFoundException: Relationship 1234 not found",
"exception": "RelationshipNotFoundException",
"fullname": "org.neo4j.server.rest.web.RelationshipNotFoundException",
"stackTrace": [
"org.neo4j.server.rest.web.DatabaseActions.relationship(DatabaseActions.java:188)",
"org.neo4j.server.rest.web.DatabaseActions.removeAllRelationshipProperties(DatabaseActions.java:660)",
"org.neo4j.server.rest.web.RestfulGraphDatabase.deleteAllRelationshipProperties(RestfulGraphDatabase.java:792)",
"java.lang.reflect.Method.invoke(Method.java:606)",
"org.neo4j.server.rest.transactional.TransactionalRequestDispatcher.dispatch(TransactionalRequestDispatcher.java:139)",
"org.neo4j.server.rest.web.CollectUserAgentFilter.doFilter(CollectUserAgentFilter.java:69)",
"java.lang.Thread.run(Thread.java:745)"
],
"cause": {
"message": "Relationship 1234 not found",
"errors": [
{
"message": "Relationship 1234 not found",
"code": "Neo.ClientError.Statement.EntityNotFound"
}
],
"cause": {
"message": "Unable to load RELATIONSHIP with id 1234.",
"errors": [
{
"message": "Unable to load RELATIONSHIP with id 1234.",
"code": "Neo.ClientError.Statement.EntityNotFound"
}
],
"exception": "EntityNotFoundException",
"fullname": "org.neo4j.kernel.api.exceptions.EntityNotFoundException",
"stackTrace": [
"org.neo4j.kernel.impl.factory.GraphDatabaseFacade.getRelationshipById(GraphDatabaseFacade.java:248)",
"org.neo4j.server.rest.web.DatabaseActions.relationship(DatabaseActions.java:184)",
"org.neo4j.server.rest.web.DatabaseActions.removeAllRelationshipProperties(DatabaseActions.java:660)",
"org.neo4j.server.rest.web.RestfulGraphDatabase.deleteAllRelationshipProperties(RestfulGraphDatabase.java:792)",
"java.lang.reflect.Method.invoke(Method.java:606)",
"org.neo4j.server.rest.transactional.TransactionalRequestDispatcher.dispatch(TransactionalRequestDispatcher.java:139)",
"org.neo4j.server.rest.web.CollectUserAgentFilter.doFilter(CollectUserAgentFilter.java:69)",

357

REST API
"java.lang.Thread.run(Thread.java:745)"
]
},
"exception": "NotFoundException",
"fullname": "org.neo4j.graphdb.NotFoundException",
"stackTrace": [
"org.neo4j.kernel.impl.factory.GraphDatabaseFacade.getRelationshipById(GraphDatabaseFacade.java:248)",
"org.neo4j.server.rest.web.DatabaseActions.relationship(DatabaseActions.java:184)",
"org.neo4j.server.rest.web.DatabaseActions.removeAllRelationshipProperties(DatabaseActions.java:660)",
"org.neo4j.server.rest.web.RestfulGraphDatabase.deleteAllRelationshipProperties(RestfulGraphDatabase.java:792)",
"java.lang.reflect.Method.invoke(Method.java:606)",
"org.neo4j.server.rest.transactional.TransactionalRequestDispatcher.dispatch(TransactionalRequestDispatcher.java:139)",
"org.neo4j.server.rest.web.CollectUserAgentFilter.doFilter(CollectUserAgentFilter.java:69)",
"java.lang.Thread.run(Thread.java:745)"
]
},
"errors": [
{
"message": "org.neo4j.graphdb.NotFoundException: Relationship 1234 not found",
"code": "Neo.ClientError.Statement.EntityNotFound"
}
]
}

Remove property from a non-existing relationship
Attempting to remove a property from a relationship which doesn’t exist results in an error.
Example request
• DELETE http://localhost:7474/db/data/relationship/1234/properties/cost
• Accept: application/json; charset=UTF-8
Example response
• 404: Not Found
• Content-Type: application/json; charset=UTF-8
{
"message": "org.neo4j.graphdb.NotFoundException: Relationship 1234 not found",
"exception": "RelationshipNotFoundException",
"fullname": "org.neo4j.server.rest.web.RelationshipNotFoundException",
"stackTrace": [
"org.neo4j.server.rest.web.DatabaseActions.relationship(DatabaseActions.java:188)",
"org.neo4j.server.rest.web.DatabaseActions.removeRelationshipProperty(DatabaseActions.java:666)",
"org.neo4j.server.rest.web.RestfulGraphDatabase.deleteRelationshipProperty(RestfulGraphDatabase.java:812)",
"java.lang.reflect.Method.invoke(Method.java:606)",
"org.neo4j.server.rest.transactional.TransactionalRequestDispatcher.dispatch(TransactionalRequestDispatcher.java:139)",
"org.neo4j.server.rest.web.CollectUserAgentFilter.doFilter(CollectUserAgentFilter.java:69)",
"java.lang.Thread.run(Thread.java:745)"
],
"cause": {
"message": "Relationship 1234 not found",
"errors": [
{
"message": "Relationship 1234 not found",
"code": "Neo.ClientError.Statement.EntityNotFound"
}
],
"cause": {
"message": "Unable to load RELATIONSHIP with id 1234.",
"errors": [
{
"message": "Unable to load RELATIONSHIP with id 1234.",

358

REST API
"code": "Neo.ClientError.Statement.EntityNotFound"
}
],
"exception": "EntityNotFoundException",
"fullname": "org.neo4j.kernel.api.exceptions.EntityNotFoundException",
"stackTrace": [
"org.neo4j.kernel.impl.factory.GraphDatabaseFacade.getRelationshipById(GraphDatabaseFacade.java:248)",
"org.neo4j.server.rest.web.DatabaseActions.relationship(DatabaseActions.java:184)",
"org.neo4j.server.rest.web.DatabaseActions.removeRelationshipProperty(DatabaseActions.java:666)",
"org.neo4j.server.rest.web.RestfulGraphDatabase.deleteRelationshipProperty(RestfulGraphDatabase.java:812)",
"java.lang.reflect.Method.invoke(Method.java:606)",
"org.neo4j.server.rest.transactional.TransactionalRequestDispatcher.dispatch(TransactionalRequestDispatcher.java:139)",
"org.neo4j.server.rest.web.CollectUserAgentFilter.doFilter(CollectUserAgentFilter.java:69)",
"java.lang.Thread.run(Thread.java:745)"
]
},
"exception": "NotFoundException",
"fullname": "org.neo4j.graphdb.NotFoundException",
"stackTrace": [
"org.neo4j.kernel.impl.factory.GraphDatabaseFacade.getRelationshipById(GraphDatabaseFacade.java:248)",
"org.neo4j.server.rest.web.DatabaseActions.relationship(DatabaseActions.java:184)",
"org.neo4j.server.rest.web.DatabaseActions.removeRelationshipProperty(DatabaseActions.java:666)",
"org.neo4j.server.rest.web.RestfulGraphDatabase.deleteRelationshipProperty(RestfulGraphDatabase.java:812)",
"java.lang.reflect.Method.invoke(Method.java:606)",
"org.neo4j.server.rest.transactional.TransactionalRequestDispatcher.dispatch(TransactionalRequestDispatcher.java:139)",
"org.neo4j.server.rest.web.CollectUserAgentFilter.doFilter(CollectUserAgentFilter.java:69)",
"java.lang.Thread.run(Thread.java:745)"
]
},
"errors": [
{
"message": "org.neo4j.graphdb.NotFoundException: Relationship 1234 not found",
"code": "Neo.ClientError.Statement.EntityNotFound"
}
]
}

359

REST API

21.13. Node labels
Adding a label to a node
Figure 21.52. Starting Graph
Node[ 146]
nam e = 'Clint East wood'

Figure 21.53. Final Graph
Node[ 146] : Person
nam e = 'Clint East wood'

Example request
• POST http://localhost:7474/db/data/node/146/labels
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
"Person"

Example response
• 204: No Content

Adding multiple labels to a node
Figure 21.54. Starting Graph
Node[ 157]
nam e = 'Clint East wood'

Figure 21.55. Final Graph
Node[ 157] : Person, Act or
nam e = 'Clint East wood'

Example request
• POST http://localhost:7474/db/data/node/157/labels
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
[ "Person", "Actor" ]

Example response
• 204: No Content

Adding a label with an invalid name
Labels with empty names are not allowed, however, all other valid strings are accepted as label names.
Adding an invalid label to a node will lead to a HTTP 400 response.
360

REST API
Example request
• POST http://localhost:7474/db/data/node/164/labels
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
""

Example response
• 400: Bad Request
• Content-Type: application/json; charset=UTF-8
{
"message": "Unable to add label, see nested exception.",
"exception": "BadInputException",
"fullname": "org.neo4j.server.rest.repr.BadInputException",
"stackTrace": [
"org.neo4j.server.rest.web.DatabaseActions.addLabelToNode(DatabaseActions.java:319)",
"org.neo4j.server.rest.web.RestfulGraphDatabase.addNodeLabel(RestfulGraphDatabase.java:446)",
"java.lang.reflect.Method.invoke(Method.java:606)",
"org.neo4j.server.rest.transactional.TransactionalRequestDispatcher.dispatch(TransactionalRequestDispatcher.java:139)",
"org.neo4j.server.rest.web.CollectUserAgentFilter.doFilter(CollectUserAgentFilter.java:69)",
"java.lang.Thread.run(Thread.java:745)"
],
"cause": {
"message": "Invalid label name \u0027\u0027.",
"errors": [
{
"message": "Invalid label name \u0027\u0027.",
"code": "Neo.ClientError.Schema.ConstraintViolation"
}
],
"cause": {
"message": "\u0027\u0027 is not a valid token name. Only non-null, non-empty strings are allowed.",
"errors": [
{
"message": "\u0027\u0027 is not a valid token name. Only non-null, non-empty strings are allowed.",
"code": "Neo.ClientError.Schema.IllegalTokenName"
}
],
"exception": "IllegalTokenNameException",
"fullname": "org.neo4j.kernel.api.exceptions.schema.IllegalTokenNameException",
"stackTrace": [

"org.neo4j.kernel.impl.api.DataIntegrityValidatingStatementOperations.checkValidTokenName(DataIntegrityValidatingStatementOperations.ja

"org.neo4j.kernel.impl.api.DataIntegrityValidatingStatementOperations.labelGetOrCreateForName(DataIntegrityValidatingStatementOperation
"org.neo4j.kernel.impl.api.OperationsFacade.labelGetOrCreateForName(OperationsFacade.java:802)",
"org.neo4j.kernel.impl.core.NodeProxy.addLabel(NodeProxy.java:620)",
"org.neo4j.server.rest.web.DatabaseActions.addLabelToNode(DatabaseActions.java:314)",
"org.neo4j.server.rest.web.RestfulGraphDatabase.addNodeLabel(RestfulGraphDatabase.java:446)",
"java.lang.reflect.Method.invoke(Method.java:606)",
"org.neo4j.server.rest.transactional.TransactionalRequestDispatcher.dispatch(TransactionalRequestDispatcher.java:139)",
"org.neo4j.server.rest.web.CollectUserAgentFilter.doFilter(CollectUserAgentFilter.java:69)",
"java.lang.Thread.run(Thread.java:745)"
]
},
"exception": "ConstraintViolationException",
"fullname": "org.neo4j.graphdb.ConstraintViolationException",
"stackTrace": [
"org.neo4j.kernel.impl.core.NodeProxy.addLabel(NodeProxy.java:631)",

361

REST API
"org.neo4j.server.rest.web.DatabaseActions.addLabelToNode(DatabaseActions.java:314)",
"org.neo4j.server.rest.web.RestfulGraphDatabase.addNodeLabel(RestfulGraphDatabase.java:446)",
"java.lang.reflect.Method.invoke(Method.java:606)",
"org.neo4j.server.rest.transactional.TransactionalRequestDispatcher.dispatch(TransactionalRequestDispatcher.java:139)",
"org.neo4j.server.rest.web.CollectUserAgentFilter.doFilter(CollectUserAgentFilter.java:69)",
"java.lang.Thread.run(Thread.java:745)"
]
},
"errors": [
{
"message": "Unable to add label, see nested exception.",
"code": "Neo.ClientError.Request.InvalidFormat"
}
]
}

Replacing labels on a node
This removes any labels currently on a node, and replaces them with the labels passed in as the
request body.
Figure 21.56. Starting Graph
Node[ 147] : Person
nam e = 'Clint East wood'

Figure 21.57. Final Graph
Node[ 147] : Act or, Direct or
nam e = 'Clint East wood'

Example request
• PUT http://localhost:7474/db/data/node/147/labels
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
[ "Actor", "Director" ]

Example response
• 204: No Content

Removing a label from a node
Figure 21.58. Starting Graph
Node[ 148] : Person
nam e = 'Clint East wood'

Figure 21.59. Final Graph
Node[ 148]
nam e = 'Clint East wood'

Example request
362

REST API
• DELETE http://localhost:7474/db/data/node/148/labels/Person
• Accept: application/json; charset=UTF-8
Example response
• 204: No Content

Removing a non-existent label from a node
Figure 21.60. Starting Graph
Node[ 149]
nam e = 'Clint East wood'

Figure 21.61. Final Graph
Node[ 149]
nam e = 'Clint East wood'

Example request
• DELETE http://localhost:7474/db/data/node/149/labels/Person
• Accept: application/json; charset=UTF-8
Example response
• 204: No Content

Listing labels for a node
Figure 21.62. Final Graph
Node[ 153] : Act or, Direct or
nam e = 'Clint East wood'

Example request
• GET http://localhost:7474/db/data/node/153/labels
• Accept: application/json; charset=UTF-8
Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
[ "Actor", "Director" ]

Get all nodes with a label
Figure 21.63. Final Graph
Node[ 158] : Direct or
nam e = 'St even Spielberg'

Node[ 159] : Act or, Direct or
nam e = 'Clint East wood'

363

Node[ 160] : Act or
nam e = 'Donald Sut herland'

REST API
Example request
• GET http://localhost:7474/db/data/label/Actor/nodes
• Accept: application/json; charset=UTF-8
Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
[ {
"labels" : "http://localhost:7474/db/data/node/159/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/159/relationships/out",
"data" : {
"name" : "Clint Eastwood"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/159/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/159/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/159",
"property" : "http://localhost:7474/db/data/node/159/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/159/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/159/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/159/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/159/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/159/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/159/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/159/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 159,
"labels" : [ "Actor", "Director" ]
}
}, {
"labels" : "http://localhost:7474/db/data/node/160/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/160/relationships/out",
"data" : {
"name" : "Donald Sutherland"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/160/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/160/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/160",
"property" : "http://localhost:7474/db/data/node/160/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/160/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/160/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/160/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/160/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/160/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/160/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/160/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 160,
"labels" : [ "Actor" ]
}
} ]

Get nodes by label and property
You can retrieve all nodes with a given label and property by passing one property as a query
parameter. Notice that the property value is JSON-encoded and then URL-encoded.
If there is an index available on the label/property combination you send, that index will be used. If no
index is available, all nodes with the given label will be filtered through to find matching nodes.
364

REST API
Currently, it is not possible to search using multiple properties.
Figure 21.64. Final Graph
Node[ 161] : Person
nam e = 'St even Spielberg'

Node[ 162] : Person

Node[ 163] : Person

nam e = 'Clint East wood'

Example request
• GET http://localhost:7474/db/data/label/Person/nodes?name=%22Clint+Eastwood%22
• Accept: application/json; charset=UTF-8
Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
[ {
"labels" : "http://localhost:7474/db/data/node/162/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/162/relationships/out",
"data" : {
"name" : "Clint Eastwood"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/162/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/162/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/162",
"property" : "http://localhost:7474/db/data/node/162/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/162/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/162/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/162/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/162/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/162/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/162/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/162/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 162,
"labels" : [ "Person" ]
}
} ]

List all labels
By default, the server will return labels in use only. If you also want to return labels not in use, append
the "in_use=0" query parameter.
Example request
• GET http://localhost:7474/db/data/labels
• Accept: application/json; charset=UTF-8
Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
[ "Director", "Person", "Actor" ]

365

REST API

21.14. Node degree
The node degree is the number of relationships associated with a node. Neo4j stores the degree for
each node, making this a useful mechanism to quickly get the number of relationships a node has. You
can also optionally filter degree by direction and/or relationship type.

Get the degree of a node
Return the total number of relationships associated with a node.
Figure 21.65. Final Graph
Node[ 20]
nam e = 'Root '

knows

knows

Node[ 19]

Node[ 21]

nam e = 'Johan'

nam e = 'Mat t ias'

Example request
• GET http://localhost:7474/db/data/node/20/degree/all
• Accept: application/json; charset=UTF-8
Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
2

Get the degree of a node by direction
Return the number of relationships of a particular direction for a node. Specify all, in or out.
Figure 21.66. Final Graph
Node[ 23]
nam e = 'Root '

knows

Node[ 22]

knows

Node[ 24]

nam e = 'Johan'

nam e = 'Mat t ias'

Example request
• GET http://localhost:7474/db/data/node/23/degree/out
• Accept: application/json; charset=UTF-8
Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
366

REST API

2

Get the degree of a node by direction and types
If you are only interested in the degree of a particular relationship type, or a set of relationship types,
you specify relationship types after the direction. You can combine multiple relationship types by using
the & character.
Figure 21.67. Final Graph
Node[ 17]
nam e = 'Root '

LIKES

KNOWS

Node[ 15]

Node[ 16]

nam e = 'Cookie'

nam e = 'Johan'

KNOWS

Example request
• GET http://localhost:7474/db/data/node/17/degree/out/KNOWS&LIKES
• Accept: application/json; charset=UTF-8
Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
3

367

Node[ 18]
nam e = 'Mat t ias'

REST API

21.15. Indexing
Note

This documents schema based indexes, a feature that was introduced in Neo4j 2.0, see
Section 21.20, “Legacy indexing” [416] for legacy indexing.
For more details about indexes and the optional schema in Neo4j, see the section called “Schema” [9].

Create index
This will start a background job in the database that will create and populate the index. You can check
the status of your index by listing all the indexes for the relevant label.
Example request
• POST http://localhost:7474/db/data/schema/index/label_1512731149613_1
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
{
"property_keys" : [ "property_1512731149613_1" ]
}

Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
{
"label" : "label_1512731149613_1",
"property_keys" : [ "property_1512731149613_1" ]
}

List indexes for a label
Example request
• GET http://localhost:7474/db/data/schema/index/label_1512731149346_1
• Accept: application/json; charset=UTF-8
Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
[ {
"property_keys" : [ "property_1512731149346_1" ],
"label" : "label_1512731149346_1"
} ]

Drop index
Drop index
Example request
• DELETE http://localhost:7474/db/data/schema/index/label_1512731149507_1/property_1512731149507_1
• Accept: application/json; charset=UTF-8
Example response
368

REST API
• 204: No Content

369

REST API

21.16. Constraints
Create uniqueness constraint
Create a uniqueness constraint on a property.
Example request
• POST http://localhost:7474/db/data/schema/constraint/label_1512731127400_1/uniqueness/
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
{
"property_keys" : [ "property_1512731127400_1" ]
}

Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
{
"label" : "label_1512731127400_1",
"type" : "UNIQUENESS",
"property_keys" : [ "property_1512731127400_1" ]
}

Get a specific uniqueness constraint
Get a specific uniqueness constraint for a label and a property.
Example request
• GET http://localhost:7474/db/data/schema/constraint/label_1512731127985_1/uniqueness/
property_1512731127985_1
• Accept: application/json; charset=UTF-8
Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
[ {
"property_keys" : [ "property_1512731127985_1" ],
"label" : "label_1512731127985_1",
"type" : "UNIQUENESS"
} ]

Get all uniqueness constraints for a label
Example request
• GET http://localhost:7474/db/data/schema/constraint/label_1512731127891_1/uniqueness/
• Accept: application/json; charset=UTF-8
Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
[ {

370

REST API
"property_keys" : [ "property_1512731127891_1" ],
"label" : "label_1512731127891_1",
"type" : "UNIQUENESS"
}, {
"property_keys" : [ "property_1512731127891_2" ],
"label" : "label_1512731127891_1",
"type" : "UNIQUENESS"
} ]

Drop uniqueness constraint
Drop uniqueness constraint for a label and a property.
Example request
• DELETE http://localhost:7474/db/data/schema/constraint/label_1512731127311_1/uniqueness/
property_1512731127311_1
• Accept: application/json; charset=UTF-8
Example response
• 204: No Content

Get a specific node property existence constraint
Get a specific node property existence constraint for a label and a property.
Example request
• GET http://localhost:7474/db/data/schema/constraint/label_1512734748264_1/existence/
property_1512734748264_1
• Accept: application/json; charset=UTF-8
Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
[ {
"property_keys" : [ "property_1512734748264_1" ],
"label" : "label_1512734748264_1",
"type" : "NODE_PROPERTY_EXISTENCE"
} ]

Get all node property existence constraints for a label
Example request
• GET http://localhost:7474/db/data/schema/constraint/label_1512734748447_1/existence/
• Accept: application/json; charset=UTF-8
Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
[ {
"property_keys" : [ "property_1512734748447_1" ],
"label" : "label_1512734748447_1",
"type" : "NODE_PROPERTY_EXISTENCE"
}, {

371

REST API
"property_keys" : [ "property_1512734748447_2" ],
"label" : "label_1512734748447_1",
"type" : "NODE_PROPERTY_EXISTENCE"
} ]

Get all constraints for a label
Example request
• GET http://localhost:7474/db/data/schema/constraint/label_1512731127693_1
• Accept: application/json; charset=UTF-8
Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
[ {
"property_keys" : [ "property_1512731127693_1" ],
"label" : "label_1512731127693_1",
"type" : "UNIQUENESS"
} ]

Get a specific relationship property existence constraint
Get a specific relationship property existence constraint for a label and a property.
Example request
• GET http://localhost:7474/db/data/schema/relationship/constraint/relationshipType_1512734746380_1/
existence/property_1512734746380_1
• Accept: application/json; charset=UTF-8
Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
[ {
"relationshipType" : "relationshipType_1512734746380_1",
"property_keys" : [ "property_1512734746380_1" ],
"type" : "RELATIONSHIP_PROPERTY_EXISTENCE"
} ]

Get all relationship property existence constraints for a type
Example request
• GET http://localhost:7474/db/data/schema/relationship/constraint/relationshipType_1512734748073_1/
existence/
• Accept: application/json; charset=UTF-8
Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
[ {
"relationshipType" : "relationshipType_1512734748073_1",
"property_keys" : [ "property_1512734748073_2" ],
"type" : "RELATIONSHIP_PROPERTY_EXISTENCE"

372

REST API
}, {
"relationshipType" : "relationshipType_1512734748073_1",
"property_keys" : [ "property_1512734748073_1" ],
"type" : "RELATIONSHIP_PROPERTY_EXISTENCE"
} ]

Get all constraints
Example request
• GET http://localhost:7474/db/data/schema/constraint
• Accept: application/json; charset=UTF-8
Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
[ {
"property_keys" : [ "property_1512731127457_1" ],
"label" : "label_1512731127457_1",
"type" : "UNIQUENESS"
}, {
"property_keys" : [ "property_1512731127400_1" ],
"label" : "label_1512731127400_1",
"type" : "UNIQUENESS"
} ]

373

REST API

21.17. Traversals
Warning

The Traversal REST Endpoint executes arbitrary Javascript code under the hood as part
of the evaluators definitions. In hosted and open environments, this can constitute a
security risk. In these case, consider using declarative approaches like Part III, “Cypher Query
Language” [102] or write your own server side plugin executing the interesting traversals
with the Java API ( see Section 32.1, “Server Plugins” [564] ) or secure your server, see
Chapter 27, Security [499].
Traversals are performed from a start node. The traversal is controlled by the URI and the body sent
with the request.
returnType

The kind of objects in the response is determined by traverse/{returnType} in the URL.
returnType can have one of these values:
• node
• relationship
• path: contains full representations of start and end node, the rest are URIs.
• fullpath: contains full representations of all nodes and relationships.

To decide how the graph should be traversed you can use these parameters in the request body:
order

relationships

uniqueness

prune_evaluator

return_filter

max_depth

Decides in which order to visit nodes. Possible values:
3
• breadth_first: see Breadth-first search .
4
• depth_first: see Depth-first search
Decides which relationship types and directions should be followed. The
direction can be one of:
• all
• in
• out
Decides how uniqueness should be calculated. For details on different
5
uniqueness values see the Java API on Uniqueness . Possible values:
• node_global
• none
• relationship_global
• node_path
• relationship_path
Decides whether the traverser should continue down that path or if it should
be pruned so that the traverser won’t continue down that path. You can write
your own prune evaluator as (see the section called “Traversal using a return
filter” [375] or use the built-in none prune evaluator.
Decides whether the current position should be included in the result. You can
provide your own code for this (see the section called “Traversal using a return
filter” [375]), or use one of the built-in filters:
• all
• all_but_start_node
Is a short-hand way of specifying a prune evaluator which prunes after a
certain depth. If not specified a max depth of 1 is used and if a prune_evaluator
is specified instead of a max_depth, no max depth limit is set.

3

http://en.wikipedia.org/wiki/Breadth-first_search
http://en.wikipedia.org/wiki/Depth-first_search
5
http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/traversal/Uniqueness.html
4

374

REST API
The position object in the body of the return_filter and prune_evaluator is a Path6 object representing
the path from the start node to the current traversal position.
Out of the box, the REST API supports JavaScript code in filters and evaluators. The script body will be
7
executed in a Java context which has access to the full Neo4j Java API . See the examples for the exact
syntax of the request.

Traversal using a return filter
In this example, the none prune evaluator is used and a return filter is supplied in order to return all
names containing "t". The result is to be returned as nodes and the max depth is set to 3.
Figure 21.68. Final Graph
Node[ 87]
nam e = 'Root '

knows

knows

Node[ 85]

Node[ 90]

nam e = 'Johan'

nam e = 'Mat t ias'

knows

Node[ 86]
nam e = 'Em il'

knows

knows

Node[ 88]

Node[ 89]

nam e = 'Tobias'

nam e = 'Pet er'

loves

Node[ 84]
nam e = 'Sara'

Example request
• POST http://localhost:7474/db/data/node/87/traverse/node
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
{
"order" : "breadth_first",
"return_filter" : {
"body" : "position.endNode().getProperty('name').toLowerCase().contains('t')",
"language" : "javascript"
},
"prune_evaluator" : {
6
7

http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/Path.html
http://neo4j.com/docs/2.3.12/javadocs/

375

REST API
"body" : "position.length() > 10",
"language" : "javascript"
},
"uniqueness" : "node_global",
"relationships" : [ {
"direction" : "all",
"type" : "knows"
}, {
"direction" : "all",
"type" : "loves"
} ],
"max_depth" : 3
}

Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
[ {
"labels" : "http://localhost:7474/db/data/node/87/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/87/relationships/out",
"data" : {
"name" : "Root"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/87/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/87/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/87",
"property" : "http://localhost:7474/db/data/node/87/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/87/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/87/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/87/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/87/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/87/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/87/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/87/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 87,
"labels" : [ ]
}
}, {
"labels" : "http://localhost:7474/db/data/node/90/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/90/relationships/out",
"data" : {
"name" : "Mattias"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/90/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/90/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/90",
"property" : "http://localhost:7474/db/data/node/90/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/90/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/90/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/90/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/90/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/90/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/90/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/90/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 90,
"labels" : [ ]
}
}, {

376

REST API
"labels" : "http://localhost:7474/db/data/node/88/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/88/relationships/out",
"data" : {
"name" : "Tobias"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/88/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/88/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/88",
"property" : "http://localhost:7474/db/data/node/88/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/88/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/88/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/88/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/88/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/88/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/88/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/88/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 88,
"labels" : [ ]
}
}, {
"labels" : "http://localhost:7474/db/data/node/89/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/89/relationships/out",
"data" : {
"name" : "Peter"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/89/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/89/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/89",
"property" : "http://localhost:7474/db/data/node/89/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/89/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/89/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/89/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/89/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/89/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/89/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/89/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 89,
"labels" : [ ]
}
} ]

Return relationships from a traversal
Figure 21.69. Final Graph
Node[ 81]
nam e = 'I'

own

know

Node[ 79]

Node[ 80]

nam e = 'car'

nam e = 'you'

Example request
• POST http://localhost:7474/db/data/node/81/traverse/relationship
377

REST API
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
{
"order" : "breadth_first",
"uniqueness" : "none",
"return_filter" : {
"language" : "builtin",
"name" : "all"
}
}

Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
[ {
"start" : "http://localhost:7474/db/data/node/81",
"data" : { },
"self" : "http://localhost:7474/db/data/relationship/37",
"property" : "http://localhost:7474/db/data/relationship/37/properties/{key}",
"properties" : "http://localhost:7474/db/data/relationship/37/properties",
"type" : "own",
"extensions" : { },
"end" : "http://localhost:7474/db/data/node/79",
"metadata" : {
"id" : 37,
"type" : "own"
}
}, {
"start" : "http://localhost:7474/db/data/node/81",
"data" : { },
"self" : "http://localhost:7474/db/data/relationship/36",
"property" : "http://localhost:7474/db/data/relationship/36/properties/{key}",
"properties" : "http://localhost:7474/db/data/relationship/36/properties",
"type" : "know",
"extensions" : { },
"end" : "http://localhost:7474/db/data/node/80",
"metadata" : {
"id" : 36,
"type" : "know"
}
} ]

Return paths from a traversal
Figure 21.70. Final Graph
Node[ 71]
nam e = 'I'

own

know

Node[ 69]

Node[ 70]

nam e = 'car'

nam e = 'you'

Example request
378

REST API
• POST http://localhost:7474/db/data/node/71/traverse/path
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
{
"order" : "breadth_first",
"uniqueness" : "none",
"return_filter" : {
"language" : "builtin",
"name" : "all"
}
}

Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
[ {
"directions" : [ ],
"start" : "http://localhost:7474/db/data/node/71",
"nodes" : [ "http://localhost:7474/db/data/node/71" ],
"length" : 0,
"relationships" : [ ],
"end" : "http://localhost:7474/db/data/node/71"
}, {
"directions" : [ "->" ],
"start" : "http://localhost:7474/db/data/node/71",
"nodes" : [ "http://localhost:7474/db/data/node/71", "http://localhost:7474/db/data/node/70" ],
"length" : 1,
"relationships" : [ "http://localhost:7474/db/data/relationship/28" ],
"end" : "http://localhost:7474/db/data/node/70"
}, {
"directions" : [ "->" ],
"start" : "http://localhost:7474/db/data/node/71",
"nodes" : [ "http://localhost:7474/db/data/node/71", "http://localhost:7474/db/data/node/69" ],
"length" : 1,
"relationships" : [ "http://localhost:7474/db/data/relationship/29" ],
"end" : "http://localhost:7474/db/data/node/69"
} ]

Traversal returning nodes below a certain depth
Here, all nodes at a traversal depth below 3 are returned.

379

REST API
Figure 21.71. Final Graph
Node[ 75]
nam e = 'Root '

knows

knows

Node[ 73]

Node[ 78]

nam e = 'Johan'

nam e = 'Mat t ias'

knows

Node[ 74]
nam e = 'Em il'

knows

knows

Node[ 77]

Node[ 76]

nam e = 'Pet er'

nam e = 'Tobias'

loves

Node[ 72]
nam e = 'Sara'

Example request
• POST http://localhost:7474/db/data/node/75/traverse/node
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
{
"return_filter" : {
"body" : "position.length()<3;",
"language" : "javascript"
},
"prune_evaluator" : {
"name" : "none",
"language" : "builtin"
}
}

Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
[ {
"labels" : "http://localhost:7474/db/data/node/75/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/75/relationships/out",
"data" : {
"name" : "Root"
},

380

REST API
"all_typed_relationships" : "http://localhost:7474/db/data/node/75/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/75/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/75",
"property" : "http://localhost:7474/db/data/node/75/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/75/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/75/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/75/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/75/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/75/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/75/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/75/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 75,
"labels" : [ ]
}
}, {
"labels" : "http://localhost:7474/db/data/node/78/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/78/relationships/out",
"data" : {
"name" : "Mattias"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/78/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/78/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/78",
"property" : "http://localhost:7474/db/data/node/78/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/78/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/78/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/78/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/78/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/78/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/78/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/78/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 78,
"labels" : [ ]
}
}, {
"labels" : "http://localhost:7474/db/data/node/73/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/73/relationships/out",
"data" : {
"name" : "Johan"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/73/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/73/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/73",
"property" : "http://localhost:7474/db/data/node/73/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/73/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/73/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/73/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/73/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/73/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/73/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/73/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 73,
"labels" : [ ]
}
}, {
"labels" : "http://localhost:7474/db/data/node/74/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/74/relationships/out",
"data" : {
"name" : "Emil"

381

REST API
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/74/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/74/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/74",
"property" : "http://localhost:7474/db/data/node/74/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/74/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/74/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/74/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/74/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/74/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/74/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/74/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 74,
"labels" : [ ]
}
} ]

Creating a paged traverser
Paged traversers are created by POST-ing a traversal description to the link identified by the
paged_traverser key in a node representation. When creating a paged traverser, the same options apply
as for a regular traverser, meaning that node, path, or fullpath, can be targeted.
Example request
• POST http://localhost:7474/db/data/node/297/paged/traverse/node
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
{
"prune_evaluator" : {
"language" : "builtin",
"name" : "none"
},
"return_filter" : {
"language" : "javascript",
"body" : "position.endNode().getProperty('name').contains('1');"
},
"order" : "depth_first",
"relationships" : {
"type" : "NEXT",
"direction" : "out"
}
}

Example response
• 201: Created
• Content-Type: application/json; charset=UTF-8
• Location: http://localhost:7474/db/data/node/297/paged/traverse/node/
ab9eaa01c9874006851712ae68e07c76
[ {
"labels" : "http://localhost:7474/db/data/node/298/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/298/relationships/out",
"data" : {
"name" : "1"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/298/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/298/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/298",
"property" : "http://localhost:7474/db/data/node/298/properties/{key}",

382

REST API
"properties" : "http://localhost:7474/db/data/node/298/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/298/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/298/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/298/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/298/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/298/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/298/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 298,
"labels" : [ ]
}
}, {
"labels" : "http://localhost:7474/db/data/node/307/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/307/relationships/out",
"data" : {
"name" : "10"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/307/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/307/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/307",
"property" : "http://localhost:7474/db/data/node/307/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/307/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/307/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/307/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/307/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/307/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/307/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/307/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 307,
"labels" : [ ]
}
}, {
"labels" : "http://localhost:7474/db/data/node/308/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/308/relationships/out",
"data" : {
"name" : "11"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/308/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/308/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/308",
"property" : "http://localhost:7474/db/data/node/308/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/308/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/308/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/308/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/308/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/308/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/308/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/308/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 308,
"labels" : [ ]
}
}, {
"labels" : "http://localhost:7474/db/data/node/309/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/309/relationships/out",
"data" : {
"name" : "12"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/309/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/309/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/309",

383

REST API
"property" : "http://localhost:7474/db/data/node/309/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/309/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/309/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/309/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/309/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/309/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/309/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/309/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 309,
"labels" : [ ]
}
}, {
"labels" : "http://localhost:7474/db/data/node/310/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/310/relationships/out",
"data" : {
"name" : "13"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/310/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/310/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/310",
"property" : "http://localhost:7474/db/data/node/310/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/310/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/310/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/310/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/310/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/310/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/310/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/310/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 310,
"labels" : [ ]
}
}, {
"labels" : "http://localhost:7474/db/data/node/311/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/311/relationships/out",
"data" : {
"name" : "14"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/311/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/311/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/311",
"property" : "http://localhost:7474/db/data/node/311/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/311/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/311/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/311/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/311/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/311/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/311/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/311/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 311,
"labels" : [ ]
}
}, {
"labels" : "http://localhost:7474/db/data/node/312/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/312/relationships/out",
"data" : {
"name" : "15"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/312/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/312/traverse/{returnType}",

384

REST API
"self" : "http://localhost:7474/db/data/node/312",
"property" : "http://localhost:7474/db/data/node/312/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/312/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/312/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/312/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/312/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/312/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/312/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/312/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 312,
"labels" : [ ]
}
}, {
"labels" : "http://localhost:7474/db/data/node/313/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/313/relationships/out",
"data" : {
"name" : "16"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/313/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/313/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/313",
"property" : "http://localhost:7474/db/data/node/313/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/313/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/313/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/313/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/313/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/313/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/313/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/313/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 313,
"labels" : [ ]
}
}, {
"labels" : "http://localhost:7474/db/data/node/314/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/314/relationships/out",
"data" : {
"name" : "17"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/314/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/314/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/314",
"property" : "http://localhost:7474/db/data/node/314/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/314/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/314/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/314/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/314/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/314/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/314/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/314/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 314,
"labels" : [ ]
}
}, {
"labels" : "http://localhost:7474/db/data/node/315/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/315/relationships/out",
"data" : {
"name" : "18"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/315/relationships/all/{-list|&|types}",

385

REST API
"traverse" : "http://localhost:7474/db/data/node/315/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/315",
"property" : "http://localhost:7474/db/data/node/315/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/315/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/315/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/315/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/315/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/315/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/315/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/315/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 315,
"labels" : [ ]
}
}, {
"labels" : "http://localhost:7474/db/data/node/316/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/316/relationships/out",
"data" : {
"name" : "19"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/316/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/316/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/316",
"property" : "http://localhost:7474/db/data/node/316/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/316/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/316/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/316/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/316/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/316/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/316/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/316/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 316,
"labels" : [ ]
}
}, {
"labels" : "http://localhost:7474/db/data/node/318/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/318/relationships/out",
"data" : {
"name" : "21"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/318/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/318/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/318",
"property" : "http://localhost:7474/db/data/node/318/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/318/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/318/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/318/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/318/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/318/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/318/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/318/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 318,
"labels" : [ ]
}
}, {
"labels" : "http://localhost:7474/db/data/node/328/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/328/relationships/out",
"data" : {
"name" : "31"
},

386

REST API
"all_typed_relationships" : "http://localhost:7474/db/data/node/328/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/328/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/328",
"property" : "http://localhost:7474/db/data/node/328/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/328/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/328/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/328/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/328/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/328/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/328/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/328/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 328,
"labels" : [ ]
}
} ]

Paging through the results of a paged traverser
Paged traversers holdstate on the server, and allow clients to page through the results of a traversal.
To progress to the next page of traversal results, the client issues a HTTP GET request on the paged
traversal URI which causes the traversal to fill the next page (or partially fill it if insufficient results are
available).
Note that if a traverser expires through inactivity it will cause a 404 response on the next GET request.
Traversers' leases are renewed on every successful access for the same amount of time as originally
specified.
When the paged traverser reaches the end of its results, the client can expect a 404 response as the
traverser is disposed by the server.
Example request
• GET http://localhost:7474/db/data/node/330/paged/traverse/node/4e702c468db740f8a4b990d06c359f4d
• Accept: application/json
Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
[ {
"labels" : "http://localhost:7474/db/data/node/661/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/661/relationships/out",
"data" : {
"name" : "331"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/661/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/661/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/661",
"property" : "http://localhost:7474/db/data/node/661/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/661/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/661/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/661/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/661/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/661/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/661/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/661/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 661,
"labels" : [ ]
}
}, {

387

REST API
"labels" : "http://localhost:7474/db/data/node/671/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/671/relationships/out",
"data" : {
"name" : "341"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/671/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/671/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/671",
"property" : "http://localhost:7474/db/data/node/671/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/671/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/671/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/671/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/671/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/671/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/671/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/671/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 671,
"labels" : [ ]
}
}, {
"labels" : "http://localhost:7474/db/data/node/681/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/681/relationships/out",
"data" : {
"name" : "351"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/681/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/681/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/681",
"property" : "http://localhost:7474/db/data/node/681/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/681/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/681/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/681/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/681/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/681/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/681/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/681/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 681,
"labels" : [ ]
}
}, {
"labels" : "http://localhost:7474/db/data/node/691/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/691/relationships/out",
"data" : {
"name" : "361"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/691/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/691/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/691",
"property" : "http://localhost:7474/db/data/node/691/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/691/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/691/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/691/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/691/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/691/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/691/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/691/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 691,
"labels" : [ ]
}

388

REST API
}, {
"labels" : "http://localhost:7474/db/data/node/701/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/701/relationships/out",
"data" : {
"name" : "371"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/701/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/701/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/701",
"property" : "http://localhost:7474/db/data/node/701/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/701/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/701/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/701/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/701/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/701/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/701/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/701/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 701,
"labels" : [ ]
}
}, {
"labels" : "http://localhost:7474/db/data/node/711/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/711/relationships/out",
"data" : {
"name" : "381"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/711/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/711/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/711",
"property" : "http://localhost:7474/db/data/node/711/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/711/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/711/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/711/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/711/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/711/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/711/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/711/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 711,
"labels" : [ ]
}
}, {
"labels" : "http://localhost:7474/db/data/node/721/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/721/relationships/out",
"data" : {
"name" : "391"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/721/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/721/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/721",
"property" : "http://localhost:7474/db/data/node/721/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/721/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/721/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/721/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/721/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/721/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/721/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/721/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 721,
"labels" : [ ]

389

REST API
}
}, {
"labels" : "http://localhost:7474/db/data/node/731/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/731/relationships/out",
"data" : {
"name" : "401"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/731/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/731/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/731",
"property" : "http://localhost:7474/db/data/node/731/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/731/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/731/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/731/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/731/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/731/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/731/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/731/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 731,
"labels" : [ ]
}
}, {
"labels" : "http://localhost:7474/db/data/node/740/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/740/relationships/out",
"data" : {
"name" : "410"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/740/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/740/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/740",
"property" : "http://localhost:7474/db/data/node/740/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/740/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/740/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/740/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/740/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/740/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/740/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/740/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 740,
"labels" : [ ]
}
}, {
"labels" : "http://localhost:7474/db/data/node/741/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/741/relationships/out",
"data" : {
"name" : "411"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/741/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/741/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/741",
"property" : "http://localhost:7474/db/data/node/741/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/741/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/741/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/741/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/741/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/741/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/741/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/741/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 741,

390

REST API
"labels" : [ ]
}
}, {
"labels" : "http://localhost:7474/db/data/node/742/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/742/relationships/out",
"data" : {
"name" : "412"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/742/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/742/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/742",
"property" : "http://localhost:7474/db/data/node/742/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/742/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/742/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/742/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/742/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/742/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/742/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/742/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 742,
"labels" : [ ]
}
}, {
"labels" : "http://localhost:7474/db/data/node/743/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/743/relationships/out",
"data" : {
"name" : "413"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/743/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/743/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/743",
"property" : "http://localhost:7474/db/data/node/743/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/743/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/743/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/743/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/743/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/743/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/743/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/743/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 743,
"labels" : [ ]
}
}, {
"labels" : "http://localhost:7474/db/data/node/744/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/744/relationships/out",
"data" : {
"name" : "414"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/744/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/744/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/744",
"property" : "http://localhost:7474/db/data/node/744/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/744/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/744/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/744/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/744/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/744/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/744/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/744/relationships/in/{-list|&|types}",
"metadata" : {

391

REST API
"id" : 744,
"labels" : [ ]
}
}, {
"labels" : "http://localhost:7474/db/data/node/745/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/745/relationships/out",
"data" : {
"name" : "415"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/745/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/745/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/745",
"property" : "http://localhost:7474/db/data/node/745/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/745/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/745/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/745/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/745/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/745/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/745/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/745/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 745,
"labels" : [ ]
}
}, {
"labels" : "http://localhost:7474/db/data/node/746/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/746/relationships/out",
"data" : {
"name" : "416"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/746/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/746/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/746",
"property" : "http://localhost:7474/db/data/node/746/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/746/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/746/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/746/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/746/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/746/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/746/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/746/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 746,
"labels" : [ ]
}
}, {
"labels" : "http://localhost:7474/db/data/node/747/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/747/relationships/out",
"data" : {
"name" : "417"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/747/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/747/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/747",
"property" : "http://localhost:7474/db/data/node/747/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/747/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/747/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/747/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/747/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/747/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/747/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/747/relationships/in/{-list|&|types}",

392

REST API
"metadata" : {
"id" : 747,
"labels" : [ ]
}
}, {
"labels" : "http://localhost:7474/db/data/node/748/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/748/relationships/out",
"data" : {
"name" : "418"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/748/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/748/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/748",
"property" : "http://localhost:7474/db/data/node/748/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/748/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/748/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/748/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/748/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/748/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/748/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/748/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 748,
"labels" : [ ]
}
}, {
"labels" : "http://localhost:7474/db/data/node/749/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/749/relationships/out",
"data" : {
"name" : "419"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/749/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/749/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/749",
"property" : "http://localhost:7474/db/data/node/749/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/749/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/749/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/749/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/749/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/749/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/749/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/749/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 749,
"labels" : [ ]
}
}, {
"labels" : "http://localhost:7474/db/data/node/751/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/751/relationships/out",
"data" : {
"name" : "421"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/751/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/751/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/751",
"property" : "http://localhost:7474/db/data/node/751/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/751/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/751/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/751/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/751/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/751/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/751/relationships/all",

393

REST API
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/751/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 751,
"labels" : [ ]
}
}, {
"labels" : "http://localhost:7474/db/data/node/761/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/761/relationships/out",
"data" : {
"name" : "431"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/761/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/761/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/761",
"property" : "http://localhost:7474/db/data/node/761/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/761/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/761/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/761/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/761/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/761/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/761/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/761/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 761,
"labels" : [ ]
}
}, {
"labels" : "http://localhost:7474/db/data/node/771/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/771/relationships/out",
"data" : {
"name" : "441"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/771/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/771/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/771",
"property" : "http://localhost:7474/db/data/node/771/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/771/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/771/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/771/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/771/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/771/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/771/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/771/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 771,
"labels" : [ ]
}
} ]

Paged traverser page size
The default page size is 50 items, but depending on the application larger or smaller pages sizes might
be appropriate. This can be set by adding a pageSize query parameter.
Example request
• POST http://localhost:7474/db/data/node/33/paged/traverse/node?pageSize=1
• Accept: application/json
• Content-Type: application/json
{
"prune_evaluator" : {

394

REST API
"language" : "builtin",
"name" : "none"
},
"return_filter" : {
"language" : "javascript",
"body" : "position.endNode().getProperty('name').contains('1');"
},
"order" : "depth_first",
"relationships" : {
"type" : "NEXT",
"direction" : "out"
}
}

Example response
• 201: Created
• Content-Type: application/json; charset=UTF-8
• Location: http://localhost:7474/db/data/node/33/paged/traverse/
node/2c4a4adef0274953a358c675c884584d
[ {
"labels" : "http://localhost:7474/db/data/node/34/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/34/relationships/out",
"data" : {
"name" : "1"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/34/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/34/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/34",
"property" : "http://localhost:7474/db/data/node/34/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/34/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/34/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/34/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/34/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/34/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/34/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/34/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 34,
"labels" : [ ]
}
} ]

Paged traverser timeout
The default timeout for a paged traverser is 60 seconds, but depending on the application larger or
smaller timeouts might be appropriate. This can be set by adding a leaseTime query parameter with the
number of seconds the paged traverser should last.
Example request
• POST http://localhost:7474/db/data/node/807/paged/traverse/node?leaseTime=10
• Accept: application/json
• Content-Type: application/json
{
"prune_evaluator" : {
"language" : "builtin",
"name" : "none"
},
"return_filter" : {
"language" : "javascript",

395

REST API
"body" : "position.endNode().getProperty('name').contains('1');"
},
"order" : "depth_first",
"relationships" : {
"type" : "NEXT",
"direction" : "out"
}
}

Example response
• 201: Created
• Content-Type: application/json; charset=UTF-8
• Location: http://localhost:7474/db/data/node/807/paged/traverse/node/
e25e0cff49b2449598c322fe313e86b0
[ {
"labels" : "http://localhost:7474/db/data/node/808/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/808/relationships/out",
"data" : {
"name" : "1"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/808/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/808/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/808",
"property" : "http://localhost:7474/db/data/node/808/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/808/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/808/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/808/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/808/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/808/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/808/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/808/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 808,
"labels" : [ ]
}
}, {
"labels" : "http://localhost:7474/db/data/node/817/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/817/relationships/out",
"data" : {
"name" : "10"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/817/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/817/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/817",
"property" : "http://localhost:7474/db/data/node/817/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/817/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/817/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/817/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/817/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/817/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/817/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/817/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 817,
"labels" : [ ]
}
}, {
"labels" : "http://localhost:7474/db/data/node/818/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/818/relationships/out",
"data" : {

396

REST API
"name" : "11"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/818/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/818/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/818",
"property" : "http://localhost:7474/db/data/node/818/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/818/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/818/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/818/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/818/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/818/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/818/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/818/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 818,
"labels" : [ ]
}
}, {
"labels" : "http://localhost:7474/db/data/node/819/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/819/relationships/out",
"data" : {
"name" : "12"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/819/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/819/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/819",
"property" : "http://localhost:7474/db/data/node/819/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/819/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/819/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/819/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/819/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/819/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/819/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/819/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 819,
"labels" : [ ]
}
}, {
"labels" : "http://localhost:7474/db/data/node/820/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/820/relationships/out",
"data" : {
"name" : "13"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/820/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/820/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/820",
"property" : "http://localhost:7474/db/data/node/820/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/820/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/820/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/820/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/820/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/820/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/820/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/820/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 820,
"labels" : [ ]
}
}, {
"labels" : "http://localhost:7474/db/data/node/821/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/821/relationships/out",

397

REST API
"data" : {
"name" : "14"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/821/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/821/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/821",
"property" : "http://localhost:7474/db/data/node/821/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/821/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/821/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/821/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/821/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/821/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/821/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/821/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 821,
"labels" : [ ]
}
}, {
"labels" : "http://localhost:7474/db/data/node/822/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/822/relationships/out",
"data" : {
"name" : "15"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/822/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/822/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/822",
"property" : "http://localhost:7474/db/data/node/822/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/822/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/822/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/822/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/822/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/822/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/822/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/822/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 822,
"labels" : [ ]
}
}, {
"labels" : "http://localhost:7474/db/data/node/823/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/823/relationships/out",
"data" : {
"name" : "16"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/823/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/823/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/823",
"property" : "http://localhost:7474/db/data/node/823/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/823/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/823/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/823/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/823/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/823/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/823/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/823/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 823,
"labels" : [ ]
}
}, {
"labels" : "http://localhost:7474/db/data/node/824/labels",

398

REST API
"outgoing_relationships" : "http://localhost:7474/db/data/node/824/relationships/out",
"data" : {
"name" : "17"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/824/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/824/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/824",
"property" : "http://localhost:7474/db/data/node/824/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/824/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/824/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/824/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/824/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/824/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/824/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/824/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 824,
"labels" : [ ]
}
}, {
"labels" : "http://localhost:7474/db/data/node/825/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/825/relationships/out",
"data" : {
"name" : "18"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/825/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/825/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/825",
"property" : "http://localhost:7474/db/data/node/825/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/825/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/825/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/825/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/825/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/825/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/825/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/825/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 825,
"labels" : [ ]
}
}, {
"labels" : "http://localhost:7474/db/data/node/826/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/826/relationships/out",
"data" : {
"name" : "19"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/826/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/826/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/826",
"property" : "http://localhost:7474/db/data/node/826/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/826/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/826/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/826/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/826/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/826/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/826/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/826/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 826,
"labels" : [ ]
}
}, {

399

REST API
"labels" : "http://localhost:7474/db/data/node/828/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/828/relationships/out",
"data" : {
"name" : "21"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/828/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/828/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/828",
"property" : "http://localhost:7474/db/data/node/828/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/828/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/828/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/828/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/828/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/828/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/828/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/828/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 828,
"labels" : [ ]
}
}, {
"labels" : "http://localhost:7474/db/data/node/838/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/838/relationships/out",
"data" : {
"name" : "31"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/838/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/838/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/838",
"property" : "http://localhost:7474/db/data/node/838/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/838/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/838/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/838/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/838/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/838/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/838/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/838/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 838,
"labels" : [ ]
}
} ]

400

REST API

21.18. Graph Algorithms
Neo4j comes with a number of built-in graph algorithms. They are performed from a start node. The
traversal is controlled by the URI and the body sent with the request. These are the parameters that can
be used:
algorithm

max_depth

The algorithm to choose. If not set, default is shortestPath. algorithm can have one of
these values:
• shortestPath
• allSimplePaths
• allPaths
• dijkstra (optionally with cost_property and default_cost parameters)
The maximum depth as an integer for the algorithms like shortestPath, where
applicable. Default is 1.

Find all shortest paths
The shortestPath algorithm can find multiple paths between the same nodes, like in this example.
Figure 21.72. Final Graph
Node[ 26]
nam e = 'a'

to

Node[ 25]
to
nam e = 'c'

to

Node[ 22]

Node[ 24]

nam e = 'd'

nam e = 'b'

to

to

to

to

Node[ 23]

Node[ 20]
to

nam e = 'e'

nam e = 'f'

to

to

Node[ 21]
nam e = 'g'

Example request
• POST http://localhost:7474/db/data/node/26/paths
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
401

REST API

{
"to" : "http://localhost:7474/db/data/node/21",
"max_depth" : 3,
"relationships" : {
"type" : "to",
"direction" : "out"
},
"algorithm" : "shortestPath"
}

Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
[ {
"directions" : [ "->", "->" ],
"start" : "http://localhost:7474/db/data/node/26",
"nodes" : [ "http://localhost:7474/db/data/node/26", "http://localhost:7474/db/data/node/22", "http://localhost:7474/db/
data/node/21" ],
"length" : 2,
"relationships" : [ "http://localhost:7474/db/data/relationship/28", "http://localhost:7474/db/data/relationship/34" ],
"end" : "http://localhost:7474/db/data/node/21"
}, {
"directions" : [ "->", "->" ],
"start" : "http://localhost:7474/db/data/node/26",
"nodes" : [ "http://localhost:7474/db/data/node/26", "http://localhost:7474/db/data/node/25", "http://localhost:7474/db/
data/node/21" ],
"length" : 2,
"relationships" : [ "http://localhost:7474/db/data/relationship/27", "http://localhost:7474/db/data/relationship/36" ],
"end" : "http://localhost:7474/db/data/node/21"
} ]

Find one of the shortest paths
If no path algorithm is specified, a shortestPath algorithm with a max depth of 1 will be chosen. In this
example, the max_depth is set to 3 in order to find the shortest path between a maximum of 3 linked
nodes.

402

REST API
Figure 21.73. Final Graph
Node[ 19]
nam e = 'a'

to

Node[ 18]
to
nam e = 'c'

to

Node[ 15]

Node[ 17]

nam e = 'd'

nam e = 'b'

to

to

to

to

Node[ 16]

Node[ 13]
to

nam e = 'e'

nam e = 'f'

to

to

Node[ 14]
nam e = 'g'

Example request
• POST http://localhost:7474/db/data/node/19/path
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
{
"to" : "http://localhost:7474/db/data/node/14",
"max_depth" : 3,
"relationships" : {
"type" : "to",
"direction" : "out"
},
"algorithm" : "shortestPath"
}

Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
{
"directions" : [ "->", "->" ],
"start" : "http://localhost:7474/db/data/node/19",
"nodes" : [ "http://localhost:7474/db/data/node/19", "http://localhost:7474/db/data/node/18", "http://localhost:7474/db/
data/node/14" ],
"length" : 2,

403

REST API
"relationships" : [ "http://localhost:7474/db/data/relationship/17", "http://localhost:7474/db/data/relationship/26" ],
"end" : "http://localhost:7474/db/data/node/14"
}

Execute a Dijkstra algorithm and get a single path
This example is running a Dijkstra algorithm over a graph with different cost properties on different
relationships. Note that the request URI ends with /path which means a single path is what we want
here.
Figure 21.74. Final Graph
Node[ 32]
nam e = 'a'

to
cost = 0.5

to
cost = 1.5

to
cost = 0.5

Node[ 31]
nam e = 'c'

Node[ 27]

Node[ 30]

nam e = 'f'

nam e = 'b'

to
cost = 1.2

to
cost = 0.5

to
cost = 0.5

Node[ 28]
nam e = 'd'

to
cost = 0.5

Node[ 29]
nam e = 'e'

Example request
• POST http://localhost:7474/db/data/node/32/path
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
{
"to" : "http://localhost:7474/db/data/node/29",
"cost_property" : "cost",
"relationships" : {
"type" : "to",
"direction" : "out"
},
"algorithm" : "dijkstra"
}

Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
404

REST API

{
"directions" : [ "->", "->", "->" ],
"weight" : 1.5,
"start" : "http://localhost:7474/db/data/node/32",
"nodes" : [ "http://localhost:7474/db/data/node/32", "http://localhost:7474/db/data/node/31", "http://localhost:7474/db/
data/node/28", "http://localhost:7474/db/data/node/29" ],
"length" : 3,
"relationships" : [ "http://localhost:7474/db/data/relationship/38", "http://localhost:7474/db/data/relationship/40",
"http://localhost:7474/db/data/relationship/41" ],
"end" : "http://localhost:7474/db/data/node/29"
}

Execute a Dijkstra algorithm with equal weights on relationships
The following is executing a Dijkstra search on a graph with equal weights on all relationships. This
example is included to show the difference when the same graph structure is used, but the path weight
is equal to the number of hops.
Figure 21.75. Final Graph
Node[ 38]
nam e = 'a'

to
cost = 1

to
cost = 1

to
cost = 1

Node[ 37]
nam e = 'c'

Node[ 33]

Node[ 36]

nam e = 'f'

nam e = 'b'

to
cost = 1

to
cost = 1

to
cost = 1

Node[ 34]
nam e = 'd'

to
cost = 1

Node[ 35]
nam e = 'e'

Example request
• POST http://localhost:7474/db/data/node/38/path
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
{
"to" : "http://localhost:7474/db/data/node/35",
"cost_property" : "cost",
"relationships" : {
"type" : "to",
"direction" : "out"
},

405

REST API
"algorithm" : "dijkstra"
}

Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
{
"directions" : [ "->", "->" ],
"weight" : 2.0,
"start" : "http://localhost:7474/db/data/node/38",
"nodes" : [ "http://localhost:7474/db/data/node/38", "http://localhost:7474/db/data/node/33", "http://localhost:7474/db/
data/node/35" ],
"length" : 2,
"relationships" : [ "http://localhost:7474/db/data/relationship/46", "http://localhost:7474/db/data/relationship/50" ],
"end" : "http://localhost:7474/db/data/node/35"
}

Execute a Dijkstra algorithm and get multiple paths
This example is running a Dijkstra algorithm over a graph with different cost properties on different
relationships. Note that the request URI ends with /paths which means we want multiple paths
returned, in case they exist.
Figure 21.76. Final Graph
Node[ 12]
nam e = 'a'

to
cost = 0.5

to
cost = 1.5

to
cost = 0.5

Node[ 11]
nam e = 'c'

Node[ 7]

Node[ 10]

nam e = 'f'

nam e = 'b'

to
cost = 1.0

to
cost = 0.5

to
cost = 0.5

Node[ 8]
nam e = 'd'

to
cost = 0.5

Node[ 9]
nam e = 'e'

Example request
• POST http://localhost:7474/db/data/node/12/paths
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
406

REST API

{
"to" : "http://localhost:7474/db/data/node/9",
"cost_property" : "cost",
"relationships" : {
"type" : "to",
"direction" : "out"
},
"algorithm" : "dijkstra"
}

Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
[ {
"directions" : [ "->", "->" ],
"weight" : 1.5,
"start" : "http://localhost:7474/db/data/node/12",
"nodes" : [ "http://localhost:7474/db/data/node/12", "http://localhost:7474/db/data/node/7", "http://localhost:7474/db/data/
node/9" ],
"length" : 2,
"relationships" : [ "http://localhost:7474/db/data/relationship/12", "http://localhost:7474/db/data/relationship/16" ],
"end" : "http://localhost:7474/db/data/node/9"
}, {
"directions" : [ "->", "->", "->" ],
"weight" : 1.5,
"start" : "http://localhost:7474/db/data/node/12",
"nodes" : [ "http://localhost:7474/db/data/node/12", "http://localhost:7474/db/data/node/11", "http://localhost:7474/db/
data/node/8", "http://localhost:7474/db/data/node/9" ],
"length" : 3,
"relationships" : [ "http://localhost:7474/db/data/relationship/11", "http://localhost:7474/db/data/relationship/13",
"http://localhost:7474/db/data/relationship/14" ],
"end" : "http://localhost:7474/db/data/node/9"
} ]

407

REST API

21.19. Batch operations
Batch operations lets you execute multiple API calls through a single HTTP call. This
improves performance for large insert and update operations significantly.
This service is transactional. If any of the operations performed fails (returns a non-2xx HTTP status
code), the transaction will be rolled back and no changes will be applied.

Important

You cannot use this resource to execute Cypher queries with USING PERIODIC COMMIT.

Execute multiple operations in batch
The batch service expects an array of job descriptions as input, each job description describing an
action to be performed via the normal server API.
Each job description should contain a to attribute, with a value relative to the data API root (so http://
localhost:7474/db/data/node becomes just /node), and a method attribute containing HTTP verb to use.
Optionally you may provide a body attribute, and an id attribute to help you keep track of responses,
although responses are guaranteed to be returned in the same order the job descriptions are received.
The following figure outlines the different parts of the job descriptions:

Figure 21.77. Starting Graph
Node[ 41]
nam e = 'Joe'

knows

Node[ 40]
nam e = 'John'

Figure 21.78. Final Graph
Node[ 41]

Node[ 42]

Node[ 43]

age = 1

age = 1

age = 1

knows

Node[ 40]
nam e = 'John'

Example request
408

REST API
• POST http://localhost:7474/db/data/batch
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
[ {
"method" : "PUT",
"to" : "/node/41/properties",
"body" : {
"age" : 1
},
"id" : 0
}, {
"method" : "GET",
"to" : "/node/41",
"id" : 1
}, {
"method" : "POST",
"to" : "/node",
"body" : {
"age" : 1
},
"id" : 2
}, {
"method" : "POST",
"to" : "/node",
"body" : {
"age" : 1
},
"id" : 3
} ]

Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
[ {
"id" : 0,
"from" : "/node/41/properties"
}, {
"id" : 1,
"body" : {
"extensions" : { },
"labels" : "http://localhost:7474/db/data/node/41/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/41/relationships/out",
"all_typed_relationships" : "http://localhost:7474/db/data/node/41/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/41/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/41",
"property" : "http://localhost:7474/db/data/node/41/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/41/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/41/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/41/relationships/in",
"create_relationship" : "http://localhost:7474/db/data/node/41/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/41/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/41/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/41/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 41,
"labels" : [ ]
},
"data" : {
"age" : 1
}

409

REST API
},
"from" : "/node/41"
}, {
"id" : 2,
"location" : "http://localhost:7474/db/data/node/42",
"body" : {
"extensions" : { },
"labels" : "http://localhost:7474/db/data/node/42/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/42/relationships/out",
"all_typed_relationships" : "http://localhost:7474/db/data/node/42/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/42/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/42",
"property" : "http://localhost:7474/db/data/node/42/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/42/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/42/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/42/relationships/in",
"create_relationship" : "http://localhost:7474/db/data/node/42/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/42/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/42/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/42/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 42,
"labels" : [ ]
},
"data" : {
"age" : 1
}
},
"from" : "/node"
}, {
"id" : 3,
"location" : "http://localhost:7474/db/data/node/43",
"body" : {
"extensions" : { },
"labels" : "http://localhost:7474/db/data/node/43/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/43/relationships/out",
"all_typed_relationships" : "http://localhost:7474/db/data/node/43/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/43/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/43",
"property" : "http://localhost:7474/db/data/node/43/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/43/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/43/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/43/relationships/in",
"create_relationship" : "http://localhost:7474/db/data/node/43/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/43/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/43/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/43/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 43,
"labels" : [ ]
},
"data" : {
"age" : 1
}
},
"from" : "/node"
} ]

Refer to items created earlier in the same batch job
The batch operation API allows you to refer to the URI returned from a created resource in subsequent
job descriptions, within the same batch call.
Use the {[JOB ID]} special syntax to inject URIs from created resources into JSON strings in subsequent
job descriptions.
410

REST API
Figure 21.79. Final Graph
Node[ 33]
nam e = 'bob'

KNOWS
since = '2010'

Node[ 34]
age = 12

Example request
• POST http://localhost:7474/db/data/batch
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
[ {
"method" : "POST",
"to" : "/node",
"id" : 0,
"body" : {
"name" : "bob"
}
}, {
"method" : "POST",
"to" : "/node",
"id" : 1,
"body" : {
"age" : 12
}
}, {
"method" : "POST",
"to" : "{0}/relationships",
"id" : 3,
"body" : {
"to" : "{1}",
"data" : {
"since" : "2010"
},
"type" : "KNOWS"
}
}, {
"method" : "POST",
"to" : "/index/relationship/my_rels",
"id" : 4,
"body" : {
"key" : "since",
"value" : "2010",
"uri" : "{3}"
}
} ]

Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
[ {
"id" : 0,

411

REST API
"location" : "http://localhost:7474/db/data/node/33",
"body" : {
"extensions" : { },
"labels" : "http://localhost:7474/db/data/node/33/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/33/relationships/out",
"all_typed_relationships" : "http://localhost:7474/db/data/node/33/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/33/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/33",
"property" : "http://localhost:7474/db/data/node/33/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/33/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/33/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/33/relationships/in",
"create_relationship" : "http://localhost:7474/db/data/node/33/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/33/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/33/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/33/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 33,
"labels" : [ ]
},
"data" : {
"name" : "bob"
}
},
"from" : "/node"
}, {
"id" : 1,
"location" : "http://localhost:7474/db/data/node/34",
"body" : {
"extensions" : { },
"labels" : "http://localhost:7474/db/data/node/34/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/34/relationships/out",
"all_typed_relationships" : "http://localhost:7474/db/data/node/34/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/34/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/34",
"property" : "http://localhost:7474/db/data/node/34/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/34/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/34/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/34/relationships/in",
"create_relationship" : "http://localhost:7474/db/data/node/34/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/34/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/34/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/34/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 34,
"labels" : [ ]
},
"data" : {
"age" : 12
}
},
"from" : "/node"
}, {
"id" : 3,
"location" : "http://localhost:7474/db/data/relationship/14",
"body" : {
"extensions" : { },
"start" : "http://localhost:7474/db/data/node/33",
"property" : "http://localhost:7474/db/data/relationship/14/properties/{key}",
"self" : "http://localhost:7474/db/data/relationship/14",
"properties" : "http://localhost:7474/db/data/relationship/14/properties",
"type" : "KNOWS",
"end" : "http://localhost:7474/db/data/node/34",
"metadata" : {
"id" : 14,

412

REST API
"type" : "KNOWS"
},
"data" : {
"since" : "2010"
}
},
"from" : "http://localhost:7474/db/data/node/33/relationships"
}, {
"id" : 4,
"location" : "http://localhost:7474/db/data/index/relationship/my_rels/since/2010/14",
"body" : {
"extensions" : { },
"start" : "http://localhost:7474/db/data/node/33",
"property" : "http://localhost:7474/db/data/relationship/14/properties/{key}",
"self" : "http://localhost:7474/db/data/relationship/14",
"properties" : "http://localhost:7474/db/data/relationship/14/properties",
"type" : "KNOWS",
"end" : "http://localhost:7474/db/data/node/34",
"metadata" : {
"id" : 14,
"type" : "KNOWS"
},
"data" : {
"since" : "2010"
},
"indexed" : "http://localhost:7474/db/data/index/relationship/my_rels/since/2010/14"
},
"from" : "/index/relationship/my_rels"
} ]

Execute multiple operations in batch streaming
Figure 21.80. Final Graph
Node[ 89]
nam e = 'bob'

KNOWS
since = '2010'

Node[ 92]
nam e = 'Tobias Test er'

FRIENDS
nam e = 't obias-andres'

Node[ 90]
age = 12

Node[ 95]

Node[ 98]

Node[ 99]

Node[ 100]

age = 1

age = 1

age = 1

age = 1

FRIENDS
nam e = 'andres-t obias'

Node[ 93]

Node[ 97]

nam e = 'Andres Test er'

nam e = 'John'

Example request
•
•
•
•

knows

POST http://localhost:7474/db/data/batch
Accept: application/json
Content-Type: application/json
X-Stream: true

[ {
"method" : "PUT",
"to" : "/node/98/properties",
"body" : {
"age" : 1
},
"id" : 0
}, {
"method" : "GET",
"to" : "/node/98",
"id" : 1
}, {

413

REST API
"method" : "POST",
"to" : "/node",
"body" : {
"age" : 1
},
"id" : 2
}, {
"method" : "POST",
"to" : "/node",
"body" : {
"age" : 1
},
"id" : 3
} ]

Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
[ {
"id" : 0,
"from" : "/node/98/properties",
"body" : null,
"status" : 204
}, {
"id" : 1,
"from" : "/node/98",
"body" : {
"extensions" : { },
"labels" : "http://localhost:7474/db/data/node/98/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/98/relationships/out",
"all_typed_relationships" : "http://localhost:7474/db/data/node/98/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/98/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/98",
"property" : "http://localhost:7474/db/data/node/98/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/98/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/98/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/98/relationships/in",
"create_relationship" : "http://localhost:7474/db/data/node/98/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/98/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/98/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/98/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 98,
"labels" : [ ]
},
"data" : {
"age" : 1
}
},
"status" : 200
}, {
"id" : 2,
"from" : "/node",
"body" : {
"extensions" : { },
"labels" : "http://localhost:7474/db/data/node/99/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/99/relationships/out",
"all_typed_relationships" : "http://localhost:7474/db/data/node/99/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/99/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/99",
"property" : "http://localhost:7474/db/data/node/99/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/99/properties",

414

REST API
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/99/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/99/relationships/in",
"create_relationship" : "http://localhost:7474/db/data/node/99/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/99/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/99/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/99/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 99,
"labels" : [ ]
},
"data" : {
"age" : 1
}
},
"location" : "http://localhost:7474/db/data/node/99",
"status" : 201
}, {
"id" : 3,
"from" : "/node",
"body" : {
"extensions" : { },
"labels" : "http://localhost:7474/db/data/node/100/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/100/relationships/out",
"all_typed_relationships" : "http://localhost:7474/db/data/node/100/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/100/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/100",
"property" : "http://localhost:7474/db/data/node/100/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/100/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/100/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/100/relationships/in",
"create_relationship" : "http://localhost:7474/db/data/node/100/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/100/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/100/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/100/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 100,
"labels" : [ ]
},
"data" : {
"age" : 1
}
},
"location" : "http://localhost:7474/db/data/node/100",
"status" : 201
} ]

415

REST API

21.20. Legacy indexing
Note

This documents the legacy indexing in Neo4j, which is no longer the preferred way to handle
indexes. Consider looking at Section 21.15, “Indexing” [368].
An index can contain either nodes or relationships.

Note

To create an index with default configuration, simply start using it by adding nodes/
relationships to it. It will then be automatically created for you.
What default configuration means depends on how you have configured your database. If you haven’t
changed any indexing configuration, it means the indexes will be using a Lucene-based backend.
All the examples below show you how to do operations on node indexes, but all of them are just as
applicable to relationship indexes. Simple change the "node" part of the URL to "relationship".
If you want to customize the index settings, see the section called “Create node index with
configuration” [416].

Create node index
Note

Instead of creating the index this way, you can simply start to use it, and it will be created
automatically with default configuration.
Example request
• POST http://localhost:7474/db/data/index/node/
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
{
"name" : "index_1512731131617_1"
}

Example response
• 201: Created
• Content-Type: application/json; charset=UTF-8
• Location: http://localhost:7474/db/data/index/node/index_1512731131617_1/
{
"template" : "http://localhost:7474/db/data/index/node/index_1512731131617_1/{key}/{value}"
}

Create node index with configuration
This request is only necessary if you want to customize the index settings. If you are happy with the
defaults, you can just start indexing nodes/relationships, as non-existent indexes will automatically be
created as you do. See Section 35.10, “Configuration and fulltext indexes” [633] for more information
on index configuration.
Example request
• POST http://localhost:7474/db/data/index/node/
• Accept: application/json; charset=UTF-8
416

REST API
• Content-Type: application/json
{
"name" : "fulltext",
"config" : {
"type" : "fulltext",
"provider" : "lucene"
}
}

Example response
• 201: Created
• Content-Type: application/json; charset=UTF-8
• Location: http://localhost:7474/db/data/index/node/fulltext/
{
"template" : "http://localhost:7474/db/data/index/node/fulltext/{key}/{value}",
"type" : "fulltext",
"provider" : "lucene"
}

Delete node index
Example request
• DELETE http://localhost:7474/db/data/index/node/index_1512731130679_1
• Accept: application/json; charset=UTF-8
Example response
• 204: No Content

List node indexes
Example request
• GET http://localhost:7474/db/data/index/node/
• Accept: application/json; charset=UTF-8
Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
{
"index_1512731130380_1" : {
"template" : "http://localhost:7474/db/data/index/node/index_1512731130380_1/{key}/{value}",
"provider" : "lucene",
"type" : "exact"
},
"index_1512731130775_1" : {
"template" : "http://localhost:7474/db/data/index/node/index_1512731130775_1/{key}/{value}",
"provider" : "lucene",
"type" : "exact"
},
"index_1512731130053_1" : {
"template" : "http://localhost:7474/db/data/index/node/index_1512731130053_1/{key}/{value}",
"provider" : "lucene",
"type" : "exact"
},

417

REST API
"index_1512731130554_1" : {
"template" : "http://localhost:7474/db/data/index/node/index_1512731130554_1/{key}/{value}",
"provider" : "lucene",
"type" : "exact"
},
"index_1512731130024_1" : {
"template" : "http://localhost:7474/db/data/index/node/index_1512731130024_1/{key}/{value}",
"provider" : "lucene",
"type" : "exact"
},
"index_1512731130371_1" : {
"template" : "http://localhost:7474/db/data/index/node/index_1512731130371_1/{key}/{value}",
"provider" : "lucene",
"type" : "exact"
},
"index_1512731130827_1" : {
"template" : "http://localhost:7474/db/data/index/node/index_1512731130827_1/{key}/{value}",
"provider" : "lucene",
"type" : "exact"
},
"index_1512731130887_1" : {
"template" : "http://localhost:7474/db/data/index/node/index_1512731130887_1/{key}/{value}",
"provider" : "lucene",
"type" : "exact"
},
"index_1512731130007_1" : {
"template" : "http://localhost:7474/db/data/index/node/index_1512731130007_1/{key}/{value}",
"provider" : "lucene",
"type" : "exact"
},
"index_1512731130500_1" : {
"template" : "http://localhost:7474/db/data/index/node/index_1512731130500_1/{key}/{value}",
"provider" : "lucene",
"type" : "exact"
},
"index_1512731130883_1" : {
"template" : "http://localhost:7474/db/data/index/node/index_1512731130883_1/{key}/{value}",
"provider" : "lucene",
"type" : "exact"
},
"index_1512731130034_1" : {
"template" : "http://localhost:7474/db/data/index/node/index_1512731130034_1/{key}/{value}",
"provider" : "lucene",
"type" : "exact"
},
"index_1512731130439_1" : {
"template" : "http://localhost:7474/db/data/index/node/index_1512731130439_1/{key}/{value}",
"provider" : "lucene",
"type" : "exact"
},
"index_1512731130318_1" : {
"template" : "http://localhost:7474/db/data/index/node/index_1512731130318_1/{key}/{value}",
"provider" : "lucene",
"type" : "exact"
}
}

Add node to index
Associates a node with the given key/value pair in the given index.

Note

Spaces in the URI have to be encoded as %20.

418

REST API

Caution

This does not overwrite previous entries. If you index the same key/value/item combination
twice, two index entries are created. To do update-type operations, you need to delete the
old entry before adding a new one.
Example request
• POST http://localhost:7474/db/data/index/node/index_1512731130380_1
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
{
"value" : "some value",
"uri" : "http://localhost:7474/db/data/node/33",
"key" : "some-key"
}

Example response
• 201: Created
• Content-Type: application/json; charset=UTF-8
• Location: http://localhost:7474/db/data/index/node/index_1512731130380_1/some-key/some%20value/33
{
"extensions" : { },
"labels" : "http://localhost:7474/db/data/node/33/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/33/relationships/out",
"all_typed_relationships" : "http://localhost:7474/db/data/node/33/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/33/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/33",
"property" : "http://localhost:7474/db/data/node/33/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/33/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/33/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/33/relationships/in",
"create_relationship" : "http://localhost:7474/db/data/node/33/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/33/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/33/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/33/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 33,
"labels" : [ ]
},
"data" : { },
"indexed" : "http://localhost:7474/db/data/index/node/index_1512731130380_1/some-key/some%20value/33"
}

Remove all entries with a given node from an index
Example request
• DELETE http://localhost:7474/db/data/index/node/index_1512731130827_1/38
• Accept: application/json; charset=UTF-8
Example response
• 204: No Content

Remove all entries with a given node and key from an index
Example request
419

REST API
• DELETE http://localhost:7474/db/data/index/node/index_1512731131119_1/kvkey2/41
• Accept: application/json; charset=UTF-8
Example response
• 204: No Content

Remove all entries with a given node, key and value from an index
Example request
• DELETE http://localhost:7474/db/data/index/node/index_1512731130439_1/kvkey1/value1/34
• Accept: application/json; charset=UTF-8
Example response
• 204: No Content

Find node by exact match
Note

Spaces in the URI have to be encoded as %20.
Example request
• GET http://localhost:7474/db/data/index/node/index_1512731131564_1/key/the%2520value
• Accept: application/json; charset=UTF-8
Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
[ {
"indexed" : "http://localhost:7474/db/data/index/node/index_1512731131564_1/key/the%2520value/49",
"labels" : "http://localhost:7474/db/data/node/49/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/49/relationships/out",
"data" : { },
"all_typed_relationships" : "http://localhost:7474/db/data/node/49/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/49/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/49",
"property" : "http://localhost:7474/db/data/node/49/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/49/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/49/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/49/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/49/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/49/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/49/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/49/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 49,
"labels" : [ ]
}
} ]

420

REST API

Find node by query
The query language used here depends on what type of index you are querying. The default index type
is Lucene, in which case you should use the Lucene query language here. Below an example of a fuzzy
search over multiple keys.
See: http://lucene.apache.org/core/3_6_2/queryparsersyntax.html
Getting the results with a predefined ordering requires adding the parameter
order=ordering

where ordering is one of index, relevance or score. In this case an additional field will be added to each
result, named score, that holds the float value that is the score reported by the query result.
Example request
• GET http://localhost:7474/db/data/index/node/index_1512731131511_1?query=Name:Build~0.1%20AND
%20Gender:Male
• Accept: application/json; charset=UTF-8
Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
[ {
"labels" : "http://localhost:7474/db/data/node/48/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/48/relationships/out",
"data" : {
"Name" : "Builder"
},
"all_typed_relationships" : "http://localhost:7474/db/data/node/48/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/48/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/48",
"property" : "http://localhost:7474/db/data/node/48/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/48/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/48/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/48/relationships/in",
"extensions" : { },
"create_relationship" : "http://localhost:7474/db/data/node/48/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/48/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/48/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/48/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 48,
"labels" : [ ]
}
} ]

421

REST API

21.21. Unique Indexing
Note

As of Neo4j 2.0, unique constraints have been added. These make Neo4j enforce
the uniqueness, guaranteeing that uniqueness is maintained. See the section called
“Constraints” [10] for details about this. For most cases, the unique constraints should be
used rather than the features described below.
For uniqueness enforcements, there are two modes:
• URL Parameter uniqueness=get_or_create: Create a new node/relationship and index it if no existing
one can be found. If an existing node/relationship is found, discard the sent data and return the
existing node/relationship.
• URL Parameter uniqueness=create_or_fail: Create a new node/relationship if no existing one can be
found in the index. If an existing node/relationship is found, return a conflict error.
For more information, see Section 18.6, “Creating unique nodes” [293].

Get or create unique node (create)
The node is created if it doesn’t exist in the unique index already.
Example request
• POST http://localhost:7474/db/data/index/node/index_1512731131455_1?uniqueness=get_or_create
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
{
"key" : "name",
"value" : "Tobias",
"properties" : {
"name" : "Tobias",
"sequence" : 1
}
}

Example response
• 201: Created
• Content-Type: application/json; charset=UTF-8
• Location: http://localhost:7474/db/data/index/node/index_1512731131455_1/name/Tobias/47
{
"extensions" : { },
"labels" : "http://localhost:7474/db/data/node/47/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/47/relationships/out",
"all_typed_relationships" : "http://localhost:7474/db/data/node/47/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/47/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/47",
"property" : "http://localhost:7474/db/data/node/47/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/47/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/47/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/47/relationships/in",
"create_relationship" : "http://localhost:7474/db/data/node/47/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/47/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/47/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/47/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 47,
"labels" : [ ]

422

REST API
},
"data" : {
"sequence" : 1,
"name" : "Tobias"
},
"indexed" : "http://localhost:7474/db/data/index/node/index_1512731131455_1/name/Tobias/47"
}

Get or create unique node (existing)
Here, a node is not created but the existing unique node returned, since another node is indexed with
the same data already. The node data returned is then that of the already existing node.
Example request
• POST http://localhost:7474/db/data/index/node/index_1512731130775_1?uniqueness=get_or_create
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
{
"key" : "name",
"value" : "Peter",
"properties" : {
"name" : "Peter",
"sequence" : 2
}
}

Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
• Location: http://localhost:7474/db/data/index/node/index_1512731130775_1/name/Peter/37
{
"extensions" : { },
"labels" : "http://localhost:7474/db/data/node/37/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/37/relationships/out",
"all_typed_relationships" : "http://localhost:7474/db/data/node/37/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/37/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/37",
"property" : "http://localhost:7474/db/data/node/37/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/37/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/37/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/37/relationships/in",
"create_relationship" : "http://localhost:7474/db/data/node/37/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/37/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/37/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/37/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 37,
"labels" : [ ]
},
"data" : {
"sequence" : 1,
"name" : "Peter"
},
"indexed" : "http://localhost:7474/db/data/index/node/index_1512731130775_1/name/Peter/37"
}

Create a unique node or return fail (create)
Here, in case of an already existing node, an error should be returned. In this example, no existing
indexed node is found and a new node is created.
423

REST API
Example request
• POST http://localhost:7474/db/data/index/node/index_1512731131401_1?uniqueness=create_or_fail
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
{
"key" : "name",
"value" : "Tobias",
"properties" : {
"name" : "Tobias",
"sequence" : 1
}
}

Example response
• 201: Created
• Content-Type: application/json; charset=UTF-8
• Location: http://localhost:7474/db/data/index/node/index_1512731131401_1/name/Tobias/46
{
"extensions" : { },
"labels" : "http://localhost:7474/db/data/node/46/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/46/relationships/out",
"all_typed_relationships" : "http://localhost:7474/db/data/node/46/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/46/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/46",
"property" : "http://localhost:7474/db/data/node/46/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/46/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/46/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/46/relationships/in",
"create_relationship" : "http://localhost:7474/db/data/node/46/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/46/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/46/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/46/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 46,
"labels" : [ ]
},
"data" : {
"sequence" : 1,
"name" : "Tobias"
},
"indexed" : "http://localhost:7474/db/data/index/node/index_1512731131401_1/name/Tobias/46"
}

Create a unique node or return fail (fail)
Here, in case of an already existing node, an error should be returned. In this example, an existing node
indexed with the same data is found and an error is returned.
Example request
• POST http://localhost:7474/db/data/index/node/index_1512731130318_1?uniqueness=create_or_fail
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
{
"key" : "name",
"value" : "Peter",
"properties" : {

424

REST API
"name" : "Peter",
"sequence" : 2
}
}

Example response
• 409: Conflict
• Content-Type: application/json; charset=UTF-8
{
"extensions" : { },
"labels" : "http://localhost:7474/db/data/node/31/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/31/relationships/out",
"all_typed_relationships" : "http://localhost:7474/db/data/node/31/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/31/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/31",
"property" : "http://localhost:7474/db/data/node/31/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/31/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/31/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/31/relationships/in",
"create_relationship" : "http://localhost:7474/db/data/node/31/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/31/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/31/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/31/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 31,
"labels" : [ ]
},
"data" : {
"sequence" : 1,
"name" : "Peter"
},
"indexed" : "http://localhost:7474/db/data/index/node/index_1512731130318_1/name/Peter/31"
}

Add an existing node to unique index (not indexed)
Associates a node with the given key/value pair in the given unique index.
In this example, we are using create_or_fail uniqueness.
Example request
• POST http://localhost:7474/db/data/index/node/index_1512731131177_1?uniqueness=create_or_fail
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
{
"value" : "some value",
"uri" : "http://localhost:7474/db/data/node/42",
"key" : "some-key"
}

Example response
• 201: Created
• Content-Type: application/json; charset=UTF-8
• Location: http://localhost:7474/db/data/index/node/index_1512731131177_1/some-key/some%20value/42
{
"extensions" : { },
"labels" : "http://localhost:7474/db/data/node/42/labels",

425

REST API
"outgoing_relationships" : "http://localhost:7474/db/data/node/42/relationships/out",
"all_typed_relationships" : "http://localhost:7474/db/data/node/42/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/42/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/42",
"property" : "http://localhost:7474/db/data/node/42/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/42/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/42/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/42/relationships/in",
"create_relationship" : "http://localhost:7474/db/data/node/42/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/42/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/42/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/42/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 42,
"labels" : [ ]
},
"data" : { },
"indexed" : "http://localhost:7474/db/data/index/node/index_1512731131177_1/some-key/some%20value/42"
}

Add an existing node to unique index (already indexed)
In this case, the node already exists in the index, and thus we get a HTTP 409 status response, as we
have set the uniqueness to create_or_fail.
Example request
• POST http://localhost:7474/db/data/index/node/index_1512731131320_1?uniqueness=create_or_fail
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
{
"value" : "some value",
"uri" : "http://localhost:7474/db/data/node/45",
"key" : "some-key"
}

Example response
• 409: Conflict
• Content-Type: application/json; charset=UTF-8
{
"extensions" : { },
"labels" : "http://localhost:7474/db/data/node/44/labels",
"outgoing_relationships" : "http://localhost:7474/db/data/node/44/relationships/out",
"all_typed_relationships" : "http://localhost:7474/db/data/node/44/relationships/all/{-list|&|types}",
"traverse" : "http://localhost:7474/db/data/node/44/traverse/{returnType}",
"self" : "http://localhost:7474/db/data/node/44",
"property" : "http://localhost:7474/db/data/node/44/properties/{key}",
"properties" : "http://localhost:7474/db/data/node/44/properties",
"outgoing_typed_relationships" : "http://localhost:7474/db/data/node/44/relationships/out/{-list|&|types}",
"incoming_relationships" : "http://localhost:7474/db/data/node/44/relationships/in",
"create_relationship" : "http://localhost:7474/db/data/node/44/relationships",
"paged_traverse" : "http://localhost:7474/db/data/node/44/paged/traverse/{returnType}{?pageSize,leaseTime}",
"all_relationships" : "http://localhost:7474/db/data/node/44/relationships/all",
"incoming_typed_relationships" : "http://localhost:7474/db/data/node/44/relationships/in/{-list|&|types}",
"metadata" : {
"id" : 44,
"labels" : [ ]
},
"data" : {
"some-key" : "some value"

426

REST API
},
"indexed" : "http://localhost:7474/db/data/index/node/index_1512731131320_1/some-key/some%20value/44"
}

Get or create unique relationship (create)
Create a unique relationship in an index. If a relationship matching the given key and value already
exists in the index, it will be returned. If not, a new relationship will be created.

Note

The type and direction of the relationship is not regarded when determining uniqueness.
Example request
• POST http://localhost:7474/db/data/index/relationship/index_1512731135301_1/?
uniqueness=get_or_create
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
{
"key" : "name",
"value" : "Tobias",
"start" : "http://localhost:7474/db/data/node/141",
"end" : "http://localhost:7474/db/data/node/142",
"type" : "knowledge"
}

Example response
• 201: Created
• Content-Type: application/json; charset=UTF-8
• Location: http://localhost:7474/db/data/index/relationship/index_1512731135301_1/name/Tobias/63
{
"extensions" : { },
"start" : "http://localhost:7474/db/data/node/141",
"property" : "http://localhost:7474/db/data/relationship/63/properties/{key}",
"self" : "http://localhost:7474/db/data/relationship/63",
"properties" : "http://localhost:7474/db/data/relationship/63/properties",
"type" : "knowledge",
"end" : "http://localhost:7474/db/data/node/142",
"metadata" : {
"id" : 63,
"type" : "knowledge"
},
"data" : {
"name" : "Tobias"
},
"indexed" : "http://localhost:7474/db/data/index/relationship/index_1512731135301_1/name/Tobias/63"
}

Get or create unique relationship (existing)
Here, in case of an already existing relationship, the sent data is ignored and the existing relationship
returned.
Example request
• POST http://localhost:7474/db/data/index/relationship/index_1512731135386_1?
uniqueness=get_or_create
427

REST API
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
{
"key" : "name",
"value" : "Peter",
"start" : "http://localhost:7474/db/data/node/145",
"end" : "http://localhost:7474/db/data/node/146",
"type" : "KNOWS"
}

Example response
• 200: OK
• Content-Type: application/json; charset=UTF-8
{
"extensions" : { },
"start" : "http://localhost:7474/db/data/node/143",
"property" : "http://localhost:7474/db/data/relationship/64/properties/{key}",
"self" : "http://localhost:7474/db/data/relationship/64",
"properties" : "http://localhost:7474/db/data/relationship/64/properties",
"type" : "KNOWS",
"end" : "http://localhost:7474/db/data/node/144",
"metadata" : {
"id" : 64,
"type" : "KNOWS"
},
"data" : { },
"indexed" : "http://localhost:7474/db/data/index/relationship/index_1512731135386_1/name/Peter/64"
}

Create a unique relationship or return fail (create)
Here, in case of an already existing relationship, an error should be returned. In this example, no
existing relationship is found and a new relationship is created.
Example request
• POST http://localhost:7474/db/data/index/relationship/index_1512731135526_1?
uniqueness=create_or_fail
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
{
"key" : "name",
"value" : "Tobias",
"start" : "http://localhost:7474/db/data/node/153",
"end" : "http://localhost:7474/db/data/node/154",
"type" : "KNOWS"
}

Example response
• 201: Created
• Content-Type: application/json; charset=UTF-8
• Location: http://localhost:7474/db/data/index/relationship/index_1512731135526_1/name/Tobias/67
{
"extensions" : { },

428

REST API
"start" : "http://localhost:7474/db/data/node/153",
"property" : "http://localhost:7474/db/data/relationship/67/properties/{key}",
"self" : "http://localhost:7474/db/data/relationship/67",
"properties" : "http://localhost:7474/db/data/relationship/67/properties",
"type" : "KNOWS",
"end" : "http://localhost:7474/db/data/node/154",
"metadata" : {
"id" : 67,
"type" : "KNOWS"
},
"data" : {
"name" : "Tobias"
},
"indexed" : "http://localhost:7474/db/data/index/relationship/index_1512731135526_1/name/Tobias/67"
}

Create a unique relationship or return fail (fail)
Here, in case of an already existing relationship, an error should be returned. In this example, an
existing relationship is found and an error is returned.
Example request
• POST http://localhost:7474/db/data/index/relationship/index_1512731135143_1?
uniqueness=create_or_fail
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
{
"key" : "name",
"value" : "Peter",
"start" : "http://localhost:7474/db/data/node/133",
"end" : "http://localhost:7474/db/data/node/134",
"type" : "KNOWS"
}

Example response
• 409: Conflict
• Content-Type: application/json; charset=UTF-8
{
"extensions" : { },
"start" : "http://localhost:7474/db/data/node/131",
"property" : "http://localhost:7474/db/data/relationship/59/properties/{key}",
"self" : "http://localhost:7474/db/data/relationship/59",
"properties" : "http://localhost:7474/db/data/relationship/59/properties",
"type" : "KNOWS",
"end" : "http://localhost:7474/db/data/node/132",
"metadata" : {
"id" : 59,
"type" : "KNOWS"
},
"data" : { },
"indexed" : "http://localhost:7474/db/data/index/relationship/index_1512731135143_1/name/Peter/59"
}

Add an existing relationship to a unique index (not indexed)
If a relationship matching the given key and value already exists in the index, it will be returned. If not,
an HTTP 409 (conflict) status will be returned in this case, as we are using create_or_fail.
It’s possible to use get_or_create uniqueness as well.
429

REST API

Note

The type and direction of the relationship is not regarded when determining uniqueness.
Example request
• POST http://localhost:7474/db/data/index/relationship/index_1512731135093_1?
uniqueness=create_or_fail
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
{
"key" : "name",
"value" : "Peter",
"uri" : "http://localhost:7474/db/data/relationship/58"
}

Example response
• 201: Created
• Content-Type: application/json; charset=UTF-8
• Location: http://localhost:7474/db/data/index/relationship/index_1512731135093_1/name/Peter/58
{
"extensions" : { },
"start" : "http://localhost:7474/db/data/node/129",
"property" : "http://localhost:7474/db/data/relationship/58/properties/{key}",
"self" : "http://localhost:7474/db/data/relationship/58",
"properties" : "http://localhost:7474/db/data/relationship/58/properties",
"type" : "KNOWS",
"end" : "http://localhost:7474/db/data/node/130",
"metadata" : {
"id" : 58,
"type" : "KNOWS"
},
"data" : { },
"indexed" : "http://localhost:7474/db/data/index/relationship/index_1512731135093_1/name/Peter/58"
}

Add an existing relationship to a unique index (already indexed)
Example request
• POST http://localhost:7474/db/data/index/relationship/index_1512731135193_1?
uniqueness=create_or_fail
• Accept: application/json; charset=UTF-8
• Content-Type: application/json
{
"key" : "name",
"value" : "Peter",
"uri" : "http://localhost:7474/db/data/relationship/61"
}

Example response
• 409: Conflict
• Content-Type: application/json; charset=UTF-8
{

430

REST API
"extensions" : { },
"start" : "http://localhost:7474/db/data/node/135",
"property" : "http://localhost:7474/db/data/relationship/60/properties/{key}",
"self" : "http://localhost:7474/db/data/relationship/60",
"properties" : "http://localhost:7474/db/data/relationship/60/properties",
"type" : "KNOWS",
"end" : "http://localhost:7474/db/data/node/136",
"metadata" : {
"id" : 60,
"type" : "KNOWS"
},
"data" : { },
"indexed" : "http://localhost:7474/db/data/index/relationship/index_1512731135193_1/name/Peter/60"
}

431

REST API

21.22. WADL Support
The Neo4j REST API is a truly RESTful interface relying on hypermedia controls (links) to advertise
permissible actions to users. Hypermedia is a dynamic interface style where declarative constructs
(semantic markup) are used to inform clients of their next legal choices just in time.

Caution

RESTful APIs cannot be modeled by static interface description languages like WSDL or
WADL.
However for some use cases, developers may wish to expose WADL descriptions of the Neo4j REST API,
particularly when using tooling that expects such.
In those cases WADL generation may be enabled by adding to your server’s conf/neo4j.properties file:
unsupported_wadl_generation_enabled=true

Caution

WADL is not an officially supported part of the Neo4j server API because WADL is
insufficiently expressive to capture the set of potential interactions a client can drive
with Neo4j server. Expect the WADL description to be incomplete, and in some cases
contradictory to the real API. In any cases where the WADL description disagrees with
the REST API, the REST API should be considered authoritative. WADL generation may be
withdrawn at any point in the Neo4j release cycle.

432

REST API

21.23. Using the REST API from WebLogic
When deploying an application to WebLogic you may run into problems when Neo4j responds with an
HTTP status of 204 No Content. The response does not contain an entity body in such cases.
This can cause WebLogic to throw java.net.SocketTimeoutException: Read timed out for no obvious
reason.
If you encounter this, please try setting UseSunHttpHandler to true. You can for example do this by adding
the following to the WebLogic startup script:
-DUseSunHttpHandler=true

The WebLogic startup script is called bin\startWebLogic.sh (bin/startWebLogic.cmd on Windows).

433

Chapter 22. Deprecations
This section outlines deprecations in Neo4j 2.3.12 or earlier in order to help you find a smoother
transition path to future releases. All features listed below may be removed in a future major release.
Cypher ExecutionEngine

Embedded Java API
Graph Matching
Windows scripts

STR() function
HAS() function

Bare node patterns

Create nodes from a
parameter list of maps

1
2

There’s no need to use ExecutionEngine anymore. Instead, use the
1
execute methods on GraphDatabaseService. ExecutionEngine and the
related classes/interfaces have been deprecated, see the javadocs for
details.
2
See Deprecated list in Javadoc .
The graph-matching component will be removed in future releases.
The .bat files used to operate the database and tools on Windows
are being phased out and will be removed in future releases, in favor
of modern, equivalent PowerShell scripts. For more information, see
Section 23.3, “Windows PowerShell module” [442].
The STR() function is deprecated from Neo4j version 2.3 and onwards.
The HAS() function is deprecated from Neo4j version 2.3 and onwards.
Please use EXISTS() instead.
The usage of node identifiers without enclosing them in parentheses,
such as in MATCH n-->m RETURN n.prop, is deprecated from Neo4j
version 2.3 and onwards. Please use MATCH (n)-->(m) RETURN n.prop
instead.
The syntax CREATE (:X {param}), where param is a list of maps, is
deprecated in Neo4j version 2.3 and onwards. To achieve the same
functionality, use UNWIND instead: UNWIND {param} as props CREATE (n:X)
SET n = props

http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/GraphDatabaseService.html#execute(java.lang.String)
http://neo4j.com/docs/2.3.12/javadocs/deprecated-list.html

434

Part V. Operations

This part describes how to install and maintain a Neo4j installation. This includes topics such as backing
up the database and monitoring the health of the database as well as diagnosing issues.

23. Installation & Deployment ............................................................................................................... 437
23.1. System Requirements ........................................................................................................... 438
23.2. Server Installation ................................................................................................................. 439
23.3. Windows PowerShell module ............................................................................................... 442
23.4. Upgrading .............................................................................................................................. 445
23.5. Setup for remote debugging ................................................................................................ 446
23.6. Usage Data Collector ............................................................................................................ 447
24. Configuration & Performance .......................................................................................................... 448
24.1. Introduction ........................................................................................................................... 449
24.2. Server Configuration ............................................................................................................. 450
24.3. Server Performance Tuning .................................................................................................. 454
24.4. Performance Guide ............................................................................................................... 455
24.5. Logical logs ............................................................................................................................ 459
24.6. Compressed storage of property values .............................................................................. 460
24.7. Memory mapped IO settings ................................................................................................ 462
24.8. Configuration Settings Reference ......................................................................................... 464
25. High Availability ................................................................................................................................ 472
25.1. Architecture ........................................................................................................................... 473
25.2. HA Setup and configuration ................................................................................................. 474
25.3. How Neo4j HA operates ....................................................................................................... 480
25.4. Arbiter Instances ................................................................................................................... 481
25.5. Upgrade of a Neo4j HA Cluster ............................................................................................ 482
25.6. High Availability setup tutorial .............................................................................................. 483
25.7. REST endpoint for HA status information ............................................................................ 489
25.8. Setting up HAProxy as a load balancer ................................................................................ 491
26. Backup .............................................................................................................................................. 494
26.1. Introducing Backup ............................................................................................................... 495
26.2. Performing Backups .............................................................................................................. 497
26.3. Restoring Your Data .............................................................................................................. 498
27. Security ............................................................................................................................................. 499
27.1. Securing Neo4j Server .......................................................................................................... 500
28. Monitoring ........................................................................................................................................ 505
28.1. Adjusting remote JMX access to the Neo4j Server ............................................................... 506
28.2. How to connect to a Neo4j instance using JMX and JConsole .............................................. 507
28.3. How to connect to the JMX monitoring programmatically ................................................... 510
28.4. Reference of supported JMX MBeans ................................................................................... 511
28.5. Metrics Reporting .................................................................................................................. 523

436

Chapter 23. Installation & Deployment
Neo4j is accessed as a standalone server, either directly through a REST interface or through a
language-specific driver.
Neo4j can be installed as a server, running either as a headless application or system service. For
information on installing The Neo4j Server, see Section 23.2, “Server Installation” [439].
For running Neo4j in high availability mode, see Chapter 25, High Availability [472].

437

Installation & Deployment

23.1. System Requirements
Memory constrains graph size, disk I/O constrains read/write performance, as always.

CPU
Performance is generally memory or I/O bound for large graphs, and compute bound for graphs that fit
in memory.
Minimum
Recommended

Intel Core i3
Intel Core i7
IBM POWER8

Memory
More memory allows even larger graphs, but needs to be configured properly to avoid larger garbage
collection operations. See Section 24.4, “Performance Guide” [455] for suggestions.
Minimum
Recommended

2GB
16—32GB or more

Disk
Aside from capacity, the performance characteristics of the disk are the most important when selecting
storage. Neo4j workloads tend significantly toward random reads. Prefer media with low average seek
time: SSD over spinning disks. Consult the section called “Disks, RAM and other tips” [456] for more
details.
Minimum
Recommended

10GB SATA
SSD w/ SATA

Filesystem
For proper ACID behavior, the filesystem must support flush (fsync, fdatasync). See the section called
“Linux filesystem tuning” [457] for a discussion on how to configure the filesystem in Linux for
optimal performance.
Minimum
Recommended

ext4 (or similar)
ext4, ZFS

Software
Neo4j requires a Java Virtual Machine to operate. Community Edition installers for Windows and Mac
include a JVM for convenience. Other distributions, including all distributions of Neo4j Enterprise
Edition, require that the JVM be provided.
1

Java
Operating Systems
Architectures

2

OpenJDK 8 (preferred) or 7 or Oracle Java 8 (preferred) or 7
3
IBM Java 8 (preferred) or 7 (POWER8 only)
Linux, HP-UX, Windows Server 2012 for production
Additionally, Windows XP and Mac OS X for development
x86
OpenPOWER (POWER8, requires IBM Java or OpenJDK, version 7 or 8)

Important

It is strongly recommended to use an up-to-date Java release as there are issues in early
versions of both Java 7 and Java 8 that are known to adversely affect Neo4j.
1

http://openjdk.java.net/
http://www.oracle.com/technetwork/java/javase/downloads/index.html
3
http://www.ibm.com/developerworks/java/jdk/
2

438

Installation & Deployment

23.2. Server Installation
Deployment Scenarios
As a developer, you may wish to download Neo4j and run it locally on your desktop computer. We
recommend this as an easy way to discover Neo4j.
• For Windows, see the section called “Windows” [439].
• For Unix/Linux, see the section called “Linux” [440].
• For OSX, see the section called “Mac OSX” [440].
As a systems administrator, you may wish to install Neo4j using a packaging system so you can ensure
that a cluster of machines have identical installs. See the section called “Linux Packages” [440] for
more information on this.
For information on High Availability, please refer to Chapter 25, High Availability [472].

Prerequisites
With the exception of our Windows and Mac Installers, you’ll need a Java Virtual Machine installed on
4
your computer. We recommend that you install OpenJDK 8 (preferred) or 7 or Oracle Java 8 (preferred)
5
or 7 .

Setting Proper File Permissions
When installing Neo4j Server, keep in mind that the bin/neo4j executable will need to be run by
some OS system user, and that user will need write permissions to some files/directories. This goes
specifically for the data/graph.db directory. That user will also need execute permissions on other files,
such as those in bin/.
It is recommended to either choose or create a user who will own and manage the Neo4j Server. This
user should own the entire Neo4j directory, so make sure to untar/unzip it as this user and not with
sudo (UNIX/Linux/OSx) etc.
If data/graph.db is not writable by the user Neo4j won’t be able to write anything either to the store or
its log files. As a result any logs would be appended to console.log. The following error message would
indicate a possible permissions issue: Write transactions to database disabled.

Windows
Windows Installer
1. Download the version that you want from http://neo4j.com/download/.
• Select the appropriate version and architecture for your platform.
2. Double-click the downloaded installer file.
3. Follow the prompts.

Note

The installer will prompt to be granted Administrator privileges. Newer versions of Windows
come with a SmartScreen feature that may prevent the installer from running — you can
make it run anyway by clicking "More info" on the "Windows protected your PC" screen.

Tip

If you install Neo4j using the windows installer and you already have an existing instance
of Neo4j the installer will select a new install directory by default. If you specify the same
directory it will ask if you want to upgrade. This should proceed without issue although
4
5

http://openjdk.java.net/
http://www.oracle.com/technetwork/java/javase/downloads/index.html

439

Installation & Deployment
some users have reported a JRE is damaged error. If you see this error simply install Neo4j
into a different location.

Windows Console Application
1. Download the latest release from http://neo4j.com/download/.
• Select the appropriate Zip distribution.
2. Right-click the downloaded file, click Extract All.
• Refer to the top-level extracted directory as: NEO4J_HOME
3. Consult Section 23.3, “Windows PowerShell module” [442] for how to start or install Neo4j.

Note

Some users have reported problems on Windows when using the ZoneAlarm firewall. If you
are having problems getting large responses from the server, or if the web interface does
not work, try disabling ZoneAlarm. Contact ZoneAlarm support to get information on how to
resolve this.

Linux
Linux Packages
• For Debian packages, see the instructions at http://debian.neo4j.org/.
After installation you may have to do some platform specific configuration and performance tuning. For
that, refer to Section 24.4, “Performance Guide” [455].

Unix Console Application
1. Download the latest release from http://neo4j.com/download/.
• Select the appropriate tar.gz distribution for your platform.
2. Extract the contents of the archive, using: tar -xf 
• Refer to the top-level extracted directory as: NEO4J_HOME
3. Change directory to: $NEO4J_HOME
• Run: ./bin/neo4j console
4. Stop the server by typing Ctrl-C in the console.

Linux Service
The neo4j command can also be used with start, stop, restart or status instead of console. By using
these actions, you can create a Neo4j service. See the neo4j man page for further details.

Caution

This approach to running Neo4j as a service is deprecated. We strongly advise you to run
Neo4j from a package where feasible.
You can build your own init.d script. See for instance the Linux Standard Base specification on system
6
7
8
initialization , or one of the many samples and tutorials .

Mac OSX
Mac OSX Installer
1. Download the .dmg installer that you want from http://neo4j.com/download/.
6

http://refspecs.linuxfoundation.org/LSB_3.1.0/LSB-Core-generic/LSB-Core-generic/tocsysinit.html
https://gist.github.com/chrisvest/7673244
8
http://www.linux.com/learn/tutorials/442412-managing-linux-daemons-with-init-scripts
7

440

Installation & Deployment
2. Click the downloaded installer file.
3. Drag the Neo4j icon into the Applications folder.

Tip

If you install Neo4j using the Mac installer and already have an existing instance of Neo4j the
installer will ensure that both the old and new versions can co-exist on your system.

Running Neo4j from the Terminal
The server can be started in the background from the terminal with the command neo4j start, and
then stopped again with neo4j stop. The server can also be started in the foreground with neo4j
console — then its log output will be printed to the terminal.
The neo4j-shell command can be used to interact with Neo4j from the command line using Cypher. It
will automatically connect to any server that is running on localhost with the default port, otherwise
it will show a help message. You can alternatively start the shell with an embedded Neo4j instance,
by using the -path path/to/data argument — note that only a single instance of Neo4j can access the
database files at a time.

OSX Service
Use the standard OSX system tools to create a service based on the neo4j command.

A note on Java on OS X Mavericks
Unlike previous versions, OS X Mavericks does not come with Java pre-installed. You might encounter
that the first time you run Neo4j, where OS X will trigger a popup offering you to install Java SE 6.
Java SE 6 is incompatible with Neo4j 2.3.12, so we strongly advise you to skip installing Java SE 6 if you
have no other uses for it. Instead, for Neo4j 2.3.12 we recommend you install Java SE 8 (preferred) or 7
from Oracle (http://www.oracle.com/technetwork/java/javase/downloads/index.html) as that is what we
support for production use.

Multiple Server instances on one machine
Neo4j can be set up to run as several instances on one machine, providing for instance several
databases for development.
For how to set this up, see the section called “Alternative setup: Creating a local cluster for
testing” [486]. Just use the Neo4j edition of your choice, follow the guide and remember to not set the
servers to run in HA mode.

441

Installation & Deployment

23.3. Windows PowerShell module
The Neo4j PowerShell module allows administrators to:
• audit and set Neo4j configuration settings,
• install, start and stop Neo4j Windows® Services
• and start tools, such as Neo4j Shell and Neo4j Import.
9

The PowerShell module is installed as part of the ZIP file distributions of Neo4j.

System Requirements
• Requires PowerShell v2.0 or above.
• Supported on either 32 or 64 bit operating systems.

How do I import the module?
The module file is located in the bin directory of your Neo4j installation, i.e. where you unzipped the
downloaded file. For example, if Neo4j was installed in C:\Neo4j then the module would be imported
like this:
Import-Module C:\Neo4j\bin\Neo4j-Management.psd1

This will add the module to the current session.

Note

On Windows it is sometimes necessary to Unblock a downloaded zip file before you can
import its contents as a module. If you right-click on the zip file and choose "Properties" you
will get a dialog. Bottom-right on that dialog you will find an "Unblock" button. Click that.
Then you should be able to import the module.

Note

Running scripts has to be enabled on the system. This can for example be achieved by
executing the following from an elevated PowerShell prompt:
Set-ExecutionPolicy -ExecutionPolicy RemoteSigned
10

For more information see About execution policies .
Once the module has been imported you can start an interactive console version of a Neo4j Server like
this:
'C:\Neo4j' | Start-Neo4jServer -Console -Wait

To stop the server, issue ctrl-c in the console window that was created by the command.

How do I get help about the module?
Once the module is imported you can query the available commands like this:
Get-Command -Module Neo4j-Management

The output should be similar to the following:
CommandType
----------Function
Function
Function
9

Name
---Get-Neo4jHome
Get-Neo4jServer
Get-Neo4jSetting

Version
------2.3.0
2.3.0
2.3.0

Source
-----Neo4j-Management
Neo4j-Management
Neo4j-Management

http://neo4j.com/download/
http://go.microsoft.com/fwlink/?LinkID=135

10

442

Installation & Deployment
Function
Function
Function
Function
Function
Function
Function
Function
Function
Function
Function
Function
Function
Function
Function
Function
Function

Initialize-Neo4jHACluster
Initialize-Neo4jServer
Install-Neo4jArbiter
Install-Neo4jServer
Remove-Neo4jSetting
Restart-Neo4jArbiter
Restart-Neo4jServer
Set-Neo4jSetting
Start-Neo4jArbiter
Start-Neo4jBackup
Start-Neo4jImport
Start-Neo4jServer
Start-Neo4jShell
Stop-Neo4jArbiter
Stop-Neo4jServer
Uninstall-Neo4jArbiter
Uninstall-Neo4jServer

2.3.0
2.3.0
2.3.0
2.3.0
2.3.0
2.3.0
2.3.0
2.3.0
2.3.0
2.3.0
2.3.0
2.3.0
2.3.0
2.3.0
2.3.0
2.3.0
2.3.0

Neo4j-Management
Neo4j-Management
Neo4j-Management
Neo4j-Management
Neo4j-Management
Neo4j-Management
Neo4j-Management
Neo4j-Management
Neo4j-Management
Neo4j-Management
Neo4j-Management
Neo4j-Management
Neo4j-Management
Neo4j-Management
Neo4j-Management
Neo4j-Management
Neo4j-Management

The module also supports the standard PowerShell help commands.
Get-Help Initialize-Neo4jServer

To see examples for a command, do like this:
Get-Help Initialize-Neo4jServer -examples

Basic Examples
• Retrieve basic information about the Neo4j Server e.g. Version and Edition.
Get-Neo4jServer C:\Neo4j

• Retrieve all of the settings of a Neo4j Server and display in a nice table
'C:\Neo4j' | Get-Neo4jSetting | `
Select ConfigurationFile, Name, Value | `
Format-Table

• The module uses the pipeline so you can export the settings, modify or filter them. For example, only
show settings with the value of True:
'C:\Neo4j' | Get-Neo4jSetting | `
Where { $_.Value -eq 'True' } | `
Select ConfigurationFile, Name, Value | `
Format-Table

• Quickly configure a Neo4j Server from saved settings in a CSV file.
Import-CSV -Path 'C:\Neo4jSettings.CSV' | Set-Neo4jSetting -Force

Advanced examples
• You can quickly configure and start an interactive console version of a Neo4j Server like this:
'C:\Neo4j' | `
Initialize-Neo4jServer -ListenOnIPAddress 127.0.0.1 -PassThru | `
Start-Neo4jServer -Console -Wait

To stop the server, issue ctrl-c in the console window that was created by the command.
• You can quickly configure and start a Service version of a Neo4j Server.

Note

The following must be executed from an elevated PowerShell prompt, where the Neo4j
module has been imported into the session.

443

Installation & Deployment

'C:\Neo4j' | `
Initialize-Neo4jServer -ListenOnIPAddress 127.0.0.1 -PassThru | `
Install-Neo4jServer -PassThru | `
Start-Neo4jServer

To stop the server do this:
'C:\Neo4j' | Stop-Neo4jServer

• Create a three node cluster on the local computer. This example assumes three installations of the
Enterprise version of Neo4j installed at C:\Neo4j-1,C:\Neo4j-2 and C:\Neo4j-3.
'C:\Neo4j-1' |
Initialize-Neo4jServer `
-ListenOnIPAddress 127.0.0.1 `
-HTTPPort 7474 `
-OnlineBackupServer '127.0.0.1:6362' `
-PassThru |
Initialize-Neo4jHACluster `
-ServerID 1 `
-InitialHosts '127.0.0.1:5001' `
-ClusterServer '127.0.0.1:5001' `
-HAServer '127.0.0.1:6001' `
-PassThru |
Start-Neo4jServer -Console
'C:\Neo4j-2' |
Initialize-Neo4jServer `
-ListenOnIPAddress 127.0.0.1 `
-HTTPPort 7475 `
-ClearExistingDatabase `
-OnlineBackupServer '127.0.0.1:6363' `
-PassThru |
Initialize-Neo4jHACluster `
-ServerID 2 `
-InitialHosts '127.0.0.1:5001' `
-ClusterServer '127.0.0.1:5002' `
-HAServer '127.0.0.1:6002' `
-DisallowClusterInit `
-PassThru |
Start-Neo4jServer -Console
'C:\Neo4j-3' |
Initialize-Neo4jServer `
-ListenOnIPAddress 127.0.0.1 `
-HTTPPort 7476 `
-ClearExistingDatabase `
-OnlineBackupServer '127.0.0.1:6364' `
-PassThru |
Initialize-Neo4jHACluster `
-ServerID 3 `
-InitialHosts '127.0.0.1:5001' `
-ClusterServer '127.0.0.1:5003' `
-HAServer '127.0.0.1:6003' `
-DisallowClusterInit `
-PassThru |
Start-Neo4jServer -Console

Common PowerShell parameters
The module commands support the common PowerShell parameters of Verbose, Debug, WhatIf etc.

444

Installation & Deployment

23.4. Upgrading
This section describes upgrading a single Neo4j instance. To upgrade a Neo4j HA cluster (Neo4j
Enterprise), a very specific procedure must be followed. Please see Section 25.5, “Upgrade of a Neo4j HA
Cluster” [482].
Throughout this instruction, the files used to store the Neo4j data are referred to as database files. The
location of these files is specified by configuring the org.neo4j.server.database.location variable in conf/
neo4j-server.properties.

Disk space requirements

An upgrade requires substantial free disk space, as it makes an entire copy of the database.
The upgraded database may also require larger data files overall.
It is recommended to make available an extra 50% disk space on top of the existing
database files.
In addition to this, don’t forget to reserve the disk space needed for the pre-upgrade
backup.

Supported upgrade paths
Before upgrading to a new major or minor release, the database must first be upgraded to the latest
version within the relevant release. The latest version is available at this page: http://neo4j.com/
download/other-releases. The following Neo4j upgrade paths are supported:
•
•
•
•
•

1.9.latest → 2.3.12
2.0.latest → 2.3.12
2.1.latest → 2.3.12
2.2.latest → 2.3.12
2.3.any → 2.3.12

Upgrade instructions
1. Cleanly shut down the database if it is running.
2. Make a backup copy of the database files. If using the online backup tool available with Neo4j
Enterprise, ensure that backups have completed successfully.
3. Install Neo4j 2.3.12.
4. Review the parameter settings in the files under conf directory in the previous installation, and
transfer any custom set parameters to the 2.3.12 installation. Be aware of parameters that have
changed names between versions. Also, ensure that you configure the 2.3.12 installation to use the
same database file directory as the previous installation.
5. Set the Neo4j configuration parameter allow_store_upgrade=true in the conf/neo4j.properties file of
the 2.3.12 installation. Neo4j will fail to start without this configuration.
6. Start up Neo4j 2.3.12.
7. The database upgrade will take place during startup.
8. Information about the upgrade and a progress indicator are logged into the messages.log file inside
the database file directory.
9. When upgrade has finished, the allow_store_upgrade should be set to false or be removed.
10.It is good practice to make a full backup immediately after the backup.

Cypher compatibility

The Cypher language may evolve between Neo4j versions. For backward compatibility,
Neo4j provides directives which allow explicitly selecting a previous Cypher language
version. This is possible to do globally or for individual statements, as described in the
Cypher Compatibility section.

445

Installation & Deployment

23.5. Setup for remote debugging
In order to configure the Neo4j server for remote debugging sessions, the Java debugging parameters
need to be passed to the Java process through the configuration. They live in the conf/neo4jwrapper.properties file.
In order to specify the parameters, add a line for the additional Java arguments like this:
wrapper.java.additional=-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=5005

This configuration will start a Neo4j server ready for remote debugging attachement at localhost and
port 5005. Use these parameters to attach to the process from Eclipse, IntelliJ or your remote debugger
of choice after starting the server.

446

Installation & Deployment

23.6. Usage Data Collector
The Neo4j Usage Data Collector is a sub-system that gathers usage data, reporting it to the UDCserver at udc.neo4j.org. It is easy to disable, and does not collect any data that is confidential. For more
information about what is being sent, see below.
The Neo4j team uses this information as a form of automatic, effortless feedback from the Neo4j
community. We want to verify that we are doing the right thing by matching download statistics with
usage statistics. After each release, we can see if there is a larger retention span of the server software.
The data collected is clearly stated here. If any future versions of this system collect additional data, we
will clearly announce those changes.
The Neo4j team is very concerned about your privacy. We do not disclose any personally identifiable
information.

Technical Information
To gather good statistics about Neo4j usage, UDC collects this information:
• Kernel version: The build number, and if there are any modifications to the kernel.
• Store id: A randomized globally unique id created at the same time a database is created.
• Ping count: UDC holds an internal counter which is incremented for every ping, and reset for every
restart of the kernel.
• Source: This is either "neo4j" or "maven". If you downloaded Neo4j from the Neo4j website, it’s
"neo4j", if you are using Maven to get Neo4j, it will be "maven".
• Java version: The referrer string shows which version of Java is being used.
• Registration id: For registered server instances.
• Tags about the execution context (e.g. test, language, web-container, app-container, spring, ejb).
• Neo4j Edition (community, enterprise).
• A hash of the current cluster name (if any).
• Distribution information for Linux (rpm, dpkg, unknown).
• User-Agent header for tracking usage of REST client drivers
• MAC address to uniquely identify instances behind firewalls.
• The number of processors on the server.
• The amount of memory on the server.
• The JVM heap size.
• The number of nodes, relationships, labels and properties in the database.
After startup, UDC waits for ten minutes before sending the first ping. It does this for two reasons; first,
we don’t want the startup to be slower because of UDC, and secondly, we want to keep pings from
automatic tests to a minimum. The ping to the UDC servers is done with a HTTP GET.

How to disable UDC
UDC is easily turned off by disabling it in the database configuration. In conf/neo4j.properties for Neo4j
server or in the configuration passed to the database in embedded mode.
Usage Data Collector configuration settings

• neo4j.ext.udc.enabled: Enable the UDC extension.
neo4j.ext.udc.enabled

Description

Enable the UDC extension.

Valid values

neo4j. ext. udc. enabled is a boolean.

Default value

true

447

Chapter 24. Configuration & Performance
In order to get optimum performance out of Neo4j for your application there are a few parameters that
can be tweaked. The two main components that can be configured are the Neo4j caches and the JVM
that Neo4j runs in. The following sections describe how to tune these.

448

Configuration & Performance

24.1. Introduction
To gain good performance, these are the things to look into first:
• Make sure the JVM is not spending too much time performing garbage collection. The goal is to
have a large enough heap to make sure that heavy/peak load will not result in so called GC-trashing.
Performance can drop as much as two orders of magnitude when GC-trashing happens.
• Start the JVM with the -server flag and a good sized heap. Having too large heap may also hurt
performance so you may have to try some different heap sizes.
• Use a concurrent garbage collector. We find that -XX:+UseG1GC works well in most use-cases.
• Give the Neo4j page cache generous amounts of memory. On a dedicated server, there are four big
memory allocations you need to balance: the operating system, the Neo4j JVM, the Neo4j page cache
and the paging memory for the Lucene indexes.
• The operating system on a dedicated server can usually make do with 1 to 2 GBs of memory, but
the more physical memory the machine has, the more memory the operating system will need.
• The Neo4j JVM needs enough heap memory for the transaction state and query processing, plus
some head-room for the garbage collector. Because the heap memory needs are so workload
dependent, it is common to see configurations from 1 GB, up to 32 GBs of heap memory.
• The Neo4j page cache should preferably have enough memory to keep the entire data set in
memory, which is to say, the page cache should be big enough to fit all of the neostore.* files that
are not neostore.transaction.db.* files.
• Lastly, leave enough memory for the operating system page cache, to fit the contents of the index
and schema directories, since it will impact index lookup performance if the indexes cannot fit in
memory.
See the Section 24.4, “Performance Guide” [455] chapter for more information on how to tune Neo4j.

How to add configuration settings
When the database is given no configuration, it will try to use reasonable defaults. This is seldom
optimal, however, because the database and the JVM have no knowledge about your workload or your
performance requirements.
The way you configure Neo4j depend on your mode of deployment; whether you are using the
database as an embedded library, or as a server.
Embedded
Server

When creating the embedded Neo4j instance it is possible to pass in parameters
contained in a map where keys and values are strings, see the section called “Starting an
embedded database with configuration settings” [579] for an example.
When using the Neo4j REST server, see Section 24.2, “Server Configuration” [450] for
how to add configuration settings for the database to the server.

449

Configuration & Performance

24.2. Server Configuration
Quick info
• The server’s primary configuration file is found under conf/neo4j-server.properties.
• Low-level performance tuning parameters and configuration of legacy indexes and the
remote shell are found in conf/neo4j.properties.
• Configuration of the daemonizing wrapper is found in conf/neo4j-wrapper.properties.
• HTTP logging configuration is found in conf/neo4j-http-logging.xml.

Important server configuration parameters
The main configuration file for the server can be found at conf/neo4j-server.properties. This file contains
several important settings, and although the defaults are sensible administrators might choose to make
changes (especially to the port settings).
Set the location on disk of the database directory like this:
org.neo4j.server.database.location=data/graph.db

Note

On Windows systems, absolute locations including drive letters need to read "c:/data/db".
Specify the HTTP server port supporting data, administrative, and UI access:
org.neo4j.server.webserver.port=7474

Specify the client accept pattern for the webserver (default is 127.0.0.1, localhost only):
allow any client to connect
org.neo4j.server.webserver.address=0.0.0.0

For securing the Neo4j Server, see also Chapter 27, Security [499]
Set the location of the round-robin database directory which gathers metrics on the running server
instance:
org.neo4j.server.webadmin.rrdb.location=data/graph.db/../rrd

Set the URI path for the REST data API through which the database is accessed. This should be a relative
path.
org.neo4j.server.webadmin.data.uri=/db/data/

Setting the management URI for the administration API that the Webadmin tool uses. This should be a
relative path.
org.neo4j.server.webadmin.management.uri=/db/manage

Force the server to use IPv4 network addresses, in conf/neo4j-wrapper.conf under the section Java
Additional Parameters add a new paramter:
wrapper.java.additional=-Djava.net.preferIPv4Stack=true

Specify the number of threads used by the Neo4j Web server to control the level of concurrent HTTP
requests that the server will service.
450

Configuration & Performance

org.neo4j.server.webserver.maxthreads=200

Note

The default value is the number of CPUs reported available by the JVM, limited to a
maximum of 500. The limit can be exceeded by specifying a larger value.
The server guards against orphaned transactions by using a timeout. If there are no requests for a
given transaction within the timeout period, the server will roll it back. You can configure the timeout
period by setting the following property to the number of seconds before timeout. The default timeout
is 60 seconds.
org.neo4j.server.transaction.timeout=60

Low-level performance tuning parameters can be explicitly set by referring to the following property:
org.neo4j.server.db.tuning.properties=neo4j.properties

If this property isn’t set, the server will look for a file called neo4j.properties in the same directory as the
neo4j-server.properties file.
If this property isn’t set, and there is no neo4j.properties file in the default configuration directory, then
the server will log a warning. Subsequently at runtime the database engine will attempt tune itself
based on the prevailing conditions.

Neo4j Database performance configuration
The fine-tuning of the low-level Neo4j graph database engine is specified in a separate properties file,
conf/neo4j.properties.
The graph database engine has a range of performance tuning options which are enumerated in
Section 24.3, “Server Performance Tuning” [454]. Note that other factors than Neo4j tuning should
be considered when performance tuning a server, including general server load, memory and file
contention, and even garbage collection penalties on the JVM, though such considerations are beyond
the scope of this configuration document.

HTTP logging configuration
As well as logging events happening within the Neo4j server, it is possible to log the HTTP requests and
responses that the server consumes and produces. Configuring HTTP logging requires operators to
enable and configure the logger and where it will log; and then to optionally configure the log format.

Important

By default the HTTP logger uses Common Log Format1 meaning that most Web server tooling
can automtically consume such logs. In general users should only enable HTTP logging,
select an output directory, and if necessary alter the rollover and retention policies.
To enable HTTP logging, edit the conf/neo4j-server.properties file to resemble the following:
org.neo4j.server.http.log.enabled=true
org.neo4j.server.http.log.config=conf/neo4j-http-logging.xml

Using org.neo4j.server.http.log.enabled=true tells the server that HTTP logging is enabled.
HTTP logging can be disabled completely by setting this property to false. The setting
org.neo4j.server.http.log.config=conf/neo4j-http-logging.xml specifies the logging format and rollover
policy file that governs how HTTP log output is presented and archived. The defaults provided with
Neo4j server uses an hourly log rotation and Common Log Format.
If logging is set up to use log files then the server will check that the log file directory exists and is
writable. If this check fails, then the server will not start and will report the failure to another available
channel like standard out.
1

http://en.wikipedia.org/wiki/Common_Log_Format

451

Configuration & Performance

Tip

Neo4j server now has experimental support for logging full request and response bodies. It
is enabled by setting the following property in neo4j-server.properties:
org.neo4j.server.http.unsafe.content_log.enabled=true

The following logging pattern must also be specified in neo4j-http-logging.xml:
%fullRequest\n\n%fullResponse

This functionality fully duplicates HTTP requests and responses, logging them out to disk. As
such it is strongly advised to not run this in a production setting because of the potential
to constrain performance. However it can prove useful in testing and pre-production
environments.

Using X-Forwarded-Proto and X-Forwarded-Host to parameterize the base
URI for REST responses
There are occasions, for example when you want to host Neo4j server behind a proxy (e.g. one that
handles HTTPS traffic), and still have Neo4j respect the base URI of that externally visible proxy.
Ordinarily Neo4j uses the HOST header of the HTTP request to construct URIs in its responses. Where
a proxy is involved however, this is often undesirable. Instead Neo4j uses the X-Forwarded-Host and
X-Forwarded-Proto headers provided by proxies to parameterize the URIs in the responses from the
database’s REST API. From the outside it looks as if the proxy generated that payload. If an X-ForwardedHost header value contains more than one address (X-Forwarded-Host allows comma-and-space
separated lists of addresses), Neo4j picks the first, which represents the client request.
In order to take advantage of this functionality your proxy server must be configured to transmit these
headers to the Neo4j server. Failure to transmit both X-Forwarded-Host and X-Forwarded-Proto headers
will result in the original base URI being used.

Enabling logging from the garbage collector
To get garbage collection logging output you have to pass the corresponding option to the server JVM
executable by setting the following value in conf/neo4j-wrapper.conf:
wrapper.java.additional=-Xloggc:data/log/neo4j-gc.log

This line is already present and needs uncommenting. Note also that logging is not directed to console.
You will find the logging statements in data/log/ne4j-gc.log or whatever directory you set the option to.

Web Interface configuration settings
Whitelist for remote guides in Browser
The Browser can :play guides from remote locations. You can specify a whitelist of hosts from where
the Browser will be allowed to fetch content from.
In the conf/neo4j-server.properties file:
 To allow default hosts
dbms.browser.remote_content_hostname_whitelist="http://guides.neo4j.com,https://guides.neo4j.com,http://localhost,https://
localhost"
 To allow all hosts (enterprise edition only)
dbms.browser.remote_content_hostname_whitelist="*"

Outgoing connections for Browser
The Browser can load services and data from remote locations and with this configuration this can be
restricted.
452

Configuration & Performance
In the conf/neo4j-server.properties file:
dbms.security.allow_outgoing_browser_connections=true

Disabling console types in Webadmin
You may, for security reasons, want to disable the the Neo4j Shell in Webadmin. Shells allow arbitrary
code execution, and so they could constitute a security risk if you do not trust all users of your Neo4j
Server.
In the conf/neo4j-server.properties file:
 To disable all shells:
org.neo4j.server.manage.console_engines=
 To enable only the Neo4j Shell:
org.neo4j.server.manage.console_engines=shell

453

Configuration & Performance

24.3. Server Performance Tuning
At the heart of the Neo4j server is a regular Neo4j storage engine instance. That engine can be tuned in
the same way as the other embedded configurations, using the same file format. The only difference is
that the server must be told where to find the fine-tuning configuration.
Quick info
• The conf/neo4j.properties file is a standard configuration file that databases load in order to
tune their memory use

Specifying Neo4j tuning properties
The conf/neo4j-server.properties file in the server distribution, is the main configuration file for
the server. In this file we can specify a second properties file that contains the database tuning
settings (that is, the neo4j.properties file). This is done by setting a single property to point to a valid
neo4j.properties file:
org.neo4j.server.db.tuning.properties={neo4j.properties file}

On restarting the server the tuning enhancements specified in the neo4j.properties file will be loaded
and configured into the underlying database engine.

Specifying JVM tuning properties
Tuning the standalone server is achieved by editing the neo4j-wrapper.conf file in the conf directory of
NEO4J_HOME.
Edit the following properties:
neo4j-wrapper.conf JVM tuning properties
Property Name

Meaning

wrapper. java. initmemory

initial heap size (in MB)

wrapper. java. maxmemory

maximum heap size (in MB)

wrapper. java. additional

additional literal JVM parameter

For more information on the tuning properties, see Section 24.4, “Performance Guide” [455].

454

Configuration & Performance

24.4. Performance Guide
This is the Neo4j performance guide. It will give you guidance on how to tune Neo4j to achieve
maximum performance.

Try this first
The first thing to look at when Neo4j is not performing as expected is how the Cypher queries are being
executed. Make sure that they don’t do more work than they have to. Some queries may accidentally
be written in a way that generates a large cartesian product. Other queries may have to perform
expensive label scans because an important index is missing. The Chapter 15, Query Tuning [253]
chapter has more information on how to investigate Cypher performance issues.
The second thing to look at is the Java Virtual Machine process. Make sure that it has enough memory
to do its work. If there is not enough memory to keep the JVM heap resident, the operating system
will swap it out to storage. When garbage collection takes place, the swapped out heap memory
has to be swapped back in, and something else swapped out. This is called swap-thrashing and has
dramatic impact on the performance of a database, rendering it practically unusable. A well-tuned
Neo4j database should not have any swap activity in its steady-state.
Assigning sufficient memory to the JVM also limits the time spent in garbage collection. The goal is to
have a large enough heap to handle peak load without thrashing occurring in the garbage collector.
Performance can drop as much as two orders of magnitude when GC-thrashing happens.
Start the JVM with -server flag and -Xmx, for example -Xmx512m for 512 MB memory or
-Xmx3g for 3 GB memory. Having the heap be too large may also hurt performance so you may have to
try out some different heap sizes. Make sure that you are using a concurrent garbage collector. We find
that -XX:+UseG1GC works well in most use cases.
The next thing to look at, is the file caching memory. Neo4j uses its own page cache for the store files,
and relies on the operating system for caching the index files. Make sure that the dbms.pagecache.memory
setting (in neo4j.properties) is large enough to fit the entire store, if possible. But also make sure that
you are not allocating so much memory to the JVM and the Neo4j page cache, that there is no memory
left for the operating system to cache the Lucene index files. For more information on configuration see
Chapter 24, Configuration & Performance [448].

Configuring heap size and garbage collection
The size of the JVM heap is an important aspect of the performance of any Java application. The heap
is separated into an old generation and a young generation. New objects are allocated in the young
generation, and then later moved to the old generation, if they stay live (in use) for long enough. When
a generation fills up, the garbage collector performs a collection, during which all other threads in the
process are paused. The young generation is quick to collect since the pause time correlates with the
live set of objects, and is independent of the size of the young generation. In the old generation, pause
times roughly correlates with the size of the heap. For this reason, the heap should ideally be sized and
tuned such that transaction and query state never makes it to the old generation.

Note

When using Neo4j Server, JVM configuration goes into the conf/neo4j-wrapper.conf file. See
Section 24.2, “Server Configuration” [450].
In server deployments, the heap size is configured with the wrapper.java.maxmemory (in MBs) setting in
the neo4j-wrapper.conf file. For embedded, the heap size is specified by giving the -Xmx???m command
line flag to the java process, where the ??? is the maximum heap size in MBs. The initial size of the heap
is specified by the wrapper.java.initmemory setting, or with the -Xms???m flag, or chosen heuristically
by the JVM itself if left unspecified. The JVM will automatically grow the heap as needed, up to the
maximum size. The growing of the heap requires a full GC cycle. If you know that you will need all the
heap memory, you can set the initial heap size and the maximum heap size to the same value. This way
the pause that happens when the garbage collector grows the heap can be avoided.
The ratio of the size between the old generation and the new generation of the heap is controlled
by the -XX:NewRatio=N flag. N is typically between 2 and 8 by default. A ratio of 2 means that the old
455

Configuration & Performance
generation size, divided by the new generation size, is equal to 2. In other words, two thirds of the heap
memory will be dedicated to the old generation. A ratio of 3 will dedicate three quarters of the heap
to the old generation, and a ratio of 1 will keep the two generations about the same size. A ratio of 1 is
quite aggressive, but may be necessary if your transactions changes a lot of data. Having a large new
generation can also be important if you run Cypher queries that need to keep a lot of data resident, for
example when sorting big result sets.
If the new generation is too small, short-lived objects may be moved to the old generation too soon.
This is called premature promotion and will slow the database down by increasing the frequency of
old generation GC cycles. If the new generation is too big, the garbage collector may decide that the
old generation does not have enough space to fit all the objects it expects to promote from the new to
the old generation. This turns new generation GC cycles into old generation GC cycles, again slowing
the database down. Running more concurrent threads means that more allocations can take place in a
given span of time, in turn increasing the pressure on the new generation in particular.

Caution

The Compressed OOPs feature in the JVM allows object references to be compressed to use
only 32 bits. The feature saves a lot of memory, but is not enabled for heaps larger than
32 GB. Gains from increasing the heap size beyond 32 GB can therefore be small or even
negative, unless the increase is significant (64 GB or above).
Neo4j has a number of long-lived objects, that stay around in the old generation, effectively for the
lifetime of the Java process. To process them efficiently, and without adversely affecting the GC pause
time, we recommend using a concurrent garbage collector.

Tip

The recommended garbage collector to use when running Neo4j in production is the
G1 garbage collector. G1 is turned on by default in server deployments. For embedded
deployments, it can be turned on by supplying -XX:+UseG1GC as a JVM parameter.
How to tune the specific GC algorithm depends on both the JVM version and the workload. It is
recommended to test the GC settings under realistic load for days or weeks. Problems like heap
fragmentation can take a long time to surface.

Disks, RAM and other tips
As with any persistence solution, performance depends a lot on the persistence media used. Better
disks equals better performance.
If you have multiple disks or persistence media available it may be a good idea to divide the store
files and transaction logs across those disks. Keeping the store files on disks with low seek time can
do wonders for read operations. Today a typical mechanical drive has an average seek time of about
5ms. This can cause a query or traversal to be very slow when the amount of RAM assigned to the
page cache is too small. A new, good SATA enabled SSD has an average seek time of less than 100
microseconds, meaning those scenarios will execute at least 50 times faster. However, this is still tens
or hundreds of times slower than accessing RAM.
To avoid hitting disk you need more RAM. On a standard mechanical drive you can handle graphs with
a few tens of millions of primitives (nodes, relationships and properties) with 2-3 GBs of RAM. A server
with 8-16 GBs of RAM can handle graphs with hundreds of millions of primitives, and a good server
with 16-64 GBs can handle billions of primitives. However, if you invest in a good SSD you will be able to
handle much larger graphs on less RAM.
Use tools like dstat or vmstat to gather information when your application is running. If the swap or
paging numbers are high, that is a sign that the Lucene indexes don’t quite fit in memory. In this case,
queries that do index lookups will have high latencies.
When Neo4j starts up, its page cache is empty and needs to warm up. This can take a while, especially
for large stores. It is not uncommon to see a long period with many blocks being read from the drive,
and high IO wait times.
456

Configuration & Performance
Neo4j also flushes its page cache in the background, so it is not uncommon to see a steady trickle of
blocks being written to the drive during steady-state. This background flushing only produces a small
amount of IO wait, however. If the IO wait times are high during steady-state, it may be a sign that
Neo4j is bottle-necked on the random IO performance of the drive. The best drives for running Neo4j
are fast SSDs that can take lots of random IOPS.

Linux filesystem tuning
Databases often produce many small and random reads when querying data, and few sequential writes
when committing changes. Neo4j is no different in this regard.
By default, most Linux distributions schedule IO requests using the Completely Fair Queuing (CFQ)
algorithm, which provides a good balance between throughput and latency. The particular IO workload
of a database, however, is better served by the Deadline scheduler. The Deadline scheduler gives
preference to read requests, and processes them as soon as possible. This tends to decrease the
latency of reads, while the latency of writes goes up. Since the writes are usually sequential, their
lingering in the IO queue increases the change of overlapping or adjacent write requests being merged
together. This effectively reduces the number of writes that are sent to the drive.
On Linux, the IO scheduler for a drive, in this case sda, can be changed at runtime like this:
$ echo 'deadline' > /sys/block/sda/queue/scheduler
$ cat
/sys/block/sda/queue/scheduler
noop [deadline] cfq

Another recommended practice is to disable file and directory access time updates. This way, the file
system won’t have to issue writes that update this meta-data, thus improving write performance. This
can be accomplished by setting the noatime,nodiratime mount options in fstab, or when issuing the disk
mount command.
There may be other tuning options relevant to any particular file system, but it is important to make
sure that barriers are enabled. Barriers prevent certain reorderings of writes. They are important for
maintaining the integrity of the transaction log, in case a power failure happens.

Setting the number of open files
Linux platforms impose an upper limit on the number of concurrent files a user may have open. This
number is reported for the current user and session with the ulimit -n command:
user@localhost:~$ ulimit -n
1024

The usual default of 1024 is often not enough. This is especially true when many indexes are used or a
server installation sees too many connections. Network sockets count against the limit as well. Users
are therefore encouraged to increase the limit to a healthy value of 40 000 or more, depending on
usage patterns. It is possible to set the limit with the ulimit command, but only for the root user, and it
only affects the current session. To set the value system wide, follow the instructions for your platform.
What follows is the procedure to set the open file descriptor limit to 40 000 for user neo4j under
Ubuntu 10.04 and later.

Note

If you opted to run the neo4j service as a different user, change the first field in step 2
accordingly.
1. Become root, since all operations that follow require editing protected system files.
user@localhost:~$ sudo su Password:
root@localhost:~$

2. Edit /etc/security/limits.conf and add these two lines:
457

Configuration & Performance

neo4j
neo4j

soft
hard

nofile
nofile

40000
40000

3. Edit /etc/pam.d/su and uncomment or add the following line:
session

required

pam_limits.so

4. A restart is required for the settings to take effect.
After the above procedure, the neo4j user will have a limit of 40 000 simultaneous open files. If you
continue experiencing exceptions on Too many open files or Could not stat() directory, you may
have to raise the limit further.

458

Configuration & Performance

24.5. Logical logs
Logical logs in Neo4j are the journal of which operations happens and are the source of truth in
scenarios where the database needs to be recovered after a crash or similar. Logs are rotated every
now and then (defaults to when they surpass 250 Mb in size) and the amount of legacy logs to keep
can be configured. Purpose of keeping a history of logical logs include being able to serve incremental
backups as well as keeping an HA cluster running.
For any given configuration at least the latest non-empty logical log will be kept, but configuration can
be supplied to control how much more to keep. There are several different means of controlling it and
the format in which configuration is supplied is:
keep_logical_logs=
keep_logical_logs= 

For example:
 Will keep logical logs indefinitely
keep_logical_logs=true
 Will keep only the most recent non-empty log
keep_logical_logs=false
 Will keep logical logs which contains any transaction committed within 30 days
keep_logical_logs=30 days
 Will keep logical logs which contains any of the most recent 500 000 transactions
keep_logical_logs=500k txs

Full list:
Type

Description

Example

files

Number of most recent logical log files to
keep

"10 files"

size

Max disk size to allow log files to occupy

"300M size" or
"1G size"

txs

Number of latest transactions to keep Keep

"250k txs" or "5M
txs"

hours

Keep logs which contains any transaction
"10 hours"
committed within N hours from current time

days

Keep logs which contains any transaction
committed within N days from current time

459

"50 days"

Configuration & Performance

24.6. Compressed storage of property values
Neo4j can in many cases compress and inline the storage of property values, such as short arrays and
strings.

Compressed storage of short arrays
Neo4j will try to store your primitive arrays in a compressed way, so as to save disk space and possibly
an I/O operation. To do that, it employs a "bit-shaving" algorithm that tries to reduce the number of bits
required for storing the members of the array. In particular:
1.
2.
3.
4.

For each member of the array, it determines the position of leftmost set bit.
Determines the largest such position among all members of the array.
It reduces all members to that number of bits.
Stores those values, prefixed by a small header.

That means that when even a single negative value is included in the array then the original size of the
primitives will be used.
There is a possibility that the result can be inlined in the property record if:
• It is less than 24 bytes after compression.
• It has less than 64 members.
For example, an array long[] {0L, 1L, 2L, 4L} will be inlined, as the largest entry (4) will require 3 bits
to store so the whole array will be stored in 4 × 3 = 12 bits. The array long[] {-1L, 1L, 2L, 4L} however
will require the whole 64 bits for the -1 entry so it needs 64 × 4 = 32 bytes and it will end up in the
dynamic store.

Compressed storage of short strings
Neo4j will try to classify your strings in a short string class and if it manages that it will treat it
accordingly. In that case, it will be stored without indirection in the property store, inlining it instead in
the property record, meaning that the dynamic string store will not be involved in storing that value,
leading to reduced disk footprint. Additionally, when no string record is needed to store the property, it
can be read and written in a single lookup, leading to performance improvements and less disk space
required.
The various classes for short strings are:
• Numerical, consisting of digits 0..9 and the punctuation space, period, dash, plus, comma and
apostrophe.
• Date, consisting of digits 0..9 and the punctuation space dash, colon, slash, plus and comma.
• Hex (lower case), consisting of digits 0..9 and lower case letters a..f
• Hex (upper case), consisting of digits 0..9 and upper case letters a..f
• Upper case, consisting of upper case letters A..Z, and the punctuation space, underscore, period,
dash, colon and slash.
• Lower case, like upper but with lower case letters a..z instead of upper case
• E-mail, consisting of lower case letters a..z and the punctuation comma, underscore, period, dash,
plus and the at sign (@).
• URI, consisting of lower case letters a..z, digits 0..9 and most punctuation available.
• Alpha-numerical, consisting of both upper and lower case letters a..zA..z, digits 0..9 and punctuation
space and underscore.
• Alpha-symbolical, consisting of both upper and lower case letters a..zA..Z and the punctuation space,
underscore, period, dash, colon, slash, plus, comma, apostrophe, at sign, pipe and semicolon.
• European, consisting of most accented european characters and digits plus punctuation space, dash,
underscore and period — like latin1 but with less punctuation.
460

Configuration & Performance
• Latin 1.
• UTF-8.
In addition to the string’s contents, the number of characters also determines if the string can be
inlined or not. Each class has its own character count limits, which are
Character count limits
String class

Character
count
limit

Numerical, Date and Hex

54

Uppercase, Lowercase and E-mail

43

URI, Alphanumerical and
Alphasymbolical

36

European

31

Latin1

27

UTF-8

14

That means that the largest inline-able string is 54 characters long and must be of the Numerical class
and also that all Strings of size 14 or less will always be inlined.
Also note that the above limits are for the default 41 byte PropertyRecord layout — if that parameter is
changed via editing the source and recompiling, the above have to be recalculated.

461

Configuration & Performance

24.7. Memory mapped IO settings
Introduction
Quick info
• The page cache is sometimes called low level cache, file system cache or file buffer cache.
• It caches the Neo4j data as stored on the durable media.
• The default configuration of the cache relies on heuristics and assumes that the machine is
dedicated to running Neo4j, so you might want to tune it yourself to get the most out of the
available memory.
• There is just one setting for the file buffer cache: dbms.pagecache.memory — it specifies how
much memory Neo4j is allowed to use for this cache.
Each file in the Neo4j store is accessed through the Neo4j page cache, when reading from, or writing
to, the store files. Since there is only one page cache, there is only one setting for specifying how much
memory Neo4j is allowed to use for page caching. The shared page cache ensures that memory is split
2
across the various store files in the most optimal manner , depending on how the database is used
and what data is popular.
The memory for the page cache is allocated outside the normal Java heap, so you need to take both the
Java heap, and the page cache, into consideration in your capacity planning. Other processes running
on the OS will impact the availability of such memory. Neo4j will require all of the heap memory of
the JVM, plus the memory to be used for the page cache, to be available as physical memory. Other
processes may thus not use more than what is available after the configured memory allocation is
made for Neo4j.

Important

Make sure that your system is configured such that it will never need to swap. If memory
belonging to the Neo4j process gets swapped out, it can lead to considerable performance
degradation.
The amount of memory available to the page cache is configured using the dbms.pagecache.memory
setting. With that setting, you specify the number of bytes available to the page cache, e.g. 150m og 4g.
The default page memory setting is 50% of the machines memory, after subtracting the memory that is
reserved for the Java heap.
For optimal performance, you will want to have as much of your data fit in the page cache as possible.
You can sum up the size of all the *store.db* files in your store file directory, to figure out how big a
page cache you need to fit all your data. For instance, on a posix system you can look at the total of
running $ du -hc *store.db* in your data/graph.db directory. Obviously the store files will grow as you
add more nodes, relationships and properties, so configuring more page cache memory than you have
data, is recommended when possible.

Configuration
Parameter

Possible values

Effect

dbms. pagecache. memory

The maximum amount of
memory to use for the page
cache, either in bytes, or
greater byte-like units, such

The amount of memory to use for
mapping the store files, in a unit
of bytes. This will automatically
be rounded down to the nearest
whole page. This value cannot

2

This is an informal comparison to the store-specific memory mapping settings of previous versions of Neo4j. We are not claiming
that our page replacement algorithms are optimal in the formal sense. Truly optimal page replacement algorithms require
knowledge of events arbitrarily far into the future.

462

Configuration & Performance
Parameter

Possible values

Effect

as 100m for 100 mega-bytes, or be zero. For extremely small and
memory constrained deployments,
4g for 4 giga-bytes.
it is recommended to still reserve
at least a couple of megabytes for
the page cache.
dump_configuration

true or false

If set to true the current
configuration settings will be
written to the default system
output, mostly the console or the
logfiles.

When configuring the amount of memory allowed for the page cache and the JVM heap, make sure to
also leave room for the operating systems page cache, and other programs and services the system
might want to run. It is important to configure the memory usage, such that the Neo4j JVM process
won’t need to use any swap memory, as this will cause a significant drag on the performance of the
system.
When reading the configuration parameters on startup Neo4j will automatically configure the
parameters that are not specified. The cache size will be configured based on the available memory
on the computer, with the assumption that the machine is dedicated to running Neo4j. Specifically,
Neo4j will look at how much memory the machine has, subtract the JVM heap allocation from that, and
then use 50% of what is left for the page cache. This is the default configuration when nothing else is
specified.

Batch insert example
Read general information on batch insertion in Chapter 36, Batch Insertion [636].
The configuration should suit the data set you are about to inject using BatchInsert. Lets say we have a
random-like graph with 10M nodes and 100M relationships. Each node (and maybe some relationships)
have different properties of string and Java primitive types. The important thing is that the page cache
has enough memory to work with, that it doesn’t slow down the BatchInserter:
dbms.pagecache.memory=4g

The configuration above will more or less fit the entire graph in memory. A rough formula to calculate
the memory needed can look like this:
bytes_needed = number_of_nodes * 15
+ number_of_relationships * 34
+ number_of_properties * 64

Note that the size of the individual property very much depends on what data it contains. The numbers
given in the above formula are only a rough estimate.

463

Configuration & Performance

24.8. Configuration Settings Reference
On this page you’ll find the main configuration settings you can use with Neo4j. They can be set in the
conf/neo4j.properties file when using the Neo4j Server (see Section 24.2, “Server Configuration” [450]).
If you use the embedded database, you can pass them in as a map (see the section called “Starting an
embedded database with configuration settings” [579]).
For additional configuration settings, see:
•
•
•
•
•
•
•
•

Section 24.2, “Server Configuration” [450]
Settings for the remote shell extension [546]
High Availability configuration settings [475]
Cluster configuration settings [477]
Online backup configuration settings [495]
Consistency check configuration settings [495]
Usage Data Collector configuration settings [447]
Metrics settings [524]

List of configuration settings

• allow_file_urls: Determines if Cypher will allow using file URLs when loading data using LOAD CSV.
• allow_store_upgrade: Whether to allow a store upgrade in case the current version of the database
starts against an older store version.
• cypher_parser_version: Set this to specify the default parser (language version).
• dbms.checkpoint.interval.time: Configures the time interval between check-points.
• dbms.checkpoint.interval.tx: Configures the transaction interval between check-points.
• dbms.cypher.hints.error: Set this to specify the behavior when Cypher planner or runtime hints
cannot be fulfilled.
• dbms.cypher.min_replan_interval: The minimum lifetime of a query plan before a query is considered
for replanning.
• dbms.cypher.planner: Set this to specify the default planner for the default language version.
• dbms.cypher.statistics_divergence_threshold: The threshold when a plan is considered stale.
• dbms.pagecache.memory: The amount of memory to use for mapping the store files, in bytes (or
kilobytes with the k suffix, megabytes with m and gigabytes with g).
• dbms.pagecache.swapper: Specify which page swapper to use for doing paged IO.
• dbms.querylog.enabled: Log executed queries that takes longer than the configured threshold.
• dbms.querylog.filename: Log executed queries that take longer than the configured threshold.
• dbms.querylog.max_archives: Maximum number of history files for the query log.
• dbms.querylog.parameter_logging_enabled: Log parameters for executed queries that took longer
than the configured threshold.
• dbms.querylog.rotation.threshold: Specifies at which file size the query log will auto-rotate.
• dbms.querylog.threshold: If the execution of query takes more time than this threshold, the query is
logged - provided query logging is enabled.
• dbms.security.load_csv_file_url_root: Sets the root directory for file URLs used with the Cypher LOAD
CSV clause.
• dense_node_threshold: Relationship count threshold for considering a node to be dense.
• dump_configuration: Print out the effective Neo4j configuration after startup.
• index_background_sampling_enabled: Enable or disable background index sampling.
• index_sampling_buffer_size: Size of buffer used by index sampling.
• index_sampling_update_percentage: Percentage of index updates of total index size required before
sampling of a given index is triggered.
• keep_logical_logs: Make Neo4j keep the logical transaction logs for being able to backup the
database.
464

Configuration & Performance
•
•
•
•
•
•
•
•

logical_log_rotation_threshold: Specifies at which file size the logical log will auto-rotate.
lucene_searcher_cache_size: The maximum number of open Lucene index searchers.
query_cache_size: The number of Cypher query execution plans that are cached.
read_only: Only allow read operations from this Neo4j instance.
relationship_grab_size: How many relationships to read at a time during iteration.
store.internal_log.level: Log level threshold.
store.internal_log.max_archives: Maximum number of history files for the internal log.
store.internal_log.rotation_delay: Minimum time interval after last rotation of the internal log before
it may be rotated again.
• store.internal_log.rotation_threshold: Threshold for rotation of the internal log.
Deprecated settings

•
•
•
•
•
•
•
•
•
•
•
•
•
•

batched_writes: Whether or not transactions are appended to the log in batches.
cache_type: The type of cache to use for nodes and relationships.
log_mapped_memory_stats: Log memory mapping statistics regularly.
log_mapped_memory_stats_filename: The file where memory mapping statistics will be recorded.
log_mapped_memory_stats_interval: The number of records to be loaded between regular logging of
memory mapping statistics.
neostore.nodestore.db.mapped_memory: The size to allocate for memory mapping the node store.
neostore.propertystore.db.arrays.mapped_memory: The size to allocate for memory mapping the
array property store.
neostore.propertystore.db.index.keys.mapped_memory: The size to allocate for memory mapping
the store for property key strings.
neostore.propertystore.db.index.mapped_memory: The size to allocate for memory mapping the
store for property key indexes.
neostore.propertystore.db.mapped_memory: The size to allocate for memory mapping the property
value store.
neostore.propertystore.db.strings.mapped_memory: The size to allocate for memory mapping the
string property store.
neostore.relationshipstore.db.mapped_memory: The size to allocate for memory mapping the
relationship store.
store_dir: The directory where the database files are located.
use_memory_mapped_buffers: Use memory mapped buffers for accessing the native storage layer.

allow_file_urls

Description

Determines if Cypher will allow using file URLs when loading data using LOAD CSV.
Setting this value to false will cause Neo4j to fail LOAD CSV clauses that load data
from the file system.

Valid values

allow_file_urls is a boolean.

Default value

true

allow_store_upgrade

Description

Whether to allow a store upgrade in case the current version of the database
starts against an older store version. Setting this to true does not guarantee
successful upgrade, it just allows an upgrade to be performed.

Valid values

allow_store_upgrade is a boolean.

Default value

false

batched_writes

Description

Whether or not transactions are appended to the log in batches.
465

Configuration & Performance
Valid values

batched_writes is a boolean.

Default value

true

Deprecated

Write batching can no longer be turned off.

cache_type

Description

The type of cache to use for nodes and relationships. This configuration setting
is no longer applicable from Neo4j 2.3. Configuration has been simplified to only
require tuning of the page cache.

Valid values

cache_type is a string.

Default value

deprecated

Deprecated

The cache_type configuration setting has been deprecated.

cypher_parser_version

Description

Set this to specify the default parser (language version).

Valid values

cypher_parser_version is one of 1. 9, 2. 2, 2. 3, default.

Default value

default

dbms.checkpoint.interval.time

Description

Configures the time interval between check-points. The database will not checkpoint more often than this (unless check pointing is triggered by a different
event), but might check-point less often than this interval, if performing a checkpoint takes longer time than the configured interval. A check-point is a point in
the transaction logs, from which recovery would start from. Longer check-point
intervals typically means that recovery will take longer to complete in case of a
crash. On the other hand, a longer check-point interval can also reduce the I/
O load that the database places on the system, as each check-point implies a
flushing and forcing of all the store files. The default is 5m for a check-point every
5 minutes. Other supported units are s for seconds, and ms for milliseconds.

Valid values

dbms. checkpoint. interval. time is a duration (valid units are ms, s, m).

Default value

5m

dbms.checkpoint.interval.tx

Description

Configures the transaction interval between check-points. The database will not
check-point more often than this (unless check pointing is triggered by a different
event), but might check-point less often than this interval, if performing a checkpoint takes longer time than the configured interval. A check-point is a point in
the transaction logs, from which recovery would start from. Longer check-point
intervals typically means that recovery will take longer to complete in case of a
crash. On the other hand, a longer check-point interval can also reduce the I/
O load that the database places on the system, as each check-point implies a
flushing and forcing of all the store files. The default is 100000 for a check-point
every 100000 transactions.

Valid values

dbms. checkpoint. interval. tx is an integer which is minimum 1.

Default value

100000

dbms.cypher.hints.error

Description

Set this to specify the behavior when Cypher planner or runtime hints cannot be
fulfilled. If true, then non-conformance will result in an error, otherwise only a
warning is generated.

Valid values

dbms. cypher. hints. error is a boolean.

Default value

false

466

Configuration & Performance
dbms.cypher.min_replan_interval

Description

The minimum lifetime of a query plan before a query is considered for
replanning.

Valid values

dbms. cypher. min_replan_interval is a duration (valid units are ms, s, m).

Default value

10s

dbms.cypher.planner

Description

Set this to specify the default planner for the default language version.

Valid values

dbms. cypher. planner is one of COST, RULE, default.

Default value

default

dbms.cypher.statistics_divergence_threshold

Description

The threshold when a plan is considered stale. If any of the underlying statistics
used to create the plan has changed more than this value, the plan is considered
stale and will be replanned. A value of 0 means always replan, and 1 means never
replan.

Valid values

dbms. cypher. statistics_divergence_threshold is a double which is minimum 0. 0,
and is maximum 1. 0.

Default value

0. 75

dbms.pagecache.memory

Description

The amount of memory to use for mapping the store files, in bytes (or kilobytes
with the k suffix, megabytes with m and gigabytes with g). If Neo4j is running on a
dedicated server, then it is generally recommended to leave about 2-4 gigabytes
for the operating system, give the JVM enough heap to hold all your transaction
state and query context, and then leave the rest for the page cache. The default
page cache memory assumes the machine is dedicated to running Neo4j, and is
heuristically set to 50% of RAM minus the max Java heap size.

Valid values

dbms. pagecache. memory is a byte size (valid multipliers are k, m, g, K, M, G) which is
minimum 245760.

Default value

3049197568

dbms.pagecache.swapper

Description

Specify which page swapper to use for doing paged IO. This is only used when
integrating with proprietary storage technology.

Valid values

dbms. pagecache. swapper is a string.

dbms.querylog.enabled

Description

Log executed queries that takes longer than the configured threshold. NOTE: This
feature is only available in the Neo4j Enterprise Edition.

Valid values

dbms. querylog. enabled is a boolean.

Default value

false

dbms.querylog.filename

Description

Log executed queries that take longer than the configured threshold.

Valid values

dbms. querylog. filename is a path.

dbms.querylog.max_archives

Description

Maximum number of history files for the query log.

Valid values

dbms. querylog. max_archives is an integer which is minimum 1.

467

Configuration & Performance
Default value

7

dbms.querylog.parameter_logging_enabled

Description

Log parameters for executed queries that took longer than the configured
threshold.

Valid values

dbms. querylog. parameter_logging_enabled is a boolean.

Default value

true

dbms.querylog.rotation.threshold

Description

Specifies at which file size the query log will auto-rotate. 0 means that no rotation
will automatically occur based on file size.

Valid values

dbms. querylog. rotation. threshold is a byte size (valid multipliers are k, m, g, K, M, G)
which is minimum 0, and is maximum 9223372036854775807.

Default value

20m

dbms.querylog.threshold

Description

If the execution of query takes more time than this threshold, the query is logged
- provided query logging is enabled. Defaults to 0 seconds, that is all queries are
logged.

Valid values

dbms. querylog. threshold is a duration (valid units are ms, s, m).

Default value

0s

dbms.security.load_csv_file_url_root

Description

Sets the root directory for file URLs used with the Cypher LOAD CSV clause. This
must be set to a single directory, restricting access to only those files within that
directory and its subdirectories.

Valid values

dbms. security. load_csv_file_url_root is a path.

dense_node_threshold

Description

Relationship count threshold for considering a node to be dense.

Valid values

dense_node_threshold is an integer which is minimum 1.

Default value

50

dump_configuration

Description

Print out the effective Neo4j configuration after startup.

Valid values

dump_configuration is a boolean.

Default value

false

index_background_sampling_enabled

Description

Enable or disable background index sampling.

Valid values

index_background_sampling_enabled is a boolean.

Default value

true

index_sampling_buffer_size

Description

Size of buffer used by index sampling.

Valid values

index_sampling_buffer_size is a byte size (valid multipliers are k, m, g, K, M, G) which
is minimum 1048576, and is maximum 2147483647.

Default value

64m

468

Configuration & Performance
index_sampling_update_percentage

Description

Percentage of index updates of total index size required before sampling of a
given index is triggered.

Valid values

index_sampling_update_percentage is an integer which is minimum 0.

Default value

5

keep_logical_logs

Description

Make Neo4j keep the logical transaction logs for being able to backup the
database. Can be used for specifying the threshold to prune logical logs after.
For example "10 days" will prune logical logs that only contains transactions
older than 10 days from the current time, or "100k txs" will keep the 100k latest
transactions and prune any older transactions.

Valid values

keep_logical_logs is a string which must be true/false or of format
  for example 100M size for limiting logical log space
on disk to 100Mb, or 200k txs for limiting the number of transactions to keep to

200 000.
Default value

7 days

log_mapped_memory_stats

Description

Log memory mapping statistics regularly.

Valid values

log_mapped_memory_stats is a boolean.

Default value

false

Deprecated

This is no longer used.

log_mapped_memory_stats_filename

Description

The file where memory mapping statistics will be recorded.

Valid values

log_mapped_memory_stats_filename is a path which is relative to store_dir.

Default value

mapped_memory_stats. log

Deprecated

This is no longer used.

log_mapped_memory_stats_interval

Description

The number of records to be loaded between regular logging of memory mapping
statistics.

Valid values

log_mapped_memory_stats_interval is an integer.

Default value

1000000

Deprecated

This is no longer used.

logical_log_rotation_threshold

Description

Specifies at which file size the logical log will auto-rotate. 0 means that no rotation
will automatically occur based on file size.

Valid values

logical_log_rotation_threshold is a byte size (valid multipliers are k, m, g, K, M, G)
which is minimum 1048576.

Default value

250M

lucene_searcher_cache_size

Description

The maximum number of open Lucene index searchers.

Valid values

lucene_searcher_cache_size is an integer which is minimum 1.

Default value

2147483647

469

Configuration & Performance
neostore.nodestore.db.mapped_memory

Description

The size to allocate for memory mapping the node store.

Valid values

neostore. nodestore. db. mapped_memory is a byte size (valid multipliers are k, m, g, K, M,
G).

Deprecated

Replaced by the dbms. pagecache. memory setting.

neostore.propertystore.db.arrays.mapped_memory

Description

The size to allocate for memory mapping the array property store.

Valid values

neostore. propertystore. db. arrays. mapped_memory is a byte size (valid multipliers
are k, m, g, K, M, G).

Deprecated

Replaced by the dbms. pagecache. memory setting.

neostore.propertystore.db.index.keys.mapped_memory

Description

The size to allocate for memory mapping the store for property key strings.

Valid values

neostore. propertystore. db. index. keys. mapped_memory is a byte size (valid
multipliers are k, m, g, K, M, G).

Deprecated

Replaced by the dbms. pagecache. memory setting.

neostore.propertystore.db.index.mapped_memory

Description

The size to allocate for memory mapping the store for property key indexes.

Valid values

neostore. propertystore. db. index. mapped_memory is a byte size (valid multipliers are
k, m, g, K, M, G).

Deprecated

Replaced by the dbms. pagecache. memory setting.

neostore.propertystore.db.mapped_memory

Description

The size to allocate for memory mapping the property value store.

Valid values

neostore. propertystore. db. mapped_memory is a byte size (valid multipliers are k, m, g,
K, M, G).

Deprecated

Replaced by the dbms. pagecache. memory setting.

neostore.propertystore.db.strings.mapped_memory

Description

The size to allocate for memory mapping the string property store.

Valid values

neostore. propertystore. db. strings. mapped_memory is a byte size (valid multipliers
are k, m, g, K, M, G).

Deprecated

Replaced by the dbms. pagecache. memory setting.

neostore.relationshipstore.db.mapped_memory

Description

The size to allocate for memory mapping the relationship store.

Valid values

neostore. relationshipstore. db. mapped_memory is a byte size (valid multipliers are k,
m, g, K, M, G).

Deprecated

Replaced by the dbms. pagecache. memory setting.

query_cache_size

Description

The number of Cypher query execution plans that are cached.

Valid values

query_cache_size is an integer which is minimum 0.

Default value

1000

read_only

Description

Only allow read operations from this Neo4j instance. This mode still requires write
access to the directory for lock purposes.
470

Configuration & Performance
Valid values

read_only is a boolean.

Default value

false

relationship_grab_size

Description

How many relationships to read at a time during iteration.

Valid values

relationship_grab_size is an integer which is minimum 1.

Default value

100

store.internal_log.level

Description

Log level threshold.

Valid values

store. internal_log. level is one of DEBUG, INFO, WARN, ERROR, NONE.

Default value

INFO

store.internal_log.max_archives

Description

Maximum number of history files for the internal log.

Valid values

store. internal_log. max_archives is an integer which is minimum 1.

Default value

7

store.internal_log.rotation_delay

Description

Minimum time interval after last rotation of the internal log before it may be
rotated again.

Valid values

store. internal_log. rotation_delay is a duration (valid units are ms, s, m).

Default value

300s

store.internal_log.rotation_threshold

Description

Threshold for rotation of the internal log.

Valid values

store. internal_log. rotation_threshold is a byte size (valid multipliers are k, m, g, K,
M, G) which is minimum 0, and is maximum 9223372036854775807.

Default value

20m

store_dir

Description

The directory where the database files are located.

Valid values

store_dir is a path.

Deprecated

This is no longer used.

use_memory_mapped_buffers

Description

Use memory mapped buffers for accessing the native storage layer.

Valid values

use_memory_mapped_buffers is a boolean.

Default value

true

Deprecated

This setting has been obsoleted. Neo4j no longer relies on the memory-mapping
capabilities of the operating system.

471

Chapter 25. High Availability
Note

The High Availability features are only available in the Neo4j Enterprise Edition.
Neo4j High Availability or “Neo4j HA” provides the following two main features:
1. It enables a fault-tolerant database architecture, where several Neo4j slave databases can be
configured to be exact replicas of a single Neo4j master database. This allows the end-user system to
be fully functional and both read and write to the database in the event of hardware failure.
2. It enables a horizontally scaling read-mostly architecture that enables the system to handle more read
load than a single Neo4j database instance can handle.

472

High Availability

25.1. Architecture
Neo4j HA has been designed to make the transition from single machine to multi machine operation
simple, by not having to change the already existing application.
Consider an existing application with Neo4j embedded and running on a single machine. To deploy
such an application in a multi machine setup the only required change is to switch the creation of the
GraphDatabaseService from GraphDatabaseFactory to HighlyAvailableGraphDatabaseFactory. Since both
implement the same interface, no additional changes are required.
Figure 25.1. Multiple Neo4j instances in HA mode

When running Neo4j in HA mode there is always a single master and zero or more slaves. Compared to
other master-slave replication setups Neo4j HA can handle write requests on all machines so there is
no need to redirect those to the master specifically.
A slave will handle writes by synchronizing with the master to preserve consistency. Writes to master
can be configured to be optimistically pushed to 0 or more slaves. By optimistically we mean the
master will try to push to slaves before the transaction completes but if it fails the transaction will still
be successful (different from normal replication factor). All updates will however propagate from the
master to other slaves eventually so a write from one slave may not be immediately visible on all other
slaves. This is the only difference between multiple machines running in HA mode compared to single
machine operation. All other ACID characteristics are the same.

473

High Availability

25.2. HA Setup and configuration
Neo4j HA can be set up to accommodate differing requirements for load, fault tolerance and available
hardware.
In HA mode, Neo4j instances form a cluster. The instances monitor each others' availability to take
account of instances joining and leaving the cluster. They elect one instance to be the master, and
designate the other instances to be slaves.
For installation instructions of a High Availability cluster see Section 25.6, “High Availability setup
tutorial” [483].

Specifying cluster members
Specify the instances that should form the cluster by supplying ha.initial_hosts, a comma-separated
list of URLs. When each instance starts, if it can contact any of the initial hosts, then it will form a cluster
with them, otherwise it will start its own cluster.
Note that the parameter is called ha.initial_hosts because it’s only used when instances initially join
the cluster. This means that you can extend the cluster without changing the configuration of existing
instances.

Server configuration
If you are running Neo4j server, specify org.neo4j.server.database.mode=HA in conf/neo4jserver.properties.
Settings available in the Enterprise server

• dbms.browser.credential_timeout: Configure the Neo4j Browser to time out logged in users after this
idle period.
• dbms.browser.remote_content_hostname_whitelist: Whitelist of hosts for the Neo4j Browser to be
allowed to fetch content from.
• dbms.browser.store_credentials: Configure the Neo4j Browser to store or not store user credentials.
• dbms.security.allow_outgoing_browser_connections: Configure the policy for outgoing Neo4j Browser
connections.
• org.neo4j.server.database.mode: Configure the operating mode of the database — SINGLE for standalone operation or HA for operating as a member in a cluster.
dbms.browser.credential_timeout

Description

Configure the Neo4j Browser to time out logged in users after this idle period.
Setting this to 0 indicates no limit.

Valid values

dbms. browser. credential_timeout is a duration (valid units are ms, s, m).

Default value

0

dbms.browser.remote_content_hostname_whitelist

Description

Whitelist of hosts for the Neo4j Browser to be allowed to fetch content from.

Valid values

dbms. browser. remote_content_hostname_whitelist is a string.

Default value
dbms.browser.store_credentials

Description

Configure the Neo4j Browser to store or not store user credentials.

Valid values

dbms. browser. store_credentials is a boolean.

Default value

true

dbms.security.allow_outgoing_browser_connections

Description

Configure the policy for outgoing Neo4j Browser connections.
474

High Availability
Valid values

dbms. security. allow_outgoing_browser_connections is a boolean.

Default value

true

org.neo4j.server.database.mode

Description

Configure the operating mode of the database — SINGLE for stand-alone
operation or HA for operating as a member in a cluster.

Valid values

org. neo4j. server. database. mode is a string.

Default value

SINGLE

Database configuration
HA configuration parameters should be supplied alongside general Neo4j parameters in conf/
neo4j.properties. There are many configurable parameters. In most cases it isn’t necessary to modify the
default values. The only parameters that need to be specified are ha.server_id and ha.initial_hosts.
High Availability configuration settings

•
•
•
•
•
•
•
•
•
•
•
•
•
•

dbms.security.ha_status_auth_enabled: Require authorization for access to the HA status endpoints.
ha.branched_data_policy: Policy for how to handle branched data.
ha.com_chunk_size: Max size of the data chunks that flows between master and slaves in HA.
ha.internal_state_switch_timeout: Timeout for waiting for internal conditions during state switch, like
for transactions to complete, before switching to master or slave.
ha.lock_read_timeout: Timeout for taking remote (write) locks on slaves.
ha.max_concurrent_channels_per_slave: Maximum number of connections a slave can have to the
master.
ha.pull_apply_batch_size: Size of batches of transactions applied on slaves when pulling from master.
ha.pull_interval: Interval of pulling updates from master.
ha.read_timeout: How long a slave will wait for response from master before giving up.
ha.server: Hostname and port to bind the HA server.
ha.slave_only: Whether this instance should only participate as slave in cluster.
ha.state_switch_timeout: Timeout for request threads waiting for instance to become master or
slave.
ha.tx_push_factor: The amount of slaves the master will ask to replicate a committed transaction.
ha.tx_push_strategy: Push strategy of a transaction to a slave during commit.

dbms.security.ha_status_auth_enabled

Description

Require authorization for access to the HA status endpoints.

Valid values

dbms. security. ha_status_auth_enabled is a boolean.

Default value

true

ha.branched_data_policy

Description

Policy for how to handle branched data.

Valid values

ha. branched_data_policy is one of keep_all, keep_last, keep_none.

Default value

keep_all

ha.com_chunk_size

Description

Max size of the data chunks that flows between master and slaves in HA. Bigger
size may increase throughput, but may also be more sensitive to variations in
bandwidth, whereas lower size increases tolerance for bandwidth variations.

Valid values

ha. com_chunk_size is a byte size (valid multipliers are k, m, g, K, M, G) which is
minimum 1024.

475

High Availability
Default value

2M

ha.internal_state_switch_timeout

Description

Timeout for waiting for internal conditions during state switch, like for
transactions to complete, before switching to master or slave.

Valid values

ha. internal_state_switch_timeout is a duration (valid units are ms, s, m).

Default value

10s

ha.lock_read_timeout

Description

Timeout for taking remote (write) locks on slaves. Defaults to .

Valid values

ha. lock_read_timeout is a duration (valid units are ms, s, m).

Default value

20s

ha.max_concurrent_channels_per_slave

Description

Maximum number of connections a slave can have to the master.

Valid values

ha. max_concurrent_channels_per_slave is an integer which is minimum 1.

Default value

20

ha.pull_apply_batch_size

Description

Size of batches of transactions applied on slaves when pulling from master.

Valid values

ha. pull_apply_batch_size is an integer.

Default value

100

ha.pull_interval

Description

Interval of pulling updates from master.

Valid values

ha. pull_interval is a duration (valid units are ms, s, m).

Default value

0s

ha.read_timeout

Description

How long a slave will wait for response from master before giving up.

Valid values

ha. read_timeout is a duration (valid units are ms, s, m).

Default value

20s

ha.server

Description

Hostname and port to bind the HA server.

Valid values

ha. server is a hostname and port.

Default value

0. 0. 0. 0:6001-6011

ha.slave_only

Description

Whether this instance should only participate as slave in cluster. If set to true, it
will never be elected as master.

Valid values

ha. slave_only is a boolean.

Default value

false

ha.state_switch_timeout

Description

Timeout for request threads waiting for instance to become master or slave.

Valid values

ha. state_switch_timeout is a duration (valid units are ms, s, m).

476

High Availability
Default value

120s

ha.tx_push_factor

Description

The amount of slaves the master will ask to replicate a committed transaction.

Valid values

ha. tx_push_factor is an integer which is minimum 0.

Default value

1

ha.tx_push_strategy

Description

Push strategy of a transaction to a slave during commit.

Valid values

ha. tx_push_strategy is one of round_robin, fixed, ``fixed_descending``,

Default value

fixed_descending

``fixed_ascending``.

Cluster configuration settings

•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•

ha.allow_init_cluster: Whether to allow this instance to create a cluster if unable to join.
ha.broadcast_timeout: Timeout for broadcasting values in cluster.
ha.cluster_name: The name of a cluster.
ha.cluster_server: Host and port to bind the cluster management communication.
ha.configuration_timeout: Timeout for waiting for configuration from an existing cluster member
during cluster join.
ha.default_timeout: Default timeout used for clustering timeouts.
ha.election_timeout: Timeout for waiting for other members to finish a role election.
ha.heartbeat_interval: How often heartbeat messages should be sent.
ha.heartbeat_timeout: Timeout for heartbeats between cluster members.
ha.initial_hosts: A comma-separated list of other members of the cluster to join.
ha.join_timeout: Timeout for joining a cluster.
ha.learn_timeout: Timeout for learning values.
ha.leave_timeout: Timeout for waiting for cluster leave to finish.
ha.max_acceptors: Maximum number of servers to involve when agreeing to membership changes.
ha.paxos_timeout: Default timeout for all Paxos timeouts.
ha.phase1_timeout: Timeout for Paxos phase 1.
ha.phase2_timeout: Timeout for Paxos phase 2.
ha.server_id: Id for a cluster instance.

ha.allow_init_cluster

Description

Whether to allow this instance to create a cluster if unable to join.

Valid values

ha. allow_init_cluster is a boolean.

Default value

true

ha.broadcast_timeout

Description

Timeout for broadcasting values in cluster. Must consider end-to-end duration of
Paxos algorithm. This value is the default value for the and settings.

Valid values

ha. broadcast_timeout is a duration (valid units are ms, s, m).

Default value

30s

ha.cluster_name

Description

The name of a cluster.

Valid values

ha. cluster_name is a string which must be a valid cluster name.

477

High Availability
Default value

neo4j. ha

ha.cluster_server

Description

Host and port to bind the cluster management communication.

Valid values

ha. cluster_server is a hostname and port.

Default value

0. 0. 0. 0:5001-5099

ha.configuration_timeout

Description

Timeout for waiting for configuration from an existing cluster member during
cluster join.

Valid values

ha. configuration_timeout is a duration (valid units are ms, s, m).

Default value

1s

ha.default_timeout

Description

Default timeout used for clustering timeouts. Override specific timeout settings
with proper values if necessary. This value is the default value for the , and
settings.

Valid values

ha. default_timeout is a duration (valid units are ms, s, m).

Default value

5s

ha.election_timeout

Description

Timeout for waiting for other members to finish a role election. Defaults to .

Valid values

ha. election_timeout is a duration (valid units are ms, s, m).

Default value

5s

ha.heartbeat_interval

Description

How often heartbeat messages should be sent. Defaults to .

Valid values

ha. heartbeat_interval is a duration (valid units are ms, s, m).

Default value

5s

ha.heartbeat_timeout

Description

Timeout for heartbeats between cluster members. Should be at least twice that of
.

Valid values

ha. heartbeat_timeout is a duration (valid units are ms, s, m).

Default value

11s

ha.initial_hosts

Description

A comma-separated list of other members of the cluster to join.

Valid values

ha. initial_hosts is a list separated by "," where items are a hostname and port.

Mandatory

The ha. initial_hosts configuration setting is mandatory.

ha.join_timeout

Description

Timeout for joining a cluster. Defaults to .

Valid values

ha. join_timeout is a duration (valid units are ms, s, m).

Default value

30s

ha.learn_timeout

Description

Timeout for learning values. Defaults to .
478

High Availability
Valid values

ha. learn_timeout is a duration (valid units are ms, s, m).

Default value

5s

ha.leave_timeout

Description

Timeout for waiting for cluster leave to finish. Defaults to .

Valid values

ha. leave_timeout is a duration (valid units are ms, s, m).

Default value

30s

ha.max_acceptors

Description

Maximum number of servers to involve when agreeing to membership changes.
In very large clusters, the probability of half the cluster failing is low, but
protecting against any arbitrary half failing is expensive. Therefore you may wish
to set this parameter to a value less than the cluster size.

Valid values

ha. max_acceptors is an integer which is minimum 1.

Default value

21

ha.paxos_timeout

Description

Default timeout for all Paxos timeouts. Defaults to . This value is the default value
for the , and settings.

Valid values

ha. paxos_timeout is a duration (valid units are ms, s, m).

Default value

5s

ha.phase1_timeout

Description

Timeout for Paxos phase 1. Defaults to .

Valid values

ha. phase1_timeout is a duration (valid units are ms, s, m).

Default value

5s

ha.phase2_timeout

Description

Timeout for Paxos phase 2. Defaults to .

Valid values

ha. phase2_timeout is a duration (valid units are ms, s, m).

Default value

5s

ha.server_id

Description

Id for a cluster instance. Must be unique within the cluster.

Valid values

ha. server_id is an instance id, which has to be a valid integer.

Mandatory

The ha. server_id configuration setting is mandatory.

479

High Availability

25.3. How Neo4j HA operates
A Neo4j HA cluster operates cooperatively — each database instance contains the logic needed in order
to coordinate with the other members of the cluster. On startup a Neo4j HA database instance will try
to connect to an existing cluster specified by configuration. If the cluster exists, the instance will join it
as a slave. Otherwise the cluster will be created and the instance will become its master.
When performing a write transaction on a slave each write operation will be synchronized with the
master (locks will be acquired on both master and slave). When the transaction commits it will first be
committed on the master and then, if successful, on the slave. To ensure consistency, a slave has to be
up to date with the master before performing a write operation. This is built into the communication
protocol between the slave and master, so that updates will be applied to a slave communicating with
its master automatically.
Write transactions performed directly through the master will execute in the same way as running
in normal non-HA mode. On success the transaction will be pushed out to a configurable number of
slaves (default one slave). This is done optimistically meaning if the push fails the transaction will still
be successful. It’s also possible to configure push factor to 0 for higher write performance when writing
directly through the master, although increasing the risk of losing any transaction not yet pulled by
another slave if the master goes down.
Slaves can also be configured to pull updates asynchronously by setting the ha.pull_interval [476]
option.
Whenever a Neo4j database becomes unavailable, by means of for example hardware failure or
network outages, the other database instances in the cluster will detect that and mark it as temporarily
failed. A database instance that becomes available after being unavailable will automatically catch
up with the cluster. If the master goes down another (best suited) member will be elected and have
its role switched from slave to master after a quorum has been reached within the cluster. When the
new master has performed its role switch it will broadcast its availability to all the other members of
the cluster. Normally a new master is elected and started within just a few seconds and during this
time no writes can take place (the writes will block or in rare cases throw an exception). The only time
this is not true is when an old master had changes that did not get replicated to any other member
before becoming unavailable. If the new master is elected and performs changes before the old master
recovers, there will be two "branches" of the database after the point where the old master became
unavailable. The old master will move away its database (its "branch") and download a full copy from
the new master, to become available as a slave in the cluster.
All this can be summarized as:
• Write transactions can be performed on any database instance in a cluster.
• Neo4j HA is fault tolerant and can continue to operate from any number of machines down to a
single machine.
• Slaves will be automatically synchronized with the master on write operations.
• If the master fails a new master will be elected automatically.
• The cluster automatically handles instances becoming unavailable (for example due to network
issues), and also makes sure to accept them as members in the cluster when they are available again.
• Transactions are atomic, consistent and durable but eventually propagated out to other slaves.
• Updates to slaves are eventually consistent by nature but can be configured to be pushed
optimistically from master during commit.
• If the master goes down any running write transaction will be rolled back and new transactions will
block or fail until a new master has become available.
• Reads are highly available and the ability to handle read load scales with more database instances in
the cluster.

480

High Availability

25.4. Arbiter Instances
A typical deployment of Neo4j will use a cluster of 3 machines to provide fault-tolerance and read
scalability. This setup is described in Section 25.6, “High Availability setup tutorial” [483].
While having at least 3 instances is necessary for failover to happen in case the master becomes
unavailable, it is not required for all instances to run the full Neo4j stack, which includes the database
engine. Instead, what is called arbiter instances can be deployed. They can be regarded as cluster
participants in that their role is to take part in master elections with the single purpose of breaking ties
in the election process. That makes possible a scenario where you have a cluster of 2 Neo4j database
instances and an additional arbiter instance and still enjoy tolerance of a single failure of either of the 3
instances.
Arbiter instances are configured in the same way as Neo4j HA members are — through the conf/
neo4j.properties file. Settings that are not cluster specific are of course ignored, so you can easily start
up an arbiter instance in place of a properly configured Neo4j instance.
To start an arbiter instance, call
neo4j_home$ ./bin/neo4j-arbiter start

You can also stop, install and remove it as a service and ask for its status in exactly the same way as for
Neo4j instances. See also Section 23.2, “Server Installation” [439].

481

High Availability

25.5. Upgrade of a Neo4j HA Cluster
Upgrading a Neo4j HA cluster to Neo4j 2.3.12 requires following a specific process in order to ensure
that the cluster remains consistent, and that all cluster instances are able to join and participate in the
cluster following their upgrade. Neo4j 2.3.12 does not support rolling upgrades.

Back up the Neo4j database
• Before starting any upgrade procedure, it is very important to make a full backup of your database.
• For detailed instructions on backing up your Neo4j database, refer to the backup guide.

Shut down the cluster
• Shut down the database slave instances one by one.
• Shut down the master last.

Upgrade the master
1. Install Neo4j 2.3.12 on the master, keeping the database files untouched.
2. Disable HA in the configuration, by setting the variable org.neo4j.server.database.mode=SINGLE in conf/
neo4j.properties.
3. Upgrade as described for a single instance of Neo4j
4. When upgrade has finished, shut down Neo4j again.
5. Re-enable HA in the configuration by setting org.neo4j.server.database.mode=HA in conf/
neo4j.properties.
6. Make a full backup of the Neo4j database. Please note that backups from before the upgrade are no
longer valid for update via the incremental online backup. Therefore it is important to perform a full
backup, using an empty target directory, at this point.

Upgrade the slaves
On each slave:
1. Remove all database files.
2. Install Neo4j 2.3.12.
3. Review the parameter settings in the files under conf directory in the previous installation, and
transfer any custom set parameters to the 2.3.12 installation. Be aware of parameters that have
changed names between versions. Also, ensure that you configure the 2.3.12 installation to use the
same database file directory as the previous installation.
4. If applicable, copy security the security configuration from the master, since this is not propagated
automatically. See the section called “Copying security configuration from one instance to
another” [315] for instructions.

Tip

At this point it is an alternative is to manually copy database files from the master to the
slaves. Doing so will avoid the need to sync from the master when starting. This can save
considerable time when upgrading large databases.

Restart the cluster
1. Start the master instance.
2. Start the slaves, one by one. Once a slave has joined the cluster, it will sync the database from the
master instance.

482

High Availability

25.6. High Availability setup tutorial
This guide will help you understand how to configure and deploy a Neo4j High Availability cluster. Two
scenarios will be considered:
• Configuring three instances to be deployed on three separate machines in a setting similar to what
might be encountered in a production environment.
• Modifying the former to make it possible to run a cluster of three instances on the same physical
machine, which is particularly useful during development.

Background
Each instance in a Neo4j HA cluster must be assigned an integer ID, which serves as its unique
identifier. At startup, a Neo4j instance contacts the other instances specified in the ha.initial_hosts
configuration option.
When an instance establishes a connection to any other, it determines the current state of the cluster
and ensures that it is eligible to join. To be eligible the Neo4j instance must host the same database
store as other members of the cluster (although it is allowed to be in an older state), or be a new
deployment without a database store.

Explicitly configure IP Addresses/Hostnames for a cluster

Neo4j will attempt to configure IP addresses for itself in the absence of explicit
configuration. However in typical operational environments where machines have multiple
network cards and support IPv4 and IPv6 it is strongly recommended that the operator
explicitly sets the IP address/hostname configuration for each machine in the cluster.
Let’s examine the available settings and the values they accept.

ha.server_id
ha.server_id is the cluster identifier for each instance. It must be a positive integer and must be unique

among all Neo4j instances in the cluster.
For example, ha.server_id=1.

ha.cluster_server
ha.cluster_server is an address/port setting that specifies where the Neo4j instance will listen for
cluster communications (like hearbeat messages). The default port is 5001. In the absence of a specified

IP address, Neo4j will attempt to find a valid interface for binding. While this behavior typically results
in a well-behaved server, it is strongly recommended that users explicitly choose an IP address bound to
the network interface of their choosing to ensure a coherent cluster deployment.
For example, ha.cluster_server=192.168.33.22:5001 will listen for cluster communications on the
network interface bound to the 192.168.33.0 subnet on port 5001.

ha.initial_hosts
ha.initial_hosts is a comma separated list of address/port pairs, which specify how to reach other
Neo4j instances in the cluster (as configured via their ha.cluster_server option). These hostname/ports

will be used when the Neo4j instances start, to allow them to find and join the cluster. Specifying an
instance’s own address is permitted.

Warning

Do not use any whitespace in this configuration option.
For example, ha.initial_hosts=192.168.33.22:5001,192.168.33.21:5001 will attempt to reach Neo4j
instances listening on 192.168.33.22 on port 5001 and 192.168.33.21 on port 5001 on the 192.168.33.0
subnet.
483

High Availability

ha.server
ha.server is an address/port setting that specifies where the Neo4j instance will listen for transactions
(changes to the graph data) from the cluster master. The default port is 6001. In the absence of a

specified IP address, Neo4j will attempt to find a valid interface for binding. While this behavior typically
results in a well-behaved server, it is strongly recommended that users explicitly choose an IP address
bound to the network interface of their choosing to ensure a coherent cluster topology.
ha.server must user a different port to ha.cluster_server.

For example, ha.server=192.168.33.22:6001 will listen for cluster communications on the network
interface bound to the 192.168.33.0 subnet on port 6001.

Address/port format

The ha.cluster_server and ha.server configuration options are specified as :.
For ha.server the IP address must be the address assigned to one of the host’s network
interfaces.
For ha.cluster_server the IP address must be the address assigned to one of the host’s
network interfaces, or the value 0.0.0.0, which will cause Neo4j to listen on every network
interface.
Either the address or the port can be omitted, in which case the default for that part will be
used. If the address is omitted, then the port must be preceded with a colon (eg. :5001).
The syntax for setting the port range is: :[-]. In this
case, Neo4j will test each port in sequence, and select the first that is unused. Note that
this usage is not permitted when the hostname is specified as 0.0.0.0 (the "all interfaces"
address).

Getting started: Setting up a production cluster
Download and configure
1

• Download Neo4j Enterprise from the Neo4j download site , and unpack on three separate machines.
• Configure the HA related settings for each installation as outlined below. Note that all three
installations have the same configuration except for the ha.server_id property.
Neo4j instance
#1 — neo4j-01.local

conf/neo4j.properties
 Unique server id for this Neo4j instance
 can not be negative id and must be unique
ha.server_id = 1
 List of other known instances in this cluster
ha.initial_hosts = neo4j-01.local:5001,neo4j-02.local:5001,neo4j-03.local:5001
 Alternatively, use IP addresses:
ha.initial_hosts = 192.168.0.20:5001,192.168.0.21:5001,192.168.0.22:5001

conf/neo4j-server.properties
 HA - High Availability
 SINGLE - Single mode, default.
org.neo4j.server.database.mode=HA
 Let the webserver only listen on the specified IP.
org.neo4j.server.webserver.address=0.0.0.0

Neo4j instance
#2 — neo4j-02.local
1

conf/neo4j.properties
 Unique server id for this Neo4j instance

http://neo4j.com/download/

484

High Availability
 can not be negative id and must be unique
ha.server_id = 2
 List of other known instances in this cluster
ha.initial_hosts = neo4j-01.local:5001,neo4j-02.local:5001,neo4j-03.local:5001
 Alternatively, use IP addresses:
ha.initial_hosts = 192.168.0.20:5001,192.168.0.21:5001,192.168.0.22:5001

conf/neo4j-server.properties
 HA - High Availability
 SINGLE - Single mode, default.
org.neo4j.server.database.mode=HA
 Let the webserver only listen on the specified IP.
org.neo4j.server.webserver.address=0.0.0.0

Neo4j instance
#3 — neo4j-03.local

conf/neo4j.properties
 Unique server id for this Neo4j instance
 can not be negative id and must be unique
ha.server_id = 3
 List of other known instances in this cluster
ha.initial_hosts = neo4j-01.local:5001,neo4j-02.local:5001,neo4j-03.local:5001
 Alternatively, use IP addresses:
ha.initial_hosts = 192.168.0.20:5001,192.168.0.21:5001,192.168.0.22:5001

conf/neo4j-server.properties
 HA - High Availability
 SINGLE - Single mode, default.
org.neo4j.server.database.mode=HA
 Let the webserver only listen on the specified IP.
org.neo4j.server.webserver.address=0.0.0.0

Start the Neo4j Servers
Start the Neo4j servers as usual. Note that the startup order does not matter.
neo4j-01$ ./bin/neo4j start
neo4j-02$ ./bin/neo4j start
neo4j-03$ ./bin/neo4j start

Startup Time

When running in HA mode, the startup script returns immediately instead of waiting for the
server to become available. This is because the instance does not accept any requests until
a cluster has been formed, which on cold start is when all the servers in ha.initial_hosts
are running and have joined the cluster. In the example above this happens when you have
started all three instances. To keep track of the startup state you can follow the messages in
console.log - the path to that is printed before the startup script returns.
Now, you should be able to access the three servers and check their HA status. Open the locations
below in a web browser and issue the following command in the editor after having set a password for
the database: :play sysinfo
• http://neo4j-01.local:7474/
• http://neo4j-02.local:7474/
• http://neo4j-03.local:7474/
485

High Availability

Tip

You can replace database #3 with an arbiter instance, see Section 25.4, “Arbiter
Instances” [481].

That’s it! You now have a Neo4j HA cluster of three instances running. You can start by making a
change on any instance and those changes will be propagated between them. For more HA related
configuration options take a look at Section 25.2, “HA Setup and configuration” [474].

Alternative setup: Creating a local cluster for testing
If you want to start a cluster similar to the one described above, but for development and testing
purposes, it is convenient to run all Neo4j instances on the same machine. This is easy to achieve,
although it requires some additional configuration as the defaults will conflict with each other.
Furthermore, the default dbms.pagecache.memory assumes that Neo4j has the machine to itself. If we
in this example assume that the machine has 4 gigabytes of memory, and that each JVM consumes
500 megabytes of memory, then we can allocate 500 megabytes of memory to the page cache of each
server.

Download and configure
2

1. Download Neo4j Enterprise from the Neo4j download site , and unpack into three separate
directories on your test machine.
2. Configure the HA related settings for each installation as outlined below.
Neo4j instance #1 — ~/
neo4j-01

conf/neo4j.properties
 Reduce the default page cache memory allocation
dbms.pagecache.memory=500m
 Port to listen to for incoming backup requests.
online_backup_server = 127.0.0.1:6366
 Unique server id for this Neo4j instance
 can not be negative id and must be unique
ha.server_id = 1
 List of other known instances in this cluster
ha.initial_hosts = 127.0.0.1:5001,127.0.0.1:5002,127.0.0.1:5003
 IP and port for this instance to bind to for communicating cluster information
 with the other neo4j instances in the cluster.
ha.cluster_server = 127.0.0.1:5001
 IP and port for this instance to bind to for communicating data with the
 other neo4j instances in the cluster.
ha.server = 127.0.0.1:6363

conf/neo4j-server.properties
 HA - High Availability
 SINGLE - Single mode, default.
org.neo4j.server.database.mode=HA
 http port (for all data, administrative, and UI access)
org.neo4j.server.webserver.port=7474
 https port (for all data, administrative, and UI access)
org.neo4j.server.webserver.https.port=7484
2

http://neo4j.com/download/

486

High Availability
Neo4j instance #2 — ~/
neo4j-02

conf/neo4j.properties
 Reduce the default page cache memory allocation
dbms.pagecache.memory=500m
 Port to listen to for incoming backup requests.
online_backup_server = 127.0.0.1:6367
 Unique server id for this Neo4j instance
 can not be negative id and must be unique
ha.server_id = 2
 List of other known instances in this cluster
ha.initial_hosts = 127.0.0.1:5001,127.0.0.1:5002,127.0.0.1:5003
 IP and port for this instance to bind to for communicating cluster information
 with the other neo4j instances in the cluster.
ha.cluster_server = 127.0.0.1:5002
 IP and port for this instance to bind to for communicating data with the
 other neo4j instances in the cluster.
ha.server = 127.0.0.1:6364

conf/neo4j-server.properties
 HA - High Availability
 SINGLE - Single mode, default.
org.neo4j.server.database.mode=HA
 http port (for all data, administrative, and UI access)
org.neo4j.server.webserver.port=7475
 https port (for all data, administrative, and UI access)
org.neo4j.server.webserver.https.port=7485

Neo4j instance #3 — ~/
neo4j-03

conf/neo4j.properties
 Reduce the default page cache memory allocation
dbms.pagecache.memory=500m
 Port to listen to for incoming backup requests.
online_backup_server = 127.0.0.1:6368
 Unique server id for this Neo4j instance
 can not be negative id and must be unique
ha.server_id = 3
 List of other known instances in this cluster
ha.initial_hosts = 127.0.0.1:5001,127.0.0.1:5002,127.0.0.1:5003
 IP and port for this instance to bind to for communicating cluster information
 with the other neo4j instances in the cluster.
ha.cluster_server = 127.0.0.1:5003
 IP and port for this instance to bind to for communicating data with the
 other neo4j instances in the cluster.
ha.server = 127.0.0.1:6365

conf/neo4j-server.properties
 HA - High Availability
 SINGLE - Single mode, default.
org.neo4j.server.database.mode=HA
 http port (for all data, administrative, and UI access)

487

High Availability
org.neo4j.server.webserver.port=7476
 https port (for all data, administrative, and UI access)
org.neo4j.server.webserver.https.port=7486

Start the Neo4j Servers
Start the Neo4j servers as usual. Note that the startup order does not matter.
localhost:~/neo4j-01$ ./bin/neo4j start
localhost:~/neo4j-02$ ./bin/neo4j start
localhost:~/neo4j-03$ ./bin/neo4j start

Now, you should be able to access the three servers and check their HA status. Open the locations
below in a web browser and issue the following command in the editor after having set a password for
the database: :play sysinfo
• http://127.0.0.1:7474/
• http://127.0.0.1:7475/
• http://127.0.0.1:7476/

488

High Availability

25.7. REST endpoint for HA status information
Introduction
A common use case for Neo4j HA clusters is to direct all write requests to the master while using slaves
for read operations, distributing the read load across the cluster and and gain failover capabilities for
your deployment. The most common way to achieve this is to place a load balancer in front of the HA
cluster, an example being shown with HA Proxy. As you can see in that guide, it makes use of a REST
endpoint to discover which instance is the master and direct write load to it. In this section, we’ll deal
with this REST endpoint and explain its semantics.

The endpoints
Each HA instance comes with 3 endpoints regarding its HA status. They are complimentary but each
may be used depending on your load balancing needs and your production setup. Those are:
• /db/manage/server/ha/master
• /db/manage/server/ha/slave
• /db/manage/server/ha/available
The /master and /slave endpoints can be used to direct write and non-write traffic respectively to
specific instances. This is the optimal way to take advantage of Neo4j’s scaling characteristics. The /
available endpoint exists for the general case of directing arbitrary request types to instances that are
available for transaction processing.
To use the endpoints, perform an HTTP GET operation on either and the following will be returned:
HA REST endpoint responses
Endpoint

Instance
State

Returned Code

Body text

Master

200 OK

true

404 Not Found

false

Unknown

404 Not Found

UNKNOWN

Master

404 Not Found

false

200 OK

true

Unknown

404 Not Found

UNKNOWN

Master

200 OK

master

Slave

200 OK

slave

Unknown

404 Not Found

UNKNOWN

/db/manage/server/ha/master Slave

/db/manage/server/ha/slave Slave

/db/manage/server/
ha/available

Examples
From the command line, a common way to ask those endpoints is to use curl. With no arguments, curl
will do an HTTP GET on the URI provided and will output the body text, if any. If you also want to get the
response code, just add the -v flag for verbose output. Here are some examples:
• Requesting master endpoint on a running master with verbose output
> curl -v localhost:7474/db/manage/server/ha/master
* About to connect() to localhost port 7474 (0)
*
Trying ::1...
* connected
* Connected to localhost (::1) port 7474 (0)
> GET /db/manage/server/ha/master HTTP/1.1
> User-Agent: curl/7.24.0 (x86_64-apple-darwin12.0) libcurl/7.24.0 OpenSSL/0.9.8r zlib/1.2.5
> Host: localhost:7474

489

High Availability
> Accept: */*
>
< HTTP/1.1 200 OK
< Content-Type: text/plain
< Access-Control-Allow-Origin: *
< Transfer-Encoding: chunked
< Server: Jetty(6.1.25)
<
* Connection 0 to host localhost left intact
true* Closing connection 0

• Requesting slave endpoint on a running master without verbose output:
> curl localhost:7474/db/manage/server/ha/slave
false

• Finally, requesting the master endpoint on a slave with verbose output
> curl -v localhost:7475/db/manage/server/ha/master
* About to connect() to localhost port 7475 (0)
*
Trying ::1...
* connected
* Connected to localhost (::1) port 7475 (0)
> GET /db/manage/server/ha/master HTTP/1.1
> User-Agent: curl/7.24.0 (x86_64-apple-darwin12.0) libcurl/7.24.0 OpenSSL/0.9.8r zlib/1.2.5
> Host: localhost:7475
> Accept: */*
>
< HTTP/1.1 404 Not Found
< Content-Type: text/plain
< Access-Control-Allow-Origin: *
< Transfer-Encoding: chunked
< Server: Jetty(6.1.25)
<
* Connection 0 to host localhost left intact
false* Closing connection 0

Unknown status

The UNKNOWN status exists to describe when a Neo4j instance is neither master nor slave. For
example, the instance could be transitioning between states (master to slave in a recovery
scenario or slave being promoted to master in the event of failure). If the UNKNOWN status is
returned, the client should not treat the instance as a master or a slave and should instead
pick another instance in the cluster to use, wait for the instance to transit from the UNKNOWN
state, or undertake restorative action via systems admin.

490

High Availability

25.8. Setting up HAProxy as a load balancer
In the Neo4j HA architecture, the cluster is typically fronted by a load balancer. In this section we will
explore how to set up HAProxy to perform load balancing across the HA cluster.
For this tutorial we will assume a Linux environment with HAProxy already installed. See http://
haproxy.1wt.eu/ for downloads and installation instructions.

Configuring HAProxy
HAProxy can be configured in many ways. The full documentation is available at their website.
For this example, we will configure HAProxy to load balance requests to three HA servers. Simply write
the follow configuration to /etc/haproxy.cfg:
global
daemon
maxconn 256
defaults
mode http
timeout connect 5000ms
timeout client 50000ms
timeout server 50000ms
frontend http-in
bind *:80
default_backend neo4j
backend neo4j
option httpchk GET /db/manage/server/ha/available
server s1 10.0.1.10:7474 maxconn 32
server s2 10.0.1.11:7474 maxconn 32
server s3 10.0.1.12:7474 maxconn 32
listen admin
bind *:8080
stats enable

HAProxy can now be started by running:
/usr/sbin/haproxy -f /etc/haproxy.cfg

You can connect to http://:8080/haproxy?stats to view the status dashboard. This
dashboard can be moved to run on port 80, and authentication can also be added. See the HAProxy
documentation for details on this.

Optimizing for reads and writes
Neo4j provides a catalogue of health check URLs (see Section 25.7, “REST endpoint for HA status
information” [489]) that HAProxy (or any load balancer for that matter) can use to distinguish machines
using HTTP response codes. In the example above we used the /available endpoint, which directs
requests to machines that are generally available for transaction processing (they are alive!).
However, it is possible to have requests directed to slaves only, or to the master only. If you are able to
distinguish in your application between requests that write, and requests that only read, then you can
take advantage of two (logical) load balancers: one that sends all your writes to the master, and one
that sends all your read-only requests to a slave. In HAProxy you build logical load balancers by adding
multiple backends.
The trade-off here is that while Neo4j allows slaves to proxy writes for you, this indirection
unnecessarily ties up resources on the slave and adds latency to your write requests. Conversely, you
don’t particularly want read traffic to tie up resources on the master; Neo4j allows you to scale out for
491

High Availability
reads, but writes are still constrained to a single instance. If possible, that instance should exclusively
do writes to ensure maximum write performance.
The following example excludes the master from the set of machines using the /slave endpoint.
global
daemon
maxconn 256
defaults
mode http
timeout connect 5000ms
timeout client 50000ms
timeout server 50000ms
frontend http-in
bind *:80
default_backend neo4j-slaves
backend neo4j-slaves
option httpchk GET /db/manage/server/ha/slave
server s1 10.0.1.10:7474 maxconn 32 check
server s2 10.0.1.11:7474 maxconn 32 check
server s3 10.0.1.12:7474 maxconn 32 check
listen admin
bind *:8080
stats enable

Note

In practice, writing to a slave is uncommon. While writing to slaves has the benefit of
ensuring that data is persisted in two places (the slave and the master), it comes at a
cost. The cost is that the slave must immediately become consistent with the master by
applying any missing transactions and then synchronously apply the new transaction with
the master. This is a more expensive operation than writing to the master and having the
master push changes to one or more slaves.

Cache-based sharding with HAProxy
Neo4j HA enables what is called cache-based sharding. If the dataset is too big to fit into the cache of
any single machine, then by applying a consistent routing algorithm to requests, the caches on each
machine will actually cache different parts of the graph. A typical routing key could be user ID.
In this example, the user ID is a query parameter in the URL being requested. This will route the same
user to the same machine for each request.
global
daemon
maxconn 256
defaults
mode http
timeout connect 5000ms
timeout client 50000ms
timeout server 50000ms
frontend http-in
bind *:80
default_backend neo4j-slaves
backend neo4j-slaves
balance url_param user_id
server s1 10.0.1.10:7474 maxconn 32
server s2 10.0.1.11:7474 maxconn 32

492

High Availability
server s3 10.0.1.12:7474 maxconn 32
listen admin
bind *:8080
stats enable

Naturally the health check and query parameter-based routing can be combined to only route requests
to slaves by user ID. Other load balancing algorithms are also available, such as routing by source IP
(source), the URI (uri) or HTTP headers(hdr()).

493

Chapter 26. Backup
Note

The Backup features are only available in the Neo4j Enterprise Edition.

494

Backup

26.1. Introducing Backup
Backups are performed over the network, from a running Neo4j server and into a local copy of the
database store (the backup). The backup is run using the neo4j-backup tool, which is provided alongside
Neo4j Enterprise.

Important

Neo4j Server must be configured to run a backup service. This is enabled via the
configuration parameter online_backup_enabled, and is enabled by default. The interface and
port the backup service listens on is configured via the parameter online_backup_server and
defaults to the loopback interface and port 6362. It is typical to reconfigure this to listen on
an external interface, by setting online_backup_server=:6362. It can also
be configured to listen on all interfaces by setting online_backup_server=0.0.0.0:6362.
Performing a backup requires specifying the target host, an optional port, and the backup location. The
backup tool will automatically select a full or incremental backup, based on whether an existing backup
is present at that location.
The relevant configuration settings are found below.
Online backup configuration settings

• online_backup_enabled: Enable support for running online backups.
• online_backup_server: Listening server for online backups.
online_backup_enabled

Description

Enable support for running online backups.

Valid values

online_backup_enabled is a boolean.

Default value

true

online_backup_server

Description

Listening server for online backups.

Valid values

online_backup_server is a hostname and port.

Default value

0. 0. 0. 0:6362-6372

Consistency check configuration settings

• consistency_check_graph: Perform checks between nodes, relationships, properties, types and
tokens.
• consistency_check_indexes: Perform checks on indexes.
• consistency_check_label_scan_store: Perform checks on the label scan store.
• consistency_check_property_owners: Perform optional additional checking on property ownership.
• consistency_check_report_file: File name for inconsistencies log file.
consistency_check_graph

Description

Perform checks between nodes, relationships, properties, types and tokens.

Valid values

consistency_check_graph is a boolean.

Default value

true

consistency_check_indexes

Description

Perform checks on indexes. Checking indexes is more expensive than checking
the native stores, so it may be useful to turn off this check for very large
databases.

Valid values

consistency_check_indexes is a boolean.

495

Backup
Default value

true

consistency_check_label_scan_store

Description

Perform checks on the label scan store. Checking this store is more expensive
than checking the native stores, so it may be useful to turn off this check for very
large databases.

Valid values

consistency_check_label_scan_store is a boolean.

Default value

true

consistency_check_property_owners

Description

Perform optional additional checking on property ownership. This can detect a
theoretical inconsistency where a property could be owned by multiple entities.
However, the check is very expensive in time and memory, so it is skipped by
default.

Valid values

consistency_check_property_owners is a boolean.

Default value

false

consistency_check_report_file

Description

File name for inconsistencies log file. If not specified, logs to a file in the store
directory.

Valid values

consistency_check_report_file is a path.

496

Backup

26.2. Performing Backups
Tip

When using Neo4j in embedded mode, the way to perform backup is still the same.

Backup Commands
 Performing a full backup: create a blank directory and run the backup tool
mkdir /mnt/backup/neo4j-backup
./bin/neo4j-backup -host 192.168.1.34 -to /mnt/backup/neo4j-backup
 Performing an incremental backup: just specify the location of your previous backup
./bin/neo4j-backup -host 192.168.1.34 -to /mnt/backup/neo4j-backup
 Performing an incremental backup where the service is listening on a non-default port
./bin/neo4j-backup -host 192.168.1.34 -port 9999 -to /mnt/backup/neo4j-backup

Incremental Backups
An incremental backup is performed whenever an existing backup directory is specified. The backup
tool will then copy any new transactions from the Neo4j server and apply them to the backup. The
result will be an updated backup that is consistent with the current server state.
However, the incremental backup may fail for a number of reasons:
• If the existing directory doesn’t contain a valid backup.
• If the existing directory contains a backup of a different database store.
• If the existing directory contains a backup from a previous database version.

Note

Note that when copying the outstanding transactions, the backup tool needs access to the
historical logical logs. These logical logs are kept by Neo4j and automatically removed after
a period of time, based on the keep_logical_logs configuration. If the required logical logs
have already been removed, the backup tool will do a full backup instead.

Online Backup from Java
In order to programmatically backup your data full or subsequently incremental from a JVM based
program, you need to write Java code like the following:
OnlineBackup backup = OnlineBackup.from( "127.0.0.1" );
backup.full( backupPath.getPath() );
assertTrue( "Should be consistent", backup.isConsistent() );
backup.incremental( backupPath.getPath() );
1

For more information, please see the Javadocs for OnlineBackup .

1

http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/backup/OnlineBackup.html

497

Backup

26.3. Restoring Your Data
The Neo4j backups are fully functional databases. To use a backup, all you need to do replace your
database folder with the backup. Just make sure the database isn’t running while replacing the folder.
To restore from backup in a clustered environment, follow these steps:
1. Shut down all instances in the cluster.
2. Restore the backup to the individual database folders.
3. Restart the instances.
Make sure there’s at least one neostore.transaction.db.nnn file included in the backup. If there isn’t, start
up one instance in stand-alone mode first, and issue one updating transaction against it (any sort of
write, like creating a node). Then copy that database to all instances in your cluster.

498

Chapter 27. Security
Neo4j in itself does not enforce security on the data level. However, there are different aspects that
should be considered when using Neo4j in different scenarios. See Section 27.1, “Securing Neo4j
Server” [500] for details.

499

Security

27.1. Securing Neo4j Server
Secure the port and remote client connection accepts
By default, the Neo4j Server is bundled with a Web server that binds to host localhost on port 7474,
answering only requests from the local machine.
This is configured in the conf/neo4j-server.properties file:
 http port (for all data, administrative, and UI access)
org.neo4j.server.webserver.port=7474
 Let the webserver only listen on the specified IP. Default is localhost (only
 accept local connections). Uncomment to allow any connection.
org.neo4j.server.webserver.address=0.0.0.0

If you want the server to listen to external hosts, configure the Web server in the conf/neo4jserver.properties by setting the property org.neo4j.server.webserver.address=0.0.0.0 which will cause
the server to bind to all available network interfaces. Note that firewalls et cetera have to be configured
accordingly as well.

Server authentication and authorization
Neo4j requires clients to supply authentication credentials when accessing the REST API. Without valid
credentials, access to the database will be forbidden.
The authentication and authorization data is stored under data/dbms/auth. If necessary, this file can
be copied over to other neo4j instances to ensure they share the same username/password (see the
section called “Copying security configuration from one instance to another” [315]).
Please refer to Section 21.3, “REST API Authentication and Authorization” [312] for additional details.
When accessing Neo4j over unsecured networks, make sure HTTPS is configured and used for access
(see the section called “HTTPS support” [500]).
If necessary, authentication may be disabled. This will allow any client to access the database without
supplying authentication credentials.
 Disable authorization
dbms.security.auth_enabled=false

Warning

Disabling authentication is not recommended, and should only be done if the operator
has a good understanding of their network security, including protection against cross-site
1
scripting (XSS) attacks via web browsers. Developers should not disable authentication if
they have a local installation using the default listening ports.

HTTPS support
The Neo4j server includes built in support for SSL encrypted communication over HTTPS. The first time
the server starts, it automatically generates a self-signed SSL certificate and a private key. Because the
certificate is self signed, it is not safe to rely on for production use, instead, you should provide your
own key and certificate for the server to use.
To provide your own key and certificate, replace the generated key and certificate, or change the conf/
neo4j-server.properties file to set the location of your certificate and key:
 Certificate location (auto generated if the file does not exist)
dbms.security.tls_certificate_file=ssl/snakeoil.cert
 Private key location (auto generated if the file does not exist)
dbms.security.tls_key_file=ssl/snakeoil.key
1

http://en.wikipedia.org/wiki/Cross-site_scripting

500

Security
Note that the key should be unencrypted. Make sure you set correct permissions on the private key, so
that only the Neo4j server user can read/write it.
Neo4j also supports chained SSL certificates. This requires to have all certificates in PEM format
combined in one file and the private key needs to be in DER format.
You can set what port the HTTPS connector should bind to in the same configuration file, as well as turn
HTTPS off:
 Turn https-support on/off
org.neo4j.server.webserver.https.enabled=true
 https port (for all data, administrative, and UI access)
org.neo4j.server.webserver.https.port=443

Arbitrary code execution
Important

The Neo4j server exposes remote scripting functionality by default that allow full access to
the underlying system. Exposing your server without implementing a security layer presents
a substantial security vulnerability.
By default, the Neo4j Server comes with some places where arbitrary code code execution can happen.
These are the Section 21.17, “Traversals” [374] REST endpoints. To secure these, either disable them
completely by removing offending plugins from the server classpath, or secure access to these URLs
through proxies or Authorization Rules. Also, the Java Security Manager, see http://docs.oracle.com/
javase/7/docs/technotes/guides/security/index.html, can be used to secure parts of the codebase.

Server authorization rules
Administrators may require more fine-grained security policies in addition to the basic authorization
and/or IP-level restrictions on the Web server. Neo4j server supports administrators in allowing or
disallowing access the specific aspects of the database based on credentials that users or applications
provide.
To facilitate domain-specific authorization policies in Neo4j Server, security rules can
be implemented and registered with the server. This makes scenarios like user and
role based security and authentication against external lookup services possible. See
org.neo4j.server.rest.security.SecurityRule in the javadocs downloadable from Maven Central
2
(org.neo4j.app:neo4j-server) .

Caution

The use of Server Authorization Rules may interact unexpectedly with the built-in
authentication and authorization (see the section called “Server authentication and
authorization” [500]), if enabled.

Enforcing Server Authorization Rules
In this example, a (dummy) failing security rule is registered to deny access to all URIs to the server by
listing the rules class in neo4j-server.properties:
org.neo4j.server.rest.security_rules=my.rules.PermanentlyFailingSecurityRule

with the rule source code of:
public class PermanentlyFailingSecurityRule implements SecurityRule
{
public static final String REALM = "WallyWorld"; // as per RFC2617 :-)

2

http://search.maven.org/#search%7Cgav%7C1%7Cg%3A%22org.neo4j.app%22%20AND%20a%3A%22neo4j-server%22

501

Security
@Override
public boolean isAuthorized( HttpServletRequest request )
{
return false; // always fails - a production implementation performs
// deployment-specific authorization logic here
}
@Override
public String forUriPath()
{
return "/*";
}
@Override
public String wwwAuthenticateHeader()
{
return SecurityFilter.basicAuthenticationResponse(REALM);
}
}

With this rule registered, any access to the server will be denied. In a production-quality
implementation the rule will likely lookup credentials/claims in a 3rd-party directory service (e.g. LDAP)
or in a local database of authorized users.
Example request
• POST http://localhost:7474/db/data/node
• Accept: application/json; charset=UTF-8
Example response
• 401: Unauthorized
• WWW-Authenticate: Basic realm="WallyWorld"

Using Wildcards to Target Security Rules
In this example, a security rule is registered to deny access to all URIs to the server by listing the rule(s)
class(es) in neo4j-server.properties. In this case, the rule is registered using a wildcard URI path (where
* characters can be used to signify any part of the path). For example /users* means the rule will be
bound to any resources under the /users root path. Similarly /users*type* will bind the rule to resources
matching URIs like /users/fred/type/premium.
org.neo4j.server.rest.security_rules=my.rules.PermanentlyFailingSecurityRuleWithWildcardPath

with the rule source code of:
public String forUriPath()
{
return "/protected/*";
}

With this rule registered, any access to URIs under /protected/ will be denied by the server. Using
wildcards allows flexible targeting of security rules to arbitrary parts of the server’s API, including any
unmanaged extensions or managed plugins that have been registered.
Example request
• GET http://localhost:7474/protected/tree/starts/here/dummy/more/stuff
• Accept: application/json
Example response
502

Security
• 401: Unauthorized
• WWW-Authenticate: Basic realm="WallyWorld"

Using Complex Wildcards to Target Security Rules
In this example, a security rule is registered to deny access to all URIs matching a complex pattern. The
config looks like this:
org.neo4j.server.rest.security_rules=my.rules.PermanentlyFailingSecurityRuleWithComplexWildcardPath

with the rule source code of:
public class PermanentlyFailingSecurityRuleWithComplexWildcardPath implements SecurityRule
{
public static final String REALM = "WallyWorld"; // as per RFC2617 :-)
@Override
public boolean isAuthorized( HttpServletRequest request )
{
return false;
}
@Override
public String forUriPath()
{
return "/protected/*/something/else/*/final/bit";
}
@Override
public String wwwAuthenticateHeader()
{
return SecurityFilter.basicAuthenticationResponse(REALM);
}
}

Example request
• GET http://localhost:7474/protected/wildcard_replacement/x/y/z/something/else/
more_wildcard_replacement/a/b/c/final/bit/more/stuff
• Accept: application/json
Example response
• 401: Unauthorized
• WWW-Authenticate: Basic realm="WallyWorld"

Using a proxy
Although the Neo4j server has a number of security features built-in (see the above chapters), for
sensitive deployments it is often sensible to front against the outside world it with a proxy like Apache
3
mod_proxy .
This provides a number of advantages:
• Control access to the Neo4j server to specific IP addresses, URL patterns and IP ranges. This can be
used to make for instance only the /db/data namespace accessible to non-local clients, while the /db/
admin URLs only respond to a specific IP address.
3

http://httpd.apache.org/docs/2.2/mod/mod_proxy.html

503

Security


Order Deny,Allow
Deny from all
Allow from 192.168.0


While it is possible to develop plugins using Neo4j’s SecurityRule (see above), operations
professionals would often prefer to configure proxy servers such as Apache. However, it should be
noted that in cases where both approaches are being used, they will work harmoniously provided
that the behavior is consistent across proxy server and SecurityRule plugins.
• Run Neo4j Server as a non-root user on a Linux/Unix system on a port < 1000 (e.g. port 80) using
ProxyPass /neo4jdb/data http://localhost:7474/db/data
ProxyPassReverse /neo4jdb/data http://localhost:7474/db/data

• Simple load balancing in a clustered environment to load-balance read load using the Apache
4
mod_proxy_balancer plugin

BalancerMember http://192.168.1.50:80
BalancerMember http://192.168.1.51:80

ProxyPass /test balancer://mycluster

LOAD CSV
The Cypher LOAD CSV clause can load files from the filesystem, and its default configuration allows any
file on the system to be read using a file:/// URL. This presents a security vulnerability in production
environments where database users should not otherwise have access to files on the system. For
production deployments, configure the dbms.security.load_csv_file_url_root [468] setting, which will
make all files identified in a file:/// URL relative to the specified directory, similarly to how a unix
chroot works. Alternatively, set the allow_file_urls [465] setting to false, which disables the use of
file:/// URLs entirely. Further information can be found in Section 11.6, “Load CSV” [182].

Neo4j Web Interface Security
For configuration settings to consider in order to get the level of security you want to achieve, see the
section called “Web Interface configuration settings” [452].

4

http://httpd.apache.org/docs/2.2/mod/mod_proxy_balancer.html

504

Chapter 28. Monitoring
Note

Most of the monitoring features are only available in the Enterprise edition of Neo4j.
In order to be able to continuously get an overview of the health of a Neo4j database, there are
1
different levels of monitoring facilities available. Most of these are exposed through JMX . Neo4j
Enterprise also has the ability to automatically report metrics to commonly used monitoring systems,
2
3
like Graphite and Ganglia .

1

http://www.oracle.com/technetwork/java/javase/tech/javamanagement-140525.html
http://graphite.wikidot.com
3
http://ganglia.sourceforge.net
2

505

Monitoring

28.1. Adjusting remote JMX access to the Neo4j Server
Per default, the Neo4j Enterprise Server edition does not allow remote JMX connections, since the
relevant options in the conf/neo4j-wrapper.conf configuration file are commented out. To enable this
feature, you have to remove the  characters from the various com.sun.management.jmxremote options
there.
When commented in, the default values are set up to allow remote JMX connections with certain roles,
refer to the conf/jmx.password, conf/jmx.access and conf/neo4j-wrapper.conf files for details.
Make sure that conf/jmx.password has the correct file permissions. The owner of the file has to be the
user that will run the service, and the permissions should be read only for that user. On Unix systems,
this is 0600.
On Windows, follow the tutorial at http://docs.oracle.com/javase/7/docs/technotes/guides/
management/security-windows.html to set the correct permissions. If you are running the service
under the Local System Account, the user that owns the file and has access to it should be SYSTEM.
With this setup, you should be able to connect to JMX monitoring of the Neo4j server using :3637, with the username monitor and the password Neo4j.
Note that it is possible that you have to update the permissions and/or ownership of the conf/
jmx.password and conf/jmx.access files — refer to the relevant section in conf/neo4j-wrapper.conf for
details.

Warning

For maximum security, please adjust at least the password settings in conf/jmx.password for
a production installation.
For more details, see: http://docs.oracle.com/javase/7/docs/technotes/guides/management/agent.html.

506

Monitoring

28.2. How to connect to a Neo4j instance using JMX and
JConsole
First, start your embedded database or the Neo4j Server, for instance using
$NEO4j_HOME/bin/neo4j start

Now, start JConsole with
$JAVA_HOME/bin/jconsole

Connect to the process running your Neo4j database instance:
Figure 28.1. Connecting JConsole to the Neo4j Java process

Now, beside the MBeans exposed by the JVM, you will see an org.neo4j section in the MBeans tab.
Under that, you will have access to all the monitoring information exposed by Neo4j.

507

Monitoring
For opening JMX to remote monitoring access, please see Section 28.1, “Adjusting remote JMX access to
4
the Neo4j Server” [506] and the JMX documention . When using Neo4j in embedded mode, make sure
to pass the com.sun.management.jmxremote.port=portNum or other configuration as JVM parameters to your
running Java process.

4

http://docs.oracle.com/javase/7/docs/technotes/guides/management/agent.html

508

Monitoring
Figure 28.2. Neo4j MBeans View

509

Monitoring

28.3. How to connect to the JMX monitoring
programmatically
In order to programmatically connect to the Neo4j JMX server, there are some convenience methods
in the Neo4j Management component to help you find out the most commonly used monitoring
attributes of Neo4j. See Section 33.12, “Reading a management attribute” [604] for an example.
5

Once you have access to this information, you can use it to for instance expose the values to SNMP or
other monitoring systems.

5

http://en.wikipedia.org/wiki/Simple_Network_Management_Protocol

510

Monitoring

28.4. Reference of supported JMX MBeans
MBeans exposed by Neo4j

•
•
•
•
•
•
•
•
•
•
•
•

Branched Store: Information about the branched stores present in this HA cluster member.
Configuration: The configuration parameters used to configure Neo4j.
Diagnostics: Diagnostics provided by Neo4j.
High Availability: Information about an instance participating in a HA cluster.
Index sampler: Handle index sampling.
Kernel: Information about the Neo4j kernel.
Locking: Information about the Neo4j lock status.
Memory Mapping: The status of Neo4j memory mapping.
Page cache: Information about the Neo4j page cache.
Primitive count: Estimates of the numbers of different kinds of Neo4j primitives.
Store file sizes: Information about the sizes of the different parts of the Neo4j graph store.
Transactions: Information about the Neo4j transaction manager.

Note

For additional information on the primitive datatypes (int, long etc.) used in the JMX
attributes, please see Property value types [584] in the section called “Properties” [7].
MBean Branched Store (org.neo4j.management.BranchedStore) Attributes
Name

Description

Type

Read

Write

Information about the branched stores present in this HA cluster member
BranchedStores

A list of the branched stores

org. neo4j. management. yes
BranchedStoreInfo[]
as CompositeData[]

no

MBean Configuration (org.neo4j.jmx.impl.ConfigurationBean) Attributes
Name

Description

Type

Read

Write

The configuration parameters used to configure Neo4j
allow_ store_ upgrade

Whether to allow a store upgrade in
String
case the current version of the database
starts against an older store version.
Setting this to true does not guarantee
successful upgrade, it just allows an
upgrade to be performed.

yes

no

array_ block_ size

Specifies the block size for storing arrays. String
This parameter is only honored when the
store is created, otherwise it is ignored.
The default block size is 120 bytes, and
the overhead of each block is the same
as for string blocks, i.e., 8 bytes.

yes

no

String

yes

no

batch_ inserter_ batch_ Specifies number of operations that
batch inserter will try to group into
size

one batch before flushing data into
underlying storage.

batched_ writes

Whether or not transactions are
appended to the log in batches

String

yes

no

cache_ type

The type of cache to use for nodes and
relationships. This configuration setting
is no longer applicable from Neo4j 2.3.

String

yes

no

511

Monitoring
Name

Description

Type

Read

Write

String

yes

no

dbms. checkpoint.
interval. time

Configures the time interval between
String
check-points. The database will not
check-point more often than this (unless
check pointing is triggered by a different
event), but might check-point less
often than this interval, if performing
a check-point takes longer time than
the configured interval. A check-point
is a point in the transaction logs, from
which recovery would start from. Longer
check-point intervals typically means that
recovery will take longer to complete
in case of a crash. On the other hand,
a longer check-point interval can also
reduce the I/O load that the database
places on the system, as each checkpoint implies a flushing and forcing of
all the store files. The default is 5m for
a check-point every 5 minutes. Other
supported units are s for seconds, and
ms for milliseconds.

yes

no

dbms. checkpoint.
interval. tx

Configures the transaction interval
String
between check-points. The database
will not check-point more often than
this (unless check pointing is triggered
by a different event), but might checkpoint less often than this interval, if
performing a check-point takes longer
time than the configured interval. A
check-point is a point in the transaction
logs, from which recovery would start
from. Longer check-point intervals
typically means that recovery will take
longer to complete in case of a crash.
On the other hand, a longer check-point
interval can also reduce the I/O load that
the database places on the system, as
each check-point implies a flushing and
forcing of all the store files. The default
is 100000 for a check-point every 100000
transactions.

yes

no

String

yes

no

String

yes

no

Configuration has been simplified to only
require tuning of the page cache.
cypher_ parser_ version Set this to specify the default parser

(language version).

dbms. cypher. compiler_ Enable tracing of compilation in cypher.
tracing
dbms. cypher. hints.
error

Set this to specify the behavior when
Cypher planner or runtime hints
cannot be fulfilled. If true, then nonconformance will result in an error,
otherwise only a warning is generated.

512

Monitoring
Name

Description

dbms. cypher. planner
dbms. cypher. runtime

Type

Read

Write

Set this to specify the default planner for String
the default language version.

yes

no

Set this to specify the default runtime for String
the default language version.

yes

no

String

yes

no

dbms. pagecache. memory The amount of memory to use for

mapping the store files, in bytes (or
kilobytes with the k suffix, megabytes
with m and gigabytes with g). If Neo4j is
running on a dedicated server, then it is
generally recommended to leave about
2-4 gigabytes for the operating system,
give the JVM enough heap to hold all
your transaction state and query context,
and then leave the rest for the page
cache. The default page cache memory
assumes the machine is dedicated to
running Neo4j, and is heuristically set to
50% of RAM minus the max Java heap
size.

dbms. pagecache.
pagesize

Target size for pages of mapped
memory. If set to 0, then a reasonable
default is chosen, depending on the
storage device used.

String

yes

no

dbms. pagecache.
swapper

Specify which page swapper to use for
doing paged IO. This is only used when
integrating with proprietary storage
technology.

String

yes

no

dbms. querylog. enabled Log executed queries that takes longer

String

yes

no

than the configured threshold. NOTE:
This feature is only available in the Neo4j
Enterprise Edition.

dbms. querylog.
filename

Log executed queries that take longer
than the configured threshold

String

yes

no

dbms. querylog. max_
archives

Maximum number of history files for the String
query log.

yes

no

dbms. querylog.
parameter_ logging_
enabled

Log parameters for executed queries
that took longer than the configured
threshold.

String

yes

no

dbms. querylog.
rotation. threshold

Specifies at which file size the query log String
will auto-rotate. 0 means that no rotation
will automatically occur based on file
size.

yes

no

dbms. querylog.
threshold

If the execution of query takes more time String
than this threshold, the query is logged
- provided query logging is enabled.
Defaults to 0 seconds, that is all queries
are logged.

yes

no

dbms. security. ha_
status_ auth_ enabled

Require authorization for access to the
HA status endpoints.

yes

no

513

String

Monitoring
Name

Description

Type

Read

Write

dense_ node_ threshold

Relationship count threshold for
considering a node to be dense

String

yes

no

dump_ configuration

Print out the effective Neo4j
configuration after startup.

String

yes

no

edition

Configuration attribute

String

yes

no

ephemeral

Configuration attribute

String

yes

no

forced_ kernel_ id

An identifier that uniquely identifies this String
graph database instance within this JVM.
Defaults to an auto-generated number
depending on how many instance are
started in this JVM.

yes

no

gc_ monitor_ threshold

The amount of time in ms the monitor
thread has to be blocked before logging
a message it was blocked.

String

yes

no

gc_ monitor_ wait_ time

Amount of time in ms the GC monitor
thread will wait before taking another
measurement.

String

yes

no

ha. allow_ init_ cluster Whether to allow this instance to create

String

yes

no

ha. branched_ data_
policy

Policy for how to handle branched data.

String

yes

no

ha. broadcast_ timeout

Timeout for broadcasting values in
String
cluster. Must consider end-to-end
duration of Paxos algorithm. This value is
the default value for the ha.join_timeout
and ha.leave_timeout settings.

yes

no

ha. cluster_ name

The name of a cluster.

String

yes

no

ha. cluster_ server

Host and port to bind the cluster
management communication.

String

yes

no

ha. com_ chunk_ size

Max size of the data chunks that flows
String
between master and slaves in HA. Bigger
size may increase throughput, but may
also be more sensitive to variations in
bandwidth, whereas lower size increases
tolerance for bandwidth variations.

yes

no

ha. configuration_
timeout

Timeout for waiting for configuration
from an existing cluster member during
cluster join.

String

yes

no

ha. default_ timeout

Default timeout used for clustering
String
timeouts. Override specific timeout
settings with proper values if necessary.
This value is the default value for the
ha.heartbeat_interval, ha.paxos_timeout
and ha.learn_timeout settings.

yes

no

ha. election_ timeout

Timeout for waiting for other members
to finish a role election. Defaults to
ha.paxos_timeout.

String

yes

no

String

yes

no

a cluster if unable to join.

ha. heartbeat_ interval How often heartbeat messages should

be sent. Defaults to ha.default_timeout.

514

Monitoring
Name

Description

Type

Read

Write

ha. heartbeat_ timeout

Timeout for heartbeats between cluster
members. Should be at least twice that
of ha.heartbeat_interval.

String

yes

no

ha. initial_ hosts

A comma-separated list of other
members of the cluster to join.

String

yes

no

ha. internal_ state_
switch_ timeout

Timeout for waiting for internal
conditions during state switch, like
for transactions to complete, before
switching to master or slave.

String

yes

no

ha. join_ timeout

Timeout for joining a cluster. Defaults to String
ha.broadcast_timeout.

yes

no

ha. learn_ timeout

Timeout for learning values. Defaults to
ha.default_timeout.

String

yes

no

ha. leave_ timeout

Timeout for waiting for cluster leave to
String
finish. Defaults to ha.broadcast_timeout.

yes

no

ha. lock_ read_ timeout

Timeout for taking remote (write) locks
on slaves. Defaults to ha.read_timeout.

String

yes

no

ha. max_ acceptors

Maximum number of servers to involve String
when agreeing to membership changes.
In very large clusters, the probability
of half the cluster failing is low, but
protecting against any arbitrary half
failing is expensive. Therefore you may
wish to set this parameter to a value less
than the cluster size.

yes

no

ha. max_ concurrent_
channels_ per_ slave

Maximum number of connections a
slave can have to the master.

String

yes

no

ha. paxos_ timeout

Default timeout for all Paxos timeouts.
Defaults to ha.default_timeout. This
value is the default value for the
ha.phase1_timeout, ha.phase2_timeout
and ha.election_timeout settings.

String

yes

no

ha. phase1_ timeout

Timeout for Paxos phase 1. Defaults to
ha.paxos_timeout.

String

yes

no

ha. phase2_ timeout

Timeout for Paxos phase 2. Defaults to
ha.paxos_timeout.

String

yes

no

ha. pull_ apply_ batch_
size

Size of batches of transactions applied
on slaves when pulling from master

String

yes

no

ha. pull_ interval

Interval of pulling updates from master.

String

yes

no

ha. read_ timeout

How long a slave will wait for response
from master before giving up.

String

yes

no

ha. server_ id

Id for a cluster instance. Must be unique String
within the cluster.

yes

no

ha. server

Hostname and port to bind the HA
server.

String

yes

no

ha. slave_ only

Whether this instance should only
participate as slave in cluster. If set to
true, it will never be elected as master.

String

yes

no

515

Monitoring
Name

Description

Type

Read

Write

ha. state_ switch_
timeout

Timeout for request threads waiting for
instance to become master or slave.

String

yes

no

ha. tx_ push_ factor

The amount of slaves the master will ask String
to replicate a committed transaction.

yes

no

ha. tx_ push_ strategy

Push strategy of a transaction to a slave
during commit.

String

yes

no

index_ background_
sampling_ enabled

Enable or disable background index
sampling

String

yes

no

index_ sampling_
buffer_ size

Size of buffer used by index sampling

String

yes

no

index_ sampling_
update_ percentage

Percentage of index updates of total
String
index size required before sampling of a
given index is triggered

yes

no

intercept_
deserialized_
transactions

Determines whether any
String
TransactionInterceptors loaded will
intercept externally received transactions
(for example in HA) before they reach the
logical log and are applied to the store.

yes

no

jmx. port

Configuration attribute

String

yes

no

keep_ logical_ logs

Make Neo4j keep the logical transaction
logs for being able to backup the
database. Can be used for specifying
the threshold to prune logical logs after.
For example "10 days" will prune logical
logs that only contains transactions
older than 10 days from the current
time, or "100k txs" will keep the 100k
latest transactions and prune any older
transactions.

String

yes

no

label_ block_ size

Specifies the block size for storing labels String
exceeding in-lined space in node record.
This parameter is only honored when the
store is created, otherwise it is ignored.
The default block size is 60 bytes, and
the overhead of each block is the same
as for string blocks, i.e., 8 bytes.

yes

no

log_ mapped_ memory_
stats_ filename

The file where memory mapping
statistics will be recorded.

String

yes

no

log_ mapped_ memory_
stats_ interval

The number of records to be loaded
between regular logging of memory
mapping statistics.

String

yes

no

log_ mapped_ memory_
stats

Log memory mapping statistics regularly. String

yes

no

logical_ log_ rotation_ Specifies at which file size the logical log String
will auto-rotate. 0 means that no rotation
threshold

yes

no

The size to allocate for memory mapping String
the node store.

yes

no

will automatically occur based on file
size.

neostore. nodestore.
db. mapped_ memory

516

Monitoring
Name

Description

neostore.
propertystore. db.
arrays. mapped_ memory

Type

Read

Write

The size to allocate for memory mapping String
the array property store.

yes

no

neostore.
propertystore. db.
index. keys. mapped_
memory

The size to allocate for memory mapping String
the store for property key strings.

yes

no

neostore.
propertystore. db.
index. mapped_ memory

The size to allocate for memory mapping String
the store for property key indexes.

yes

no

neostore.
propertystore. db.
mapped_ memory

The size to allocate for memory mapping String
the property value store.

yes

no

The size to allocate for memory mapping String
neostore.
the string property store.
propertystore. db.
strings. mapped_ memory

yes

no

The size to allocate for memory mapping String
neostore.
relationshipstore. db. the relationship store.
mapped_ memory

yes

no

node_ auto_ indexing

Controls the auto indexing feature
for nodes. Setting it to false shuts
it down, while true enables it by
default for properties listed in the
node_keys_indexable setting.

String

yes

no

node_ keys_ indexable

A list of property names (comma
separated) that will be indexed by
default. This applies to nodes only.

String

yes

no

online_ backup_ enabled Enable support for running online

String

yes

no

backups

online_ backup_ server

Listening server for online backups

String

yes

no

read_ only

Only allow read operations from this
Neo4j instance. This mode still requires
write access to the directory for lock
purposes.

String

yes

no

rebuild_ idgenerators_ Use a quick approach for rebuilding the String
ID generators. This give quicker recovery
fast

yes

no

time, but will limit the ability to reuse the
space of deleted entities.

relationship_ auto_
indexing

Controls the auto indexing feature
for relationships. Setting it to false
shuts it down, while true enables it
by default for properties listed in the
relationship_keys_indexable setting.

String

yes

no

relationship_ grab_
size

How many relationships to read at a time String
during iteration

yes

no

relationship_ keys_
indexable

A list of property names (comma
String
separated) that will be indexed by
default. This applies to relationships only.

yes

no

517

Monitoring
Name

Description

Type

Read

Write

remote_ shell_ enabled

Enable a remote shell server which
Neo4j Shell clients can log in to.

String

yes

no

remote_ shell_ host

Remote host for shell. By default, the
shell server listens only on the loopback
interface, but you can specify the IP
address of any network interface or use
0. 0. 0. 0 for all interfaces.

String

yes

no

remote_ shell_ name

The name of the shell.

String

yes

no

remote_ shell_ port

The port the shell will listen on.

String

yes

no

remote_ shell_ read_
only

Read only mode. Will only allow read
operations.

String

yes

no

store. internal_ log.
debug_ contexts

Internal log contexts that should output
debug level logging

String

yes

no

store. internal_ log.
level

Log level threshold.

String

yes

no

store. internal_ log.
location

The location of the internal diagnostics
log.

String

yes

no

store. internal_ log.
max_ archives

Maximum number of history files for the String
internal log.

yes

no

store. internal_ log.
rotation_ delay

Minimum time interval after last rotation String
of the internal log before it may be
rotated again.

yes

no

store. internal_ log.
rotation_ threshold

Threshold for rotation of the internal log. String

yes

no

store. interval. log.
rotation

Maximum time interval for log rotation
String
to wait for active transaction completion

yes

no

store_ dir

The directory where the database files
are located.

String

yes

no

string_ block_ size

Specifies the block size for storing
String
strings. This parameter is only honored
when the store is created, otherwise it
is ignored. Note that each character in a
string occupies two bytes, meaning that
a block size of 120 (the default size) will
hold a 60 character long string before
overflowing into a second block. Also
note that each block carries an overhead
of 8 bytes. This means that if the block
size is 120, the size of the stored records
will be 128 bytes.

yes

no

transaction_ start_
timeout

The maximum amount of time to wait
for the database to become available,
when starting a new transaction.

String

yes

no

unsupported. dbms. id_
reuse_ safe_ zone

Duration for which master will buffer ids String
and not reuse them to allow slaves read
consistently. Slaves will also terminate
transactions longer than this duration,
when applying received transaction

yes

no

518

Monitoring
Name

Description

Type

Read

Write

stream, to make sure they do not read
potentially inconsistent/reused records.
unsupported. dbms.
index. archive_ failed

Create an archive of an index before recreating it if failing to load on startup.

String

yes

no

use_ memory_ mapped_
buffers

Use memory mapped buffers for
accessing the native storage layer.

String

yes

no

Read

Write

MBean Diagnostics (org.neo4j.management.Diagnostics) Attributes
Name

Description

Type

Diagnostics provided by Neo4j
DiagnosticsProviders

A list of the ids for the registered
diagnostics providers.

List (java. util. List) yes

no

MBean Diagnostics (org.neo4j.management.Diagnostics) Operations
Name

Description

ReturnType

Signature

dumpAll

Dump diagnostics information to
JMX

String

(no parameters)

dumpToLog

Dump diagnostics information to the void
log.

(no parameters)

dumpToLog

Dump diagnostics information to the void
log.

java. lang. String

extract

Operation exposed for management String

java. lang. String

MBean High Availability (org.neo4j.management.HighAvailability) Attributes
Name

Description

Type

Read

Write

Information about an instance participating in a HA cluster
Alive

Whether this instance is alive or not

boolean

yes

no

Available

Whether this instance is available or not

boolean

yes

no

InstanceId

The identifier used to identify this server String
in the HA cluster

yes

no

InstancesInCluster

Information about all instances in this
cluster

org. neo4j. management. yes
ClusterMemberInfo[]
as CompositeData[]

no

LastCommittedTxId

The latest transaction id present in this
instance’s store

long

yes

no

LastUpdateTime

The time when the data on this instance
was last updated from the master

String

yes

no

Role

The role this instance has in the cluster

String

yes

no

MBean High Availability (org.neo4j.management.HighAvailability) Operations
Name

Description

ReturnType

Signature

update

(If this is a slave) Update the
database on this instance with the
latest transactions from the master

String

(no parameters)

MBean Kernel (org.neo4j.jmx.Kernel) Attributes
Name

Description

Type

Information about the Neo4j kernel
519

Read

Write

Monitoring
Name

Description

Type

Read

KernelStartTime

The time from which this Neo4j instance Date (java. util. Date) yes
was in operational mode.

no

KernelVersion

The version of Neo4j

yes

no

MBeanQuery

An ObjectName that can be used as a
javax. management.
query for getting all management beans ObjectName
for this Neo4j instance.

yes

no

ReadOnly

Whether this is a read only instance

boolean

yes

no

StoreCreationDate

The time when this Neo4j graph store
was created.

Date (java. util. Date) yes

no

StoreDirectory

The location where the Neo4j store is
located

String

yes

no

StoreId

An identifier that, together with store
creation time, uniquely identifies this
Neo4j graph store.

String

yes

no

StoreLogVersion

The current version of the Neo4j store
logical log.

long

yes

no

Type

Read

Write

String

Write

MBean Locking (org.neo4j.management.LockManager) Attributes
Name

Description

Information about the Neo4j lock status
Locks

Information about all locks held by Neo4j java. util. List as
CompositeData[]

yes

no

NumberOf
AvertedDeadlocks

The number of lock sequences that
would have lead to a deadlock situation
that Neo4j has detected and averted (by
throwing DeadlockDetectedException).

yes

no

long

MBean Locking (org.neo4j.management.LockManager) Operations
Name

Description

ReturnType

Signature

getContendedLocks

getContendedLocks

java. util.
List as
CompositeData[]

long

MBean Memory Mapping (org.neo4j.management.MemoryMapping) Attributes
Name

Description

Type

Read

Write

The status of Neo4j memory mapping
MemoryPools

Get information about each pool of
org. neo4j. management. yes
memory mapped regions from store files WindowPoolInfo[] as
with memory mapping enabled
CompositeData[]

no

MBean Page cache (org.neo4j.management.PageCache) Attributes
Name

Description

Type

Read

Write

yes

no

Information about the Neo4j page cache
BytesRead

Number of bytes read from durable
storage
520

long

Monitoring
Name

Description

Type

Read

Write

BytesWritten

Number of bytes written to durable
storage

long

yes

no

EvictionExceptions

Number of exceptions caught during
page eviction

long

yes

no

Evictions

Number of page evictions

long

yes

no

Faults

Number of page faults

long

yes

no

FileMappings

Number of files that have been mapped
into the page cache

long

yes

no

FileUnmappings

Number of files that have been
unmapped from the page cache

long

yes

no

Flushes

Number of page flushes

long

yes

no

Pins

Number of page pins

long

yes

no

Unpins

Number of page unpins

long

yes

no

Type

Read

Write

MBean Primitive count (org.neo4j.jmx.Primitives) Attributes
Name

Description

Estimates of the numbers of different kinds of Neo4j primitives
NumberOf NodeIds InUse

An estimation of the number of nodes
used in this Neo4j instance

long

yes

no

NumberOf PropertyIds
InUse

An estimation of the number of
properties used in this Neo4j instance

long

yes

no

NumberOf
RelationshipIds InUse

An estimation of the number of
long
relationships used in this Neo4j instance

yes

no

NumberOf
RelationshipTypeIds
InUse

The number of relationship types used in long
this Neo4j instance

yes

no

Read

Write

MBean Store file sizes (org.neo4j.jmx.StoreFile) Attributes
Name

Description

Type

Information about the sizes of the different parts of the Neo4j graph store
ArrayStoreSize

The amount of disk space used to store
array properties, in bytes.

long

yes

no

LogicalLogSize

The amount of disk space used by the
current Neo4j logical log, in bytes.

long

yes

no

NodeStoreSize

The amount of disk space used to store
nodes, in bytes.

long

yes

no

PropertyStoreSize

The amount of disk space used to store
properties (excluding string values and
array values), in bytes.

long

yes

no

RelationshipStoreSize The amount of disk space used to store

long

yes

no

StringStoreSize

The amount of disk space used to store
string properties, in bytes.

long

yes

no

TotalStoreSize

The total disk space used by this Neo4j
instance, in bytes.

long

yes

no

relationships, in bytes.

521

Monitoring
MBean Transactions (org.neo4j.management.TransactionManager) Attributes
Name

Description

Type

Read

Write

Information about the Neo4j transaction manager
LastCommittedTxId

The id of the latest committed
transaction

long

yes

no

NumberOf Committed
Transactions

The total number of committed
transactions

long

yes

no

NumberOf Opened
Transactions

The total number started transactions

long

yes

no

NumberOf Open
Transactions

The number of currently open
transactions

long

yes

no

NumberOf RolledBack
Transactions

The total number of rolled back
transactions

long

yes

no

PeakNumberOf
Concurrent
Transactions

The highest number of transactions ever long
opened concurrently

yes

no

MBean Index sampler (org.neo4j.management.IndexSamplingManager) Operations
Name

Description

ReturnType

Signature

triggerIndexSampling

triggerIndexSampling

void

java. lang. String,
java. lang. String,
boolean

522

Monitoring

28.5. Metrics Reporting
Note

Metrics reporting is only available in the Neo4j Enterprise Edition.

Introducing Metrics
Neo4j Enterprise can be configured to continuously export Neo4j-specific metrics to Graphite, Ganglia,
or CSV files. This makes it easy to monitor the health of running Neo4j instances.
Neo4j Enterprise can expose metrics for the following parts of the database, and does so by default:
// default setting for enabling all supported metrics
metrics.enabled=true
// default setting for enabling all Neo4j specific metrics
metrics.neo4j.enabled=true
// setting for exposing metrics about transactions; number of transactions started, committed, etc.
metrics.neo4j.tx.enabled=true
// setting for exposing metrics about the Neo4j page cache; page faults, evictions, flushes and exceptions, etc.
metrics.neo4j.pagecache.enabled=true
// setting for exposing metrics about approximately entities are in the database; nodes, relationships, properties, etc.
metrics.neo4j.counts.enabled=true
// setting for exposing metrics about the network usage of the HA cluster component
metrics.neo4j.network.enabled=true

Graphite Configuration
For Graphite integration add the following settings to neo4j.properties:
metrics.graphite.enabled=true // default is 'false'
metrics.graphite.server=:2003
metrics.graphite.interval=
metrics.prefix=

Start the Neo4j Server and connect to Graphite via a web browser in order to monitor your Neo4j
metrics.

Ganglia Configuration
For Ganglia integration add the following settings to neo4j.properties:
metrics.ganglia.enabled=true // default is 'false'
metrics.ganglia.server=:8469
metrics.ganglia.interval=
metrics.prefix=

Export to CSV Configuration
For storing metrics in local CSV files add the following settings to neo4j.properties:
metrics.csv.enabled=true // default is 'false'
metrics.csv.path=
metrics.csv.file=
metrics.csv.interval=

Note

The CSV exporter does not automatically rotate the output files, so it is recommended to
also set up a CRON job to periodically archive the files.
523

Monitoring

Configuration Settings Reference for Metrics
Metrics settings

•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•

metrics.csv.enabled: Set to true to enable exporting metrics to CSV files.
metrics.csv.interval: The reporting interval for the CSV files.
metrics.csv.path: The target location of the CSV files.
metrics.enabled: The default enablement value for all the supported metrics.
metrics.graphite.enabled: Set to true to enable exporting metrics to Graphite.
metrics.graphite.interval: The reporting interval for Graphite.
metrics.graphite.server: The hostname or IP address of the Graphite server.
metrics.jvm.buffers.enabled: Enable reporting metrics about the buffer pools.
metrics.jvm.gc.enabled: Enable reporting metrics about the duration of garbage collections.
metrics.jvm.memory.enabled: Enable reporting metrics about the memory usage.
metrics.jvm.threads.enabled: Enable reporting metrics about the current number of threads running.
metrics.neo4j.checkpointing.enabled: Enable reporting metrics about Neo4j check pointing.
metrics.neo4j.cluster.enabled: Enable reporting metrics about HA cluster info.
metrics.neo4j.counts.enabled: Enable reporting metrics about approximately how many entities are
in the database.
metrics.neo4j.enabled: The default enablement value for all Neo4j specific support metrics.
metrics.neo4j.logrotation.enabled: Enable reporting metrics about the Neo4j log rotation.
metrics.neo4j.network.enabled: Enable reporting metrics about the network usage.
metrics.neo4j.pagecache.enabled: Enable reporting metrics about the Neo4j page cache.
metrics.neo4j.tx.enabled: Enable reporting metrics about transactions.
metrics.prefix: A common prefix for the reported metrics field names.

Deprecated settings

•
•
•
•

metrics.csv.file: Write to a single CSV file or to multiple files.
metrics.ganglia.enabled: Set to true to enable exporting metrics to Ganglia.
metrics.ganglia.interval: The reporting interval for Ganglia.
metrics.ganglia.server: The hostname or IP address of the Ganglia server.

metrics.csv.enabled

Description

Set to true to enable exporting metrics to CSV files.

Valid values

metrics. csv. enabled is a boolean.

Default value

false

metrics.csv.file

Description

Write to a single CSV file or to multiple files. Set to single (the default) for
reporting the metrics in a single CSV file (given by ), with a column per metrics
field. Or set to split to produce a CSV file for each metrics field, in a directory
given by .

Valid values

metrics. csv. file is one of single, split.

Default value

single

Deprecated

This setting will be removed in the next major release.

metrics.csv.interval

Description

The reporting interval for the CSV files. That is, how often new rows with numbers
are appended to the CSV files.

Valid values

metrics. csv. interval is a duration (valid units are ms, s, m).

524

Monitoring
Default value

3s

metrics.csv.path

Description

The target location of the CSV files. Depending on the setting, this is either the
path to an individual CSV file, that have each of the reported metrics fields as
columns, or it is a path to a directory wherein a CSV file per reported field will be
written. Relative paths will be intepreted relative to the configured Neo4j store
directory.

Valid values

metrics. csv. path is a path.

metrics.enabled

Description

The default enablement value for all the supported metrics. Set this to false
to turn off all metrics by default. The individual settings can then be used to
selectively re-enable specific metrics.

Valid values

metrics. enabled is a boolean.

Default value

false

metrics.ganglia.enabled

Description

Set to true to enable exporting metrics to Ganglia.

Valid values

metrics. ganglia. enabled is a boolean.

Default value

false

Deprecated

Ganglia support is experimental, and not guaranteed to work. This built in
support has been deprecated and will be removed from a subsequent version.

metrics.ganglia.interval

Description

The reporting interval for Ganglia. That is, how often to send updated metrics to
Ganglia.

Valid values

metrics. ganglia. interval is a duration (valid units are ms, s, m).

Default value

3s

Deprecated

Ganglia support is experimental, and not guaranteed to work. This built in
support has been deprecated and will be removed from a subsequent version.

metrics.ganglia.server

Description

The hostname or IP address of the Ganglia server.

Valid values

metrics. ganglia. server is a hostname and port.

Default value

:8469

Deprecated

Ganglia support is experimental, and not guaranteed to work. This built in
support has been deprecated and will be removed from a subsequent version.

metrics.graphite.enabled

Description

Set to true to enable exporting metrics to Graphite.

Valid values

metrics. graphite. enabled is a boolean.

Default value

false

metrics.graphite.interval

Description

The reporting interval for Graphite. That is, how often to send updated metrics to
Graphite.

Valid values

metrics. graphite. interval is a duration (valid units are ms, s, m).

Default value

3s

525

Monitoring
metrics.graphite.server

Description

The hostname or IP address of the Graphite server.

Valid values

metrics. graphite. server is a hostname and port.

Default value

:2003

metrics.jvm.buffers.enabled

Description

Enable reporting metrics about the buffer pools.

Valid values

metrics. jvm. buffers. enabled is a boolean.

Default value

false

metrics.jvm.gc.enabled

Description

Enable reporting metrics about the duration of garbage collections.

Valid values

metrics. jvm. gc. enabled is a boolean.

Default value

false

metrics.jvm.memory.enabled

Description

Enable reporting metrics about the memory usage.

Valid values

metrics. jvm. memory. enabled is a boolean.

Default value

false

metrics.jvm.threads.enabled

Description

Enable reporting metrics about the current number of threads running.

Valid values

metrics. jvm. threads. enabled is a boolean.

Default value

false

metrics.neo4j.checkpointing.enabled

Description

Enable reporting metrics about Neo4j check pointing; when it occurs and how
much time it takes to complete.

Valid values

metrics. neo4j. checkpointing. enabled is a boolean.

Default value

false

metrics.neo4j.cluster.enabled

Description

Enable reporting metrics about HA cluster info.

Valid values

metrics. neo4j. cluster. enabled is a boolean.

Default value

false

metrics.neo4j.counts.enabled

Description

Enable reporting metrics about approximately how many entities are in the
database; nodes, relationships, properties, etc.

Valid values

metrics. neo4j. counts. enabled is a boolean.

Default value

false

metrics.neo4j.enabled

Description

The default enablement value for all Neo4j specific support metrics. Set this to
false to turn off all Neo4j specific metrics by default. The individual metrics. neo4j.
* metrics can then be turned on selectively.

Valid values

metrics. neo4j. enabled is a boolean.

Default value

false

526

Monitoring
metrics.neo4j.logrotation.enabled

Description

Enable reporting metrics about the Neo4j log rotation; when it occurs and how
much time it takes to complete.

Valid values

metrics. neo4j. logrotation. enabled is a boolean.

Default value

false

metrics.neo4j.network.enabled

Description

Enable reporting metrics about the network usage.

Valid values

metrics. neo4j. network. enabled is a boolean.

Default value

false

metrics.neo4j.pagecache.enabled

Description

Enable reporting metrics about the Neo4j page cache; page faults, evictions,
flushes, exceptions, etc.

Valid values

metrics. neo4j. pagecache. enabled is a boolean.

Default value

false

metrics.neo4j.tx.enabled

Description

Enable reporting metrics about transactions; number of transactions started,
committed, etc.

Valid values

metrics. neo4j. tx. enabled is a boolean.

Default value

false

metrics.prefix

Description

A common prefix for the reported metrics field names. By default, this is either
be neo4j, or a computed value based on the cluster and instance names, when
running in an HA configuration.

Valid values

metrics. prefix is a string.

Default value

neo4j

Available Metrics
Database Metrics
Name

Description

neo4j. transaction.
peak_concurrent

The highest peak of concurrent transactions ever seen on this machine

neo4j. transaction.
started

The total number of started transactions

neo4j. transaction.
terminated

The total number of terminated transactions

neo4j. transaction.
rollbacks

The total number of rolled back transactions

neo4j. transaction.
committed

The total number of committed transactions

neo4j. transaction.
active

The number of currently active transactions

The total number of exceptions seen during the eviction process in the page
neo4j. page_cache.
eviction_exceptions cache
527

Monitoring
Name

Description

neo4j. page_cache.
flushes

The total number of flushes executed by the page cache

neo4j. page_cache.
unpins

The total number of page unpins executed by the page cache

neo4j. page_cache.
pins

The total number of page pins executed by the page cache

neo4j. page_cache.
evictions

The total number of page evictions executed by the page cache

neo4j. page_cache.
page_faults

The total number of page faults happened in the page cache

neo4j. check_point.
events

The total number of check point events executed so far

neo4j. check_point.
total_time

The total time spent in check pointing so far

neo4j. log_rotation. The total number of transaction log rotations executed so far
events
neo4j. log_rotation. The total time spent in rotating transaction logs so far
total_time
neo4j. ids_in_use.
relationship_type

The total number of different relationship types stored in the database

neo4j. ids_in_use.
property

The total number of different property names used in the database

neo4j. ids_in_use.
relationship

The total number of relationships stored in the database

neo4j. ids_in_use.
node

The total number of nodes stored in the database

Network Metrics
Name

Description

The amount of bytes transmitted on the network containing the transaction data
neo4j. network.
from a slave to the master in order to be committed
slave_network_tx_writes
The amount of bytes transmitted on the network while copying stores from a
neo4j. network.
machines to another
master_network_store_writes
The amount of bytes transmitted on the network containing the transaction data
neo4j. network.
from a master to the slaves in order to propagate committed transactions
master_network_tx_writes
Cluster Metrics
Name

Description

neo4j. cluster.
slave_pull_updates

The total number of update pulls executed by this instance

The highest transaction id that has been pulled in the last pull updates by this
neo4j. cluster.
instance
slave_pull_update_up_to_tx

Java Virtual Machine Metrics
These metrics are environment dependent and they may vary on different hardware and with JVM
configurations. Typically these metrics will show information about garbage collections (for example
the number of events and time spent collecting), memory pools and buffers, and finally the number of
active threads running.
528

Part VI. Tools

The Tools part describes available Neo4j tools and how to use them.

29. Import tool .......................................................................................................................................
29.1. CSV file header format ..........................................................................................................
29.2. Command line usage ............................................................................................................
29.3. Import tool examples ............................................................................................................
30. Web Interface ...................................................................................................................................
31. Neo4j Shell .......................................................................................................................................
31.1. Starting the shell ...................................................................................................................
31.2. Passing options and arguments ...........................................................................................
31.3. Enum options ........................................................................................................................
31.4. Filters .....................................................................................................................................
31.5. Node titles .............................................................................................................................
31.6. How to use (individual commands) ......................................................................................
31.7. An example shell session ......................................................................................................
31.8. A Matrix example ..................................................................................................................

530

531
532
533
536
544
545
546
548
549
550
551
552
557
558

Chapter 29. Import tool
The import tool is used to create a new Neo4j database from data in CSV files.
This chapter explains how to use the tool, format the input data and concludes with an example
bringing everything together.
These are some things you’ll need to keep in mind when creating your input files:
•
•
•
•
•
•
•

Fields are comma separated by default but a different delimiter can be specified.
All files must use the same delimiter.
Multiple data sources can be used for both nodes and relationships.
A data source can optionally be provided using multiple files.
A header which provides information on the data fields must be on the first row of each data source.
Fields without corresponding information in the header will not be read.
UTF-8 encoding is used.

Tip

Indexes are not created during the import. Instead you’ll need to add indexes afterwards
(see the section called “Indexes” [9]).

Note

Data cannot be imported into an existing database using this tool.
• If you want to load small to medium sized CSV files see Section 11.6, “Load CSV” [182].
• If you want to bulk import into an existing database see Chapter 36, Batch
Insertion [636].

531

Import tool

29.1. CSV file header format
The header row of each data source specifies how the fields should be interpreted. The same delimiter
is used for the header row as for the rest of the data.
The header contains information for each field, with the format: :. The  is used
as the property key for values, and ignored in other cases. The following  settings can be
used for both nodes and relationships:
Property value

Use one of int, long, float, double, boolean, byte, short, char, string to designate
the data type. If no data type is given, this defaults to string. To define an
array type, append [] to the type. By default, array values are separated by ;. A
different delimiter can be specified with --array-delimiter.
Ignore this field completely.

IGNORE

See below for the specifics of node and relationship data source headers.

Nodes
The following field types do additionally apply to node data sources:
ID
LABEL

Each node must have a unique id which is used during the import. The ids are used to find
the correct nodes when creating relationships. Note that the id has to be unique across all
nodes in the import, even nodes with different labels.
Read one or more labels from this field. Like array values, multiple labels are separated by ;,
or by the character specified with --array-delimiter.

Relationships
For relationship data sources, there are three mandatory fields:
TYPE
START_ID
END_ID

The relationship type to use for the relationship.
The id of the start node of the relationship to create.
The id of the end node of the relationship to create.

ID spaces
The import tool assumes that node identifiers are unique across node files. If this isn’t the case then we
can define an id space. Id spaces are defined in the ID field of node files.
For example, to specify the Person id space we would use the field type ID(Person) in our persons
node file. We also need to reference that id space in our relationships file i.e. START_ID(Person) or
END_ID(Person).

532

Import tool

29.2. Command line usage
Linux
Under Unix/Linux/OSX, the command is named neo4j-import. Depending on the installation type, the
tool is either available globally, or used by executing ./bin/neo4j-import from inside the installation
directory.

Windows
For help with running the import tool using Windows PowerShell, see Section 23.3, “Windows
PowerShell module” [442].

Options
--into 

Database directory to import into. Must not contain existing database.

--nodes[:Label1:Label2]
",,…"

Node CSV header and data. Multiple files will be logically seen as
one big file from the perspective of the importer. The first line must
contain the header. Multiple data sources like these can be specified
in one import, where each data source has its own header. Note that
file groups must be enclosed in quotation marks.

-Relationship CSV header and data. Multiple files will be logically seen
relationships[:RELATIONSHIP_TYPE]
as one big file from the perspective of the importer. The first line must
",,…"
contain the header. Multiple data sources like these can be specified
in one import, where each data source has its own header. Note that
file groups must be enclosed in quotation marks.
--delimiter 

Delimiter character, or TAB, between values in CSV data. The default
option is ,.

--array-delimiter 

Delimiter character, or TAB, between array elements within a value in
CSV data. The default option is ;.

--quote 

Character to treat as quotation character for values in CSV data. The
default option is ". Quotes inside quotes escaped like """Go away"", he
said." and "\"Go away\", he said." are supported. If you have set "'"
to be used as the quotation character, you could write the previous
example like this instead: '"Go away", he said.'

--multiline-fields 

Whether or not fields from input source can span multiple lines, i.e.
contain newline characters. Default value: false

--input-encoding 

Character set that input data is encoded in. Provided value must be
one out of the available character sets in the JVM, as provided by
Charset#availableCharsets(). If no input encoding is provided, the
default character set of the JVM will be used.

--ignore-empty-strings


Whether or not empty string fields, i.e. "" from input source are
ignored, i.e. treated as null. Default value: false

--id-type 

One out of [STRING, INTEGER, ACTUAL] and specifies how ids in node/
relationship input files are treated. STRING: arbitrary strings for identifying
nodes. INTEGER: arbitrary integer values for identifying nodes. ACTUAL:
(advanced) actual node ids. The default option is STRING. Default value:
STRING

--processors 

(advanced) Max number of processors used by the importer. Defaults
to the number of available processors reported by the JVM. There is a
533

Import tool
certain amount of minimum threads needed so for that reason there
is no lower bound for this value. For optimal performance this value
shouldn’t be greater than the number of available processors.
--stacktrace 

Enable printing of error stack traces.

--bad-tolerance 

Number of bad entries before the import is considered failed. This
tolerance threshold is about relationships refering to missing nodes.
Format errors in input data are still treated as errors. Default value:
1000

--skip-bad-relationships


Whether or not to skip importing relationships that refers to missing
node ids, i.e. either start or end node id/group referring to node that
wasn’t specified by the node input data. Skipped nodes will be logged,
containing at most number of entites specified by bad-tolerance.
Default value: true

--skip-duplicate-nodes 

Whether or not to skip importing nodes that have the same id/group.
In the event of multiple nodes within the same group having the same
id, the first encountered will be imported whereas consecutive such
nodes will be skipped. Skipped nodes will be logged, containing at
most number of entities specified by bad-tolerance. Default value:
false

--ignore-extra-columns


Whether or not to ignore extra columns in the data not specified
by the header. Skipped columns will be logged, containing at most
number of entities specified by bad-tolerance. Default value: false

--db-config 

(advanced) File specifying database-specific configuration. For more
information consult manual about available configuration options
for a neo4j configuration file. Only configuration affecting store at
time of creation will be read. Examples of supported config are:
dense_node_threshold string_block_size array_block_size

Output and statistics
While an import is running through its different stages, some statistics and figures are printed in the
console. The general interpretation of that output is to look at the horizontal line, which is divided up
into sections, each section representing one type of work going on in parallel with the other sections.
The wider a section is, the more time is spent there relative to the other sections, the widest being the
bottleneck, also marked with *. If a section has a double line, instead of just a single line, it means that
multiple threads are executing the work in that section. To the far right a number is displayed telling
how many entities (nodes or relationships) have been processed by that stage.
As an example:
[*>:20,25 MB/s------------------|PREPARE(3)====================|RELATIONSHIP(2)===============] 16M

Would be interpreted as:
•
•
•
•

> data being read, and perhaps parsed, at 20,25 MB/s, data that is being passed on to …
PREPARE preparing the data for …
RELATIONSHIP creating actual relationship records and …
v writing the relationships to the store. This step isn’t visible in this example, because it’s so cheap

compared to the other sections.

Observing the section sizes can give hints about where performance can be improved. In the example
above, the bottleneck is the data read section (marked with >), which might indicate that the disk is
being slow, or is poorly handling simultaneous read and write operations (since the last section often
revolves around writing to disk).
534

Import tool

Verbose error information
In some cases if an unexpected error occurs it might be useful to supply the command line option -stacktrace to the import (and rerun the import to actually see the additional information). This will have
the error printed with additional debug information, useful for both developers and issue reporting.

535

Import tool

29.3. Import tool examples
Let’s look at a few examples. We’ll use a data set containing movies, actors and roles.

Tip

While you’ll usually want to store your node identifier as a property on the node for looking
it up later, it’s not mandatory. If you don’t want the identifier to be persisted then don’t
specify a property name in the :ID field.

Basic example
First we’ll look at the movies. Each movie has an id, which is used to refer to it in other data sources, a
title and a year Along with these properties we’ll also add the node labels Movie and Sequel.
By default the import tool expects CSV files to be comma delimited.
movies.csv
movieId:ID,title,year:int,:LABEL
tt0133093,"The Matrix",1999,Movie
tt0234215,"The Matrix Reloaded",2003,Movie;Sequel
tt0242653,"The Matrix Revolutions",2003,Movie;Sequel

Next up are the actors. They have an id - in this case a shorthand - and a name and all have the Actor
label.
actors.csv
personId:ID,name,:LABEL
keanu,"Keanu Reeves",Actor
laurence,"Laurence Fishburne",Actor
carrieanne,"Carrie-Anne Moss",Actor

Finally we have the roles that an actor plays in a movie which will be represented by relationships in
the database. In order to create a relationship between nodes we refer to the ids used in actors.csv
and movies.csv in the START_ID and END_ID fields. We also need to provide a relationship type (in this case
ACTED_IN) in the :TYPE field.
roles.csv
:START_ID,role,:END_ID,:TYPE
keanu,"Neo",tt0133093,ACTED_IN
keanu,"Neo",tt0234215,ACTED_IN
keanu,"Neo",tt0242653,ACTED_IN
laurence,"Morpheus",tt0133093,ACTED_IN
laurence,"Morpheus",tt0234215,ACTED_IN
laurence,"Morpheus",tt0242653,ACTED_IN
carrieanne,"Trinity",tt0133093,ACTED_IN
carrieanne,"Trinity",tt0234215,ACTED_IN
carrieanne,"Trinity",tt0242653,ACTED_IN

With all data in place, we execute the following command:
neo4j-import --into path_to_target_directory --nodes movies.csv --nodes actors.csv --relationships roles.csv

We’re now ready to start up a database from the target directory. (see Section 23.2, “Server
Installation” [439])
Once we’ve got the database up and running we can add appropriate indexes. (see Section 3.6, “Labels,
Constraints and Indexes” [32].)

536

Import tool

Tip

It is possible to import only nodes using the import tool - just don’t specify a relationships
file when calling neo4j-import. If you do this you’ll need to create relationships later by
another method - the import tool only works for initial graph population.

Customizing configuration options
We can customize the configuration options that the import tool uses (see the section called
“Options” [533]) if our data doesn’t fit the default format. The following CSV files are delimited by ;, use
| as their array delimiter and use ' for quotes.
movies2.csv
movieId:ID;title;year:int;:LABEL
tt0133093;'The Matrix';1999;Movie
tt0234215;'The Matrix Reloaded';2003;Movie|Sequel
tt0242653;'The Matrix Revolutions';2003;Movie|Sequel

actors2.csv
personId:ID;name;:LABEL
keanu;'Keanu Reeves';Actor
laurence;'Laurence Fishburne';Actor
carrieanne;'Carrie-Anne Moss';Actor

roles2.csv
:START_ID;role;:END_ID;:TYPE
keanu;'Neo';tt0133093;ACTED_IN
keanu;'Neo';tt0234215;ACTED_IN
keanu;'Neo';tt0242653;ACTED_IN
laurence;'Morpheus';tt0133093;ACTED_IN
laurence;'Morpheus';tt0234215;ACTED_IN
laurence;'Morpheus';tt0242653;ACTED_IN
carrieanne;'Trinity';tt0133093;ACTED_IN
carrieanne;'Trinity';tt0234215;ACTED_IN
carrieanne;'Trinity';tt0242653;ACTED_IN

We can then import these files with the following command line options:
neo4j-import --into path_to_target_directory --nodes movies2.csv --nodes actors2.csv --relationships roles2.csv --delimiter
";" --array-delimiter "|" --quote "'"

Using separate header files
When dealing with very large CSV files it’s more convenient to have the header in a separate file. This
makes it easier to edit the header as you avoid having to open a huge data file just to change it.

Tip

import-tool can also process single file compressed archives. e.g. --nodes nodes.csv.gz or -relationships rels.zip

We’ll use the same data as in the previous example but put the headers in separate files.
movies3-header.csv
movieId:ID,title,year:int,:LABEL

movies3.csv
tt0133093,"The Matrix",1999,Movie

537

Import tool
tt0234215,"The Matrix Reloaded",2003,Movie;Sequel
tt0242653,"The Matrix Revolutions",2003,Movie;Sequel

actors3-header.csv
personId:ID,name,:LABEL

actors3.csv
keanu,"Keanu Reeves",Actor
laurence,"Laurence Fishburne",Actor
carrieanne,"Carrie-Anne Moss",Actor

roles3-header.csv
:START_ID,role,:END_ID,:TYPE

roles3.csv
keanu,"Neo",tt0133093,ACTED_IN
keanu,"Neo",tt0234215,ACTED_IN
keanu,"Neo",tt0242653,ACTED_IN
laurence,"Morpheus",tt0133093,ACTED_IN
laurence,"Morpheus",tt0234215,ACTED_IN
laurence,"Morpheus",tt0242653,ACTED_IN
carrieanne,"Trinity",tt0133093,ACTED_IN
carrieanne,"Trinity",tt0234215,ACTED_IN
carrieanne,"Trinity",tt0242653,ACTED_IN

Note how the file groups are enclosed in quotation marks in the command:
neo4j-import --into path_to_target_directory --nodes "movies3-header.csv,movies3.csv" --nodes "actors3-header.csv,actors3.csv"
--relationships "roles3-header.csv,roles3.csv"

Multiple input files
As well as using a separate header file you can also provide multiple nodes or relationships files. This
may be useful when processing the output from a Hadoop pipeline for example. Files within such
an input group can be specified with multiple match strings, delimited by ,, where each match string
can be either: the exact file name or a regular expression matching one or more files. Multiple matching
files will be sorted according to their characters and their natural number sort order for file names
containing numbers.
movies4-header.csv
movieId:ID,title,year:int,:LABEL

movies4-part1.csv
tt0133093,"The Matrix",1999,Movie
tt0234215,"The Matrix Reloaded",2003,Movie;Sequel

movies4-part2.csv
tt0242653,"The Matrix Revolutions",2003,Movie;Sequel

actors4-header.csv
personId:ID,name,:LABEL

actors4-part1.csv
keanu,"Keanu Reeves",Actor
laurence,"Laurence Fishburne",Actor

538

Import tool
actors4-part2.csv
carrieanne,"Carrie-Anne Moss",Actor

roles4-header.csv
:START_ID,role,:END_ID,:TYPE

roles4-part1.csv
keanu,"Neo",tt0133093,ACTED_IN
keanu,"Neo",tt0234215,ACTED_IN
keanu,"Neo",tt0242653,ACTED_IN
laurence,"Morpheus",tt0133093,ACTED_IN
laurence,"Morpheus",tt0234215,ACTED_IN

roles4-part2.csv
laurence,"Morpheus",tt0242653,ACTED_IN
carrieanne,"Trinity",tt0133093,ACTED_IN
carrieanne,"Trinity",tt0234215,ACTED_IN
carrieanne,"Trinity",tt0242653,ACTED_IN

The call to neo4j-import would look like this:
neo4j-import --into path_to_target_directory --nodes "movies4-header.csv,movies4-part1.csv,movies4-part2.csv" --nodes
"actors4-header.csv,actors4-part1.csv,actors4-part2.csv" --relationships "roles4-header.csv,roles4-part1.csv,roles4part2.csv"

Types and labels
Using the same label for every node
If you want to use the same node label(s) for every node in your nodes file you can do this by specifying
the appropriate value as an option to neo4j-import. In this example we’ll put the label Movie on every
node specified in movies5.csv:
movies5.csv
movieId:ID,title,year:int
tt0133093,"The Matrix",1999

Tip

There’s then no need to specify the :LABEL field in the node file if you pass it as a command
line option. If you do then both the label provided in the file and the one provided on the
command line will be added to the node.
In this case, we’ll put the labels Movie and Sequel on the nodes specified in sequels5.csv.
sequels5.csv
movieId:ID,title,year:int
tt0234215,"The Matrix Reloaded",2003
tt0242653,"The Matrix Revolutions",2003

actors5.csv
personId:ID,name
keanu,"Keanu Reeves"
laurence,"Laurence Fishburne"
carrieanne,"Carrie-Anne Moss"

roles5.csv

539

Import tool

:START_ID,role,:END_ID,:TYPE
keanu,"Neo",tt0133093,ACTED_IN
keanu,"Neo",tt0234215,ACTED_IN
keanu,"Neo",tt0242653,ACTED_IN
laurence,"Morpheus",tt0133093,ACTED_IN
laurence,"Morpheus",tt0234215,ACTED_IN
laurence,"Morpheus",tt0242653,ACTED_IN
carrieanne,"Trinity",tt0133093,ACTED_IN
carrieanne,"Trinity",tt0234215,ACTED_IN
carrieanne,"Trinity",tt0242653,ACTED_IN

The call to neo4j-import would look like this:
neo4j-import --into path_to_target_directory --nodes:Movie movies5.csv --nodes:Movie:Sequel sequels5.csv --nodes:Actor
actors5.csv --relationships roles5.csv

Using the same relationship type for every relationship
If you want to use the same relationship type for every relationship in your relationships file you can
do this by specifying the appropriate value as an option to neo4j-import. In this example we’ll put the
relationship type ACTED_IN on every relationship specified in roles6.csv:
movies6.csv
movieId:ID,title,year:int,:LABEL
tt0133093,"The Matrix",1999,Movie
tt0234215,"The Matrix Reloaded",2003,Movie;Sequel
tt0242653,"The Matrix Revolutions",2003,Movie;Sequel

actors6.csv
personId:ID,name,:LABEL
keanu,"Keanu Reeves",Actor
laurence,"Laurence Fishburne",Actor
carrieanne,"Carrie-Anne Moss",Actor

roles6.csv
:START_ID,role,:END_ID
keanu,"Neo",tt0133093
keanu,"Neo",tt0234215
keanu,"Neo",tt0242653
laurence,"Morpheus",tt0133093
laurence,"Morpheus",tt0234215
laurence,"Morpheus",tt0242653
carrieanne,"Trinity",tt0133093
carrieanne,"Trinity",tt0234215
carrieanne,"Trinity",tt0242653

Tip

If you provide a relationship type on the command line and in the relationships file the one
in the file will be applied.

The call to neo4j-import would look like this:
neo4j-import --into path_to_target_directory --nodes movies6.csv --nodes actors6.csv --relationships:ACTED_IN roles6.csv

Property types
The type for properties specified in nodes and relationships files is defined in the header row. (see
Section 29.1, “CSV file header format” [532])
540

Import tool
The following example creates a small graph containing one actor and one movie connected by an
ACTED_IN relationship. There is a roles property on the relationship which contains an array of the
characters played by the actor in a movie.
movies7.csv
movieId:ID,title,year:int,:LABEL
tt0099892,"Joe Versus the Volcano",1990,Movie

actors7.csv
personId:ID,name,:LABEL
meg,"Meg Ryan",Actor

roles7.csv
:START_ID,roles:string[],:END_ID,:TYPE
meg,"DeDe;Angelica Graynamore;Patricia Graynamore",tt0099892,ACTED_IN

The arguments to neo4j-import would be the following:
neo4j-import --into path_to_target_directory --nodes movies7.csv --nodes actors7.csv --relationships roles7.csv

ID handling
Each node processed by neo4j-import must provide a unique id. We use this id to find the correct nodes
when creating relationships.

Working with sequential or auto incrementing identifiers
The import tool makes the assumption that identifiers are unique across node files. This may not be
the case for data sets which use sequential, auto incremented or otherwise colliding identifiers. Those
data sets can define id spaces where identifiers are unique within their respective id space.
For example if movies and people both use sequential identifiers then we would define Movie and Actor
id spaces.
movies8.csv
movieId:ID(Movie),title,year:int,:LABEL
1,"The Matrix",1999,Movie
2,"The Matrix Reloaded",2003,Movie;Sequel
3,"The Matrix Revolutions",2003,Movie;Sequel

actors8.csv
personId:ID(Actor),name,:LABEL
1,"Keanu Reeves",Actor
2,"Laurence Fishburne",Actor
3,"Carrie-Anne Moss",Actor

We also need to reference the appropriate id space in our relationships file so it knows which nodes to
connect together:
roles8.csv
:START_ID(Actor),role,:END_ID(Movie)
1,"Neo",1
1,"Neo",2
1,"Neo",3
2,"Morpheus",1
2,"Morpheus",2
2,"Morpheus",3
3,"Trinity",1
3,"Trinity",2

541

Import tool
3,"Trinity",3

The command line arguments would remain the same as before:
neo4j-import --into path_to_target_directory --nodes movies8.csv --nodes actors8.csv --relationships:ACTED_IN roles8.csv

Bad input data
The import tool has a threshold of how many bad entities (nodes/relationships) to tolerate and skip
before failing the import. By default 1000 bad entities are tolerated. A bad tolerance of 0 will as an
example fail the import on the first bad entity. For more information, see the --bad-tolerance option.
There are different types of bad input, which we will look into.

Relationships referring to missing nodes
Relationships that refer to missing node ids, either for :START_ID or :END_ID are considered bad
relationships. Whether or not such relationships are skipped is controlled with --skip-bad-relationships
flag which can have the values true or false or no value, which means true. Specifying false means that
any bad relationship is considered an error and will fail the import. For more information, see the -skip-bad-relationships option.
In the following example there is a missing emil node referenced in the roles file.
movies9.csv
movieId:ID,title,year:int,:LABEL
tt0133093,"The Matrix",1999,Movie
tt0234215,"The Matrix Reloaded",2003,Movie;Sequel
tt0242653,"The Matrix Revolutions",2003,Movie;Sequel

actors9.csv
personId:ID,name,:LABEL
keanu,"Keanu Reeves",Actor
laurence,"Laurence Fishburne",Actor
carrieanne,"Carrie-Anne Moss",Actor

roles9.csv
:START_ID,role,:END_ID,:TYPE
keanu,"Neo",tt0133093,ACTED_IN
keanu,"Neo",tt0234215,ACTED_IN
keanu,"Neo",tt0242653,ACTED_IN
laurence,"Morpheus",tt0133093,ACTED_IN
laurence,"Morpheus",tt0234215,ACTED_IN
laurence,"Morpheus",tt0242653,ACTED_IN
carrieanne,"Trinity",tt0133093,ACTED_IN
carrieanne,"Trinity",tt0234215,ACTED_IN
carrieanne,"Trinity",tt0242653,ACTED_IN
emil,"Emil",tt0133093,ACTED_IN

The command line arguments would remain the same as before:
neo4j-import --into path_to_target_directory --nodes movies9.csv --nodes actors9.csv --relationships roles9.csv

Since there was only one bad relationship the import process will complete successfully and a notimported.bad file will be created and populated with the bad relationships.
not-imported.bad
InputRelationship:
source: roles9.csv:11
properties: [role, Emil]
startNode: emil

542

Import tool
endNode: tt0133093
type: ACTED_IN
refering to missing node emil

Multiple nodes with same id within same id space
Nodes that specify :ID which has already been specified within the id space are considered bad nodes.
Whether or not such nodes are skipped is controlled with --skip-duplicate-nodes flag which can have
the values true or false or no value, which means true. Specifying false means that any duplicate node
is considered an error and will fail the import. For more information, see the --skip-duplicate-nodes
option.
In the following example there is a node id that is specified twice within the same id space.
actors10.csv
personId:ID,name,:LABEL
keanu,"Keanu Reeves",Actor
laurence,"Laurence Fishburne",Actor
carrieanne,"Carrie-Anne Moss",Actor
laurence,"Laurence Harvey",Actor
neo4j-import --into path_to_target_directory --nodes actors10.csv --skip-duplicate-nodes

Since there was only one bad node the import process will complete successfully and a notimported.bad file will be created and populated with the bad node.
not-imported.bad
Id 'laurence' is defined more than once in global id space, at least at actors10.csv:3 and actors10.csv:5

543

Chapter 30. Web Interface
The Neo4j Web Interface is the primary user interface for Neo4j.
The tool is available at http://127.0.0.1:7474/ after you have installed the Neo4j Server.
See the tool itself for more information!

544

Chapter 31. Neo4j Shell
Neo4j shell is a command-line shell for running Cypher queries. There are also commands to get
information about the database. In addition, you can browse the graph, much like how the Unix shell
along with commands like cd, ls and pwd can be used to browse your local file system.
It’s a nice tool for development and debugging. This guide will show you how to get it going!

545

Neo4j Shell

31.1. Starting the shell
When used together with a Neo4j server, simply issue the following at the command line:
./bin/neo4j-shell

For help with running shell using Windows PowerShell, see Section 23.3, “Windows PowerShell
module” [442].
For the full list of options, see the reference in the Shell manual page.
To connect to a running Neo4j database, use the section called “Read-only mode” [547] for local
databases and see the section called “Enabling the shell server” [546] for remote databases.
You need to make sure that the shell jar file is on the classpath when you start up your Neo4j instance.

Enabling the shell server
Shell is enabled from the main configuration of Neo4j, see Section 24.8, “Configuration Settings
Reference” [464]. Here’s the available settings:
Settings for the remote shell extension

•
•
•
•
•

remote_shell_enabled: Enable a remote shell server which Neo4j Shell clients can log in to.
remote_shell_host: Remote host for shell.
remote_shell_name: The name of the shell.
remote_shell_port: The port the shell will listen on.
remote_shell_read_only: Read only mode.

remote_shell_enabled

Description

Enable a remote shell server which Neo4j Shell clients can log in to.

Valid values

remote_shell_enabled is a boolean.

Default value

false

remote_shell_host

Description

Remote host for shell. By default, the shell server listens only on the loopback
interface, but you can specify the IP address of any network interface or use 0. 0.
0. 0 for all interfaces.

Valid values

remote_shell_host is a string which must be a valid name.

Default value

127. 0. 0. 1

remote_shell_name

Description

The name of the shell.

Valid values

remote_shell_name is a string which must be a valid name.

Default value

shell

remote_shell_port

Description

The port the shell will listen on.

Valid values

remote_shell_port is an integer which must be a valid port number (is in the range
0 to 65535).

Default value

1337

remote_shell_read_only

Description

Read only mode. Will only allow read operations.

Valid values

remote_shell_read_only is a boolean.

546

Neo4j Shell
Default value

false

There are two ways to start the shell, either by connecting to a remote shell server or by pointing it to a
Neo4j store path.

Connecting to a shell server
To start the shell and connect to a running server, run:
neo4j-shell

Alternatively supply -port and -name options depending on how the remote shell server was enabled.
Then you’ll get the shell prompt like this:
neo4j-sh (0)$

Pointing the shell to a path
To start the shell by just pointing it to a Neo4j store path you run the shell jar file. Given that the right
neo4j-kernel-.jar and jta jar files are in the same path as your neo4j-shell-.jar file you
run it with:
$ neo4j-shell -path path/to/neo4j-db

Read-only mode
By issuing the -readonly switch when starting the shell with a store path, changes cannot be made to
the database during the session.
$ neo4j-shell -readonly -path path/to/neo4j-db

Run a command and then exit
It is possible to tell the shell to just start, execute a command and then exit. This opens up for uses of
background jobs and also handling of huge output of f.ex. an ls command where you then could pipe
the output to less or another reader of your choice, or even to a file.
And even to another neo4j-shell, e.g. for importing a dump of another database or cypher result. When
used with command mode the shell will not output a welcome message. So some examples of usage:
$ neo4j-shell -c "cd -a 24 && set name Mattias"
$ neo4j-shell -c "trav -r KNOWS" | less

Pass Neo4j configuration options
By setting the -config switch, you can provide a properties file that will be used to configure your Neo4j
instance, if started in embedded mode.
$ neo4j-shell -config conf/neo4j.properties -path mydb

Execute a file and then exit
To execute commands from a file and then exit just provide a -file filename. This is faster than piping
to the shell which still handles the input as if it was user input.
For example reading a dump file directly from the command line and executing it against the given
database. For example:
$ neo4j-shell -file export.cql > result.txt

Supplying - as the filename reads from stdin instead.

547

Neo4j Shell

31.2. Passing options and arguments
Passing options and arguments to your commands is very similar to many CLI commands in an *nix
environment. Options are prefixed with a - and can contain one or more options. Some options expect
a value to be associated with it. Arguments are string values which aren’t prefixed with -. Let’s look at ls
as an example:
ls -r -f KNOWS:out -v 12345 will make a verbose listing of node 12345's outgoing relationships of type
KNOWS. The node id, 12345, is an argument to ls which tells it to do the listing on that node instead of the
current node (see pwd command). However a shorter version of this can be written:
ls -rfv KNOWS:out 12345. Here all three options are written together after a single - prefix. Even though
f is in the middle it gets associated with the KNOWS:out value. The reason for this is that the ls command
doesn’t expect any values associated with the r or v options. So, it can infer the right values for the

rights options.

548

Neo4j Shell

31.3. Enum options
Some options expect a value which is one of the values in an enum, f.ex. direction part of relationship
type filtering where there’s INCOMING, OUTGOING and BOTH. All such values can be supplied in an easier way.
It’s enough that you write the start of the value and the interpreter will find what you really meant. F.ex.
out, in, i or even INCOMING.

549

Neo4j Shell

31.4. Filters
Some commands make use of filters for varying purposes. F.ex. -f in ls and in trav. A filter
1
is supplied as a json object (w/ or w/o the surrounding {} brackets. Both keys and values
can contain regular expressions for a more flexible matching. An example of a filter could be
.*url.*:http.*neo4j.*,name:Neo4j. The filter option is also accompanied by the options -i and -l which
stands for ignore case (ignore casing of the characters) and loose matching (it’s considered a match even
if the filter value just matches a part of the compared value, not necessarily the entire value). So for a
case-insensitive, loose filter you can supply a filter with -f -i -l or -fil for short.

1

http://www.json.org/

550

Neo4j Shell

31.5. Node titles
To make it easier to navigate your graph the shell can display a title for each node, f.ex. in ls -r. It
will display the relationships as well as the nodes on the other side of the relationships. The title is
displayed together with each node and its best suited property value from a list of property keys.
If you’re standing on a node which has two KNOWS relationships to other nodes it’d be difficult to know
which friend is which. The title feature addresses this by reading a list of property keys and grabbing
the first existing property value of those keys and displays it as a title for the node. So you may specify
a list (with or without regular expressions), f.ex: name,title.*,caption and the title for each node will
be the property value of the first existing key in that list. The list is defined by the client (you) using
the TITLE_KEYS environment variable (see the section called “Environment variables” [553]) and the
default being .*name.*,.*title.*

551

Neo4j Shell

31.6. How to use (individual commands)
The shell is modeled after Unix shells like bash that you use to walk around your local file system. It
has some of the same commands, like cd and ls. When you first start the shell (see instructions above),
you will get a list of all the available commands. Use man  to get more info about a particular
command. Some notes:

Comments
Single line comments, which will be ignored, can be made by using the prefix //. Example:
// This is a comment

Current node/relationship and path
You have a current node/relationship and a "current path" (like a current working directory in bash) that
you’ve traversed so far. When the shell first starts you are not positioned on any entity, but you can
cd your way through the graph (check your current path at any time with the pwd command). cd can be
used in different ways:
• cd  will traverse one relationship to the supplied node id. The node must have a direct
relationship to the current node.
• cd -a  will do an absolute path change, which means the supplied node doesn’t have to
have a direct relationship to the current node.
• cd -r  will traverse to a relationship instead of a node. The relationship must have
the current node as either start or end point. To see the relationship ids use the ls -vr command on
nodes.
• cd -ar  will do an absolute path change which means the relationship can be any
relationship in the graph.
• cd .. will traverse back one step to the previous location, removing the last path item from your
current path (pwd).
• cd start (only if your current location is a relationship). Traverses to the start node of the relationship.
• cd end (only if your current location is a relationship). Traverses to the end node of the relationship.

Listing the contents of a node/relationship
List contents of the current node/relationship (or any other node) with the ls command. Please note
that it will give an empty output if the current node/relationship has no properties or relationships (for
example in the case of a brand new graph). ls can take a node id as argument as well as filters, see
Section 31.4, “Filters” [550] and for information about how to specify direction see Section 31.3, “Enum
options” [549]. Use man ls for more info.

Creating nodes and relationships
You create new nodes by connecting them with relationships to the current node. For example, mkrel t A_RELATIONSHIP_TYPE -d OUTGOING -c will create a new node (-c) and draw to it an OUTGOING relationship
of type A_RELATIONSHIP_TYPE from the current node. If you already have two nodes which you’d like
to draw a relationship between (without creating a new node) you can do for example, mkrel -t
A_RELATIONSHIP_TYPE -d OUTGOING -n  and it will just create a new relationship between
the current node and that other node.

Setting, renaming and removing properties
Property operations are done with the set, mv and rm commands. These commands operates on the
current node/relationship.
Use set  , optionally with the -t option (for value type), to set a property. Supports every
type of value that Neo4j supports. Examples of a property of type int:
$ set -t int age 29

552

Neo4j Shell
And an example of setting a double[] property:
$ set -t double[] my_values [1.4,12.2,13]

Example of setting a String property containing a JSON string:
mkrel -c -d i -t DOMAIN_OF --np "{'app':'foobar'}"

• rm  removes a property.
• mv   renames a property from one key to another.

Deleting nodes and relationships
Deletion of nodes and relationships is done with the rmnode and rmrel commands. rmnode can delete
nodes, if the node to be deleted still has relationships they can also be deleted by supplying -f option.
rmrel can delete relationships, it tries to ensure connectedness in the graph, but relationships can be
deleted regardless with the -f option. rmrel can also delete the node on the other side of the deleted
relationship if it’s left with no more relationships, see -d option.

Environment variables
The shell uses environment variables a-la bash to keep session information, such as the current path
and more. The commands for this mimics the bash commands export and env. For example you can at
anytime issue a export STACKTRACES=true command to set the STACKTRACES environment variable to true.
This will then result in stacktraces being printed if an exception or error should occur. Allowed values
are all parseable JSON strings, so maps {age:10,name:"Mattias"} and arrays [1,2,3] are also supported.
Variables can also be assigned to each other. E.g. a=b will result in a containing the value of b.
This becomes especially interesting as all shell variables are automatically passed to cypher statements
as parameters. That makes it easy to query for certain start nodes or create nodes and relationships
with certain provided properties (as maps).
Values are removed by setting them to null or an empty value. List environment variables using env

Executing groovy/python scripts

2

3

4

The shell has support for executing scripts, such as Groovy and Python (via Jython ). As of now the
scripts (*.groovy, *.py) must exist on the server side and gets called from a client with for example, gsh
--renamePerson 1234 "Mathias" "Mattias" --doSomethingElse where the scripts renamePerson.groovy
and doSomethingElse.groovy must exist on the server side in any of the paths given by the GSH_PATH
environment variable (defaults to .:src:src/script). This variable is like the java classpath, separated by a
:. The python/jython scripts can be executed with the jsh in a similar fashion, however the scripts have
the .py extension and the environment variable for the paths is JSH_PATH.
When writing the scripts assume that there’s made available an args variable (a String[]) which
contains the supplied arguments. In the case of the renamePerson example above the array would
contain ["1234", "Mathias", "Mattias"]. Also please write your outputs to the out variable, such as
out.println( "My tracing text" ) so that it will be printed at the shell client instead of the server.

Traverse
You can traverse the graph with the trav command which allows for simple traversing from the current
node. You can supply which relationship types (w/ regex matching) and optionally direction as well
as property filters for matching nodes. In addition to that you can supply a command line to execute
for each match. An example: trav -o depth -r KNOWS:both,HAS_.*:incoming -c "ls $n". Which means
traverse depth first for relationships with type KNOWS disregarding direction and incoming relationships
with type matching HAS_.\* and do a ls  for each match. The node filtering is supplied
2

http://groovy.codehaus.org
http://www.python.org
4
http://www.jython.org
3

553

Neo4j Shell
with the -f option, see Section 31.4, “Filters” [550]. See Section 31.3, “Enum options” [549] for the
traversal order option. Even relationship types/directions are supplied using the same format as filters.

Query with Cypher
You can use Cypher to query the graph. For that, use the match or start command. You can also use
create statements to create nodes and relationships and use the cypher VERSION prefix to select a
certain cypher version.

Tip

Cypher queries need to be terminated by a semicolon ;.

Cypher commands are given all shell variables as parameters and the special self parameter for the
current node or relationship.
• start n = node(0) return n; will give you a listing of the node with ID 0
• cypher 1.9 start n = node(0) return n; will execute the query with Cypher version 1.9
• START n = node({self}) MATCH (n)-[:KNOWS]->(friend) RETURN friend; will return the nodes connected
to the current node.
• START n=node({me}) CREATE (me)-[r:KNOWS]->(friend {props}); will create the friend and the
relationship according to the variables available.

Listing Indexes and Constraints
The schema command allows to list all existing indexes and constraints together with their current
status.

Note

This command does not list legacy indexes. For working with legacy indexes, please see the
section called “Legacy Indexing” [554].
List all indexes and constraints:
schema

List indexes or constraints on :Person nodes for the property name:
schema -l :Person -p name

The schema command supports the following parameters:
• -l :Label only list indexes or constraints for the given label :Label
• -p propertyKey only list indexes or constraints for the given property key propertyKey
• -v if an index is in the FAILED state, print a verbose error cause if available
Indexes and constraints can be created or removed using Cypher or the Java Core API. They are
updated automatically whenever the graph is changed. See the section called “Schema” [9] for more
information.

Legacy Indexing
It’s possible to query and manipulate legacy indexes via the index command.
Example: index -i persons name (will index the name for the current node or relationship in the
"persons" legacy index).
• -g will do exact lookup in the legacy index and display hits. You can supply -c with a command to be
executed for each hit.
554

Neo4j Shell
• -q will ask the legacy index a query and display hits. You can supply -c with a command to be
executed for each hit.
• --cd will change current location to the hit from the query. It’s just a convenience for using the -c
option.
• --ls will do a listing of the contents for each hit. It’s just a convenience for using the -c option.
• -i will index a key-value pair into a legacy index for the current node/relationship. If no value is given
the property value for that key for the current node is used as value.
• -r will remove a key-value pair (if it exists) from a legacy index for the current node/relationship. Key
and value are optional.
• -t will set the legacy index type to work with, for example index -t Relationship --delete friends will
delete the friends relationship index.

Transactions
It is useful to be able to test changes, and then being able to commit or rollback said changes.
Transactions can be nested. With a nested transaction, a commit does not write any changes to disk,
except for the top level transaction. A rollback, however works regardless of the level of the transaction.
It will roll back all open transactions.
• begin transaction Starts a transaction.
• commit Commits a transaction.
• rollback Rollbacks all open transactions.

Dumping the database or Cypher statement results
Experimental feature

The dump command has incomplete functionality. It might not work for your use case or data
size.
As a simple way of exporting a database or a subset of it, the dump command converts the graph of a
Cypher result or the whole database into a single Cypher create statement.
Examples:
• dump dumps the whole database as single cypher create statement
• dump START n=node({self}) MATCH p=(n)-[r:KNOWS*]->(m) RETURN n,r,m; dumps the transitive friendship
graph of the current node.
• neo4j-shell -path db1 -c 'dump MATCH p=(n:Person {name:"Mattias"})-[r:KNOWS]->(m) RETURN p;' |
neo4j-shell -path db2 -file - imports the subgraph of the first database (db1) into the second (db2)

Example Dump Scripts
 create a new node and go to it
neo4j-sh (?)$ mknode --cd --np "{'name':'Neo'}"
 create a relationship
neo4j-sh (Neo,0)$ mkrel -c -d i -t LIKES --np "{'app':'foobar'}"
 Export the cypher statement results
neo4j-sh (Neo,0)$ dump MATCH (n)-[r]-(m) WHERE n = {self} return n,r,m;
begin
create (_0 {`name`:"Neo"})
create (_1 {`app`:"foobar"})
create _1-[:`LIKES`]->_0
;
commit
 create an index

555

Neo4j Shell
neo4j-sh (?)$ create index on :Person(name);
+-------------------+
| No data returned. |
+-------------------+
Indexes added: 1
35 ms

 create one labeled node and a relationship
neo4j-sh (?)$ create (m:Person:Hacker {name:'Mattias'}), (m)-[:KNOWS]->(m);
+-------------------+
| No data returned. |
+-------------------+
Nodes created: 1
Relationships created: 1
Properties set: 1
Labels added: 2
218 ms

 Export the whole database including indexes
neo4j-sh (?)$ dump
begin
create index on :`Person`(`name`)
create (_0:`Person`:`Hacker` {`name`:"Mattias"})
create _0-[:`KNOWS`]->_0
;
commit

556

Neo4j Shell

31.7. An example shell session
 Create a node
neo4j-sh (?)$ mknode --cd
 where are we?
neo4j-sh (0)$ pwd
Current is (0)
(0)

 On the current node, set the key "name" to value "Jon"
neo4j-sh (0)$ set name "Jon"
 send a cypher query
neo4j-sh (Jon,0)$ match n where id(n) = 0 return n;
+---------------------+
| n
|
+---------------------+
| Node[0]{name:"Jon"} |
+---------------------+
1 row
386 ms

 make an incoming relationship of type LIKES, create the end node with the node properties specified.
neo4j-sh (Jon,0)$ mkrel -c -d i -t LIKES --np "{'app':'foobar'}"
 where are we?
neo4j-sh (Jon,0)$ ls
*name =[Jon]
(me)<-[:LIKES]-(1)

 change to the newly created node
neo4j-sh (Jon,0)$ cd 1
 list relationships, including relationship id
neo4j-sh (1)$ ls -avr
(me)-[:LIKES,0]->(Jon,0)

 create one more KNOWS relationship and the end node
neo4j-sh (1)$ mkrel -c -d i -t KNOWS --np "{'name':'Bob'}"
 print current history stack
neo4j-sh (1)$ pwd
Current is (1)
(Jon,0)-->(1)

 verbose list relationships
neo4j-sh (1)$ ls -avr
(me)-[:LIKES,0]->(Jon,0)
(me)<-[:KNOWS,1]-(Bob,2)

557

Neo4j Shell

31.8. A Matrix example
This example is creating a graph of the characters in the Matrix via the shell and then executing Cypher
queries against it:
Figure 31.1. Shell Matrix Example

ROOT

nam e = 'Thom as Andersson'

KNOWS
age = 3

nam e = 'Morpheus'

KNOWS

KNOWS
age = 90

KNOWS

nam e = 'Cypher'

nam e = 'Trinit y'

KNOWS

nam e = 'Agent Sm it h'

CODED_BY

nam e = 'The Archit ect '

The following is a sample shell session creating the Matrix graph and querying it.
 Create a reference node
neo4j-sh (?)$ mknode --cd
 create the Thomas Andersson node
neo4j-sh (0)$ mkrel -t ROOT -c -v
Node (1) created
Relationship [:ROOT,0] created

 go to the new node
neo4j-sh (0)$ cd 1

558

Neo4j Shell
 set the name property
neo4j-sh (1)$ set name "Thomas Andersson"
 create Thomas direct friends
neo4j-sh (Thomas Andersson,1)$ mkrel -t KNOWS -cv
Node (2) created
Relationship [:KNOWS,1] created

 go to the new node
neo4j-sh (Thomas Andersson,1)$ cd 2
 set the name property
neo4j-sh (2)$ set name "Trinity"
 go back in the history stack
neo4j-sh (Trinity,2)$ cd ..
 create Thomas direct friends
neo4j-sh (Thomas Andersson,1)$ mkrel -t KNOWS -cv
Node (3) created
Relationship [:KNOWS,2] created

 go to the new node
neo4j-sh (Thomas Andersson,1)$ cd 3
 set the name property
neo4j-sh (3)$ set name "Morpheus"
 create relationship to Trinity
neo4j-sh (Morpheus,3)$ mkrel -t KNOWS 2
 list the relationships of node 3
neo4j-sh (Morpheus,3)$ ls -rv
(me)-[:KNOWS,3]->(Trinity,2)
(me)<-[:KNOWS,2]-(Thomas Andersson,1)

 change the current position to relationship 2
neo4j-sh (Morpheus,3)$ cd -r 2
 set the age property on the relationship
neo4j-sh [:KNOWS,2]$ set -t int age 3
 back to Morpheus
neo4j-sh [:KNOWS,2]$ cd ..
 next relationship
neo4j-sh (Morpheus,3)$ cd -r 3
 set the age property on the relationship
neo4j-sh [:KNOWS,3]$ set -t int age 90
 position to the start node of the current relationship
neo4j-sh [:KNOWS,3]$ cd start
 new node
neo4j-sh (Morpheus,3)$ mkrel -t KNOWS -c
 list relationships on the current node
neo4j-sh (Morpheus,3)$ ls -r
(me)-[:KNOWS]->(4)
(me)-[:KNOWS]->(Trinity,2)
(me)<-[:KNOWS]-(Thomas Andersson,1)

559

Neo4j Shell

 go to Cypher
neo4j-sh (Morpheus,3)$ cd 4
 set the name
neo4j-sh (4)$ set name Cypher
 create new node from Cypher
neo4j-sh (Cypher,4)$ mkrel -ct KNOWS
 list relationships
neo4j-sh (Cypher,4)$ ls -r
(me)-[:KNOWS]->(5)
(me)<-[:KNOWS]-(Morpheus,3)

 go to the Agent Smith node
neo4j-sh (Cypher,4)$ cd 5
 set the name
neo4j-sh (5)$ set name "Agent Smith"
 outgoing relationship and new node
neo4j-sh (Agent Smith,5)$ mkrel -cvt CODED_BY
Node (6) created
Relationship [:CODED_BY,6] created

 go there
neo4j-sh (Agent Smith,5)$ cd 6
 set the name
neo4j-sh (6)$ set name "The Architect"
 go to the first node in the history stack
neo4j-sh (The Architect,6)$ cd
 Morpheus' friends, looking up Morpheus by name in the Neo4j autoindex
neo4j-sh (?)$ start morpheus = node:node_auto_index(name='Morpheus') match morpheus-[:KNOWS]-zionist return zionist.name;
+--------------------+
| zionist.name
|
+--------------------+
| "Cypher"
|
| "Trinity"
|
| "Thomas Andersson" |
+--------------------+
3 rows
129 ms

 Morpheus' friends, looking up Morpheus by name in the Neo4j autoindex
neo4j-sh (?)$ cypher 2.2 start morpheus = node:node_auto_index(name='Morpheus') match morpheus-[:KNOWS]-zionist return
zionist.name;
+--------------------+
| zionist.name
|
+--------------------+
| "Cypher"
|
| "Trinity"
|
| "Thomas Andersson" |
+--------------------+
3 rows
997 ms

560

Part VII. Advanced Usage

This part contains information on advanced usage of Neo4j. Among the topics covered are embedding
Neo4j in your own software and writing plugins for the Neo4j Server.
5

You might want to keep the Neo4j JavaDocs handy while reading!

5

http://neo4j.com/docs/2.3.12/javadocs/

32. Extending the Neo4j Server ............................................................................................................. 563
32.1. Server Plugins ........................................................................................................................ 564
32.2. Unmanaged Extensions ........................................................................................................ 568
32.3. Testing your extension .......................................................................................................... 573
32.4. Installing Plugins and Extensions in Neo4j Desktop ............................................................ 575
33. Using Neo4j embedded in Java applications ................................................................................... 576
33.1. Include Neo4j in your project ............................................................................................... 577
33.2. Hello World ............................................................................................................................ 581
33.3. Property values ..................................................................................................................... 584
33.4. User database with indexes ................................................................................................. 585
33.5. User database with legacy index .......................................................................................... 588
33.6. Managing resources when using long running transactions ............................................... 589
33.7. Controlling logging ................................................................................................................ 590
33.8. Basic unit testing ................................................................................................................... 591
33.9. Traversal ................................................................................................................................ 593
33.10. Domain entities ................................................................................................................... 601
33.11. Graph Algorithm examples ................................................................................................. 602
33.12. Reading a management attribute ....................................................................................... 604
33.13. How to create unique nodes .............................................................................................. 605
33.14. Terminating a running transaction ..................................................................................... 607
33.15. Execute Cypher Queries from Java ..................................................................................... 609
33.16. Query Parameters ............................................................................................................... 611
34. The Traversal Framework ................................................................................................................ 613
34.1. Main concepts ....................................................................................................................... 614
34.2. Traversal Framework Java API ............................................................................................... 615
35. Legacy Indexing ................................................................................................................................ 621
35.1. Introduction ........................................................................................................................... 622
35.2. Create .................................................................................................................................... 623
35.3. Delete .................................................................................................................................... 624
35.4. Add ........................................................................................................................................ 625
35.5. Remove .................................................................................................................................. 627
35.6. Update ................................................................................................................................... 628
35.7. Search .................................................................................................................................... 629
35.8. Relationship indexes ............................................................................................................. 631
35.9. Scores .................................................................................................................................... 632
35.10. Configuration and fulltext indexes ..................................................................................... 633
35.11. Extra features for Lucene indexes ...................................................................................... 634
36. Batch Insertion ................................................................................................................................. 636
36.1. Batch Inserter Examples ....................................................................................................... 637
36.2. Index Batch Insertion ............................................................................................................ 639

562

Chapter 32. Extending the Neo4j Server
The Neo4j Server can be extended by either plugins or unmanaged extensions.

563

Extending the Neo4j Server

32.1. Server Plugins
Quick info
• The server’s functionality can be extended by adding plugins.
• Plugins are user-specified code which extend the capabilities of the database, nodes, or
relationships.
• The neo4j server will then advertise the plugin functionality within representations as clients
interact via HTTP.
Plugins provide an easy way to extend the Neo4j REST API with new functionality, without the need to
invent your own API. Think of plugins as server-side scripts that can add functions for retrieving and
manipulating nodes, relationships, paths, properties or indices.

Tip

If you want to have full control over your API, and are willing to put in the effort, and
understand the risks, then Neo4j server also provides hooks for unmanaged extensions
based on JAX-RS.
1

The needed classes reside in the org.neo4j:server-api jar file. See the linked page for downloads and
instructions on how to include it using dependency management. For Maven projects, add the Server
API dependencies in your pom.xml like this:

org.neo4j
server-api
2.3.12

2

To create a plugin, your code must inherit from the ServerPlugin class. Your plugin should also:
• ensure that it can produce an (Iterable of) Node, Relationship or Path, any Java primitive or String or an
instance of a org.neo4j.server.rest.repr.Representation
• specify parameters,
• specify a point of extension and of course
• contain the application logic.
• make sure that the discovery point type in the @PluginTarget and the @Source parameter are of the
same type.

Note

If your plugin class has any constructors defined it must also have a no-arguments
constructor defined.
An example of a plugin which augments the database (as opposed to nodes or relationships) follows:
Get all nodes or relationships plugin
@Description( "An extension to the Neo4j Server for getting all nodes or relationships" )
public class GetAll extends ServerPlugin
{
@Name( "get_all_nodes" )
@Description( "Get all nodes from the Neo4j graph database" )
@PluginTarget( GraphDatabaseService.class )
public Iterable getAllNodes( @Source GraphDatabaseService graphDb )
1
2

http://search.maven.org/#search|gav|1|g%3A%22org.neo4j%22%20AND%20a%3A%22server-api%22
http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/server/plugins/ServerPlugin.html

564

Extending the Neo4j Server
{
ArrayList nodes = new ArrayList<>();
try (Transaction tx = graphDb.beginTx())
{
for ( Node node : GlobalGraphOperations.at( graphDb ).getAllNodes() )
{
nodes.add( node );
}
tx.success();
}
return nodes;
}
@Description( "Get all relationships from the Neo4j graph database" )
@PluginTarget( GraphDatabaseService.class )
public Iterable getAllRelationships( @Source GraphDatabaseService graphDb )
{
List rels = new ArrayList<>();
try (Transaction tx = graphDb.beginTx())
{
for ( Relationship rel : GlobalGraphOperations.at( graphDb ).getAllRelationships() )
{
rels.add( rel );
}
tx.success();
}
return rels;
}
}
3

The full source code is found here: GetAll.java

Find the shortest path between two nodes plugin
public class ShortestPath extends ServerPlugin
{
@Description( "Find the shortest path between two nodes." )
@PluginTarget( Node.class )
public Iterable shortestPath(
@Source Node source,
@Description( "The node to find the shortest path to." )
@Parameter( name = "target" ) Node target,
@Description( "The relationship types to follow when searching for the shortest path(s). " +
"Order is insignificant, if omitted all types are followed." )
@Parameter( name = "types", optional = true ) String[] types,
@Description( "The maximum path length to search for, default value (if omitted) is 4." )
@Parameter( name = "depth", optional = true ) Integer depth )
{
PathExpander expander;
List paths = new ArrayList<>();
if ( types == null )
{
expander = PathExpanders.allTypesAndDirections();
}
else
{
PathExpanderBuilder expanderBuilder = PathExpanderBuilder.empty();
for ( int i = 0; i < types.length; i++ )
{
expanderBuilder = expanderBuilder.add( DynamicRelationshipType.withName( types[i] ) );
}
expander = expanderBuilder.build();
3

https://github.com/neo4j/neo4j/blob/2.3.12/community/server-examples/src/main/java/org/neo4j/examples/server/plugins/
GetAll.java

565

Extending the Neo4j Server
}
try (Transaction tx = source.getGraphDatabase().beginTx())
{
PathFinder shortestPath = GraphAlgoFactory.shortestPath( expander,
depth == null ? 4 : depth.intValue() );
for ( Path path : shortestPath.findAllPaths( source, target ) )
{
paths.add( path );
}
tx.success();
}
return paths;
}
}
4

The full source code is found here: ShortestPath.java

To deploy the code, simply compile it into a .jar file and place it onto the server classpath (which by
convention is the plugins directory under the Neo4j server home directory).

Caution

Whilst Neo4j is tested to run on JVM 8, Neo4j server is currently compiled using JDK 7, to
ensure compatibility for JVM 7 deployments. When compiling plugins for Neo4j Server, we
strongly recommend using JDK 7 also.

Tip

Make sure the directories listings are retained in the jarfile by either building with default
Maven, or with jar -cvf myext.jar *, making sure to jar directories instead of specifying
single files.
The .jar file must include the file META-INF/services/org.neo4j.server.plugins.ServerPlugin with the fully
qualified name of the implementation class. This is an example with multiple entries, each on a
separate line:
org.neo4j.examples.server.plugins.DepthTwo
org.neo4j.examples.server.plugins.GetAll
org.neo4j.examples.server.plugins.ShortestPath

The code above makes an extension visible in the database representation (via the @PluginTarget
annotation) whenever it is served from the Neo4j Server. Simply changing the @PluginTarget parameter
to Node.class or Relationship.class allows us to target those parts of the data model should we wish.
The functionality extensions provided by the plugin are automatically advertised in representations on
the wire. For example, clients can discover the extension implemented by the above plugin easily by
examining the representations they receive as responses from the server, e.g. by performing a GET on
the default database URI:
curl -v http://localhost:7474/db/data/

The response to the GET request will contain (by default) a JSON container that itself contains a
container called "extensions" where the available plugins are listed. In the following case, we only have
the GetAll plugin registered with the server, so only its extension functionality is available. Extension
names will be automatically assigned, based on method names, if not specifically specified using the
@Name annotation.
{
"extensions-info" : "http://localhost:7474/db/data/ext",
"node" : "http://localhost:7474/db/data/node",
"node_index" : "http://localhost:7474/db/data/index/node",
4

https://github.com/neo4j/neo4j/blob/2.3.12/community/server-examples/src/main/java/org/neo4j/examples/server/plugins/
ShortestPath.java

566

Extending the Neo4j Server
"relationship_index" : "http://localhost:7474/db/data/index/relationship",
"reference_node" : "http://localhost:7474/db/data/node/0",
"extensions_info" : "http://localhost:7474/db/data/ext",
"extensions" : {
"GetAll" : {
"get_all_nodes" : "http://localhost:7474/db/data/ext/GetAll/graphdb/get_all_nodes",
"get_all_relationships" : "http://localhost:7474/db/data/ext/GetAll/graphdb/getAllRelationships"
}
}

Performing a GET on one of the two extension URIs gives back the meta information about the service:
curl http://localhost:7474/db/data/ext/GetAll/graphdb/get_all_nodes
{
"extends" : "graphdb",
"description" : "Get all nodes from the Neo4j graph database",
"name" : "get_all_nodes",
"parameters" : [ ]
}

To use it, just POST to this URL, with parameters as specified in the description and encoded as JSON
data content. For example for calling the shortest path extension (URI gotten from a GET to http://
localhost:7474/db/data/node/123):
curl -X POST http://localhost:7474/db/data/ext/ShortestPath/node/123/shortestPath \
-H "Content-Type: application/json" \
-d '{"target":"http://localhost:7474/db/data/node/456", "depth":"5"}'

If everything is OK a response code 200 and a list of zero or more items will be returned. If nothing is
returned (null returned from extension) an empty result and response code 204 will be returned. If the
extension throws an exception response code 500 and a detailed error message is returned.
Extensions that do any kind of database operation will have to manage their own transactions, i.e.
transactions aren’t managed automatically. Note that the results of traversals or execution of graph
algorithms should be exhausted inside the transaction before returning the result.
Through this model, any plugin can naturally fit into the general hypermedia scheme that Neo4j
espouses — meaning that clients can still take advantage of abstractions like Nodes, Relationships and
Paths with a straightforward upgrade path as servers are enriched with plugins (old clients don’t break).

567

Extending the Neo4j Server

32.2. Unmanaged Extensions
Sometimes you’ll want finer grained control over your application’s interactions with Neo4j than cypher
provides. For these situations you can use the unmanaged extension API.

Caution

5

This is a sharp tool, allowing users to deploy arbitrary JAX-RS classes to the server so
be careful when using this. In particular it’s easy to consume lots of heap space on the
server and degrade performance. If in doubt please ask for help via one of the community
channels (see Preface [v]).

Introduction to unmanaged extensions
The first step when writing an unmanaged extension is to create a project which includes dependencies
to the JAX-RS and Neo4j core jars. In Maven this would be achieved by adding the following lines to the
pom file:

javax.ws.rs
javax.ws.rs-api
2.0
provided


org.neo4j
neo4j
2.3.12
provided


Now we’re ready to write our extension.
In our code we’ll interact with the database using GraphDatabaseService which we can get access to by
using the @Context annotation. The following examples serves as a template which you can base your
extension on:
Unmanaged extension example
@Path( "/helloworld" )
public class HelloWorldResource
{
private final GraphDatabaseService database;
public HelloWorldResource( @Context GraphDatabaseService database )
{
this.database = database;
}
@GET
@Produces( MediaType.TEXT_PLAIN )
@Path( "/{nodeId}" )
public Response hello( @PathParam( "nodeId" ) long nodeId )
{
// Do stuff with the database
return Response.status( Status.OK ).entity(
("Hello World, nodeId=" + nodeId).getBytes( Charset.forName("UTF-8") ) ).build();
}
}

5

http://en.wikipedia.org/wiki/JAX-RS

568

Extending the Neo4j Server
The full source code is found here: HelloWorldResource.java

6

Having built your code, the resulting jar file (and any custom dependencies) should be placed in the
$NEO4J_SERVER_HOME/plugins directory. We also need to tell Neo4j where to look for the extension by
adding some configuration in the conf/neo4j-server.properties file:
Comma separated list of JAXRS packages containing JAXRS Resource, one package name for each mountpoint.
org.neo4j.server.thirdparty_jaxrs_classes=org.neo4j.examples.server.unmanaged=/examples/unmanaged

Our hello method will now respond to GET requests at the URI: http://{neo4j_server}:{neo4j_port}/
examples/unmanaged/helloworld/{nodeId}. e.g.
curl http://localhost:7474/examples/unmanaged/helloworld/123

which results in
Hello World, nodeId=123

Caution

Whilst Neo4j is tested to run on JVM 8, Neo4j server is currently compiled using JDK 7, to
ensure compatibility for JVM 7 deployments. When compiling plugins for Neo4j Server, we
strongly recommend using JDK 7 also.

Streaming JSON responses
When writing unmanaged extensions we have greater control over the amount of memory that
our Neo4j queries use. If we keep too much state around it can lead to more frequent full Garbage
Collection and subsequent unresponsiveness by the Neo4j server.
A common way that state can creep in is the creation of JSON objects to represent the result of a
query which we then send back to our application. Neo4j’s Transactional Cypher HTTP endpoint (see
Section 21.1, “Transactional Cypher HTTP endpoint” [298]) streams responses back to the client and we
should follow in its footsteps.
For example, the following unmanaged extension streams an array of a person’s colleagues:
Unmanaged extension streaming example
@Path("/colleagues")
public class ColleaguesResource
{
private GraphDatabaseService graphDb;
private final ObjectMapper objectMapper;
private static final DynamicRelationshipType ACTED_IN = DynamicRelationshipType.withName( "ACTED_IN" );
private static final Label PERSON = DynamicLabel.label( "Person" );
public ColleaguesResource( @Context GraphDatabaseService graphDb )
{
this.graphDb = graphDb;
this.objectMapper = new ObjectMapper();
}
@GET
@Path("/{personName}")
public Response findColleagues( final @PathParam("personName") String personName )
{
StreamingOutput stream = new StreamingOutput()
{
@Override
public void write( OutputStream os ) throws IOException, WebApplicationException
{
JsonGenerator jg = objectMapper.getJsonFactory().createJsonGenerator( os, JsonEncoding.UTF8 );
6

https://github.com/neo4j/neo4j/blob/2.3.12/community/server-examples/src/main/java/org/neo4j/examples/server/
unmanaged/HelloWorldResource.java

569

Extending the Neo4j Server
jg.writeStartObject();
jg.writeFieldName( "colleagues" );
jg.writeStartArray();
try ( Transaction tx = graphDb.beginTx();
ResourceIterator persons = graphDb.findNodes( PERSON, "name", personName ) )
{
while ( persons.hasNext() )
{
Node person = persons.next();
for ( Relationship actedIn : person.getRelationships( ACTED_IN, OUTGOING ) )
{
Node endNode = actedIn.getEndNode();
for ( Relationship colleagueActedIn : endNode.getRelationships( ACTED_IN, INCOMING ) )
{
Node colleague = colleagueActedIn.getStartNode();
if ( !colleague.equals( person ) )
{
jg.writeString( colleague.getProperty( "name" ).toString() );
}
}
}
}
tx.success();
}
jg.writeEndArray();
jg.writeEndObject();
jg.flush();
jg.close();
}
};
return Response.ok().entity( stream ).type( MediaType.APPLICATION_JSON ).build();
}
}
7

The full source code is found here: ColleaguesResource.java

As well as depending on JAX-RS API this example also uses Jackson — a Java JSON library. You’ll need to
add the following dependency to your Maven POM file (or equivalent):

org.codehaus.jackson
jackson-mapper-asl
1.9.7


Our findColleagues method will now respond to GET requests at the URI: http://{neo4j_server}:
{neo4j_port}/examples/unmanaged/colleagues/{personName}. For example:
curl http://localhost:7474/examples/unmanaged/colleagues/Keanu%20Reeves

which results in
{"colleagues":["Hugo Weaving","Carrie-Anne Moss","Laurence Fishburne"]}

Using Cypher in an unmanaged extension
You can execute Cypher queries by using the GraphDatabaseService that is injected into the extension.
7

https://github.com/neo4j/neo4j/blob/2.3.12/community/server-examples/src/main/java/org/neo4j/examples/server/
unmanaged/ColleaguesResource.java

570

Extending the Neo4j Server

Note

In Neo4j versions prior to 2.2 you had to retrieve an ExecutionEngine to execute queries.
This has been deprecated, and we recommend you to update any existing code to use
GraphDatabaseService instead.
For example, the following unmanaged extension retrieves a person’s colleagues using Cypher:
Unmanaged extension Cypher execution example
@Path("/colleagues-cypher-execution")
public class ColleaguesCypherExecutionResource
{
private final ObjectMapper objectMapper;
private GraphDatabaseService graphDb;
public ColleaguesCypherExecutionResource( @Context GraphDatabaseService graphDb )
{
this.graphDb = graphDb;
this.objectMapper = new ObjectMapper();
}
@GET
@Path("/{personName}")
public Response findColleagues( final @PathParam("personName") String personName )
{
final Map params = MapUtil.map( "personName", personName );
StreamingOutput stream = new StreamingOutput()
{
@Override
public void write( OutputStream os ) throws IOException, WebApplicationException
{
JsonGenerator jg = objectMapper.getJsonFactory().createJsonGenerator( os, JsonEncoding.UTF8 );
jg.writeStartObject();
jg.writeFieldName( "colleagues" );
jg.writeStartArray();
try ( Transaction tx = graphDb.beginTx();
Result result = graphDb.execute( colleaguesQuery(), params ) )
{
while ( result.hasNext() )
{
Map row = result.next();
jg.writeString( ((Node) row.get( "colleague" )).getProperty( "name" ).toString() );
}
tx.success();
}
jg.writeEndArray();
jg.writeEndObject();
jg.flush();
jg.close();
}
};
return Response.ok().entity( stream ).type( MediaType.APPLICATION_JSON ).build();
}
private String colleaguesQuery()
{
return "MATCH (p:Person {name: {personName} })-[:ACTED_IN]->()<-[:ACTED_IN]-(colleague) RETURN colleague";
}
8
}

https://github.com/neo4j/neo4j/blob/2.3.12/community/server-examples/src/main/java/org/neo4j/examples/server/
unmanaged/ColleaguesCypherExecutionResource.java

571

Extending the Neo4j Server
8

The full source code is found here: ColleaguesCypherExecutionResource.java

Our findColleagues method will now respond to GET requests at the URI: http://{neo4j_server}:
{neo4j_port}/examples/unmanaged/colleagues-cypher-execution/{personName}. e.g.
curl http://localhost:7474/examples/unmanaged/colleagues-cypher-execution/Keanu%20Reeves

which results in
{"colleagues":["Hugo Weaving","Carrie-Anne Moss","Laurence Fishburne"]}

572

Extending the Neo4j Server

32.3. Testing your extension
Neo4j provides tools to help you write integration tests for your extensions. You can access this toolkit
by adding the following test dependency to your project:

org.neo4j.test
neo4j-harness
2.3.12
test


The test toolkit provides a mechanism to start a Neo4j instance with custom configuration and with
extensions of your choice. It also provides mechanisms to specify data fixtures to include when starting
Neo4j.
Usage example
@Path("")
public static class MyUnmanagedExtension
{
@GET
public Response myEndpoint()
{
return Response.ok().build();
}
}
@Test
public void testMyExtension() throws Exception
{
// Given
try ( ServerControls server = TestServerBuilders.newInProcessBuilder()
.withExtension( "/myExtension", MyUnmanagedExtension.class )
.newServer() )
{
// When
HTTP.Response response = HTTP.GET( server.httpURI().resolve( "myExtension" ).toString() );
// Then
assertEquals( 200, response.status() );
}
}
@Test
public void testMyExtensionWithFunctionFixture() throws Exception
{
// Given
try ( ServerControls server = TestServerBuilders.newInProcessBuilder()
.withExtension( "/myExtension", MyUnmanagedExtension.class )
.withFixture( new Function()
{
@Override
public Void apply( GraphDatabaseService graphDatabaseService ) throws RuntimeException
{
try ( Transaction tx = graphDatabaseService.beginTx() )
{
graphDatabaseService.createNode( DynamicLabel.label( "User" ) );
tx.success();
}
return null;
}
} )
.newServer() )

573

Extending the Neo4j Server
{
// When
Result result = server.graph().execute( "MATCH (n:User) return n" );
// Then
assertEquals( 1, IteratorUtil.count( result ) );
}
}

The full source code of the example is found here: ExtensionTestingDocTest.java

9

Note the use of server.httpURI().resolve( "myExtension" ) to ensure that the correct base URI is used.
If you are using the JUnit test framework, there is a JUnit rule available as well.
JUnit example
@Rule
public Neo4jRule neo4j = new Neo4jRule()
.withFixture( "CREATE (admin:Admin)" )
.withFixture( new Function()
{
@Override
public Void apply( GraphDatabaseService graphDatabaseService ) throws RuntimeException
{
try (Transaction tx = graphDatabaseService.beginTx())
{
graphDatabaseService.createNode( DynamicLabel.label( "Admin" ));
tx.success();
}
return null;
}
} );
@Test
public void shouldWorkWithServer() throws Exception
{
// Given
URI serverURI = neo4j.httpURI();
// When I access the server
HTTP.Response response = HTTP.GET( serverURI.toString() );
// Then it should reply
assertEquals(200, response.status());
// and we have access to underlying GraphDatabaseService
try (Transaction tx = neo4j.getGraphDatabaseService().beginTx()) {
assertEquals( 2, IteratorUtil.count(
neo4j.getGraphDatabaseService().findNodes( DynamicLabel.label( "Admin" ) )
));
tx.success();
}
}

9

https://github.com/neo4j/neo4j/blob/2.3.12/manual/neo4j-harness-test/src/test/java/org/neo4j/harness/doc/
ExtensionTestingDocTest.java

574

Extending the Neo4j Server

32.4. Installing Plugins and Extensions in Neo4j Desktop
Neo4j Desktop can also be extended with server plugins and extensions. Neo4j Desktop will add all jars
in %ProgramFiles%\Neo4j Community\plugins to the classpath, but please note that nested directories for
plugins are currently not supported.
Otherwise server plugins and extensions are subject to the same rules as usual. Please note when
configuring server extensions that neo4j-server.properties for Neo4j Desktop lives in %APPDATA%\Neo4j
Community.

575

Chapter 33. Using Neo4j embedded in Java
applications
It’s easy to use Neo4j embedded in Java applications. In this chapter you will find everything
needed — from setting up the environment to doing something useful with your data.

576

Using Neo4j embedded in Java applications

33.1. Include Neo4j in your project
After selecting the appropriate edition for your platform, embed Neo4j in your Java application by
including the Neo4j library jars in your build. The following sections will show how to do this by either
altering the build path directly or by using dependency management.

Add Neo4j to the build path
Get the Neo4j libraries from one of these sources:
1

• Extract a Neo4j download zip/tarball, and use the jar files found in the lib/ directory.
2
• Use the jar files available from Maven Central Repository
Add the jar files to your project:
JDK tools
Eclipse

IntelliJ IDEA
NetBeans

Append to -classpath
• Right-click on the project and then go Build Path → Configure Build Path. In the
dialog, choose Add External JARs, browse to the Neo4j lib/ directory and select all
of the jar files.
3
• Another option is to use User Libraries .
4
See Libraries, Global Libraries, and the Configure Library dialog
• Right-click on the Libraries node of the project, choose Add JAR/Folder, browse to
the Neo4j lib/ directory and select all of the jar files.
• You can also handle libraries from the project node, see Managing a Project’s
5
Classpath .

Editions
The following table outlines the available editions and their names for use with dependency
management tools.

Tip

Follow the links in the table for details on dependency configuration with Apache Maven,
Apache Buildr, Apache Ivy, Groovy Grape, Grails, Scala SBT!

Neo4j editions
Edition

Dependency
6

Description

License

Community

org.neo4j:neo4j

a high performance,
fully ACID transactional
graph database

GPLv3

Enterprise

org.neo4j:neo4j7
enterprise

adding advanced
monitoring, online
backup and High
Availability clustering

AGPLv3

Note

The listed dependencies do not contain the implementation, but pulls it in transitively.
1

http://neo4j.com/download/
http://search.maven.org/#search|ga|1|g%3A%22org.neo4j%22
3
http://help.eclipse.org/indigo/index.jsp?topic=/org.eclipse.jdt.doc.user/reference/preferences/java/buildpath/ref-preferencesuser-libraries.htm
4
http://www.jetbrains.com/idea/webhelp/configuring-project-and-global-libraries.html
5
http://netbeans.org/kb/docs/java/project-setup.html#projects-classpath
6
http://search.maven.org/#search%7Cgav%7C1%7Cg%3A%22org.neo4j%22%20AND%20a%3A%22neo4j%22
7
http://search.maven.org/#search%7Cgav%7C1%7Cg%3A%22org.neo4j%22%20AND%20a%3A%22neo4j-enterprise%22
2

577

Using Neo4j embedded in Java applications
8

For more information regarding licensing, see the Licensing Guide .
9

Javadocs can be downloaded packaged in jar files from Maven Central or read at javadocs .

Add Neo4j as a dependency
You can either go with the top-level artifact from the table above or include the individual components
directly. The examples included here use the top-level artifact approach.

Maven
Add the dependency to your project along the lines of the snippet below. This is usually done in the
pom.xml file found in the root directory of the project.
Maven dependency

...


org.neo4j
neo4j
2.3.12

...

...


Where the artifactId is found in the editions table.

Eclipse and Maven

10

11

For development in Eclipse , it is recommended to install the m2e plugin and let Maven manage the
project build classpath instead, see above. This also adds the possibility to build your project both via
the command line with Maven and have a working Eclipse setup for development.

Ivy
Make sure to resolve dependencies from Maven Central, for example using this configuration in your
ivysettings.xml file:












With that in place you can add Neo4j to the mix by having something along these lines to your ivy.xml
file:
..

..

8

http://www.neo4j.org/learn/licensing
http://neo4j.com/docs/2.3.12/javadocs/
10
http://www.eclipse.org
11
http://www.eclipse.org/m2e/
9

578

Using Neo4j embedded in Java applications
..

..

Where the name is found in the editions table above

Gradle
The example below shows an example gradle build script for including the Neo4j libraries.
def neo4jVersion = "2.3.12"
apply plugin: 'java'
repositories {
mavenCentral()
}
dependencies {
compile "org.neo4j:neo4j:${neo4jVersion}"
}

Where the coordinates (org.neo4j:neo4j in the example) are found in the editions table above.

Starting and stopping
To create a new database or open an existing one you instantiate a GraphDatabaseService12.
graphDb = new GraphDatabaseFactory().newEmbeddedDatabase( DB_PATH );
registerShutdownHook( graphDb );

Note

The GraphDatabaseService instance can be shared among multiple threads. Note however
that you can’t create multiple instances pointing to the same database.
To stop the database, call the shutdown() method:
graphDb.shutdown();

To make sure Neo4j is shut down properly you can add a shutdown hook:
private static void registerShutdownHook( final GraphDatabaseService graphDb )
{
// Registers a shutdown hook for the Neo4j instance so that it
// shuts down nicely when the VM exits (even if you "Ctrl-C" the
// running application).
Runtime.getRuntime().addShutdownHook( new Thread()
{
@Override
public void run()
{
graphDb.shutdown();
}
} );
}

Starting an embedded database with configuration settings
To start Neo4j with configuration settings, a Neo4j properties file can be loaded like this:
GraphDatabaseService graphDb = new GraphDatabaseFactory()
.newEmbeddedDatabaseBuilder( testDirectory.graphDbDir() )
.loadPropertiesFromFile( pathToConfig + "neo4j.properties" )
.newGraphDatabase();

Configuration settings can also be applied programmatically, like so:
12

http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/GraphDatabaseService.html

579

Using Neo4j embedded in Java applications

GraphDatabaseService graphDb = new GraphDatabaseFactory()
.newEmbeddedDatabaseBuilder( testDirectory.graphDbDir() )
.setConfig( GraphDatabaseSettings.pagecache_memory, "512M" )
.setConfig( GraphDatabaseSettings.string_block_size, "60" )
.setConfig( GraphDatabaseSettings.array_block_size, "300" )
.newGraphDatabase();

For configuration settings, see Chapter 24, Configuration & Performance [448].

Starting an embedded read-only instance
If you want a read-only view of the database, create an instance this way:
graphDb = new GraphDatabaseFactory().newEmbeddedDatabaseBuilder(
"target/read-only-db/location" )
.setConfig( GraphDatabaseSettings.read_only, "true" )
.newGraphDatabase();

Obviously the database has to already exist in this case.

Note

Concurrent access to the same database files by multiple (read-only or write) instances is
not supported.

580

Using Neo4j embedded in Java applications

33.2. Hello World
Learn how to create and access nodes and relationships. For information on project setup, see
Section 33.1, “Include Neo4j in your project” [577].
Remember, from Section 2.1, “The Neo4j Graph Database” [5], that a Neo4j graph consists of:
• Nodes that are connected by
• Relationships, with
• Properties on both nodes and relationships.
All relationships have a type. For example, if the graph represents a social network, a relationship type
could be KNOWS. If a relationship of the type KNOWS connects two nodes, that probably represents two
people that know each other. A lot of the semantics (that is the meaning) of a graph is encoded in the
relationship types of the application. And although relationships are directed they are equally well
traversed regardless of which direction they are traversed.

Tip

13

The source code of this example is found here: EmbeddedNeo4j.java

Prepare the database
Relationship types can be created by using an enum. In this example we only need a single relationship
type. This is how to define it:
private static enum RelTypes implements RelationshipType
{
KNOWS
}

We also prepare some variables to use:
GraphDatabaseService graphDb;
Node firstNode;
Node secondNode;
Relationship relationship;

The next step is to start the database server. Note that if the directory given for the database doesn’t
already exist, it will be created.
graphDb = new GraphDatabaseFactory().newEmbeddedDatabase( DB_PATH );
registerShutdownHook( graphDb );

Note that starting a database server is an expensive operation, so don’t start up a new instance
every time you need to interact with the database! The instance can be shared by multiple threads.
Transactions are thread confined.
As seen, we register a shutdown hook that will make sure the database shuts down when the JVM exits.
Now it’s time to interact with the database.

Wrap operations in a transaction
All operations have to be performed in a transaction. This is a conscious design decision, since we
believe transaction demarcation to be an important part of working with a real enterprise database.
Now, transaction handling in Neo4j is very easy:
try ( Transaction tx = graphDb.beginTx() )
13

https://github.com/neo4j/neo4j/blob/2.3.12/community/embedded-examples/src/main/java/org/neo4j/examples/
EmbeddedNeo4j.java

581

Using Neo4j embedded in Java applications
{
// Database operations go here
tx.success();
}

For more information on transactions, see Chapter 18, Transaction Management [285] and Java API for
14
Transaction .

Note

For brevity, we do not spell out wrapping of operations in a transaction throughout the
manual.

Create a small graph

15

Now, let’s create a few nodes. The API is very intuitive. Feel free to have a look at the Neo4j Javadocs .
They’re included in the distribution, as well. Here’s how to create a small graph consisting of two nodes,
connected with one relationship and some properties:
firstNode = graphDb.createNode();
firstNode.setProperty( "message", "Hello, " );
secondNode = graphDb.createNode();
secondNode.setProperty( "message", "World!" );
relationship = firstNode.createRelationshipTo( secondNode, RelTypes.KNOWS );
relationship.setProperty( "message", "brave Neo4j " );

We now have a graph that looks like this:
Figure 33.1. Hello World Graph
m essage = 'Hello, '

KNOWS
m essage = 'brave Neo4j '

m essage = 'World! '

Print the result
After we’ve created our graph, let’s read from it and print the result.
System.out.print( firstNode.getProperty( "message" ) );
System.out.print( relationship.getProperty( "message" ) );
System.out.print( secondNode.getProperty( "message" ) );

Which will output:
Hello, brave Neo4j World!

Remove the data
In this case we’ll remove the data before committing:
// let's remove the data
firstNode.getSingleRelationship( RelTypes.KNOWS, Direction.OUTGOING ).delete();
14
15

http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/Transaction.html
http://neo4j.com/docs/2.3.12/javadocs/

582

Using Neo4j embedded in Java applications
firstNode.delete();
secondNode.delete();

Note that deleting a node which still has relationships when the transaction commits will fail. This is to
make sure relationships always have a start node and an end node.

Shut down the database server
Finally, shut down the database server when the application finishes:
graphDb.shutdown();

583

Using Neo4j embedded in Java applications

33.3. Property values
Both nodes and relationships can have properties.
Properties are named values where the name is a string. Property values can be either a primitive or an
array of one primitive type. For example String, int and int[] values are valid for properties.
NULL is not a valid property value.
NULLs can instead be modeled by the absence of a key.
Property value types
Type

Description

Value range
true/false

boolean
byte

8-bit integer

-128 to 127, inclusive

short

16-bit integer

-32768 to 32767, inclusive

int

32-bit integer

-2147483648 to 2147483647, inclusive

long

64-bit integer

-9223372036854775808 to
9223372036854775807, inclusive

float

32-bit IEEE 754 floating-point number

double

64-bit IEEE 754 floating-point number

char

16-bit unsigned integers representing
Unicode characters

String

sequence of Unicode characters

u0000 to uffff (0 to 65535)

16

For further details on float/double values, see Java Language Specification .

16

http://docs.oracle.com/javase/specs/jls/se5.0/html/typesValues.html#4.2.3

584

Using Neo4j embedded in Java applications

33.4. User database with indexes
You have a user database, and want to retrieve users by name using indexes.

Tip

The source code used in this example is found here:
17
EmbeddedNeo4jWithNewIndexing.java

To begin with, we start the database server:
GraphDatabaseService graphDb = new GraphDatabaseFactory().newEmbeddedDatabase( DB_PATH );

Then we have to configure the database to index users by name. This only needs to be done once.

Note

Schema changes and data changes are not allowed in the same transaction. Each
transaction must either change the schema or the data, but not both.
IndexDefinition indexDefinition;
try ( Transaction tx = graphDb.beginTx() )
{
Schema schema = graphDb.schema();
indexDefinition = schema.indexFor( DynamicLabel.label( "User" ) )
.on( "username" )
.create();
tx.success();
}

Indexes are populated asynchronously when they are first created. It is possible to use the core API to
wait for index population to complete:
try ( Transaction tx = graphDb.beginTx() )
{
Schema schema = graphDb.schema();
schema.awaitIndexOnline( indexDefinition, 10, TimeUnit.SECONDS );
}

It’s time to add the users:
try ( Transaction tx = graphDb.beginTx() )
{
Label label = DynamicLabel.label( "User" );
// Create some users
for ( int id = 0; id < 100; id++ )
{
Node userNode = graphDb.createNode( label );
userNode.setProperty( "username", "user" + id + "@neo4j.org" );
}
System.out.println( "Users created" );
tx.success();
}

Note

Please read Section 33.6, “Managing resources when using long running
transactions” [589] on how to properly close ResourceIterators returned from index
lookups.
17

https://github.com/neo4j/neo4j/blob/2.3.12/community/embedded-examples/src/main/java/org/neo4j/examples/
EmbeddedNeo4jWithNewIndexing.java

585

Using Neo4j embedded in Java applications
And here’s how to find a user by id:
Label label = DynamicLabel.label( "User" );
int idToFind = 45;
String nameToFind = "user" + idToFind + "@neo4j.org";
try ( Transaction tx = graphDb.beginTx() )
{
try ( ResourceIterator users =
graphDb.findNodes( label, "username", nameToFind ) )
{
ArrayList userNodes = new ArrayList<>();
while ( users.hasNext() )
{
userNodes.add( users.next() );
}
for ( Node node : userNodes )
{
System.out.println( "The username of user " + idToFind + " is " + node.getProperty( "username" ) );
}
}
}

When updating the name of a user, the index is updated as well:
try ( Transaction tx = graphDb.beginTx() )
{
Label label = DynamicLabel.label( "User" );
int idToFind = 45;
String nameToFind = "user" + idToFind + "@neo4j.org";
for ( Node node : loop( graphDb.findNodes( label, "username", nameToFind ) ) )
{
node.setProperty( "username", "user" + ( idToFind + 1 ) + "@neo4j.org" );
}
tx.success();
}

When deleting a user, it is automatically removed from the index:
try ( Transaction tx = graphDb.beginTx() )
{
Label label = DynamicLabel.label( "User" );
int idToFind = 46;
String nameToFind = "user" + idToFind + "@neo4j.org";
for ( Node node : loop( graphDb.findNodes( label, "username", nameToFind ) ) )
{
node.delete();
}
tx.success();
}

In case we change our data model, we can drop the index as well:
try ( Transaction tx = graphDb.beginTx() )
{
Label label = DynamicLabel.label( "User" );
for ( IndexDefinition indexDefinition : graphDb.schema()
.getIndexes( label ) )
{
// There is only one index
indexDefinition.drop();
}

586

Using Neo4j embedded in Java applications
tx.success();
}

587

Using Neo4j embedded in Java applications

33.5. User database with legacy index
Unless you have specific reasons to use the legacy indexing, see Section 33.4, “User database with
indexes” [585] instead.

Note

Please read Section 33.6, “Managing resources when using long running
transactions” [589] on how to properly close ResourceIterators returned from index
lookups.
You have a user database, and want to retrieve users by name using the legacy indexing system.

Tip

18

The source code used in this example is found here: EmbeddedNeo4jWithIndexing.java

We have created two helper methods to handle user names and adding users to the database:
private static String idToUserName( final int id )
{
return "user" + id + "@neo4j.org";
}
private static Node createAndIndexUser( final String username )
{
Node node = graphDb.createNode();
node.setProperty( USERNAME_KEY, username );
nodeIndex.add( node, USERNAME_KEY, username );
return node;
}

The next step is to start the database server:
graphDb = new GraphDatabaseFactory().newEmbeddedDatabase( DB_PATH );
registerShutdownHook();

It’s time to add the users:
try ( Transaction tx = graphDb.beginTx() )
{
nodeIndex = graphDb.index().forNodes( "nodes" );
// Create some users and index their names with the IndexService
for ( int id = 0; id < 100; id++ )
{
createAndIndexUser( idToUserName( id ) );
}

And here’s how to find a user by Id:
int idToFind = 45;
String userName = idToUserName( idToFind );
Node foundUser = nodeIndex.get( USERNAME_KEY, userName ).getSingle();
System.out.println( "The username of user " + idToFind + " is "
+ foundUser.getProperty( USERNAME_KEY ) );

18

https://github.com/neo4j/neo4j/blob/2.3.12/community/embedded-examples/src/main/java/org/neo4j/examples/
EmbeddedNeo4jWithIndexing.java

588

Using Neo4j embedded in Java applications

33.6. Managing resources when using long running
transactions
It is necessary to always open a transaction when accessing the database. Inside a long running
transaction it is good practice to ensure that any ResourceIterator19s obtained inside the transaction are
closed as early as possible. This is either achieved by just exhausting the iterator or by explicitly calling
its close method.
What follows is an example of how to work with a ResourceIterator. As we don’t exhaust the iterator, we
will close it explicitly using the close() method.
Label label = DynamicLabel.label( "User" );
int idToFind = 45;
String nameToFind = "user" + idToFind + "@neo4j.org";
try ( Transaction tx = graphDb.beginTx();
ResourceIterator users = graphDb.findNodes( label, "username", nameToFind ) )
{
Node firstUserNode;
if ( users.hasNext() )
{
firstUserNode = users.next();
}
users.close();
}

19

http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/ResourceIterator.html

589

Using Neo4j embedded in Java applications

33.7. Controlling logging
To control logging in Neo4j embedded, use the Neo4j embedded logging framework.
Neo4j embedded provides logging via its own org.neo4j.logging.Log20 layer, and does not natively use
any existing Java logging framework. All logging events produced by Neo4j have a name, a level and a
message. The name is a FQCN (fully qualified class name).
Neo4j uses the following log levels:
ERROR

For serious errors that are almost always fatal

WARN

For events that are serious, but not fatal

INFO

Informational events

DEBUG

Debugging events

To enable logging, an implementation of org.neo4j.logging.LogProvider21 must be provided to the
GraphDatabaseFactory22, as follows:
LogProvider logProvider = new MyCustomLogProvider( output );
graphDb = new GraphDatabaseFactory().setUserLogProvider( logProvider ).newEmbeddedDatabase( DB_PATH );

Neo4j also includes a binding for SLF4J, which is available in the neo4j-slf4j library jar. This can be
obtained via Maven:

...


org.neo4j
neo4j-slf4j
2.3.12


org.slf4j
slf4j-api

...

...


To use this binding, simply pass an instance of org.neo4j.logging.slf4j.Slf4jLogProvider23 to the
GraphDatabaseFactory24, as follows:
graphDb = new GraphDatabaseFactory().setUserLogProvider( new Slf4jLogProvider() ).newEmbeddedDatabase( DB_PATH );

All log output can then be controlled via SLF4J configuration.

20

http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/logging/Log.html
http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/logging/LogProvider.html
22
http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/factory/GraphDatabaseFactory.html
23
http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/logging/slf4j/Slf4jLogProvider.html
24
http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/factory/GraphDatabaseFactory.html
21

590

Using Neo4j embedded in Java applications

33.8. Basic unit testing
The basic pattern of unit testing with Neo4j is illustrated by the following example.
To access the Neo4j testing facilities you should have the neo4j-kernel tests.jar together with the neo4jio tests.jar on the classpath during tests. You can download them from Maven Central: org.neo4j:neo4j25
26
kernel and org.neo4j:neo4j-io .
Using Maven as a dependency manager you would typically add this dependency together with JUnit
and Hamcrest like so:
Maven dependency

...


org.neo4j
neo4j-kernel
2.3.12
test-jar
test


org.neo4j
neo4j-io
2.3.12
test-jar
test


junit
junit
4.12
test


org.hamcrest
hamcrest-all
1.3
test

...

...


Observe that the test-jar is crucial. Without it you would get the common neo4j-kernel
jar, not the one containing the testing facilities.
With that in place, we’re ready to code our tests.

Tip

27

For the full source code of this example see: Neo4jBasicDocTest.java

Before each test, create a fresh database:
@Before
25

http://search.maven.org/#search|ga|1|g%3A%22org.neo4j%22%20AND%20a%3A%22neo4j-kernel%22
http://search.maven.org/#search|ga|1|g%3A%22org.neo4j%22%20AND%20a%3A%22neo4j-io%22
27
https://github.com/neo4j/neo4j/blob/2.3.12/community/embedded-examples/src/test/java/org/neo4j/examples/
Neo4jBasicDocTest.java
26

591

Using Neo4j embedded in Java applications
public void prepareTestDatabase()
{
graphDb = new TestGraphDatabaseFactory().newImpermanentDatabase();
}

After the test has executed, the database should be shut down:
@After
public void destroyTestDatabase()
{
graphDb.shutdown();
}

During a test, create nodes and check to see that they are there, while enclosing write operations in a
transaction.
Node n = null;
try ( Transaction tx = graphDb.beginTx() )
{
n = graphDb.createNode();
n.setProperty( "name", "Nancy" );
tx.success();
}
// The node should have a valid id
assertThat( n.getId(), is( greaterThan( -1L ) ) );
// Retrieve a node by using the id of the created node. The id's and
// property should match.
try ( Transaction tx = graphDb.beginTx() )
{
Node foundNode = graphDb.getNodeById( n.getId() );
assertThat( foundNode.getId(), is( n.getId() ) );
assertThat( (String) foundNode.getProperty( "name" ), is( "Nancy" ) );
}

If you want to set configuration parameters at database creation, it’s done like this:
GraphDatabaseService db = new TestGraphDatabaseFactory()
.newImpermanentDatabaseBuilder()
.setConfig( GraphDatabaseSettings.pagecache_memory, "512M" )
.setConfig( GraphDatabaseSettings.string_block_size, "60" )
.setConfig( GraphDatabaseSettings.array_block_size, "300" )
.newGraphDatabase();

592

Using Neo4j embedded in Java applications

33.9. Traversal
For reading about traversals, see Chapter 34, The Traversal Framework [613].
For more examples of traversals, see Chapter 5, Basic Data Modeling Examples [47].

The Matrix
This is the first graph we want to traverse into:
Figure 33.2. Matrix node space view

Tip

The source code of this example is found here: NewMatrix.java

28

Friends and friends of friends
private Traverser getFriends(
final Node person )
{
TraversalDescription td = graphDb.traversalDescription()
.breadthFirst()
.relationships( RelTypes.KNOWS, Direction.OUTGOING )
.evaluator( Evaluators.excludeStartPosition() );
return td.traverse( person );
}

Let’s perform the actual traversal and print the results:
int numberOfFriends = 0;
String output = neoNode.getProperty( "name" ) + "'s friends:\n";
Traverser friendsTraverser = getFriends( neoNode );
for ( Path friendPath : friendsTraverser )
{
output += "At depth " + friendPath.length() + " => "
+ friendPath.endNode()
28

https://github.com/neo4j/neo4j/blob/2.3.12/community/embedded-examples/src/main/java/org/neo4j/examples/
NewMatrix.java

593

Using Neo4j embedded in Java applications
.getProperty( "name" ) + "\n";
numberOfFriends++;
}
output += "Number of friends found: " + numberOfFriends + "\n";

Which will give us the following output:
Thomas Anderson's friends:
At depth 1 => Morpheus
At depth 1 => Trinity
At depth 2 => Cypher
At depth 3 => Agent Smith
Number of friends found: 4

Who coded the Matrix?
private Traverser findHackers( final Node startNode )
{
TraversalDescription td = graphDb.traversalDescription()
.breadthFirst()
.relationships( RelTypes.CODED_BY, Direction.OUTGOING )
.relationships( RelTypes.KNOWS, Direction.OUTGOING )
.evaluator(
Evaluators.includeWhereLastRelationshipTypeIs( RelTypes.CODED_BY ) );
return td.traverse( startNode );
}

Print out the result:
String output = "Hackers:\n";
int numberOfHackers = 0;
Traverser traverser = findHackers( getNeoNode() );
for ( Path hackerPath : traverser )
{
output += "At depth " + hackerPath.length() + " => "
+ hackerPath.endNode()
.getProperty( "name" ) + "\n";
numberOfHackers++;
}
output += "Number of hackers found: " + numberOfHackers + "\n";

Now we know who coded the Matrix:
Hackers:
At depth 4 => The Architect
Number of hackers found: 1

Walking an ordered path
This example shows how to use a path context holding a representation of a path.

Tip

The source code of this example is found here: OrderedPath.java

29

Create a toy graph
Node A = db.createNode();
Node B = db.createNode();
Node C = db.createNode();
29

https://github.com/neo4j/neo4j/blob/2.3.12/community/embedded-examples/src/main/java/org/neo4j/examples/orderedpath/
OrderedPath.java

594

Using Neo4j embedded in Java applications
Node D = db.createNode();
A.createRelationshipTo(
C.createRelationshipTo(
A.createRelationshipTo(
B.createRelationshipTo(

C,
D,
B,
C,

REL2
REL3
REL1
REL2

);
);
);
);

A

REL1

B

REL2

REL2

C

REL3

D

Now, the order of relationships (REL1 → REL2 → REL3) is stored in an ArrayList. Upon traversal, the
Evaluator can check against it to ensure that only paths are included and returned that have the
predefined order of relationships:
Define how to walk the path
final ArrayList orderedPathContext = new ArrayList();
orderedPathContext.add( REL1 );
orderedPathContext.add( withName( "REL2" ) );
orderedPathContext.add( withName( "REL3" ) );
TraversalDescription td = db.traversalDescription()
.evaluator( new Evaluator()
{
@Override
public Evaluation evaluate( final Path path )
{
if ( path.length() == 0 )
{
return Evaluation.EXCLUDE_AND_CONTINUE;
}
RelationshipType expectedType = orderedPathContext.get( path.length() - 1 );
boolean isExpectedType = path.lastRelationship()
.isType( expectedType );
boolean included = path.length() == orderedPathContext.size() && isExpectedType;
boolean continued = path.length() < orderedPathContext.size() && isExpectedType;
return Evaluation.of( included, continued );
}
} )
.uniqueness( Uniqueness.NODE_PATH );

595

Using Neo4j embedded in Java applications
Note that we set the uniqueness to Uniqueness.NODE_PATH30 as we want to be able to revisit the same
node dureing the traversal, but not the same path.
Perform the traversal and print the result
Traverser traverser = td.traverse( A );
PathPrinter pathPrinter = new PathPrinter( "name" );
for ( Path path : traverser )
{
output += Paths.pathToString( path, pathPrinter );
}

Which will output:
(A)--[REL1]-->(B)--[REL2]-->(C)--[REL3]-->(D)

In this case we use a custom class to format the path output. This is how it’s done:
static class PathPrinter implements Paths.PathDescriptor
{
private final String nodePropertyKey;
public PathPrinter( String nodePropertyKey )
{
this.nodePropertyKey = nodePropertyKey;
}
@Override
public String nodeRepresentation( Path path, Node node )
{
return "(" + node.getProperty( nodePropertyKey, "" ) + ")";
}
@Override
public String relationshipRepresentation( Path path, Node from, Relationship relationship )
{
String prefix = "--", suffix = "--";
if ( from.equals( relationship.getEndNode() ) )
{
prefix = "<--";
}
else
{
suffix = "-->";
}
return prefix + "[" + relationship.getType().name() + "]" + suffix;
}
}

Uniqueness of Paths in traversals
This example is demonstrating the use of node uniqueness. Below an imaginary domain graph with
Principals that own pets that are descendant to other pets.

30

http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/traversal/Uniqueness.html#NODE_PATH

596

Using Neo4j embedded in Java applications
Figure 33.3. Descendants Example Graph
Node[ 4]

Node[ 2]

nam e = 'Principal1'

owns

owns

Node[ 5]

nam e = 'Pet 0'

descendant

descendant

nam e = 'Principal2'

descendant

Node[ 0]

Node[ 3]

Node[ 1]

nam e = 'Pet 1'

nam e = 'Pet 3'

nam e = 'Pet 2'

owns

In order to return all descendants of Pet0 which have the relation owns to Principal1 (Pet1 and Pet3), the
Uniqueness of the traversal needs to be set to NODE_PATH rather than the default NODE_GLOBAL so that
nodes can be traversed more that once, and paths that have different nodes but can have some nodes
in common (like the start and end node) can be returned.
final Node target = data.get().get( "Principal1" );
TraversalDescription td = db.traversalDescription()
.uniqueness( Uniqueness.NODE_PATH )
.evaluator( new Evaluator()
{
@Override
public Evaluation evaluate( Path path )
{
boolean endNodeIsTarget = path.endNode().equals( target );
return Evaluation.of( endNodeIsTarget, !endNodeIsTarget );
}
} );
Traverser results = td.traverse( start );

This will return the following paths:
(2)--[descendant,2]-->(3)<--[owns,5]--(4)
(2)--[descendant,0]-->(0)<--[owns,3]--(4)

In the default path.toString() implementation, (1)--[knows,2]-->(4) denotes a node with ID=1 having a
relationship with ID 2 or type knows to a node with ID-4.
Let’s create a new TraversalDescription from the old one, having NODE_GLOBAL uniqueness to see the
difference.

Tip

The TraversalDescription object is immutable, so we have to use the new instance returned
with the new uniqueness setting.

TraversalDescription nodeGlobalTd = td.uniqueness( Uniqueness.NODE_GLOBAL );
results = nodeGlobalTd.traverse( start );

Now only one path is returned:
(2)--[descendant,2]-->(3)<--[owns,5]--(4)

Social network
Note

The following example uses the new enhanced traversal API.

597

Using Neo4j embedded in Java applications
Social networks (know as social graphs out on the web) are natural to model with a graph. This example
shows a very simple social model that connects friends and keeps track of status updates.

Tip

31

The source code of the example is found here: socnet

Simple social model
Figure 33.4. Social network data model

The data model for a social network is pretty simple: Persons with names and StatusUpdates with
timestamped text. These entities are then connected by specific relationships.
• Person
• friend: relates two distinct Person instances (no self-reference)
• status: connects to the most recent StatusUpdate
• StatusUpdate
• next: points to the next StatusUpdate in the chain, which was posted before the current one

Status graph instance
The StatusUpdate list for a Person is a linked list. The head of the list (the most recent status) is found by
following status. Each subsequent StatusUpdate is connected by next.
Here’s an example where Andreas Kollegger micro-blogged his way to work in the morning:

31

https://github.com/neo4j/neo4j/tree/2.3.12/community/embedded-examples/src/main/java/org/neo4j/examples/socnet

598

Using Neo4j embedded in Java applications

Andreas Kollegger

st at us

st art ed designing t his graph m odel
9:30 am

next

rode m y awesom e Skeppshult t o work
8:45 am

next

is get t ing used t o m uesli for breakfast
8:00 am

To read the status updates, we can create a traversal, like so:
TraversalDescription traversal = graphDb().traversalDescription()
.depthFirst()
.relationships( NEXT );

This gives us a traverser that will start at one StatusUpdate, and will follow the chain of updates until
they run out. Traversers are lazy loading, so it’s performant even when dealing with thousands of
statuses — they are not loaded until we actually consume them.

Activity stream
Once we have friends, and they have status messages, we might want to read our friends status'
messages, in reverse time order — latest first. To do this, we go through these steps:
1.
2.
3.
4.
5.

Gather all friend’s status update iterators in a list — latest date first.
Sort the list.
Return the first item in the list.
If the first iterator is exhausted, remove it from the list. Otherwise, get the next item in that iterator.
Go to step 2 until there are no iterators left in the list.
32

Animated, the sequence looks like this .
The code looks like:
PositionedIterator first = statuses.get(0);
StatusUpdate returnVal = first.current();
if ( !first.hasNext() )
{
statuses.remove( 0 );
}
else
32

http://www.slideshare.net/systay/pattern-activity-stream

599

Using Neo4j embedded in Java applications
{
first.next();
sort();
}
return returnVal;

600

Using Neo4j embedded in Java applications

33.10. Domain entities
This page demonstrates one way to handle domain entities when using Neo4j. The principle at use is to
wrap the entities around a node (the same approach can be used with relationships as well).

Tip

33

The source code of the examples is found here: Person.java

First off, store the node and make it accessible inside the package:
private final Node underlyingNode;
Person( Node personNode )
{
this.underlyingNode = personNode;
}
protected Node getUnderlyingNode()
{
return underlyingNode;
}

Delegate attributes to the node:
public String getName()
{
return (String)underlyingNode.getProperty( NAME );
}

Make sure to override these methods:
@Override
public int hashCode()
{
return underlyingNode.hashCode();
}
@Override
public boolean equals( Object o )
{
return o instanceof Person &&
underlyingNode.equals( ( (Person)o ).getUnderlyingNode() );
}
@Override
public String toString()
{
return "Person[" + getName() + "]";
}

33

https://github.com/neo4j/neo4j/blob/2.3.12/community/embedded-examples/src/main/java/org/neo4j/examples/socnet/
Person.java

601

Using Neo4j embedded in Java applications

33.11. Graph Algorithm examples
34

For details on the graph algorithm usage, see the Javadocs .

Tip

The source code used in the example is found here: PathFindingDocTest.java

35

Calculating the shortest path (least number of relationships) between two nodes:
Node startNode = graphDb.createNode();
Node middleNode1 = graphDb.createNode();
Node middleNode2 = graphDb.createNode();
Node middleNode3 = graphDb.createNode();
Node endNode = graphDb.createNode();
createRelationshipsBetween( startNode, middleNode1, endNode );
createRelationshipsBetween( startNode, middleNode2, middleNode3, endNode );
// Will find the shortest path between startNode and endNode via
// "MY_TYPE" relationships (in OUTGOING direction), like f.ex:
//
// (startNode)-->(middleNode1)-->(endNode)
//
PathFinder finder = GraphAlgoFactory.shortestPath(
PathExpanders.forTypeAndDirection( ExampleTypes.MY_TYPE, Direction.OUTGOING ), 15 );
Iterable paths = finder.findAllPaths( startNode, endNode );
36

Using Dijkstra’s algorithm to calculate cheapest path between node A and B where each relationship
can have a weight (i.e. cost) and the path(s) with least cost are found.
PathFinder finder = GraphAlgoFactory.dijkstra(
PathExpanders.forTypeAndDirection( ExampleTypes.MY_TYPE, Direction.BOTH ), "cost" );
WeightedPath path = finder.findSinglePath( nodeA, nodeB );
// Get the weight for the found path
path.weight();
37

Using A* to calculate the cheapest path between node A and B, where cheapest is for example the
path in a network of roads which has the shortest length between node A and B. Here’s our example
graph:

34

http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphalgo/GraphAlgoFactory.html
https://github.com/neo4j/neo4j/blob/2.3.12/community/embedded-examples/src/test/java/org/neo4j/examples/
PathFindingDocTest.java
36
http://en.wikipedia.org/wiki/Dijkstra%27s_algorithm
37
http://en.wikipedia.org/wiki/A*_search_algorithm
35

602

Using Neo4j embedded in Java applications

Node nodeA =
Node nodeB =
Node nodeC =
Relationship
Relationship
Relationship

createNode( "name", "A", "x", 0d, "y", 0d
createNode( "name", "B", "x", 7d, "y", 0d
createNode( "name", "C", "x", 2d, "y", 1d
relAB = createRelationship( nodeA, nodeC,
relBC = createRelationship( nodeC, nodeB,
relAC = createRelationship( nodeA, nodeB,

);
);
);
"length", 2d );
"length", 3d );
"length", 10d );

EstimateEvaluator estimateEvaluator = new EstimateEvaluator()
{
@Override
public Double getCost( final Node node, final Node goal )
{
double dx = (Double) node.getProperty( "x" ) - (Double) goal.getProperty( "x" );
double dy = (Double) node.getProperty( "y" ) - (Double) goal.getProperty( "y" );
double result = Math.sqrt( Math.pow( dx, 2 ) + Math.pow( dy, 2 ) );
return result;
}
};
PathFinder astar = GraphAlgoFactory.aStar(
PathExpanders.allTypesAndDirections(),
CommonEvaluators.doubleCostEvaluator( "length" ), estimateEvaluator );
WeightedPath path = astar.findSinglePath( nodeA, nodeB );

603

Using Neo4j embedded in Java applications

33.12. Reading a management attribute
38

The JmxUtils class includes methods to access Neo4j management beans. The common JMX service
can be used as well, but from your code you probably rather want to use the approach outlined here.

Tip

The source code of the example is found here: JmxDocTest.java

39

This example shows how to get the start time of a database:
private static Date getStartTimeFromManagementBean(
GraphDatabaseService graphDbService )
{
ObjectName objectName = JmxUtils.getObjectName( graphDbService, "Kernel" );
Date date = JmxUtils.getAttribute( objectName, "KernelStartTime" );
return date;
}

Depending on which Neo4j edition you are using different sets of management beans are available.
40

• For all editions, see the org.neo4j.jmx package.
41
• For the Enterprise edition, see the org.neo4j.management package as well.

38

http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/jmx/JmxUtils.html
https://github.com/neo4j/neo4j/blob/2.3.12/community/embedded-examples/src/test/java/org/neo4j/examples/
JmxDocTest.java
40
http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/jmx/package-summary.html
41
http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/management/package-summary.html
39

604

Using Neo4j embedded in Java applications

33.13. How to create unique nodes
This section is about how to ensure uniqueness of a property when creating nodes. For an overview of
the topic, see Section 18.6, “Creating unique nodes” [293].

Get or create unique node using Cypher and unique constraints
Create a unique constraint
try ( Transaction tx = graphdb.beginTx() )
{
graphdb.schema()
.constraintFor( DynamicLabel.label( "User" ) )
.assertPropertyIsUnique( "name" )
.create();
tx.success();
}

Use MERGE to create a unique node
Node result = null;
ResourceIterator resultIterator = null;
try ( Transaction tx = graphDb.beginTx() )
{
String queryString = "MERGE (n:User {name: {name}}) RETURN n";
Map parameters = new HashMap<>();
parameters.put( "name", username );
resultIterator = graphDb.execute( queryString, parameters ).columnAs( "n" );
result = resultIterator.next();
tx.success();
return result;
}

Get or create unique node using a legacy index
Important

While this is a working solution, please consider using the preferred solution at the section
called “Get or create unique node using Cypher and unique constraints” [605] instead.
42

By using put-if-absent

functionality, entity uniqueness can be guaranteed using an index.

Here the index acts as the lock and will only lock the smallest part needed to guarantee uniqueness
across threads and transactions. To get the more high-level get-or-create functionality make use of
43
UniqueFactory as seen in the example below.
Create a factory for unique nodes at application start
try ( Transaction tx = graphDb.beginTx() )
{
UniqueFactory.UniqueNodeFactory result = new UniqueFactory.UniqueNodeFactory( graphDb, "users" )
{
@Override
protected void initialize( Node created, Map properties )
{
created.addLabel( DynamicLabel.label( "User" ) );
created.setProperty( "name", properties.get( "name" ) );
}
};
tx.success();
42

http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/index/Index.html#putIfAbsent%28T,%20java.lang.String,
%20java.lang.Object%29
43
http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/index/UniqueFactory.html

605

Using Neo4j embedded in Java applications
return result;
}

Use the unique node factory to get or create a node
try ( Transaction tx = graphDb.beginTx() )
{
Node node = factory.getOrCreate( "name", username );
tx.success();
return node;
}

Pessimistic locking for node creation
Important

While this is a working solution, please consider using the preferred solution at the section
called “Get or create unique node using Cypher and unique constraints” [605] instead.
One might be tempted to use Java synchronization for pessimistic locking, but this is dangerous. By
mixing locks in Neo4j and in the Java runtime, it is easy to produce deadlocks that are not detectable
by Neo4j. As long as all locking is done by Neo4j, all deadlocks will be detected and avoided. Also, a
solution using manual synchronization doesn’t ensure uniqueness in an HA environment.
This example uses a single “lock node” for locking. We create it only as a place to put locks, nothing else.
Create a lock node at application start
try ( Transaction tx = graphDb.beginTx() )
{
final Node lockNode = graphDb.createNode();
tx.success();
return lockNode;
}

Use the lock node to ensure nodes are not created concurrently
try ( Transaction tx = graphDb.beginTx() )
{
Index usersIndex = graphDb.index().forNodes( "users" );
Node userNode = usersIndex.get( "name", username ).getSingle();
if ( userNode != null )
{
return userNode;
}
tx.acquireWriteLock( lockNode );
userNode = usersIndex.get( "name", username ).getSingle();
if ( userNode == null )
{
userNode = graphDb.createNode( DynamicLabel.label( "User" ) );
usersIndex.add( userNode, "name", username );
userNode.setProperty( "name", username );
}
tx.success();
return userNode;
}

Note that finishing the transaction will release the lock on the lock node.

606

Using Neo4j embedded in Java applications

33.14. Terminating a running transaction
Sometimes you may want to terminate (abort) a long running transaction from another thread.

Tip

44

The source code used in this example is found here: TerminateTransactions.java

To begin with, we start the database server:
GraphDatabaseService graphDb = new GraphDatabaseFactory().newEmbeddedDatabase( DB_PATH );

Now we start creating an infinite binary tree of nodes in the database, as an example of a long running
transaction.
RelationshipType relType = DynamicRelationshipType.withName( "CHILD" );
Queue nodes = new LinkedList<>();
int depth = 1;
try ( Transaction tx = graphDb.beginTx() )
{
Node rootNode = graphDb.createNode();
nodes.add( rootNode );
for (; true; depth++)
int nodesToExpand
for (int i = 0; i
Node parent =

{
= nodes.size();
< nodesToExpand; ++i) {
nodes.remove();

Node left = graphDb.createNode();
Node right = graphDb.createNode();
parent.createRelationshipTo( left, relType );
parent.createRelationshipTo( right, relType );
nodes.add( left );
nodes.add( right );
}
}
}
catch ( TransactionTerminatedException ignored )
{
return String.format( "Created tree up to depth %s in 1 sec", depth );
}

After waiting for some time, we decide to terminate the transaction. This is done from a separate
thread.
tx.terminate();

Running this will execute the long running transaction for about one second and prints the maximum
depth of the tree that was created before the transaction was terminated. No changes are actually
made to the data — because the transaction has been terminated, the end result is as if no operations
were performed.
Example output
Created tree up to depth 15 in 1 sec
44

https://github.com/neo4j/neo4j/blob/2.3.12/community/embedded-examples/src/main/java/org/neo4j/examples/
TerminateTransactions.java

607

Using Neo4j embedded in Java applications
Finally, let’s shut down the database again.
graphDb.shutdown();

608

Using Neo4j embedded in Java applications

33.15. Execute Cypher Queries from Java
Tip

The full source code of the example: JavaQuery.java

45

In Java, you can use the Cypher query language as per the example below. First, let’s add some data.
GraphDatabaseService db = new GraphDatabaseFactory().newEmbeddedDatabase( DB_PATH );
try ( Transaction tx = db.beginTx())
{
Node myNode = db.createNode();
myNode.setProperty( "name", "my node" );
tx.success();
}

Execute a query:
try ( Transaction ignored = db.beginTx();
Result result = db.execute( "match (n {name: 'my node'}) return n, n.name" ) )
{
while ( result.hasNext() )
{
Map row = result.next();
for ( Entry column : row.entrySet() )
{
rows += column.getKey() + ": " + column.getValue() + "; ";
}
rows += "\n";
}
}

In the above example, we also show how to iterate over the rows of the Result46.
The code will generate:
n.name: my node; n: Node[0];

Caution

When using an Result, you should consume the entire result (iterate over all rows using
next(), iterating over the iterator from columnAs() or calling for example resultAsString()).
Failing to do so will not properly clean up resources used by the Result object, leading to
unwanted behavior, such as leaking transactions. In case you don’t want to iterate over all
of the results, make sure to invoke close() as soon as you are done, to release the resources
tied to the result.

Tip

47

Using a try-with-resources statement will make sure that the result is closed at the end of
the statement. This is the recommended way to handle results.

You can also get a list of the columns in the result like this:
List columns = result.columns();
45

https://github.com/neo4j/neo4j/blob/2.3.12/community/cypher/docs/cypher-docs/src/test/java/org/neo4j/cypher/example/
JavaQuery.java
46
http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/Result.html
47
http://docs.oracle.com/javase/tutorial/essential/exceptions/tryResourceClose.html

609

Using Neo4j embedded in Java applications
This gives us:
[n, n.name]

To fetch the result items from a single column, do like below. In this case we’ll have to read the property
from the node and not from the result.
Iterator n_column = result.columnAs( "n" );
for ( Node node : IteratorUtil.asIterable( n_column ) )
{
nodeResult = node + ": " + node.getProperty( "name" );
}

In this case there’s only one node in the result:
Node[0]: my node

Only use this if the result only contains a single column, or you are only interested in a single column of
the result.

Note

resultAsString(), writeAsStringTo(), columnAs() cannot be called more than once on the
same Result object, as they consume the result. In the same way, part of the result gets
consumed for every call to next(). You should instead use only one and if you need the
facilities of the other methods on the same query result instead create a new Result.
48

For more information on the Java interface to Cypher, see the Java API .
For more information and examples for Cypher, see Part III, “Cypher Query Language” [102] and
Chapter 5, Basic Data Modeling Examples [47].

48

http://neo4j.com/docs/2.3.12/javadocs/index.html

610

Using Neo4j embedded in Java applications

33.16. Query Parameters
For more information on parameters see Section 8.5, “Parameters” [113].
Below follows example of how to use parameters when executing Cypher queries from Java.
Node id
Map params = new HashMap();
params.put( "id", 0 );
String query = "MATCH n WHERE id(n) = {id} RETURN n.name";
Result result = db.execute( query, params );

Node object
Map params = new HashMap();
params.put( "node", andreasNode );
String query = "MATCH n WHERE n = {node} RETURN n.name";
Result result = db.execute( query, params );

Multiple node ids
Map params = new HashMap();
params.put( "ids", Arrays.asList( 0, 1, 2 ) );
String query = "MATCH n WHERE id(n) in {ids} RETURN n.name";
Result result = db.execute( query, params );

String literal
Map params = new HashMap();
params.put( "name", "Johan" );
String query = "MATCH (n) WHERE n.name = {name} RETURN n";
Result result = db.execute( query, params );

Index value
Map params = new HashMap();
params.put( "value", "Michaela" );
String query = "START n=node:people(name = {value}) RETURN n";
Result result = db.execute( query, params );

Index query
Map params = new HashMap();
params.put( "query", "name:Andreas" );
String query = "START n=node:people({query}) RETURN n";
Result result = db.execute( query, params );

Numeric parameters for SKIP and LIMIT
Map params = new HashMap();
params.put( "s", 1 );
params.put( "l", 1 );
String query = "MATCH (n) RETURN n.name SKIP {s} LIMIT {l}";
Result result = db.execute( query, params );

Regular expression
Map params = new HashMap();
params.put( "regex", ".*h.*" );
String query = "MATCH (n) WHERE n.name =~ {regex} RETURN n.name";
Result result = db.execute( query, params );

Create node with properties

611

Using Neo4j embedded in Java applications

Map props = new HashMap();
props.put( "name", "Andres" );
props.put( "position", "Developer" );
Map params = new HashMap();
params.put( "props", props );
String query = "CREATE ({props})";
db.execute( query, params );

Create multiple nodes with properties
Map n1 = new HashMap();
n1.put( "name", "Andres" );
n1.put( "position", "Developer" );
n1.put( "awesome", true );
Map n2 = new HashMap();
n2.put( "name", "Michael" );
n2.put( "position", "Developer" );
n2.put( "children", 3 );
Map params = new HashMap();
List> maps = Arrays.asList( n1, n2 );
params.put( "props", maps );
String query = "CREATE (n:Person {props}) RETURN n";
db.execute( query, params );

Setting all properties on node
Map n1 = new HashMap<>();
n1.put( "name", "Andres" );
n1.put( "position", "Developer" );
Map params = new HashMap<>();
params.put( "props", n1 );
String query = "MATCH (n) WHERE n.name='Michaela' SET n = {props}";
db.execute( query, params );

612

Chapter 34. The Traversal Framework
1

The Neo4j Traversal API is a callback based, lazily executed way of specifying desired movements
through a graph in Java. Some traversal examples are collected under Section 33.9, “Traversal” [593].
You can also use The Cypher Query Language as a powerful declarative way to query the graph.

1

http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/traversal/package-summary.html

613

The Traversal Framework

34.1. Main concepts
Here follows a short explanation of all different methods that can modify or add to a traversal
description.
•
•
•
•

Pathexpanders — define what to traverse, typically in terms of relationship direction and type.
Order — for example depth-first or breadth-first.
Uniqueness — visit nodes (relationships, paths) only once.
Evaluator — decide what to return and whether to stop or continue traversal beyond the current
position.
• Starting nodes where the traversal will begin.
Dept h First

Breadt h First

Order

Include/Exclude

Unique Relat ionships

Unique Pat hs

Evaluat or

None

Uniqueness

where t o go next

ret urn and prune policy

Prune/Cont inue

avoid duplicat es

Traversal Descript ion

st art ing point of

Traverser

result as result as

Pat hs

Nodes

result as

Relat ionships

See Section 34.2, “Traversal Framework Java API” [615] for more details.

614

Direct ion

Pat hExpander

what t o t raverse

Relat ionship Type

applies

A Node

Unique Nodes

The Traversal Framework

34.2. Traversal Framework Java API
The traversal framework consists of a few main interfaces in addition to Node and Relationship:
TraversalDescription, Evaluator, Traverser and Uniqueness are the main ones. The Path interface
also has a special purpose in traversals, since it is used to represent a position in the graph when
evaluating that position. Furthermore the PathExpander (replacing RelationshipExpander and Expander)
interface is central to traversals, but users of the API rarely need to implement it. There are also a set of
interfaces for advanced use, when explicit control over the traversal order is required: BranchSelector,
BranchOrderingPolicy and TraversalBranch.

TraversalDescription
2

The TraversalDescription is the main interface used for defining and initializing traversals. It is
not meant to be implemented by users of the traversal framework, but rather to be provided
by the implementation of the traversal framework as a way for the user to describe traversals.
TraversalDescription instances are immutable and its methods returns a new TraversalDescription that
is modified compared to the object the method was invoked on with the arguments of the method.

Relationships
Adds a relationship type to the list of relationship types to traverse. By default that list is empty and it
means that it will traverse all relationships, regardless of type. If one or more relationships are added
3
to this list only the added types will be traversed. There are two methods, one including direction and
4
5
another one excluding direction , where the latter traverses relationships in both directions .

Evaluator
6

Evaluator s are used for deciding, at each position (represented as a Path): should the traversal
continue, and/or should the node be included in the result. Given a Path, it asks for one of four actions

for that branch of the traversal:
•
•
•
•

Evaluation.INCLUDE_AND_CONTINUE: Include this node in the result and continue the traversal
Evaluation.INCLUDE_AND_PRUNE: Include this node in the result, but don’t continue the traversal
Evaluation.EXCLUDE_AND_CONTINUE: Exclude this node from the result, but continue the traversal
Evaluation.EXCLUDE_AND_PRUNE: Exclude this node from the result and don’t continue the traversal

More than one evaluator can be added. Note that evaluators will be called for all positions the traverser
encounters, even for the start node.

Traverser
7

8

The Traverser object is the result of invoking traverse() of a TraversalDescription object. It represents
a traversal positioned in the graph, and a specification of the format of the result. The actual traversal is
performed lazily each time the next()-method of the iterator of the Traverser is invoked.

Uniqueness
9

Sets the rules for how positions can be revisited during a traversal as stated in Uniqueness . Default if
10
not set is NODE_GLOBAL .
2

http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/traversal/TraversalDescription.html
http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/traversal/TraversalDescription.html#relationships
4
http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/traversal/TraversalDescription.html#relationships
5
http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/Direction.html#BOTH
6
http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/traversal/Evaluator.html
7
http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/traversal/Traverser.html
8
http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/traversal/
TraversalDescription.html#traverse(org.neo4j.graphdb.Node)
9
http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/traversal/Uniqueness.html
10
http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/traversal/Uniqueness.html#NODE_GLOBAL
3

615

The Traversal Framework
A Uniqueness can be supplied to the TraversalDescription to dictate under what circumstances a
traversal may revisit the same position in the graph. The various uniqueness levels that can be used in
Neo4j are:
• NONE: Any position in the graph may be revisited.
• NODE_GLOBAL uniqueness: No node in the entire graph may be visited more than once. This could
potentially consume a lot of memory since it requires keeping an in-memory data structure
remembering all the visited nodes.
• RELATIONSHIP_GLOBAL uniqueness: no relationship in the entire graph may be visited more than once.
For the same reasons as NODE_GLOBAL uniqueness, this could use up a lot of memory. But since graphs
typically have a larger number of relationships than nodes, the memory overhead of this uniqueness
level could grow even quicker.
• NODE_PATH uniqueness: A node may not occur previously in the path reaching up to it.
• RELATIONSHIP_PATH uniqueness: A relationship may not occur previously in the path reaching up to it.
• NODE_RECENT uniqueness: Similar to NODE_GLOBAL uniqueness in that there is a global collection of visited
nodes each position is checked against. This uniqueness level does however have a cap on how
much memory it may consume in the form of a collection that only contains the most recently visited
nodes. The size of this collection can be specified by providing a number as the second argument to
the TraversalDescription.uniqueness()-method along with the uniqueness level.
• RELATIONSHIP_RECENT uniqueness: Works like NODE_RECENT uniqueness, but with relationships instead of
nodes.

Depth First / Breadth First

11

12

These are convenience methods for setting preorder depth-first / breadth-first
13
BranchSelector|ordering policies. The same result can be achieved by calling the order
14
method with ordering policies from BranchOrderingPolicies , or to write your own
BranchSelector/BranchOrderingPolicy and pass in.

Order — How to move through branches?
A more generic version of depthFirst/breadthFirst methods in that it allows an arbitrary
15
BranchOrderingPolicy to be injected into the description.

BranchSelector
A BranchSelector/BranchOrderingPolicy is used for selecting which branch of the traversal to attempt
next. This is used for implementing traversal orderings. The traversal framework provides a few basic
ordering implementations:
• BranchOrderingPolicies.PREORDER_DEPTH_FIRST: Traversing depth first, visiting each node before visiting
its child nodes.
• BranchOrderingPolicies.POSTORDER_DEPTH_FIRST: Traversing depth first, visiting each node after visiting
its child nodes.
• BranchOrderingPolicies.PREORDER_BREADTH_FIRST: Traversing breadth first, visiting each node before
visiting its child nodes.
• BranchOrderingPolicies.POSTORDER_BREADTH_FIRST: Traversing breadth first, visiting each node after
visiting its child nodes.

Note

Please note that breadth first traversals have a higher memory overhead than depth first
traversals.
11

http://en.wikipedia.org/wiki/Depth-first_search
http://en.wikipedia.org/wiki/Breadth-first_search
13
http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/traversal/TraversalDescription.html#order
14
http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/traversal/BranchOrderingPolicies.html
15
http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/traversal/BranchOrderingPolicy.html
12

616

The Traversal Framework
BranchSelectors carries state and hence needs to be uniquely instantiated for each traversal. Therefore
it is supplied to the TraversalDescription through a BranchOrderingPolicy interface, which is a factory of
BranchSelector instances.

A user of the Traversal framework rarely needs to implement his own BranchSelector or
BranchOrderingPolicy, it is provided to let graph algorithm implementors provide their own
traversal orders. The Neo4j Graph Algorithms package contains for example a BestFirst order
BranchSelector/BranchOrderingPolicy that is used in BestFirst search algorithms such as A* and Dijkstra.

BranchOrderingPolicy
A factory for creating BranchSelectors to decide in what order branches are returned (where a branch’s
16
position is represented as a Path from the start node to the current node). Common policies are
17
18
depth-first and breadth-first and that’s why there are convenience methods for those. For
19
example, calling TraversalDescriptiondepthFirst() is equivalent to:
description.order( BranchOrderingPolicies.PREORDER_DEPTH_FIRST );

TraversalBranch
An object used by the BranchSelector to get more branches from a certain branch. In essence these
20
are a composite of a Path and a RelationshipExpander that can be used to get new TraversalBranch es
from the current one.

Path
21

A Path is a general interface that is part of the Neo4j API. In the traversal API of Neo4j the use of Paths
are twofold. Traversers can return their results in the form of the Paths of the visited positions in the
graph that are marked for being returned. Path objects are also used in the evaluation of positions in
the graph, for determining if the traversal should continue from a certain point or not, and whether a
certain position should be included in the result set or not.

PathExpander/RelationshipExpander
The traversal framework use PathExpanders (replacing RelationshipExpander) to discover the relationships
that should be followed from a particular path to further branches in the traversal.

Expander
A more generic version of relationships where a RelationshipExpander is injected, defining all
relationships to be traversed for any given node.
The Expander interface is an extension of the RelationshipExpander interface that makes it possible to
build customized versions of an Expander. The implementation of TraversalDescription uses this to
provide methods for defining which relationship types to traverse, this is the usual way a user of the
API would define a RelationshipExpander — by building it internally in the TraversalDescription.
All the RelationshipExpanders provided by the Neo4j traversal framework also implement the
Expander interface. For a user of the traversal API it is easier to implement the PathExpander/
RelationshipExpander interface, since it only contains one method — the method for getting the
relationships from a path/node, the methods that the Expander interface adds are just for building new
Expanders.

16

http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/Path.html
http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/traversal/TraversalDescription.html#depthFirst()
18
http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/traversal/TraversalDescription.html#breadthFirst()
19
http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/traversal/TraversalDescription.html#depthFirst()
20
http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/traversal/TraversalBranch.html
21
http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/Path.html
17

617

The Traversal Framework

How to use the Traversal framework
22

A traversal description
23
traversers .

is built using a fluent interface and such a description can then spawn
Figure 34.1. Traversal Example Graph
Node[ 3]

Node[ 2]

nam e = 'Lisa'

LIKES

nam e = 'Ed'

KNOWS

KNOWS

Node[ 0]
nam e = 'Lars'

Node[ 6]
KNOWS
nam e = 'Joe'

Node[ 5]
nam e = 'Dirk'

KNOWS

KNOWS

Node[ 4]
nam e = 'Pet er'

KNOWS

Node[ 1]
nam e = 'Sara'

With the definition of the RelationshipTypes as
private enum Rels implements RelationshipType
{
LIKES, KNOWS
}

The graph can be traversed with for example the following traverser, starting at the “Joe” node:
for ( Path position : db.traversalDescription()
.depthFirst()
.relationships( Rels.KNOWS )
.relationships( Rels.LIKES, Direction.INCOMING )
.evaluator( Evaluators.toDepth( 5 ) )
.traverse( node ) )
{
output += position + "\n";
}
22
23

http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/traversal/TraversalDescription.html
http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/traversal/Traverser.html

618

The Traversal Framework
The traversal will output:
(6)
(6)<--[LIKES,1]--(3)
(6)<--[LIKES,1]--(3)--[KNOWS,6]-->(0)
(6)<--[LIKES,1]--(3)--[KNOWS,6]-->(0)<--[KNOWS,5]--(2)
(6)<--[LIKES,1]--(3)--[KNOWS,6]-->(0)--[KNOWS,4]-->(5)
(6)<--[LIKES,1]--(3)--[KNOWS,6]-->(0)--[KNOWS,4]-->(5)--[KNOWS,3]-->(4)
(6)<--[LIKES,1]--(3)--[KNOWS,6]-->(0)--[KNOWS,4]-->(5)--[KNOWS,3]-->(4)--[KNOWS,2]-->(1)
24

Since TraversalDescription s are immutable it is also useful to create template descriptions which
holds common settings shared by different traversals. For example, let’s start with this traverser:
friendsTraversal = db.traversalDescription()
.depthFirst()
.relationships( Rels.KNOWS )
.uniqueness( Uniqueness.RELATIONSHIP_GLOBAL );

This traverser would yield the following output (we will keep starting from the “Joe” node):
(6)
(6)--[KNOWS,0]-->(1)
(6)--[KNOWS,0]-->(1)<--[KNOWS,2]--(4)
(6)--[KNOWS,0]-->(1)<--[KNOWS,2]--(4)<--[KNOWS,3]--(5)
(6)--[KNOWS,0]-->(1)<--[KNOWS,2]--(4)<--[KNOWS,3]--(5)<--[KNOWS,4]--(0)
(6)--[KNOWS,0]-->(1)<--[KNOWS,2]--(4)<--[KNOWS,3]--(5)<--[KNOWS,4]--(0)<--[KNOWS,6]--(3)
(6)--[KNOWS,0]-->(1)<--[KNOWS,2]--(4)<--[KNOWS,3]--(5)<--[KNOWS,4]--(0)<--[KNOWS,5]--(2)

Now let’s create a new traverser from it, restricting depth to three:
for ( Path path : friendsTraversal
.evaluator( Evaluators.toDepth( 3 ) )
.traverse( node ) )
{
output += path + "\n";
}

This will give us the following result:
(6)
(6)--[KNOWS,0]-->(1)
(6)--[KNOWS,0]-->(1)<--[KNOWS,2]--(4)
(6)--[KNOWS,0]-->(1)<--[KNOWS,2]--(4)<--[KNOWS,3]--(5)

Or how about from depth two to four? That’s done like this:
for ( Path path : friendsTraversal
.evaluator( Evaluators.fromDepth( 2 ) )
.evaluator( Evaluators.toDepth( 4 ) )
.traverse( node ) )
{
output += path + "\n";
}

This traversal gives us:
(6)--[KNOWS,0]-->(1)<--[KNOWS,2]--(4)
(6)--[KNOWS,0]-->(1)<--[KNOWS,2]--(4)<--[KNOWS,3]--(5)
(6)--[KNOWS,0]-->(1)<--[KNOWS,2]--(4)<--[KNOWS,3]--(5)<--[KNOWS,4]--(0)
25

For various useful evaluators, see the Evaluators
interface yourself.

26

Java API or simply implement the Evaluator

24

http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/traversal/TraversalDescription.html
http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/traversal/Evaluators.html
26
http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/traversal/Evaluator.html
25

619

The Traversal Framework
27

28

If you’re not interested in the Path s, but the Node s you can transform the traverser into an iterable of
29
nodes like this:
for ( Node currentNode : friendsTraversal
.traverse( node )
.nodes() )
{
output += currentNode.getProperty( "name" ) + "\n";
}

In this case we use it to retrieve the names:
Joe
Sara
Peter
Dirk
Lars
Lisa
Ed
30

Relationships

are fine as well, here’s how to get them:

for ( Relationship relationship : friendsTraversal
.traverse( node )
.relationships() )
{
output += relationship.getType().name() + "\n";
}

Here the relationship types are written, and we get:
KNOWS
KNOWS
KNOWS
KNOWS
KNOWS
KNOWS

Tip

31

The source code for the traversers in this example is available at: TraversalExample.java

27

http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/Path.html
http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/Node.html
29
http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/traversal/Traverser.html#nodes()
30
http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/traversal/Traverser.html#relationships()
31
https://github.com/neo4j/neo4j/blob/2.3.12/community/embedded-examples/src/main/java/org/neo4j/examples/
TraversalExample.java
28

620

Chapter 35. Legacy Indexing
Note

This is not the same as indexes defined in the schema, the documentation below is for the
legacy indexing in Neo4j.
This chapter focuses on how to use the Manual Indexes. As of Neo4j 2.0, this is not the favored method
of indexing data in Neo4j, instead we recommend defining indexes in the database schema.
However, support for legacy indexes remains, because certain features, such as uniqueness
constraints, are not yet handled by the new indexes.

621

Legacy Indexing

35.1. Introduction

1

Legacy Indexing operations are part of the Neo4j index API .
Each index is tied to a unique, user-specified name (for example "first_name" or "books") and can index
2
3
either nodes or relationships .
The default index implementation is provided by the neo4j-lucene-index component, which is included
in the standard Neo4j download. It can also be downloaded separately from http://repo1.maven.org/
maven2/org/neo4j/neo4j-lucene-index/ . For Maven users, the neo4j-lucene-index component has the
coordinates org.neo4j:neo4j-lucene-index and should be used with the same version of org.neo4j:neo4jkernel. Different versions of the index and kernel components are not compatible in the general case.
Both components are included transitively by the org.neo4j:neo4j:pom artifact which makes it simple to
keep the versions in sync.
For initial import of data using indexes, see Section 36.2, “Index Batch Insertion” [639].

Note

All modifying index operations must be performed inside a transaction, as with any
modifying operation in Neo4j.

1

http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/index/package-summary.html
http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/Node.html
3
http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/Relationship.html
2

622

Legacy Indexing

35.2. Create
An index is created if it doesn’t exist when you ask for it. Unless you give it a custom configuration, it
will be created with default configuration and backend.
To set the stage for our examples, let’s create some indexes to begin with:
IndexManager index = graphDb.index();
Index actors = index.forNodes( "actors" );
Index movies = index.forNodes( "movies" );
RelationshipIndex roles = index.forRelationships( "roles" );

This will create two node indexes and one relationship index with default configuration. See
Section 35.8, “Relationship indexes” [631] for more information specific to relationship indexes.
See Section 35.10, “Configuration and fulltext indexes” [633] for how to create fulltext indexes.
You can also check if an index exists like this:
IndexManager index = graphDb.index();
boolean indexExists = index.existsForNodes( "actors" );

623

Legacy Indexing

35.3. Delete
Indexes can be deleted. When deleting, the entire contents of the index will be removed as well as its
associated configuration. An index can be created with the same name at a later point in time.
IndexManager index = graphDb.index();
Index actors = index.forNodes( "actors" );
actors.delete();

Note that the actual deletion of the index is made during the commit of the surrounding transaction.
4
Calls made to such an index instance after delete() has been called are invalid inside that transaction
as well as outside (if the transaction is successful), but will become valid again if the transaction is rolled
back.

4

http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/index/Index.html#delete%28%29

624

Legacy Indexing

35.4. Add
Each index supports associating any number of key-value pairs with any number of entities (nodes or
relationships), where each association between entity and key-value pair is performed individually. To
begin with, let’s add a few nodes to the indexes:
// Actors
Node reeves = graphDb.createNode();
reeves.setProperty( "name", "Keanu Reeves" );
actors.add( reeves, "name", reeves.getProperty( "name" ) );
Node bellucci = graphDb.createNode();
bellucci.setProperty( "name", "Monica Bellucci" );
actors.add( bellucci, "name", bellucci.getProperty( "name" ) );
// multiple values for a field, in this case for search only
// and not stored as a property.
actors.add( bellucci, "name", "La Bellucci" );
// Movies
Node theMatrix = graphDb.createNode();
theMatrix.setProperty( "title", "The Matrix" );
theMatrix.setProperty( "year", 1999 );
movies.add( theMatrix, "title", theMatrix.getProperty( "title" ) );
movies.add( theMatrix, "year", theMatrix.getProperty( "year" ) );
Node theMatrixReloaded = graphDb.createNode();
theMatrixReloaded.setProperty( "title", "The Matrix Reloaded" );
theMatrixReloaded.setProperty( "year", 2003 );
movies.add( theMatrixReloaded, "title", theMatrixReloaded.getProperty( "title" ) );
movies.add( theMatrixReloaded, "year", 2003 );
Node malena = graphDb.createNode();
malena.setProperty( "title", "Malèna" );
malena.setProperty( "year", 2000 );
movies.add( malena, "title", malena.getProperty( "title" ) );
movies.add( malena, "year", malena.getProperty( "year" ) );

Note that there can be multiple values associated with the same entity and key.
Next up, we’ll create relationships and index them as well:
// we need a relationship type
DynamicRelationshipType ACTS_IN = DynamicRelationshipType.withName( "ACTS_IN" );
// create relationships
Relationship role1 = reeves.createRelationshipTo( theMatrix, ACTS_IN );
role1.setProperty( "name", "Neo" );
roles.add( role1, "name", role1.getProperty( "name" ) );
Relationship role2 = reeves.createRelationshipTo( theMatrixReloaded, ACTS_IN );
role2.setProperty( "name", "Neo" );
roles.add( role2, "name", role2.getProperty( "name" ) );
Relationship role3 = bellucci.createRelationshipTo( theMatrixReloaded, ACTS_IN );
role3.setProperty( "name", "Persephone" );
roles.add( role3, "name", role3.getProperty( "name" ) );
Relationship role4 = bellucci.createRelationshipTo( malena, ACTS_IN );
role4.setProperty( "name", "Malèna Scordia" );
roles.add( role4, "name", role4.getProperty( "name" ) );

After these operations, our example graph looks like this:

625

Legacy Indexing
Figure 35.1. Movie and Actor Graph
nam e = 'Keanu Reeves'

ACTS_IN
nam e = 'Neo'

t it le = 'The Mat rix'
year = 1999

nam e = 'Monica Bellucci'

ACTS_IN
nam e = 'Neo'

ACTS_IN
nam e = 'Persephone'

t it le = 'The Mat rix Reloaded'
year = 2003

626

ACTS_IN
nam e = 'Malèna Scordia'

t it le = 'Malèna'
year = 2000

Legacy Indexing

35.5. Remove
5

Removing from an index is similar to adding, but can be done by supplying one of the following
combinations of arguments:
• entity
• entity, key
• entity, key, value
// completely remove bellucci from the actors index
actors.remove( bellucci );
// remove any "name" entry of bellucci from the actors index
actors.remove( bellucci, "name" );
// remove the "name" -> "La Bellucci" entry of bellucci
actors.remove( bellucci, "name", "La Bellucci" );

5

http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/index/Index.html#remove%28T,%20java.lang.String,
%20java.lang.Object%29

627

Legacy Indexing

35.6. Update
Important

To update an index entry, the old one must be removed and a new one added. For details on
removing index entries, see Section 35.5, “Remove” [627].
Remember that a node or relationship can be associated with any number of key-value pairs in an
index. This means that you can index a node or relationship with many key-value pairs that have
the same key. In the case where a property value changes and you’d like to update the index, it’s not
enough to just index the new value — you’ll have to remove the old value as well.
Here’s a code example that demonstrates how it’s done:
// create a node with a property
// so we have something to update later on
Node fishburn = graphDb.createNode();
fishburn.setProperty( "name", "Fishburn" );
// index it
actors.add( fishburn, "name", fishburn.getProperty( "name" ) );
// update the index entry
// when the property value changes
actors.remove( fishburn, "name", fishburn.getProperty( "name" ) );
fishburn.setProperty( "name", "Laurence Fishburn" );
actors.add( fishburn, "name", fishburn.getProperty( "name" ) );

628

Legacy Indexing

35.7. Search

6

7

An index can be searched in two ways, get and query . The get method will return exact matches to
the given key-value pair, whereas query exposes querying capabilities directly from the backend used
8
by the index. For example the Lucene query syntax can be used directly with the default indexing
backend.

Get
This is how to search for a single exact match:
IndexHits hits = actors.get( "name", "Keanu Reeves" );
Node reeves = hits.getSingle();
9

10

IndexHits is an Iterable with some additional useful methods. For example getSingle()
first and only item from the result iterator, or null if there isn’t any hit.

returns the

Here’s how to get a single relationship by exact matching and retrieve its start and end nodes:
Relationship persephone = roles.get( "name", "Persephone" ).getSingle();
Node actor = persephone.getStartNode();
Node movie = persephone.getEndNode();

Finally, we can iterate over all exact matches from a relationship index:
for ( Relationship role : roles.get( "name", "Neo" ) )
{
// this will give us Reeves twice
Node reeves = role.getStartNode();
}

Important

11

In case you don’t iterate through all the hits, IndexHits.close()

must be called explicitly.

Query
There are two query methods, one which uses a key-value signature where the value represents a
query for values with the given key only. The other method is more generic and supports querying for
more than one key-value pair in the same query.
Here’s an example using the key-query option:
for ( Node actor : actors.query( "name", "*e*" ) )
{
// This will return Reeves and Bellucci
}

In the following example the query uses multiple keys:
for ( Node movie : movies.query( "title:*Matrix* AND year:1999" ) )
{
// This will return "The Matrix" from 1999 only.
}

Note

Beginning a wildcard search with "*" or "?" is discouraged by Lucene, but will nevertheless
work.
6

http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/index/Index.html#get%28java.lang.String,%20java.lang.Object%29
http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/index/Index.html#query%28java.lang.String,%20java.lang.Object%29
8
http://lucene.apache.org/core/3_6_2/queryparsersyntax.html
9
http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/index/IndexHits.html
10
http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/index/IndexHits.html#getSingle%28%29
11
http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/index/IndexHits.html#close%28%29
7

629

Legacy Indexing

Caution

You can’t have any whitespace in the search term with this syntax. See the section called
“Querying with Lucene Query objects” [635] for how to do that.

630

Legacy Indexing

35.8. Relationship indexes
An index for relationships is just like an index for nodes, extended by providing support to constrain
a search to relationships with a specific start and/or end nodes These extra methods reside in the
12
13
RelationshipIndex interface which extends Index .
Example of querying a relationship index:
// find relationships filtering on start node
// using exact matches
IndexHits reevesAsNeoHits;
reevesAsNeoHits = roles.get( "name", "Neo", reeves, null );
Relationship reevesAsNeo = reevesAsNeoHits.iterator().next();
reevesAsNeoHits.close();
// find relationships filtering on end node
// using a query
IndexHits matrixNeoHits;
matrixNeoHits = roles.query( "name", "*eo", null, theMatrix );
Relationship matrixNeo = matrixNeoHits.iterator().next();
matrixNeoHits.close();

And here’s an example for the special case of searching for a specific relationship type:
// find relationships filtering on end node
// using a relationship type.
// this is how to add it to the index:
roles.add( reevesAsNeo, "type", reevesAsNeo.getType().name() );
// Note that to use a compound query, we can't combine committed
// and uncommitted index entries, so we'll commit before querying:
tx.success();
tx.close();
// and now we can search for it:
try ( Transaction tx = graphDb.beginTx() )
{
IndexHits typeHits = roles.query( "type:ACTS_IN AND name:Neo", null, theMatrix );
Relationship typeNeo = typeHits.iterator().next();
typeHits.close();

Such an index can be useful if your domain has nodes with a very large number of relationships
between them, since it reduces the search time for a relationship between two nodes. A good example
where this approach pays dividends is in time series data, where we have readings represented as a
relationship per occurrence.

12
13

http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/index/RelationshipIndex.html
http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/index/Index.html

631

Legacy Indexing

35.9. Scores

14

The IndexHits interface exposes scoring so that the index can communicate scores for the hits.
Note that the result is not sorted by the score unless you explicitly specify that. See the section called
“Sorting” [634] for how to sort by score.
IndexHits hits = movies.query( "title", "The*" );
for ( Node movie : hits )
{
System.out.println( movie.getProperty( "title" ) + " " + hits.currentScore() );
}

14

http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/graphdb/index/IndexHits.html#currentScore%28%29

632

Legacy Indexing

35.10. Configuration and fulltext indexes
At the time of creation extra configuration can be specified to control the behavior of the index and
which backend to use. For example to create a Lucene fulltext index:
IndexManager index = graphDb.index();
Index fulltextMovies = index.forNodes( "movies-fulltext",
MapUtil.stringMap( IndexManager.PROVIDER, "lucene", "type", "fulltext" ) );
fulltextMovies.add( theMatrix, "title", "The Matrix" );
fulltextMovies.add( theMatrixReloaded, "title", "The Matrix Reloaded" );
// search in the fulltext index
Node found = fulltextMovies.query( "title", "reloAdEd" ).getSingle();

Here’s an example of how to create an exact index which is case-insensitive:
Index index = graphDb.index().forNodes( "exact-case-insensitive",
stringMap( "type", "exact", "to_lower_case", "true" ) );
Node node = graphDb.createNode();
index.add( node, "name", "Thomas Anderson" );
assertContains( index.query( "name", "\"Thomas Anderson\"" ), node );
assertContains( index.query( "name", "\"thoMas ANDerson\"" ), node );

Tip

In order to search for tokenized words, the query method has to be used. The get method
will only match the full string value, not the tokens.

The configuration of the index is persisted once the index has been created. The provider configuration
key is interpreted by Neo4j, but any other configuration is passed onto the backend index (e.g. Lucene)
to interpret.
Lucene indexing configuration parameters
Parameter

Possible values

Effect

type

exact, fulltext

exact is the default and uses a Lucene keyword
15
analyzer . fulltext uses a white-space tokenizer

in its analyzer.
to_lower_case

true, false

This parameter goes together with type: fulltext
and converts values to lower case during both
additions and querying, making the index case
insensitive. Defaults to true.

analyzer

the full class name of an
16
Analyzer

Overrides the type so that a custom analyzer
can be used. Note: to_lower_case still affects
lowercasing of string queries. If the custom
analyzer uppercases the indexed tokens, string
queries will not match as expected.

15
16

http://lucene.apache.org/core/3_6_2/api/core/org/apache/lucene/analysis/KeywordAnalyzer.html
http://lucene.apache.org/core/3_6_2/api/core/org/apache/lucene/analysis/Analyzer.html

633

Legacy Indexing

35.11. Extra features for Lucene indexes
Numeric ranges
Lucene supports smart indexing of numbers, querying for ranges and sorting such results, and so does
its backend for Neo4j. To mark a value so that it is indexed as a numeric value, we can make use of the
17
ValueContext class, like this:
movies.add( theMatrix, "year-numeric", new ValueContext( 1999 ).indexNumeric() );
movies.add( theMatrixReloaded, "year-numeric", new ValueContext( 2003 ).indexNumeric() );
movies.add( malena, "year-numeric", new ValueContext( 2000 ).indexNumeric() );
int from = 1997;
int to = 1999;
hits = movies.query( QueryContext.numericRange( "year-numeric", from, to ) );

Note

The same type must be used for indexing and querying. That is, you can’t index a value as a
Long and then query the index using an Integer.
By giving null as from/to argument, an open ended query is created. In the following example we are
doing that, and have added sorting to the query as well:
hits = movies.query(
QueryContext.numericRange( "year-numeric", from, null )
.sortNumeric( "year-numeric", false ) );

From/to in the ranges defaults to be inclusive, but you can change this behavior by using two extra
parameters:
movies.add( theMatrix, "score", new ValueContext( 8.7 ).indexNumeric() );
movies.add( theMatrixReloaded, "score", new ValueContext( 7.1 ).indexNumeric() );
movies.add( malena, "score", new ValueContext( 7.4 ).indexNumeric() );
// include 8.0, exclude 9.0
hits = movies.query( QueryContext.numericRange( "score", 8.0, 9.0, true, false ) );

Sorting
Lucene performs sorting very well, and that is also exposed in the index backend, through the
18
QueryContext class:
hits = movies.query( "title", new QueryContext( "*" ).sort( "title" ) );
for ( Node hit : hits )
{
// all movies with a title in the index, ordered by title
}
// or
hits = movies.query( new QueryContext( "title:*" ).sort( "year", "title" ) );
for ( Node hit : hits )
{
// all movies with a title in the index, ordered by year, then title
}

We sort the results by relevance (score) like this:
hits = movies.query( "title", new QueryContext( "The*" ).sortByScore() );
for ( Node movie : hits )
{
17
18

http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/index/lucene/ValueContext.html
http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/index/lucene/QueryContext.html

634

Legacy Indexing
// hits sorted by relevance (score)
}

Querying with Lucene Query objects
Instead of passing in Lucene query syntax queries, you can instantiate such queries programmatically
and pass in as argument, for example:
// a TermQuery will give exact matches
Node actor = actors.query( new TermQuery( new Term( "name", "Keanu Reeves" ) ) ).getSingle();
19

Note that the TermQuery

is basically the same thing as using the get method on the index.

This is how to perform wildcard searches using Lucene Query Objects:
hits = movies.query( new WildcardQuery( new Term( "title", "The Matrix*" ) ) );
for ( Node movie : hits )
{
System.out.println( movie.getProperty( "title" ) );
}

Note that this allows for whitespace in the search string.

Compound queries
Lucene supports querying for multiple terms in the same query, like so:
hits = movies.query( "title:*Matrix* AND year:1999" );

Caution

Compound queries can’t search across committed index entries and those who haven’t got
committed yet at the same time.

Default operator
The default operator (that is whether AND or OR is used in between different terms) in a query is OR.
20
Changing that behavior is also done via the QueryContext class:
QueryContext query = new QueryContext( "title:*Matrix* year:1999" )
.defaultOperator( Operator.AND );
hits = movies.query( query );

19
20

http://lucene.apache.org/core/3_6_2/api/core/org/apache/lucene/search/TermQuery.html
http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/index/lucene/QueryContext.html

635

Chapter 36. Batch Insertion
Neo4j has a batch insertion facility intended for initial imports, which bypasses transactions and other
checks in favor of performance. This is useful when you have a big dataset that needs to be loaded
once.
1

Batch insertion is included in the neo4j-kernel component, which is part of all Neo4j distributions and
editions.
Be aware of the following points when using batch insertion:
• The intended use is for initial import of data but you can use it on an existing database if the existing
database is shutdown first.
• Batch insertion is not thread safe.
• Batch insertion is non-transactional.
• Batch insertion is not enforcing constraints on the inserted data while inserting data.
• Batch insertion will re-populate all existing indexes and indexes created during batch insertion on
shutdown.
• Batch insertion will verify all existing constraints and constraints created during batch insertion on
shutdown.
• Unless shutdown is successfully invoked at the end of the import, the database files will be corrupt.

Warning

Always perform batch insertion in a single thread (or use synchronization to make only one
thread at a time access the batch inserter) and invoke shutdown when finished.

Warning

Since the batch insertion doesn’t enforce constraint during data loading, if the inserted
data violate any constraint the batch inserter will fail on shutdown and the database will be
inconsistent.

1

http://search.maven.org/#search|ga|1|neo4j-kernel

636

Batch Insertion

36.1. Batch Inserter Examples
Initial import
To bulk load data using the batch inserter you’ll need to write a Java application which makes use of the
2
low level BatchInserter interface.

Tip

You can’t have multiple threads using the batch inserter concurrently without external
synchronization.

3

You can get hold of an instance of BatchInserter by using BatchInserters . Here’s an example of the
batch inserter in use:
BatchInserter inserter = null;
try
{
inserter = BatchInserters.inserter(
new File( "target/batchinserter-example" ).getAbsolutePath() );
Label personLabel = DynamicLabel.label( "Person" );
inserter.createDeferredSchemaIndex( personLabel ).on( "name" ).create();
Map properties = new HashMap<>();
properties.put( "name", "Mattias" );
long mattiasNode = inserter.createNode( properties, personLabel );
properties.put( "name", "Chris" );
long chrisNode = inserter.createNode( properties, personLabel );
RelationshipType knows = DynamicRelationshipType.withName( "KNOWS" );
inserter.createRelationship( mattiasNode, chrisNode, knows, null );
}
finally
{
if ( inserter != null )
{
inserter.shutdown();
}
}

When creating a relationship you can set properties on the relationship by passing in a map containing
properties rather than null as the last parameter to createRelationship.
It’s important that the call to shutdown is inside a finally block to ensure that it gets called even if
exceptions are thrown. If he batch inserter isn’t cleanly shutdown then the consistency of the store is
not guaranteed.

Tip

4

The source code for the examples on this page can be found here: BatchInsertDocTest.java

2

http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/unsafe/batchinsert/BatchInserter.html
http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/unsafe/batchinsert/BatchInserters.html
4
https://github.com/neo4j/neo4j/blob/2.3.12/community/kernel/src/test/java/examples/BatchInsertDocTest.java
3

637

Batch Insertion

Setting configuration options
You can pass custom configuration options to the BatchInserter. (See the section called “Batch insert
example” [463] for information on the available options.) e.g.
Map config = new HashMap<>();
config.put( "dbms.pagecache.memory", "512m" );
BatchInserter inserter = BatchInserters.inserter(
new File( "target/batchinserter-example-config" ).getAbsolutePath(), config );
// Insert data here ... and then shut down:
inserter.shutdown();

Alternatively you could store the configuration in a file:
batchinsert-config
dbms.pagecache.memory=8m

You can then refer to that file when initializing BatchInserter:
try ( FileReader input = new FileReader( new File( "target/docs/batchinsert-config" ).getAbsoluteFile() ) )
{
Map config = MapUtil.load( input );
BatchInserter inserter = BatchInserters.inserter(
"target/docs/batchinserter-example-config", config );
// Insert data here ... and then shut down:
inserter.shutdown();
}

Importing into an existing database
Although it’s a less common use case, the batch inserter can also be used to import data into an
existing database. However, you will need to ensure that the existing database is shut down before you
write to it.

Warning

Since the batch importer bypasses transactions there is a possibility of data inconsistency if
the import process crashes midway. We would strongly suggest you take a backup of your
existing database before using the batch inserter against it.

638

Batch Insertion

36.2. Index Batch Insertion
For general notes on batch insertion, see Chapter 36, Batch Insertion [636].
5

Indexing during batch insertion is done using BatchInserterIndex which are provided via
6
BatchInserterIndexProvider . An example:
BatchInserter inserter = BatchInserters.inserter( "target/neo4jdb-batchinsert" );
BatchInserterIndexProvider indexProvider =
new LuceneBatchInserterIndexProvider( inserter );
BatchInserterIndex actors =
indexProvider.nodeIndex( "actors", MapUtil.stringMap( "type", "exact" ) );
actors.setCacheCapacity( "name", 100000 );
Map properties = MapUtil.map( "name", "Keanu Reeves" );
long node = inserter.createNode( properties );
actors.add( node, properties );
//make the changes visible for reading, use this sparsely, requires IO!
actors.flush();
// Make sure to shut down the index provider as well
indexProvider.shutdown();
inserter.shutdown();

The configuration parameters are the same as mentioned in Section 35.10, “Configuration and fulltext
indexes” [633].

Best practices
Here are some pointers to get the most performance out of BatchInserterIndex:
7

• Try to avoid flushing too often because each flush will result in all additions (since last flush) to be
visible to the querying methods, and publishing those changes can be a performance penalty.
• Have (as big as possible) phases where one phase is either only writes or only reads, and don’t forget
to flush after a write phase so that those changes becomes visible to the querying methods.
8
• Enable caching for keys you know you’re going to do lookups for later on to increase performance
significantly (though insertion performance may degrade slightly).

Note

Changes to the index are available for reading first after they are flushed to disk. Thus, for
optimal performance, read and lookup operations should be kept to a minimum during
batchinsertion since they involve IO and impact speed negatively.

5

http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/unsafe/batchinsert/BatchInserterIndex.html
http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/unsafe/batchinsert/BatchInserterIndexProvider.html
7
http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/unsafe/batchinsert/BatchInserterIndex.html#flush%28%29
8
http://neo4j.com/docs/2.3.12/javadocs/org/neo4j/unsafe/batchinsert/BatchInserterIndex.html#setCacheCapacity
%28java.lang.String,%20int%29
6

639

Terminology
The terminology used for Cypher and Neo4j is drawn from the worlds of database design and graph
theory. This section provides cross-linked summaries of common terms.
In some cases, multiple terms (e.g., arc, edge, relationship) may be used for the same or similar
concept. An asterisk (*) to the right of a term indicates that the term is commonly used for Neo4j and
Cypher.
acyclic

for a graph or subgraph: when there is no way to start at some node n and
follow a sequence of adjacent relationships that eventually loops back to n
again. The opposite of cyclic.

adjacent

nodes sharing an incident (that is, directly-connected) relationship or
relationships sharing an incident node.

aggregating expression

expression that summarizes a set of values, like computing their sum or
their maximum.

arc

graph theory: a synonym for a directed relationship.

array

container that holds a number of elements. The element types can be the
types supported by the underlying graph storage layer, but all elements
must be of the same type.

attribute

Synonym for property.

clause

component of a Cypher query or command; starts with an identifying
keyword (for example CREATE). The following clauses currently exist in
Cypher: CREATE, CREATE UNIQUE, DELETE, FOREACH, LOAD CSV, MATCH, MERGE,
OPTIONAL MATCH, REMOVE, RETURN, SET, START, UNION, and WITH.

co-incident

alternative term for adjacent relationships, which share a common node.

collection

container that holds a number of values. The values can have mixed
types.

command

a statement that operates on the database without affecting the data
graph or returning content from it.

commit

successful completion of a transaction, ensuring durability of any changes
made.

constraint

part of a database schema: defines a contract that the database will never
break (for example, uniqueness of a property on all nodes that have a
specific label).

cyclic

The opposite of acyclic.

Cypher

a special-purpose programming language for describing queries and
operations on a graph database, with accompanying natural language
concepts.

DAG

a directed, acyclic graph: there are no cyclic paths and all the relationships
are directed.

data graph

graph stored in the database. See also property graph.
640

Terminology
data record

a unit of storage containing an arbitrary unordered collection of
properties.

degree

of a node: is the number of relationships leaving or entering (if directed)
the node; loops are counted twice.

directed relationship

a relationship that has a direction; that is the relationship has a source
node and a destination node. The opposite of an undirected relationship.
All relationships in a Neo4j graph are directed.

edge

graph theory: a synonym for undirected relationship.

execution plan

parsed and compiled statement that is ready for Neo4j to execute.
An execution plan consists of the physical operations that need to be
performed in order to achieve the intent of the statement.

execution result

all statements return an execution result. For queries, this can contain an
iterator of result rows.

expression

produces values; may be used in projections, as a predicate, or when
setting properties on graph elements.

graph

1. data graph,
2. property graph,
3. graph theory: set of vertices and edges.

graph database

a database that uses graph-based structures (for example, nodes,
relationships, properties) to represent and store data.

graph element

node or relationship that is part of a graph.

identifier

identifiers are named bindings to values (for example, collections, scalars)
in a statement. For example, in MATCH n RETURN n, n is an identifier.

incident

adjacent relationship attached to a node or a node attached to a
relationship.

incoming relationship

pertaining to a directed relationship: from the point of view of a node n,
this is any relationship r arriving at n, exemplified by ()-[:r]->(n). The
opposite of outgoing.

index

data structure that improves performance of a database by redundantly
storing the same information in a way that is faster to read.

intermediate result

set of identifiers and values (record) passed from one clause to another
during query execution. This is internal to the execution of a given query.

label

marks a node as a member of a named subset. A node may be assigned
zero or more labels. Labels are written as :label in Cypher (the actual
label is prefixed by a colon). Note: graph theory: This differs from
mathematical graphs, where a label applies uniquely to a single vertex.

loop

a relationship that connects a node to itself.

neighbor

of node: another node, connected by a common relationship; of
relationship: another relationship, connected to a common node.

node*

data record within a data graph; contains an arbitrary collection of
properties. Nodes may have zero, one, or more labels and are optionally
connected by relationships. Similar to vertex.
641

Terminology
null

NULL is a special marker, used to indicate that a data item does not exist in

operator

there are three categories of operators in Cypher:

the graph or that the value of an expression is unknown or inapplicable.

1. Arithmetic, such as +, /, % etc.;
2. Logical, such as OR, AND, NOT etc.; and
3. Comparison, such as <, >, = etc.
outgoing relationship

pertaining to a directed relationship: from the point of view of a node n, this
is any relationship r leaving n, exemplified by (n)-[:r]->(). The opposite
of incoming relationship.

parameter

named value provided when running a statement. Parameters allow
Cypher to efficiently re-use execution plans without having to parse and
recompile every statement when only a literal value changes.

path

collection of alternating nodes and relationships that corresponds to a
walk in the data graph.

pattern graph

graph used to express the shape (that is, connectivity pattern) of the
data being searched for in the data graph. This is what MATCH and WHERE
describe in a Cypher query.

predicate

expression that returns TRUE, FALSE or NULL. When used in WHERE, NULL is
treated as FALSE.

projection

an operation taking result rows as both input and output data. This may
be a subset of the identifiers provided in the input, a calculation based on
single or multiple identifiers in the input, or both. The relevant clauses are
WITH and RETURN.

property graph

a graph having directed, typed relationships. Each node or relationship
may have zero or more associated properties.

property*

named value stored in a node or relationship. Synonym for attribute.

query

statement that reads or writes data from the database

relationship type

marks a relationship as a member of a named subset. A relationship must
be assigned one and only one type. For example, in the Cypher pattern
(start)-[:TYPE]->(to), TYPE is the relationship type.

relationship*

data record in a property graph that associates an ordered pair of nodes.
Similar to arc and edge.

result row

each query returns an iterator of result rows, which represents the result
of executing the query. Each result row is a set of key-value pairs (a
record).

rollback

abort of the containing transaction, effectively undoing any changes
defined inside the transaction.

schema

persistent database state that describes available indexes and enabled
constraints for the data graph.

schema command

statement that updates the schema.

statement

text string containing a Cypher query or command.
642

Terminology
transaction

A transaction comprises a unit of work performed against a database. It is
treated in a coherent and reliable way, independent of other transactions.
A transaction, by definition, must be atomic, consistent, isolated, and
durable.

transitive closure

of a graph: is a graph which contains a relationship from node x to node
y whenever there is a directed path from x to y; For example, if there is
a relationship from a to b, and another from b to c, then the transitive
closure includes a relationship from a to c.

type

types classify values. Each value in Cypher has a concrete type. Supported
types are:
•
•
•
•
•

string,
boolean,
the number types (double, integer, long),
the map types (plain maps, nodes, and relationships),
and collections of any concrete type.

The type hierarchy supports several other types (for example, any, scalar,
derived map, collection). These are used to classify values and collections
of values having different concrete types.
undirected relationship

a relationship that doesn’t have a direction. The opposite of directed
relationship.

vertex

graph theory: the fundamental unit used to form a mathematical graph
(plural: vertices). See node.

643

Appendix A. Resources
1

• Neo4j Cypher Refcard .
2
• Neo4j Javadocs . You can also download a javadocs.jar file from Maven Central, see
3
4
org.neo4j.doc:neo4j-javadocs or download it from neo4j-javadocs-2.3.12-javadoc.jar directly.
• Neo4j GraphGist, an online tool for creating interactive documents with executable Cypher
statements: http://gist.neo4j.org/.
• The main Neo4j site at http://neo4j.com/ is a good starting point to learn about Neo4j.
• See http://neo4j.com/developer/language-guides/ for how to use Neo4j from different programming
languages.
Below are some starting points within this manual:
•
•
•
•
•
•
•

Section 2.1, “The Neo4j Graph Database” [5]
Part III, “Cypher Query Language” [102]
Chapter 3, Introduction to Cypher [16]
Chapter 21, REST API [297]
Chapter 23, Installation & Deployment [437]
Section 23.4, “Upgrading” [445]
Chapter 27, Security [499]

1

http://neo4j.com/docs/2.3.12/cypher-refcard/
http://neo4j.com/docs/2.3.12/javadocs/
3
http://search.maven.org/#search%7Cgav%7C1%7Cg%3A%22org.neo4j.doc%22%20AND%20a%3A%22neo4j-javadocs%22
4
http://central.maven.org/maven2/org/neo4j/doc/neo4j-javadocs/2.3.12/neo4j-javadocs-2.3.12-javadoc.jar
2

644

Appendix B. Manpages
The Neo4j Unix manual pages are included on the following pages.
•
•
•
•
•

neo4j
neo4j-shell
neo4j-import
neo4j-backup
neo4j-arbiter

645

Manpages

Name
neo4j — Neo4j Server control

Synopsis
neo4j 

DESCRIPTION
Neo4j is a graph database, perfect for working with highly connected data. The neo4j command is used
to control the Neo4j Server.
The preferred way to install Neo4j on Linux systems is by using prebuilt installation packages. For
information regarding Windows, see http://neo4j.com/docs/stable/powershell.html.

COMMANDS
console
start
stop
restart
status
info

Start the server as an application, running as a foreground process. Stop the server using
CTRL-C.
Start server as daemon, running as a background process.
Stops a running daemonized server.
Restarts the server.
Current running state of the server.
Displays configuration information, such as the current NEO4J_HOME and CLASSPATH.

FILES
conf/neo4j-server.properties
conf/neo4j-wrapper.conf
conf/neo4j.properties

Server configuration.
Configuration for service wrapper.
Tuning configuration for the database.

646

Manpages

Name
neo4j-shell — a command-line tool for exploring and manipulating a graph database

Synopsis
neo4j-shell [REMOTE OPTIONS]
neo4j-shell [LOCAL OPTIONS]

DESCRIPTION
Neo4j shell is a command-line shell for running Cypher queries. There are also commands to get
information about the database. In addition, you can browse the graph, much like how the Unix shell
along with commands like cd, ls and pwd can be used to browse your local file system. The shell can
connect directly to a graph database on the file system. To access local a local database used by other
processes, use the readonly mode.

REMOTE OPTIONS
-port PORT
-host HOST
-name NAME
-readonly

Port of host to connect to (default: 1337).
Domain name or IP of host to connect to (default: localhost).
RMI name, i.e. rmi://:/ (default: shell).
Access the database in read-only mode.

LOCAL OPTIONS
-path PATH
-pid PID
-readonly
-c COMMAND
-file FILE
-config CONFIG

The path to the database directory. If there is no database at the location, a new
one will e created.
Process ID to connect to.
Access the database in read-only mode.
Command line to execute. After executing it the shell exits.
File to read and execute. After executing it the shell exits. If - is supplied as
filename data is read from stdin instead.
The path to the Neo4j configuration file to be used.

EXAMPLES
Examples for remote:
neo4j-shell
neo4j-shell -port 1337
neo4j-shell -host 192.168.1.234 -port 1337 -name shell
neo4j-shell -host localhost -readonly

Examples for local:
neo4j-shell -path /path/to/db
neo4j-shell -path /path/to/db -config /path/to/neo4j.properties
neo4j-shell -path /path/to/db -readonly

647

Manpages

Name
neo4j-import — Neo4j Import Tool

Synopsis
neo4j-import [options]

DESCRIPTION
neo4j-import is used to create a new Neo4j database from data in CSV files. See the chapter "Import
Tool" in the Neo4j Manual for details on the CSV file format — a special kind of header is required.
For information regarding Windows, see http://neo4j.com/docs/stable/powershell.html.

OPTIONS
--into 
--nodes[:Label1:Label2]
",,…"

Database directory to import into. Must not contain existing database.
Node CSV header and data. Multiple files will be logically seen as
one big file from the perspective of the importer. The first line must
contain the header. Multiple data sources like these can be specified
in one import, where each data source has its own header. Note that
file groups must be enclosed in quotation marks.
-Relationship CSV header and data. Multiple files will be logically seen
relationships[:RELATIONSHIP_TYPE]
as one big file from the perspective of the importer. The first line must
",,…"
contain the header. Multiple data sources like these can be specified
in one import, where each data source has its own header. Note that
file groups must be enclosed in quotation marks.
--delimiter 
option is ,.
--array-delimiter 
CSV data. The default option is ;.
--quote 
default option is ". Quotes inside quotes escaped like """Go away"", he
said." and "\"Go away\", he said." are supported. If you have set "'"
to be used as the quotation character, you could write the previous
example like this instead: '"Go away", he said.'
--multiline-fields 
contain newline characters. Default value: false
--input-encoding 
one out of the available character sets in the JVM, as provided by
Charset#availableCharsets(). If no input encoding is provided, the
default character set of the JVM will be used.
--ignore-empty-strings
Whether or not empty string fields, i.e. "" from input source are

ignored, i.e. treated as null. Default value: false
--id-type 
One out of [STRING, INTEGER, ACTUAL] and specifies how ids in
node/relationship input files are treated. STRING: arbitrary strings for
identifying nodes. INTEGER: arbitrary integer values for identifying
nodes. ACTUAL: (advanced) actual node ids. The default option is
STRING. Default value: STRING
--processors 
to the number of available processors reported by the JVM. There is a
certain amount of minimum threads needed so for that reason there
is no lower bound for this value. For optimal performance this value
shouldn’t be greater than the number of available processors.
--stacktrace 
Enable printing of error stack traces.
648

Manpages
--bad-tolerance 
--skip-bad-relationships


--skip-duplicate-nodes 

--ignore-extra-columns

--db-config 

Number of bad entries before the import is considered failed. This
tolerance threshold is about relationships refering to missing nodes.
Format errors in input data are still treated as errors. Default value:
1000
Whether or not to skip importing relationships that refers to missing
node ids, i.e. either start or end node id/group referring to node that
wasn’t specified by the node input data. Skipped nodes will be logged,
containing at most number of entites specified by bad-tolerance.
Default value: true
Whether or not to skip importing nodes that have the same id/group.
In the event of multiple nodes within the same group having the same
id, the first encountered will be imported whereas consecutive such
nodes will be skipped. Skipped nodes will be logged, containing at
most number of entities specified by bad-tolerance. Default value:
false
Whether or not to ignore extra columns in the data not specified
by the header. Skipped columns will be logged, containing at most
number of entities specified by bad-tolerance. Default value: false
(advanced) File specifying database-specific configuration. For more
information consult manual about available configuration options
for a neo4j configuration file. Only configuration affecting store at
time of creation will be read. Examples of supported config are:
dense_node_threshold string_block_size array_block_size

EXAMPLES
Below is a basic example, where we import movies, actors and roles from three files.
movies.csv
movieId:ID,title,year:int,:LABEL
tt0133093,"The Matrix",1999,Movie
tt0234215,"The Matrix Reloaded",2003,Movie;Sequel
tt0242653,"The Matrix Revolutions",2003,Movie;Sequel

actors.csv
personId:ID,name,:LABEL
keanu,"Keanu Reeves",Actor
laurence,"Laurence Fishburne",Actor
carrieanne,"Carrie-Anne Moss",Actor

roles.csv
:START_ID,role,:END_ID,:TYPE
keanu,"Neo",tt0133093,ACTED_IN
keanu,"Neo",tt0234215,ACTED_IN
keanu,"Neo",tt0242653,ACTED_IN
laurence,"Morpheus",tt0133093,ACTED_IN
laurence,"Morpheus",tt0234215,ACTED_IN
laurence,"Morpheus",tt0242653,ACTED_IN
carrieanne,"Trinity",tt0133093,ACTED_IN
carrieanne,"Trinity",tt0234215,ACTED_IN
carrieanne,"Trinity",tt0242653,ACTED_IN

The command will look like this:
neo4j-import --into path_to_target_directory --nodes movies.csv --nodes actors.csv --relationships roles.csv

See the Neo4j Manual for further examples.

649

Manpages

Name
neo4j-backup — Neo4j Backup Tool

Synopsis
neo4j-backup -host  [-port ] -to target_directory

DESCRIPTION
A tool to perform live backups over the network from a running Neo4j graph database onto a local
filesystem. Backups can be either full or incremental. The first backup must be a full backup, after that
incremental backups can be performed.
The source(s) are given as host:port pairs, the target is a filesystem location.
For information regarding Windows, see http://neo4j.com/docs/stable/powershell.html.

BACKUP TYPE
-full
-incremental

copies the entire database to a directory.
copies the changes that have taken place since the last full or incremental backup
to an existing backup store.

The backup tool will automatically detect whether it needs to do a full or an incremental backup.

SOURCE ADDRESS
Backup sources are given in the following format:
-host  [-port ]
host
port

In single mode, the host of a source database; in HA mode, the cluster address of a cluster
member.
In single mode, the port of a source database backup service; in HA mode, the port of a
cluster instance. If not given, the default value 6362 will be used for single mode, 5001 for HA.

IMPORTANT
Backups can only be performed on databases which have the configuration parameter
enable_online_backup=true set. That will make the backup service available on the default port (6362). To
enable the backup service on a different port use for example enable_online_backup=port=9999 instead.

EXAMPLES
 Performing a backup the first time: create a blank directory and run the backup tool
mkdir /mnt/backup/neo4j-backup
neo4j-backup -host 192.168.1.34 -to /mnt/backup/neo4j-backup
 Subsequent backups using the same _target_-directory will be incremental and therefore quick
neo4j-backup -host freja -to /mnt/backup/neo4j-backup
 Performing a backup where the service is registered on a custom port
neo4j-backup -host freja -port 9999 -to /mnt/backup/neo4j-backup
 Performing a backup from HA cluster, specifying a cluster member
./neo4j-backup -host oden -to /mnt/backup/neo4j-backup
 Performing a backup from HA cluster, specifying a cluster member registered on custom port
./neo4j-backup -host oden -port 9191 -to /mnt/backup/neo4j-backup

RESTORE FROM BACKUP
The Neo4j backups are fully functional databases. To use a backup, replace your database directory
with the backup.
650

Manpages

Name
neo4j-arbiter — Neo4j Arbiter for High-Availability clusters

Synopsis
neo4j-arbiter 

DESCRIPTION
Neo4j Arbiter is a service that can help break ties in Neo4j clusters that have an even number of cluster
members.

COMMANDS
console
start
stop
restart
status
install
remove

Start the server as an application, running as a foreground process. Stop the server using
CTRL-C.
Start server as daemon, running as a background process.
Stops a running daemonized server.
Restarts a running server.
Current running state of the server
Installs the server as a platform-appropriate system service.
Uninstalls the system service

FILES
conf/arbiter.cfg
conf/arbiter-wrapper.cfg

Arbiter server configuration.
Configuration for service wrapper.

651



Source Exif Data:
File Type                       : PDF
File Type Extension             : pdf
MIME Type                       : application/pdf
Linearized                      : No
Page Count                      : 656
Profile CMM Type                : Little CMS
Profile Version                 : 2.1.0
Profile Class                   : Display Device Profile
Color Space Data                : RGB
Profile Connection Space        : XYZ
Profile Date Time               : 1998:02:09 06:49:00
Profile File Signature          : acsp
Primary Platform                : Apple Computer Inc.
CMM Flags                       : Not Embedded, Independent
Device Manufacturer             : Hewlett-Packard
Device Model                    : sRGB
Device Attributes               : Reflective, Glossy, Positive, Color
Rendering Intent                : Perceptual
Connection Space Illuminant     : 0.9642 1 0.82491
Profile Creator                 : Little CMS
Profile ID                      : 0
Profile Copyright               : Copyright (c) 1998 Hewlett-Packard Company
Profile Description             : sRGB IEC61966-2.1
Media White Point               : 0.95045 1 1.08905
Media Black Point               : 0 0 0
Red Matrix Column               : 0.43607 0.22249 0.01392
Green Matrix Column             : 0.38515 0.71687 0.09708
Blue Matrix Column              : 0.14307 0.06061 0.7141
Device Mfg Desc                 : IEC http://www.iec.ch
Device Model Desc               : IEC 61966-2.1 Default RGB colour space - sRGB
Viewing Cond Desc               : Reference Viewing Condition in IEC61966-2.1
Viewing Cond Illuminant         : 19.6445 20.3718 16.8089
Viewing Cond Surround           : 3.92889 4.07439 3.36179
Viewing Cond Illuminant Type    : D50
Luminance                       : 76.03647 80 87.12462
Measurement Observer            : CIE 1931
Measurement Backing             : 0 0 0
Measurement Geometry            : Unknown
Measurement Flare               : 0.999%
Measurement Illuminant          : D65
Technology                      : Cathode Ray Tube Display
Red Tone Reproduction Curve     : (Binary data 2060 bytes, use -b option to extract)
Green Tone Reproduction Curve   : (Binary data 2060 bytes, use -b option to extract)
Blue Tone Reproduction Curve    : (Binary data 2060 bytes, use -b option to extract)
Language                        : en
Format                          : application/pdf
Title                           : The Neo4j Manual v2.3.12
Creator                         : The Neo4j Team neo4j.com11 http://neo4j.com/
Date                            : 2017:12:08 12:26:45Z
PDF Version                     : 1.4
Producer                        : Apache FOP Version 1.1
Create Date                     : 2017:12:08 12:26:45Z
Creator Tool                    : DocBook XSL Stylesheets with Apache FOP
Metadata Date                   : 2017:12:08 12:26:45Z
Page Mode                       : UseOutlines
Author                          : The Neo4j Team neo4j.com11 http://neo4j.com/
EXIF Metadata provided by EXIF.tools

Navigation menu