Table of Contents
Introduction
1.1
Getting Started
1.2
Installation
1.2.1
Run the server
1.2.2
Run the console
1.2.3
Run the Studio
1.2.4
Classes
1.2.5
Clusters
1.2.6
Record ID
1.2.7
Relationships
1.2.8
Basic SQL
1.2.9
Working with Graphs
1.2.10
Using Schema with Graphs
1.2.11
Setup a Distributed Database
1.2.12
Working with Distributed Graphs
1.2.13
Data M odeling
Basic Concepts
1.3
1.3.1
Supported Types
1.3.1.1
Inheritance
1.3.1.2
Concurrency
1.3.1.3
Schema
1.3.1.4
Graph or Document API?
1.3.1.5
Cluster Selection
1.3.1.6
M anaging Dates
1.3.1.7
Graph Consistency
1.3.2
Fetching Strategies
1.3.3
Use Cases
1.3.4
Time Series
1.3.4.1
Chat
1.3.4.2
Key Value
1.3.4.3
Queue system
1.3.4.4
Administration
Console
1.4
1.4.1
Backup
1.4.1.1
Begin
1.4.1.2
Browse Class
1.4.1.3
Browse Cluster
1.4.1.4
List Classes
1.4.1.5
Cluster Status
1.4.1.6
List Clusters
1.4.1.7
List Servers
1.4.1.8
1
List Server Users
1.4.1.9
Commit
1.4.1.10
Config
1.4.1.11
Config Get
1.4.1.12
Config Set
1.4.1.13
Connect
1.4.1.14
Create Cluster
1.4.1.15
Create Database
1.4.1.16
Create Index
1.4.1.17
Create Link
1.4.1.18
Create Property
1.4.1.19
Declare Intent
1.4.1.20
Delete
1.4.1.21
Dictionary Get
1.4.1.22
Dictionary Keys
1.4.1.23
Dictionary Put
1.4.1.24
Dictionary Remove
1.4.1.25
Disconnect
1.4.1.26
Display Record
1.4.1.27
Display Raw Record
1.4.1.28
Drop Cluster
1.4.1.29
Drop Database
1.4.1.30
Drop Server User
1.4.1.31
Export Database
1.4.1.32
Export Record
1.4.1.33
Freeze DB
1.4.1.34
Get
1.4.1.35
Gremlin
1.4.1.36
Import Database
1.4.1.37
Indexes
1.4.1.38
Info
1.4.1.39
Info Class
1.4.1.40
Info Property
1.4.1.41
Insert
1.4.1.42
Js
1.4.1.43
Jss
1.4.1.44
List Databases
1.4.1.45
List Connections
1.4.1.46
Load Record
1.4.1.47
Load Script
1.4.1.48
Profiler
1.4.1.49
Properties
1.4.1.50
Release DB
1.4.1.51
Reload Record
1.4.1.52
2
Repair Database
1.4.1.53
Restore
1.4.1.54
Rollback
1.4.1.55
Set
1.4.1.56
Set Server User
1.4.1.57
Sleep
1.4.1.58
Upgrading
1.4.2
Backward compatibility
1.4.2.1
From 2.1.x to 2.2.x
1.4.2.2
From 2.0.x to 2.1.x
1.4.2.3
From 1.7.x to 2.0.x
1.4.2.4
From 1.6.x to 1.7.x
1.4.2.5
From 1.5.x to 1.6.x
1.4.2.6
From 1.4.x to 1.5.x
1.4.2.7
From 1.3.x to 1.4.x
1.4.2.8
Backup and Restore
Incremental Backup and Restore
1.4.3
1.4.3.1
Export and Import
1.4.4
Export format
1.4.4.1
Import From RDBM S
1.4.4.2
To Document M odel
1.4.4.2.1
To Graph M odel
1.4.4.2.2
Import From Neo4j
Neo4j to OrientDB Importer
Tutorial: Importing the northwind Database from Neo4j
Import from Neo4j using GraphM L
Tutorial: Importing the movie Database from Neo4j
ETL
1.4.4.3
1.4.4.3.1
1.4.4.3.1.1
1.4.4.3.2
1.4.4.3.2.1
1.4.5
Configuration
1.4.5.1
Blocks
1.4.5.2
Sources
1.4.5.3
Extractors
1.4.5.4
Transformers
1.4.5.5
Loaders
1.4.5.6
Tutorial: Importing the Open Beer Database into OrientDB
1.4.5.7
Import from CSV to a Graph
1.4.5.8
Import a tree structure
1.4.5.9
Import from JSON
1.4.5.10
Import from RDBM S
1.4.5.11
Import from DB-Pedia
1.4.5.12
Import from Parse (Facebook)
1.4.5.13
Logging
1.4.6
Scheduler
1.4.7
Studio
1.4.8
3
Query
1.4.8.1
Edit Document
1.4.8.2
Edit Vertex
1.4.8.3
Schema
1.4.8.4
Class
1.4.8.5
Graph Editor
1.4.8.6
Functions
1.4.8.7
Security
1.4.8.8
Database M anagement
1.4.8.9
Dashboard
1.4.8.10
Server M anagement
1.4.8.11
Cluster M anagement
1.4.8.12
Data Centers
1.4.8.13
Query Profiler
1.4.8.14
Studio Auditing
1.4.8.15
Studio
1.4.8.16
Teleporter
1.4.8.17
Teleporter
1.4.9
Installation and configuration
1.4.9.1
Execution strategies
1.4.9.2
Sequential executions and One-Way Synchronizer
1.4.9.3
Import filters
1.4.9.4
Inheritance
1.4.9.5
Single Table Inheritance
1.4.9.5.1
Table Per Class Inheritance
1.4.9.5.2
Table Per Concrete Class Inheritance
1.4.9.5.3
Import Configuration
Troubleshooting
1.4.9.6
1.4.10
Java
1.4.10.1
Query Examples
1.4.10.2
Performance Tuning
1.4.11
Setting Configuration
1.4.11.1
Graph API
1.4.11.2
Document API
1.4.11.3
Object API
1.4.11.4
Profiler
1.4.11.5
Leak Detector
1.4.11.6
Distributed tuning
1.4.11.7
Security
1.4.12
Database security
1.4.12.1
Server security
1.4.12.2
Database encryption
1.4.12.3
Secure SSL connections
1.4.12.4
Security Configuration
1.4.12.5
4
Kerberos Example
1.4.12.6
Security v2.2 Code Changes
1.4.12.7
Security v2.2 New Features
1.4.12.8
Symmetric Key Authentication
1.4.12.9
Server M anagement
1.4.13
Install as Service on Unix
1.4.13.1
Install as Service on Windows
1.4.13.2
Install with Docker
1.4.13.3
Stress Test Tool
APIs and Drivers
1.4.14
1.5
Functions
1.5.1
Creating Functions
1.5.1.1
Using Functions
1.5.1.2
Accessing the Database
1.5.1.3
Server-side Functions
1.5.1.4
Available Plugins and Tools
1.5.2
Java API
1.5.3
Java API Introduction
1.5.3.1
Graph API
1.5.3.2
Vertices and Edges
1.5.3.2.1
Blueprints Extension
1.5.3.2.2
Factory
1.5.3.2.3
Schema
1.5.3.2.4
Class
1.5.3.2.4.1
Property
1.5.3.2.4.2
Partitioned
1.5.3.2.5
Comparison
1.5.3.2.6
Lightweight Edges
1.5.3.2.7
Graph Batch Insert
1.5.3.2.8
Document API
1.5.3.3
Database
1.5.3.3.1
Documents
1.5.3.3.2
Schema
1.5.3.3.3
Classes
1.5.3.3.3.1
Property
1.5.3.3.3.2
Field Part
1.5.3.3.4
Comparison
1.5.3.3.5
Object API
Binding
1.5.3.4
1.5.3.4.1
Traverse
1.5.3.5
Live Query
1.5.3.6
M ulti-Threading
1.5.3.7
Transactions
1.5.3.8
Binary Data
1.5.3.9
5
Web Apps
1.5.3.10
JDBC Driver
1.5.3.11
JPA
1.5.3.12
JM X
1.5.4
Gremlin API
1.5.5
Javascript
1.5.6
Javascript API
OrientJS (Node.js)
1.5.6.1
1.5.7
Server API
1.5.7.1
Database API
1.5.7.2
Record API
1.5.7.3
Class API
1.5.7.4
Class
1.5.7.4.1
Property
1.5.7.4.2
Records
1.5.7.4.3
Index API
1.5.7.5
Function API
1.5.7.6
Queries
1.5.7.7
create()
1.5.7.7.1
delete()
1.5.7.7.2
fetch()
1.5.7.7.3
insert()
1.5.7.7.4
liveQuery()
1.5.7.7.5
select()
1.5.7.7.6
transform()
1.5.7.7.7
traverse()
1.5.7.7.8
update()
1.5.7.7.9
Transactions
1.5.7.8
Events
1.5.7.9
PyOrient
1.5.8
Client
1.5.8.1
command()
1.5.8.1.1
batch()
1.5.8.1.2
data_cluster_add()
1.5.8.1.3
data_cluster_count()
1.5.8.1.4
data_cluster_data_range()
1.5.8.1.5
data_cluster_drop()
1.5.8.1.6
db_count_records()
1.5.8.1.7
db_create()
1.5.8.1.8
db_drop()
1.5.8.1.9
db_exists()
1.5.8.1.10
db_list()
1.5.8.1.11
db_open()
1.5.8.1.12
db_reload()
1.5.8.1.13
6
db_size()
1.5.8.1.14
get_session_token()
1.5.8.1.15
query()
1.5.8.1.16
query_async()
1.5.8.1.17
record_create()
1.5.8.1.18
record_delete()
1.5.8.1.19
record_load()
1.5.8.1.20
record_update()
1.5.8.1.21
set_session_token()
1.5.8.1.22
tx_commit()
1.5.8.1.23
attach()
1.5.8.1.23.1
begin()
1.5.8.1.23.2
commit()
1.5.8.1.23.3
rollback()
1.5.8.1.23.4
OGM
1.5.8.2
Connection
1.5.8.2.1
Schemas
1.5.8.2.2
Brokers
1.5.8.2.3
Batch
1.5.8.2.4
Scripts
1.5.8.2.5
C#/.NET
1.5.9
Server
1.5.9.1
ConfigGet()
1.5.9.1.1
ConfigList()
1.5.9.1.2
ConfigSet()
1.5.9.1.3
CreateDatabase()
1.5.9.1.4
DatabaseExists()
1.5.9.1.5
Databases()
1.5.9.1.6
DropDatabase()
1.5.9.1.7
Database
1.5.9.2
Clusters()
1.5.9.2.1
Command()
1.5.9.2.2
GetClusterIdFor()
1.5.9.2.3
GetClusterNameFor()
1.5.9.2.4
GetClusters()
1.5.9.2.5
Gremlin()
1.5.9.2.6
Insert()
1.5.9.2.7
JavaScript()
1.5.9.2.8
Query()
1.5.9.2.9
Select()
1.5.9.2.10
SqlBatch()
1.5.9.2.11
Update()
1.5.9.2.12
Query
Conditionals
1.5.9.3
1.5.9.3.1
7
Limiters
1.5.9.3.2
Sort
1.5.9.3.3
Transaction
1.5.9.4
Add()
1.5.9.4.1
AddEdge()
1.5.9.4.2
AddOrUpdate()
1.5.9.4.3
Delete()
1.5.9.4.4
GetPendingObject()
1.5.9.4.5
Update()
1.5.9.4.6
PHP
1.5.10
Client
1.5.10.1
Server
1.5.10.2
dbCreate()
1.5.10.2.1
dbDrop()
1.5.10.2.2
dbExists()
1.5.10.2.3
dbList()
1.5.10.2.4
Database
1.5.10.3
command()
1.5.10.3.1
dataClusterAdd()
1.5.10.3.2
dataClusterCount()
1.5.10.3.3
dataClusterDrop()
1.5.10.3.4
dataClusterDataRange()
1.5.10.3.5
dbCountRecords()
1.5.10.3.6
dbReload()
1.5.10.3.7
dbSize()
1.5.10.3.8
query()
1.5.10.3.9
queryAsync()
1.5.10.3.10
recordCreate()
1.5.10.3.11
recordLoad()
1.5.10.3.12
recordUpdate()
1.5.10.3.13
sqlBatch()
1.5.10.3.14
ClusterM ap
1.5.10.4
dropClusterID()
1.5.10.4.1
getClusterID()
1.5.10.4.2
getIdList()
1.5.10.4.3
Record
1.5.10.5
getOClass()
1.5.10.5.1
getOData()
1.5.10.5.2
getRid()
1.5.10.5.3
jsonSerialize()
1.5.10.5.4
recordSerialize()
1.5.10.5.5
setOClass()
1.5.10.5.6
setOData()
1.5.10.5.7
setRid()
1.5.10.5.8
8
ID
1.5.10.6
Transaction
1.5.10.7
attach()
1.5.10.7.1
begin()
1.5.10.7.2
commit()
1.5.10.7.3
rollback()
1.5.10.7.4
Elixir
1.5.11
Server
1.5.11.1
create_db()
1.5.11.1.1
db_exists?()
1.5.11.1.2
distrib-config()
1.5.11.1.3
drop_db()
1.5.11.1.4
Database
1.5.11.2
command()
1.5.11.2.1
create_record()
1.5.11.2.2
db_countrecords()
1.5.11.2.3
db_reload()
1.5.11.2.4
db_size()
1.5.11.2.5
delete_record()
1.5.11.2.6
live_query()
1.5.11.2.7
live_query_unsubscribe()
1.5.11.2.8
load_record()
1.5.11.2.9
script()
1.5.11.2.10
update_record()
1.5.11.2.11
Types
1.5.11.3
Structs
1.5.11.4
BinaryRecord
1.5.11.4.1
Date
1.5.11.4.2
DateTime
1.5.11.4.3
Document
1.5.11.4.4
RID
1.5.11.4.5
Scala API
1.5.12
HTTP API
1.5.13
Binary Protocol
1.5.14
CSV Serialization
1.5.14.1
Schemaless Serialization
1.5.14.2
Commands
1.5.14.3
SQL Reference
Commands
1.6
1.6.1
Alter Class
1.6.1.1
Alter Cluster
1.6.1.2
Alter Database
1.6.1.3
Alter Property
1.6.1.4
Alter Sequence
1.6.1.5
9
Create Class
1.6.1.6
Create Cluster
1.6.1.7
Create Edge
1.6.1.8
Create Function
1.6.1.9
Create Index
1.6.1.10
Create Link
1.6.1.11
Create Property
1.6.1.12
Create Sequence
1.6.1.13
Create User
1.6.1.14
Create Vertex
1.6.1.15
Delete
1.6.1.16
Delete Edge
1.6.1.17
Delete Vertex
1.6.1.18
Drop Class
1.6.1.19
Drop Cluster
1.6.1.20
Drop Index
1.6.1.21
Drop Property
1.6.1.22
Drop Sequence
1.6.1.23
Drop User
1.6.1.24
Explain
1.6.1.25
Find References
1.6.1.26
Grant
1.6.1.27
HA Remove Server
1.6.1.28
HA Status
1.6.1.29
HA Sync Cluster
1.6.1.30
HA Sync Database
1.6.1.31
HA Set
1.6.1.32
Insert
1.6.1.33
Live Select
1.6.1.34
Live Unsubscribe
1.6.1.35
M atch
1.6.1.36
M ove Vertex
1.6.1.37
Optimize Database
1.6.1.38
Rebuild Index
1.6.1.39
Revoke
1.6.1.40
Select
1.6.1.41
Traverse
1.6.1.42
Truncate Class
1.6.1.43
Truncate Cluster
1.6.1.44
Truncate Record
1.6.1.45
Update
1.6.1.46
Update Edge
1.6.1.47
Filtering
1.6.2
Functions
1.6.3
10
M ethods
1.6.4
Batch
1.6.5
Pagination
1.6.6
Sequences and auto increment
1.6.7
Pivoting with Query
1.6.8
Command Cache
1.6.9
Query Optimization
Indexing
1.6.10
1.7
SB-Tree
1.7.1
Hash
1.7.2
Auto-Sharding
1.7.3
Full Text
1.7.4
Lucene Full Text
1.7.5
Lucene Spatial Index
1.7.6
Scaling
1.8
Lifecycle
1.8.1
Configuration
1.8.2
Server M anager
1.8.2.1
Runtime Configuration
1.8.2.2
Replication
1.8.3
Sharding
1.8.4
Data Centers
1.8.5
Tuning
1.8.6
HA SQL Commands
1.8.7
HA Remove Server
1.8.7.1
HA Status
1.8.7.2
HA Sync Cluster
1.8.7.3
HA Sync Database
1.8.7.4
HA Set
1.8.7.5
Internals
1.9
Storages
1.9.1
M emory storage
1.9.1.1
PLocal storage
1.9.1.2
Engine
1.9.1.2.1
Disk-Cache
1.9.1.2.2
WAL (Journal)
1.9.1.2.3
Local storage (deprecated)
1.9.1.3
Clusters
1.9.2
Limits
1.9.3
RidBag
1.9.4
SQL Syntax
1.9.5
Custom Index Engine
1.9.6
Caching
1.9.7
Transaction
1.9.8
11
Hooks - Triggers
1.9.9
Dynamic Hooks
1.9.10
Java (Native) Hooks
1.9.10.1
Java Hook Tutorial
1.9.10.2
Server
1.9.11
Embed the Server
1.9.11.1
Web Server
1.9.11.2
System Database
1.9.12
System Users
1.9.13
Implementation
1.9.13.1
M ulti Tenant
1.9.14
Plugins
1.9.15
Automatic Backup
1.9.15.1
SysLog
1.9.15.2
M ail
1.9.15.3
JM X
1.9.15.4
Rexster
1.9.15.5
Gephi Graph Render
1.9.15.6
spider-box
1.9.15.7
Script Interpreter Plugin
1.9.15.8
Contribute to OrientDB
1.10
Hackaton
1.10.1
Report an issue
1.10.2
Get in touch
1.10.2.1
M ore Tutorials
1.10.2.2
Presentations
Roadmap
Enterprise Edition
Auditing
Tutorials
1.10.3
1.10.3.1
1.11
1.11.1
1.12
Tutorial: Importing the Open Beer Database into OrientDB
1.12.1
Tutorial: Importing the movie Database from Neo4j
1.12.2
Tutorial: Importing the northwind Database from Neo4j
1.12.3
Java Hook Tutorial
1.12.4
Release Notes
1.13
12
Introduction
OrientDB Manual - version 2.2.x
Quick Navigation
Getting S tarted
Main Topics
Developers
Introduction to OrientDB
Basic Concepts
SQL
Installation
Supported Data Types
Gremlin
First Steps
Inheritance
HTTP API
Troubleshooting
Security
Java API
Enterprise Edition
Indexes
NodeJS
ACID Transactions
PHP
Functions
Python
Caching Levels
.NET
Common Use Cases
Other Drivers
Network Binary Protocol
Javadocs
Operations
Installation
3rd party Plugins
Upgrade
Configuration
Distributed Architecture (replication, sharding and high-availability)
Performance Tuning
ETL to Import any kind of data into OrientDB
Import from Relational DB
Backup and Restore
Export and Import
Quick References
13
Introduction
Console
Studio web tool
Workbench (Enterprise Edition)
OrientDB Server
Network-Binary-Protocol
Gephi Graph Analysis Visual tool
Rexster Support and configuration
Continuous integration
Resources
User Group - Have question, troubles, problems?
#orientdb IRC channel on freenode
Professional Support
Training - Training and classes.
Events - Follow OrientDB at the next event!
Team - M eet the team behind OrientDB
Contribute - Contribute to the project.
Who is using OrientDB? - Clients using OrientDB in production.
Questions or Need Help?
Check out our Get in Touch page for different ways of getting in touch with us.
PDF
This documentation is also available in PDF format.
Past releases
v1.7.8
v2.0.x
v2.1.x
Welcome to OrientDB - the first M ulti-M odel Open Source NoSQL DBM S that brings together the power of graphs and the flexibility
of documents into one scalable high-performance operational database.
Every effort has been made to ensure the accuracy of this manual. However, OrientDB, LTD. makes no warranties with respect
to this documentation and disclaims any implied warranties of merchantability and fitness for a particular purpose. The
information in this document is subject to change without notice.
14
Getting Started
Getting Started
Over the past few years, there has been an explosion of many NoSQL database solutions and products. The meaning of the word
"NoSQL" is not a campaign against the SQL language. In fact, OrientDB allows for SQL syntax! NoSQL is probably best described by
the following:
NoSQL, meaning "not only SQL", is a movement encouraging developers and business people to open their minds and consider
new possibilities beyond the classic relational approach to data persistence.
Alternatives to relational database management systems have existed for many years, but they have been relegated primarily to niche use
cases such as telecommunications, medicine, CAD and others. Interest in NoSQL alternatives like OrientDB is increasing dramatically.
Not surprisingly, many of the largest web companies like Google, Amazon, Facebook, Foursquare and Twitter are using NoSQL based
solutions in their production environments.
What motivates companies to leave the comfort of a well established relational database world? It is basically the great need to better
solve today's data problems. Specifically, there are a few key areas:
Performance
Scalability (often huge)
Smaller footprint
Developer productivity and friendliness
Schema flexibility
M ost of these areas also happen to be the requirements of modern web applications. A few years ago, developers designed systems that
could handle hundreds of concurrent users. Today it is not uncommon to have a potential target of thousands or millions of users
connected and served at the same time.
Changing technology requirements have been taken into account on the application front by creating frameworks, introducing standards
and leveraging best practices. However, in the database world, the situation has remained more or less the same for over 30 years. From
the 1970s until recently, relational DBM Ss have played the dominant role. Programming languages and methodologies have evolved, but
the concept of data persistence and the DBM S have remained unchanged for the most part: it is all still tables, records and joins.
NoSQL Models
NoSQL-based solutions in general provide a powerful, scalable, and flexible way to solve data needs and use cases, which have
previously been managed by relational databases. To summarize the NoSQL options, we'll examine the most common models or
categories:
Key / Value databases: where the data model is reduced to a simple hash table, which consists of key / value pairs. It is often
easily distributed across multiple servers. The most recognized products of this group include Redis, Dynamo, and Riak.
Column-oriented databases: where the data is stored in sections of columns offering more flexibility and easy aggregation.
Facebook's Cassandra, Google's BigTable, and Amazon's SimpleDB are some examples of column-oriented databases.
Document databases: where the data model consists of document collections, in which each individual document can have
multiple fields without necessarily having a defined schema. The best known products of this group are M ongoDB and CouchDB.
Graph databases: where the domain model consists of vertices interconnected by edges creating rich graph structures. The best
known products of this group are OrientDB, Neo4j and Titan.
OrientDB is a document-graph database, meaning it has full native graph capabilities coupled with features normally only found
in document databases.
Each of these categories or models has its own peculiarities, strengths and limitations. There is no single category or model, which is
better than the others. However, certain types of databases are better at solving specific problems. This leads to the motto of NoSQL:
choose the best tool for your specific use case.
The goal of Orient Technologies in building OrientDB was to create a robust, highly scalable database that can perform optimally in the
widest possible set of use cases. Our product is designed to be a fantastic "go to" solution for practically all of your data persistence
needs. In the following parts of this tutorial, we will look closely at OrientDB, one of the best open-source, multi-model, next
15
Getting Started
generation NoSQL products on the market today.
16
Installation
Installation
OrientDB is available in two editions:
Community Edition is released as an open source project under the Apache 2 license. This license allows unrestricted free usage
for both open source and commercial projects.
Enterprise Edition is commercial software built on top of the Community Edition. Enterprise is developed by the same team that
developed the OrientDB engine. It serves as an extension of the Community Edition, providing Enterprise features, such as:
Non-Stop Backup and Restore
Scheduled FULL and Incremental Backups
Query Profiler
Distributed Clustering configuration
M etrics Recording
Live M onitoring with configurable Alerts
The Community Edition is available as a binary package for download or as source code on GitHub. The Enterprise Edition license is
included with Support purchases.
Use Docker
If you have Docker installed in your computer, this is the easiest way to run OrientDB. From the command line type:
$ docker run -d --name orientdb -p 2424:2424 -p 2480:2480
-e ORIENTDB_ROOT_PASSWORD=root orientdb:latest
Where instead of "root", type the root's password you want to use.
Use Ansible
If you manage your servers through Ansible, you can use the following role : https://galaxy.ansible.com/migibert/orientdb which is highly
customizable and allows you to deploy OrientDB as a standalone instance or multiple clusterized instances.
For using it, you can follow these steps :
Install the role
ansible-galaxy install migibert.orientdb
Create an Ansible inventory
Assuming you have one two servers with respective IPs fixed at 192.168.10.5 and 192.168.10.6, using ubuntu user.
[orientdb-servers]
192.168.20.5 ansible_ssh_user=ubuntu
192.168.20.6 ansible_ssh_user=ubuntu
Create an Ansible playbook
In this example, we provision a two node cluster using multicast discovery mode. Please note that this playbook assumes java is already
installed on the machine so you should have one step before that install Java 8 on the servers
17
Installation
- hosts: orientdb-servers
become: yes
vars:
orientdb_version: 2.0.5
orientdb_enable_distributed: true
orientdb_distributed:
hazelcast_network_port: 2434
hazelcast_group: orientdb
hazelcast_password: orientdb
multicast_enabled: True
multicast_group: 235.1.1.1
multicast_port: 2434
tcp_enabled: False
tcp_members: []
orientdb_users:
- name: root
password: root
tasks:
- apt:
name: openjdk-8-jdk
state: present
roles:
- role: orientdb-role
Run the playbook
ansible-playbook -i inventory playbook.yml
Prerequisites
Both editions of OrientDB run on any operating system that implements the Java Virtual machine (JVM ). Examples of these include:
Linux, all distributions, including ARM (Raspberry Pi, etc.)
M ac OS X
M icrosoft Windows, from 95/NT and later
Solaris
HP-UX
IBM AIX
OrientDB requires Java, version 1.7 or higher.
Note: In OSGi containers, OrientDB uses a
ConcurrentLinkedHashMap
implementation provided by concurrentlinkedhashmap to
create the LRU based cache. This library actively uses the sun.misc package which is usually not exposed as a system package.
To overcome this limitation you should add property
org.osgi.framework.system.packages.extra
with value
sun.misc
to your
list of framework properties.
It may be as simple as passing an argument to the VM starting the platform:
$ java -Dorg.osgi.framework.system.packages.extra=sun.misc
Binary Installation
OrientDB provides a pre-compiled binary package to install the database on your system. Depending on your operating system, this is
a tarred or zipped package that contains all the relevant files you need to run OrientDB. For desktop installations, go to OrientDB
Downloads and select the package that best suits your system.
On server installations, you can use the
wget
utility:
$ wget http://bit.ly/orientdb-ce-tele-2-2-23 -O orientdb-community-2.2.23.zip
Whether you use your web browser or
example,
/opt/orientdb/
wget
, unzip or extract the downloaded file into a directory convenient for your use, (for
on Linux). This creates a directory called orientdb-community-2.2.23 with relevant files and scripts, which
you will need to run OrientDB on your system.
18
Installation
Source Code Installation
In addition to downloading the binary packages, you also have the option of compiling OrientDB from the Community Edition source
code, available on GitHub. This process requires that you install Git and Apache M aven on your system.
To compile OrientDB from source code, clone the Community Edition repository, then run M aven (
mvn
) in the newly created
directory:
$ git clone https://github.com/orientechnologies/orientdb
$ git checkout develop
$ cd orientdb
$ mvn clean install
It is possible to skip tests:
$ mvn clean install -DskipTests
The develop branch contains code for the next version of OrientDB. Stable versions are tagged on master branch. For each maintained
version OrientDB has its own
hotfix
branch. As the time of writing this notes, the state of branches is:
develop: work in progress for next 3.0.x release (3.0.x-SNAPSHOT)
2.2.x: hot fix for next 2.2.x stable release (2.2.x-SNAPSHOT)
2.1.x: hot fix for next 2.1.x stable release (2.1.x-SNAPSHOT)
2.0.x: hot fix for next 2.0.x stable release (2.0.x-SNAPSHOT)
last tag on master is 2.2.0
The build process installs all jars in the local maven repository and creates archives under the
distribution
module inside the
target
directory. At the time of writing, building from branch 2.1.x gave:
$ls -l distribution/target/
total 199920
1088 26 Jan 09:57 archive-tmp
102 26 Jan 09:57 databases
102 26 Jan 09:57 orientdb-community-2.2.1-SNAPSHOT.dir
48814386 26 Jan 09:57 orientdb-community-2.2.1-SNAPSHOT.tar.gz
53542231 26 Jan 09:58 orientdb-community-2.2.1-SNAPSHOT.zip
$
The directory
orientdb-community-2.2.1-SNAPSHOT.dir
contains the OrientDB distribution uncompressed. Take a look to Contribute to
OrientDB if you want to be involved.
Update Permissions
For Linux, M ac OS X and UNIX-based operating system, you need to change the permissions on some of the files after compiling from
source.
$ chmod 755 bin/*.sh
$ chmod -R 777 config
These commands update the execute permissions on files in the
config/
directory and shell scripts in
bin/
, ensuring that you can
run the scripts or programs that you've compiled.
Post-installation Tasks
For desktop users installing the binary, OrientDB is now installed and can be run through shell scripts found in the package
bin
directory of the installation. For servers, there are some additional steps that you need to take in order to manage the database server for
OrientDB as a service. The procedure for this varies, depending on your operating system.
Install as Service on Unix, Linux and M ac OS X
Install as Service on M icrosoft Windows
19
Installation
Upgrading
When the time comes to upgrade to a newer version of OrientDB, the methods vary depending on how you chose to install it in the first
place. If you installed from binary downloads, repeat the download process above and update any symbolic links or shortcuts to point
to the new directory.
For systems where OrientDB was built from source, pull down the latest source code and compile from source.
$ git pull origin master
$ mvn clean install
Bear in mind that when you build from source, you can switch branches to build different versions of OrientDB using Git. For example,
$ git checkout 2.2.x
$ mvn clean install
builds the
2.2.x
branch, instead of
master
.
Building a single executable jar with OrientDB
OrientDB for internal components like engines, operators, factories uses Java SPI Service Provider Interface. That means that the jars of
OrientDB are shipped with files in
META-INF/services
that contains the implementation of components. Bear in mind that when
building a single executable jar, you have to concatenate the content of files with the same name in different orientdb-*.jar . If you are
using M aven Shade Plugin you can use Service Resource Transformer to do that.
Other Resources
To learn more about how to install OrientDB on specific environments, please refer to the guides below:
Install with Docker
Install with Ansible
Install on Linux Ubuntu
Install on JBoss AS
Install on GlassFish
Install on Ubuntu 12.04 VPS (DigitalOcean)
Install on Vagrant
20
Run the server
Running the OrientDB Server
When you finish installing OrientDB, whether you build it from source or download the binary package, you are ready to launch the
database server. You can either start it through the system daemon or through the provided server script. This article only covers the
latter.
Note: If you would like to run OrientDB as a service on your system, there are some additional steps that you need to take. This
provides alternate methods for starting the server and allows you to launch it as a daemon when your system boots. For more
information on this process see:
Install OrientDB as a Service on Unix, Linux and M ac OS X
Install OrientDB as a Service on M icrosoft Windows
Starting the Database Server
While you can run the database server as system daemon, you also have the option of starting it directly. In the OrientDB installation
directory, (that is
$ORIENTDB_HOME
), under
bin
, there is a file named
server.sh
on Unix-based systems and
server.bat
on
Windows. Executing this file starts the server.
To launch the OrientDB database server, run the following commands:
21
Run the server
$
cd $ORIENTDB_HOME/bin
$
./server.sh
.
.`
,
`
`:.
`,`
.,.
.,,
,:`
:,,
,,,
.
.,.:::::
,`
.::,,,,::.,,,,,,`;;
````
`,.
::,,,,,,,:.,,.`
,,:,:,,,,,,,,::.
`
.:
`
`
,,:.,,,,,,,,,: `::, ,,
`
.:
``
::,::`
,:,,,,,,,,,,::,:
,,
:.
:,,,,,,,,,,:,::
,,
:
:,,,,,,,,,,:,::,
:
: :,::`
::::
::
:
.:
:
:
:
.:
,, .::::::::
:
:
.:
:
:
.:
`,...,,:,,,,,,,,,: .:,. ,, ,,
.,,,,::,,,,,,,:
.:
`: , ,,
...,::,,,,::.. `:
.,,
,::::,,,. `:
,,
:
`
:
:
.:
:,
:
:
:
.:
:
:
.:
:::::
,,:` `,,.
,,,
,,.
``
.,`
`,
S E R V E R
`.
``
`
2012-12-28 01:25:46:319 INFO Loading configuration from: config/orientdb-serverconfig.xml... [OServerConfigurationLoaderXml]
2012-12-28 01:25:46:625 INFO OrientDB Server v1.6 is starting up... [OServer]
2012-12-28 01:25:47:142 INFO -> Loaded memory database 'temp' [OServer]
2012-12-28 01:25:47:289 INFO Listening binary connections on 0.0.0.0:2424
[OServerNetworkListener]
2012-12-28 01:25:47:290 INFO Listening http connections on 0.0.0.0:2480
[OServerNetworkListener]
2012-12-28 01:25:47:317 INFO OrientDB Server v1.6 is active. [OServer]
The database server is now running. It is accessible on your system through ports
2424
and
2480
. At the first startup the server will
ask for the root user password. The password is stored in the config file.
Stop the Server
On the console where the server is running a simple CTRL+c will shutdown the server.
The shutdown.sh (shutdown.bat) script could be used to stop the server:
$
cd $ORIENTDB_HOME/bin
$
./shutdown.sh -p ROOT_PASSWORD
On *nix systems a simple call to shutdown.sh will stop the server running on localhost:
$
cd $ORIENTDB_HOME/bin
$
./shutdown.sh
22
Run the server
It is possible to stop servers running on remote hosts or even on different ports on localhost:
$
cd $ORIENTDB_HOME/bin
$
./shutdown.sh -h odb1.mydomain.com -P 2424-2430 -u root -p ROOT_PASSWORD
List of params
-h | --host HOS TNAME or IP ADDRES S : the host or ip where OrientDB is running, default to localhost
-P | --ports PORT or PORT RANGE : single port value or range of ports; default to 2424-2430
-u | --user ROOT US ERNAME : root's username; deafult to root
-p | --password ROOT PAS S WORD : root's user password; mandatory
NOTE On Windows systems password is always mandatory because the script isn't able to discover the pid of the OrientDB's
process.
Server Log Messages
Following the masthead, the database server begins to print log messages to standard output. This provides you with a guide to what
OrientDB does as it starts up on your system.
1. The database server loads its configuration file from the file
$ORIENTDB_HOME/config/orientdb-server-config.xml
.
For more information on this step, see OrientDB Server.
2. The database server loads the
temp
database into memory. You can use this database for storing temporary data.
3. The database server begins listening for binary connections on port
2424
for all configured networks, (
0.0.0.0
).
4. The database server begins listening for HTTP connections on port
2480
for all configured networks, (
0.0.0.0
).
Accessing the Database Server
By default, OrientDB listens on two different ports for external connections.
Binary: OrientDB listens on port
2424
for binary connections from the console and for clients and drivers that support the
Network Binary Protocol.
HTTP: OrientDB listens on port
2480
for HTTP connections from OrientDB Studio Web Tool and clients and drivers that
support the HTTP/REST protocol, or similar tools, such as cURL.
If you would like the database server to listen at different ports or IP address, you can define these values in the configuration file
config/orientdb-server-config.xml
.
23
Run the console
Running the OrientDB Console
Once the server is running there are various methods you can use to connect to your database server to an individual databases. Two
such methods are the Network Binary and HTTP/REST protocols. In addition to these OrientDB provides a command-line interface for
connecting to and working with the database server.
Starting the OrientDB Console
In the OrientDB installation directory (that is,
console.sh
for Unix-based systems or
$ORIENTDB_HOME
console.bat
, where you installed the database) under
bin
, there is a file called
for Windows users.
To launch the OrientDB console, run the following command after you start the database server:
$
cd $ORIENTDB_HOME/bin
$
./console.sh
OrientDB console v.X.X.X (build 0) www.orientdb.com
Type 'HELP' to display all the commands supported.
Installing extensions for GREMLIN language v.X.X.X
orientdb>
The OrientDB console is now running. From this prompt you can connect to and manage any remote or local databases available to you.
Using the
HELP
Command
In the event that you are unfamiliar with OrientDB and the available commands, or if you need help at any time, you can use the
command, or type
orientdb>
?
HELP
into the console prompt.
HELP
AVAILABLE COMMANDS:
* alter class
Alter a class in the database schema
* alter cluster Alter class in the database schema
...
...
* help
Print this help
* exit
Close the console
For each console command available to you,
HELP
documents its basic use and what it does. If you know the particular command and
need details on its use, you can provide arguments to
orientdb>
HELP
for further clarification.
HELP SELECT
COMMAND: SELECT
- Execute a query against the database and display the results.
SYNTAX: select
WHERE:
- : The query to execute
Connecting to Server Instances
24
Run the console
There are some console commands, such as
LIST DATABASES
or
CREATE DATABASE
, which you can only run while connected to a server
instance. For other commands, however, you must also connect to a database, before they run without error.
Before you can connect to a fresh server instance and fully control it, you need to know the root password for the database. The
root password is located in the configuration file at
config/orientdb-server-config.xml
. You can find it by searching for the
element. If you want to change it, edit the configuration file and restart the server.
...
...
With the required credentials, you can connect to the database server instance on your system, or establish a remote connection to one
running on a different machine.
orientdb>
CONNECT remote:localhost root my_root_password
Connecting to remote Server instance [remote:localhost] with user 'root'...OK
Once you have established a connection to the database server, you can begin to execute commands on that server, such as
DATABASES
and
orientdb>
CREATE DATABASE
LIST
.
LIST DATABASES
Found 1 databases:
* GratefulDeadConcerts (plocal)
To connect to this database or to a different one, use the
and password. By default, each database has an
admin
CONNECT
command from the console and specify the server URL, username,
user with a password of
admin
.
Warning: Always change the default password on production databases.
The above
LIST DATABASES
command shows a
GratefulDeadConcerts
installed on the local server. To connect to this database, run the
following command:
orientdb>
CONNECT remote:localhost/GratefulDeadConcerts admin admin
Connecting to database [remote:localhost/GratefulDeadConcerts] with user 'admin'...OK
The
CONNECT
command takes a specific syntax for its URL. That is,
remote:localhost/GratefulDeadConcerts
in the example. It has
three parts:
Protocol: The first part of the database address is the protocol the console should use in the connection. In the example, this is
remote
, indicating that it should use the TCP/IP protocol.
Address: The second part of the database address is hostname or IP address of the database server that you want the console to
connect to. In the example, this is
localhost
, since the connection is made to a server instance running on the local file system.
Database: The third part of the address is the name of the database that you want to use. In the case of the example, this is
GratefulDeadConcerts
.
For more detailed information about the commands, see Console Commands.
25
Run the console
Note: The OrientDB distribution comes with the bundled database
GratefulDeadConcerts
which represents the Graph of the
Grateful Dead's concerts. This database can be used by anyone to start exploring the features and characteristics of OrientDB.
26
Run the Studio
Run the Studio
If you're more comfortable interacting with database systems through a graphical interface then you can accomplish the most common
database tasks with OrientDB Studio, the web interface.
Connecting to Studio
By default, there are no additional steps that you need to take to start OrientDB Studio. When you launch the Server, whether through
the start-up script
$
server.sh
or as a system daemon, the Studio web interface opens automatically with it.
firefox http://localhost:2480
27
Run the Studio
From here you can create a new database, connect to or drop an existing database, import a public database and navigate to the Server
management interface.
For more information on the OrientDB Studio, see Studio.
28
Classes
Classes
M ulti-model support in the OrientDB engine provides a number of ways in approaching and understanding its basic concepts. These
concepts are clearest when viewed from the perspective of the Document Database API. Like many database management systems,
OrientDB uses the Record as an element of storage. There are many types of records, but with the Document Database API, records
always use the Document type. Documents are formed by a set of key/value pairs, referred to as fields and properties, and can belong to
a class.
The Class is a concept drawn from the Object-oriented programming paradigm. It is a type of data model that allows you to define
certain rules for records that belong to it. In the traditional Document database model, it is comparable to the collection, while in the
Relational database model it is comparable to the table.
For more information on classes in general, see Wikipedia.
To list all the configured classes on your system, use the
orientdb>
CLASSES
command in the console:
CLASSES
CLASSES:
-------------------+------------+----------+-----------+
NAME
| SUPERCLASS |CLUSTERS
| RECORDS
|
-------------------+------------+----------+-----------+
AbstractPerson
|
| -1
|
0 |
Account
|
| 11
|
1126 |
Actor
|
| 91
|
3 |
Address
|
| 19
|
166 |
Animal
|
| 17
|
0 |
....
| ....
| ....
|
.... |
Whiz
|
| 14
|
1001 |
-------------------+------------+----------+-----------+
TOTAL
22775 |
-------------------------------------------------------+
Working with Classes
In order to start using classes with your own applications, you need to understand how to create and configure a class for use. The class
in OrientDB is similar to the table in relational databases, but unlike tables, classes can be schema-less, schema-full or mixed. A class can
inherit properties from other classes thereby creating trees of classes (though the super-class relationship).
Each class has its own cluster or clusters, (created by default, if none are defined). For now we should know that a cluster is a place
where a group of records are stored. We'll soon see how
clustering
improves performance of querying the database.
For more information on classes in OrientDB, see Class.
To create a new class, use the
orientdb>
CREATE CLASS
command:
CREATE CLASS Student
Class created successfully. Total classes in database now: 92
This creates a class called
cluster called
student
now displayed in the
Student
. Given that no cluster was defined in the
CREATE CLASS
command, OrientDB creates a default
, to contain records assigned to this class. For the moment, the class has no records or properties tied to it. It is
CLASSES
listings.
Adding Properties to a Class
29
Classes
As mentioned above, OrientDB does allow you to work in a schema-less mode. That is, it allows you to create classes without defining
their properties. However, in the event that you would like to define indexes or constraints for your class, properties are mandatory.
Following the comparison to relational databases, if classes in OrientDB are similar to tables, properties are the columns on those tables.
To create new properties on
orientdb>
Student
, use the
CREATE PROPERTY
command in the console:
CREATE PROPERTY Student.name STRING
Property created successfully with id=1
orientdb>
CREATE PROPERTY Student.surname STRING
Property created successfully with id=2
orientdb>
CREATE PROPERTY Student.birthDate DATE
Property created successfully with id=3
These commands create three new properties on the
Student
class to provide you with areas to define the individual student's name,
surname and date of birth.
Displaying Class Information
On occasion, you may need to reference a particular class to see what clusters it belongs to and any properties configured for its use.
Using the
INFO CLASS
command, you can display information on the current configuration and properties of a class.
To display information on the class
orientdb>
Student
, use the
INFO CLASS
command:
INFO CLASS Student
Class................: Student
Default cluster......: student (id=96)
Supported cluster ids: [96]
Properties:
-----------+--------+--------------+-----------+----------+----------+-----+-----+
NAME
| TYPE
| LINKED TYPE/ | MANDATORY | READONLY | NOT NULL | MIN | MAX |
|
| CLASS
|
|
|
|
|
|
-----------+--------+--------------+-----------+----------+----------+-----+-----+
birthDate | DATE
| null
| false
| false
| false
|
|
|
name
| STRING | null
| false
| false
| false
|
|
|
surname
| STRING | null
| false
| false
| false
|
|
|
-----------+--------+--------------+-----------+----------+----------+-----+-----+
Adding Constraints to Properties
Constraints create limits on the data values assigned to properties. For instance, the type, the minimum or maximum size of, whether or
not a value is mandatory or if null values are permitted to the property.
To add a constraint, use the
orientdb>
ALTER PROPERTY
command:
ALTER PROPERTY Student.name MIN 3
Property updated successfully
30
Classes
This command adds a constraint to
Student
on the
name
property. It sets it so that any value given to this class and property must
have a minimum of three characters.
Viewing Records in a Class
Classes contain and define records in OrientDB. You can view all records that belong to a class using the
data belonging to a particular record with the
In the above examples, you created a
Student
orientdb>
OUser
command and
command.
DISPLAY RECORD
class and defined the schema for records that belong to that class, but you did not create
these records or add any data. As a result, running these commands on the
below, consider the
BROWSE CLASS
Student
class returns no results. Instead, for the examples
class.
INFO CLASS OUser
CLASS 'OUser'
Super classes........: [OIdentity]
Default cluster......: ouser (id=5)
Supported cluster ids: [5]
Cluster selection....: round-robin
Oversize.............: 0.0
PROPERTIES
----------+---------+--------------+-----------+----------+----------+-----+-----+
NAME
| TYPE
| LINKED TYPE/ | MANDATORY | READONLY | NOT NULL | MIN | MAX |
|
| CLASS
|
|
|
|
|
|
----------+---------+--------------+-----------+----------+----------+-----+-----+
password | STRING
| true
| false
| true
|
|
|
roles
| LINKSET | ORole
| null
| false
| false
| false
|
|
|
name
| STRING
| null
| true
| false
| true
|
|
|
status
| STRING
| null
| true
| false
| true
|
|
|
----------+---------+--------------+-----------+----------+----------+-----+-----+
INDEXES (1 altogether)
-------------------------------+----------------+
NAME
| PROPERTIES
|
-------------------------------+----------------+
OUser.name
| name
|
-------------------------------+----------------+
OrientDB ships with a number of default classes, which it uses in configuration and in managing data on your system, (the classes with
the
O
prefix shown in the
CLASSES
To see records assigned to the
orientdb>
command output). The
OUser
class, run the
OUser
BROWSE CLASS
class defines the users on your database.
command:
BROWSE CLASS OUser
---+------+-------+--------+-----------------------------------+--------+-------+
# | @RID | @Class| name
| password
| status | roles |
---+------+-------+--------+-----------------------------------+--------+-------+
0 | #5:0 | OUser | admin
| {SHA-256}8C6976E5B5410415BDE90... | ACTIVE | [1]
|
1 | #5:1 | OUser | reader | {SHA-256}3D0941964AA3EBDCB00EF... | ACTIVE | [1]
|
2 | #5:2 | OUser | writer | {SHA-256}B93006774CBDD4B299389... | ACTIVE | [1]
|
---+------+-------+--------+-----------------------------------+--------+-------+
31
Classes
In the example, you are listing all of the users of the database. While this is fine for your initial setup and as an
example, it is not particularly secure. To further improve security in production environments, see Security.
When you run
BROWSE CLASS
, the first column in the output provides the identifier number, which you can use to display detailed
information on that particular record.
To show the first record browsed from the
orientdb>
OUser
class, run the
DISPLAY RECORD
command:
DISPLAY RECORD 0
------------------------------------------------------------------------------+
Document - @class: OUser
@rid: #5:0
@version: 1
|
----------+-------------------------------------------------------------------+
Name | Value
|
----------+-------------------------------------------------------------------+
name | admin
|
password | {SHA-256}8C6976E5B5410415BDE908BD4DEE15DFB167A9C873F8A81F6F2AB... |
status | ACTIVE
|
roles | [#4:0=#4:0]
|
----------+-------------------------------------------------------------------+
Bear in mind that this command references the last call of
BROWSE CLASS
. You can continue to display other records, but you cannot
display records from another class until you browse that particular class.
32
Clusters
Clusters
The Cluster is a place where a group of records are stored. Like the Class, it is comparable with the collection in traditional document
databases, and in relational databases with the table. However, this is a loose comparison given that unlike a table, clusters allow you to
store the data of a class in different physical locations.
To list all the configured clusters on your system, use the
orientdb>
CLUSTERS
command in the console:
CLUSTERS
CLUSTERS:
-------------+------+-----------+-----------+
NAME
| ID
| TYPE
| RECORDS
|
-------------+------+-----------+-----------+
account
| 11
| PHYSICAL
|
actor
| 91
| PHYSICAL
|
1107 |
3 |
address
| 19
| PHYSICAL
|
166 |
animal
| 17
| PHYSICAL
|
0 |
animalrace
| 16
| PHYSICAL
|
2 |
....
| .... | ....
|
.... |
-------------+------+-----------+-----------+
TOTAL
23481 |
--------------------------------------------+
Understanding Clusters
By default, OrientDB creates one cluster for each Class. Starting from v2.2, OrientDB automatically creates multiple clusters per each
class (the number of clusters created is equals to the number of CPU's cores available on the server) to improve using of parallelism. All
records of a class are stored in the same cluster, which has the same name as the class. You can create up to 32,767 (or, 215 - 1) clusters
in a database. Understanding the concepts of classes and clusters allows you to take advantage of the power of clusters in designing new
databases.
While the default strategy is that each class maps to one cluster, a class can rely on multiple clusters. For instance, you can spawn
records physically in multiple locations, thereby creating multiple clusters.
Here, you have a class
USA_customers
Customer
that relies on two clusters:
, which is a cluster that contains all customers in the United States.
China_customers
, which is a cluster that contains all customers in China.
In this deployment, the default cluster is
USA_customers
. Whenever commands are run on the
Customer
class, such as
INSERT
statements, OrientDB assigns this new data to the default cluster.
33
Clusters
The new entry from the
INSERT
statement is added to the
USA_customers
cluster, given that it's the default. Inserting data into a non-
default cluster would require that you specify the cluster you want to insert the data into in your statement.
When you run a query on the
Customer
class, such as
SELECT
queries, for instance:
OrientDB scans all clusters associated with the class in looking for matches.
In the event that you know the cluster in which the data is stored, you can query that cluster directly to avoid scanning all others and
optimize the query.
34
Clusters
Here, OrientDB only scans the
China_customers
cluster of the
Customer
class in looking for matches
Note: The method OrientDB uses to select the cluster, where it inserts new records, is configurable and extensible. For more
information, see Cluster Selection.
Working with Clusters
While running in HA mode, upon the creation of a new record (document, vertex, edge, etc.) the coordinator server automatically assigns
the cluster among the list of local clusters for the current server. For more information look at HA: Cluster Ownership.
You may also find it beneficial to locate different clusters on different servers, physically separating where you store records in your
database. The advantages of this include:
Optimization Faster query execution against clusters, given that you need only search a subset of the clusters in a class.
Indexes With good partitioning, you can reduce or remove the use of indexes.
Parallel Queries: Queries can be run in parallel when made to data on multiple disks.
S harding: You can shard large data-sets across multiple instances.
Adding Clusters
When you create a class, OrientDB creates a default cluster of the same name. In order for you to take advantage of the power of
clusters, you need to create additional clusters on the class. This is done with the
ADDCLUSTER
ALTER CLASS
statement in conjunction with the
parameter.
To add a cluster to the
orientdb>
Customer
class, use an
ALTER CLASS
statement in the console:
ALTER CLASS Customer ADDCLUSTER UK_Customers
Class updated successfully
You now have a third cluster for the
Customer
class, covering those customers located in the United Kingdom.
Viewing Records in a Cluster
Clusters store the records contained by a class in OrientDB. You can view all records that belong to a cluster using the
command and the data belonging to a particular record with the
DISPLAY RECORD
BROWSE CLUSTER
command.
35
Clusters
In the above example, you added a cluster to a class for storing records customer information based on their locations around the world,
but you did not create these records or add any data. As a result, running these commands on the
Instead, for the examples below, consider the
ouser
Customer
class returns no results.
cluster.
OrientDB ships with a number of default clusters to store data from its default classes. You can see these using the
command. Among these, there is the
To see records stored in the
orientdb>
ouser
ouser
CLUSTERS
cluster, which stores data of the users on your database.
cluster, run the
BROWSE CLUSTER
command:
BROWSE CLUSTER OUser
---+------+--------+--------+----------------------------------+--------+-------+
# | @RID | @CLASS | name
| password
| status | roles |
---+------+-------+--------+-----------------------------------+--------+-------+
0 | #5:0 | OUser | admin
| {SHA-256}8C6976E5B5410415BDE90... | ACTIVE | [1]
|
1 | #5:1 | OUser | reader | {SHA-256}3D0941964AA3EBDCB00CC... | ACTIVE | [1]
|
2 | #5:2 | OUser | writer | {SHA-256}B93006774CBDD4B299389... | ACTIVE | [1]
|
---+------+--------+--------+----------------------------------+--------+-------+
The results are identical to executing
BROWSE CLASS
on the
OUser
class, given that there is only one cluster for the
OUser
class in this
example.
In the example, you are listing all of the users of the database. While this is fine for your initial setup and as an
example, it is not particularly secure. To further improve security in production environments, see Security.
When you run
BROWSE CLUSTER
, the first column in the output provides the identifier number, which you can use to display detailed
information on that particular record.
To show the first record browsed from the
orientdb>
ouser
cluster, run the
DISPLAY RECORD
command:
DISPLAY RECORD 0
------------------------------------------------------------------------------+
Document - @class: OUser
@rid: #5:0
@version: 1
|
----------+-------------------------------------------------------------------+
Name | Value
|
----------+-------------------------------------------------------------------+
name | admin
|
password | {SHA-256}8C6976E5B5410415BDE908BD4DEE15DFB167A9C873F8A81F6F2AB... |
status | ACTIVE
|
roles | [#4:0=#4:0]
|
----------+-------------------------------------------------------------------+
Bear in mind that this command references the last call of
BROWSE CLUSTER
. You can continue to display other records, but you cannot
display records from another cluster until you browse that particular cluster.
36
Record ID
Record ID
In OrientDB, each record has its own self-assigned unique ID within the database called Record ID or RID. It is composed of two parts:
#:
That is,
The cluster identifier.
The position of the data within the cluster.
Each database can have a maximum of 32,767 clusters, or 215 - 1. Each cluster can handle up to 9,223,372,036,780,000 records, or 263,
namely 9,223,372 trillion records.
The maximum size of a database is 278 records, or 302,231,454,903 trillion records. Due to limitations in hardware resources,
OrientDB has not been tested at such high numbers, but there are users working with OrientDB in the billions of records range.
Loading Records
Each record has a Record ID, which notes the physical position of the record inside the database. What this means is that when you load
a record by its RID, the load is significantly faster than it would be otherwise.
In document and relational databases, the more data that you have, the slower the database responds. OrientDB handles relationships as
physical links to the records. The relationship is assigned only once, when the edge is created
databases, which compute the relationship every time the database is run
O(log N)
O(1)
. You can compare this to relational
. In OrientDB, the size of a database does not effect
the traverse speed. The speed remains constant, whether for one record or one hundred billion records. This is a critical feature in the age
of Big Data.
To directly load a record, use the
orientdb>
LOAD RECORD
command in the console.
LOAD RECORD #12:4
-------------------------------------------------------ODocument - @class: Company
@rid: #12:4
@version: 8
-------------+-----------------------------------------Name | Value
-------------+-----------------------------------------addresses | [NOT LOADED: #19:159]
salary | 0.0
employees | 100004
id | 4
name | Microsoft4
initialized | false
salary2 | 0.0
checkpoint | true
created | Sat Dec 29 23:13:49 CET 2012
-------------+------------------------------------------
The
LOAD RECORD
command returns some useful information about this record. It shows:
that it is a document. OrientDB supports different types of records, but document is the only type covered in this chapter.
that it belongs to the
Company
that its current version is
8
class.
. OrientDB uses an M VCC system. Every time you update a record, its version increments by one.
37
Record ID
that we have different field types: floats in
for
initialized
that the field
and
checkpoint
addresses
has been
salary
and
, and date-time for
NOT LOADED
salary2
created
. It is also a
, integers for
employees
and
id
, string for
name
, booleans
.
LINK
to another record,
#19:159
. This is a relationship. For more
information on this concept, see Relationships.
38
Relationships
Relationships
One of the most important features of Graph databases lies in how they manage relationships. M any users come to OrientDB from
M ongoDB due to OrientDB having more efficient support for relationships.
Relations in Relational Databases
M ost database developers are familiar with the Relational model of databases and with relational database management systems, such as
M ySQL and M S-SQL. Given its more than thirty years of dominance, this has long been thought the best way to handle relationships.
By contrast, Graph databases suggest a more modern approach to this concept.
Consider, as an example, a database where you need to establish relationships between
Customer
and
Address
tables.
1-to-1 Relationship
Relational databases store the value of the target record in the
address
foreign key points to the Primary Key of the related record in the
row of the
Address
Customer
table. This is the Foreign Key. The
table.
Consider a case where you want to view the address of a customer named Luca. In a Relational database, like M ySQL, this is how you
would query the table:
mysql>
SELECT B.location FROM Customer A, Address B
WHERE A.name='Luca' AND A.address=B.id;
What happens here is a
JOIN
. That is, the contents of two tables are joined to form the results. The database executes the
JOIN
every
time you retrieve the relationship.
1-to-Many Relationship
Given that Relational databases have no concept of a collections, the
Customer
table cannot have multiple foreign keys. The only way
to manage a 1-to-M any Relationship in databases of this kind is to move the Foreign Key to the
Address
table.
39
Relationships
For example, consider a case where you want to return all addresses connected to the customer Luca, this is how you would query the
table:
mysql>
SELECT B.location FROM Customer A, Address B
WHERE A.name='Luca' AND B.customer=A.id;
Many-to-Many relationship
The most complicated case is the M any-to-M any relationship. To handle associations of this kind, Relational databases require a
separate, intermediary table that matches rows from both
double
JOIN
Customer
and
Address
tables in all required combinations. This results in a
per record at runtime.
For example, consider a case where you want to return all address for the customer Luca, this is how you would query the table:
mysql>
SELECT C.location FROM Customer A, CustomerAddress B, Address C
WHERE A.name='Luca' AND B.id=A.id AND B.address=C.id;
Understanding
JOIN
40
Relationships
In document and relational database systems, the more data that you have, the slower the database responds and
JOIN
operations have
a heavy runtime cost.
For relational database systems, the database computes the relationship every time you query the server. That translates to
block_size)
That is,
O(log N /
. OrientDB handles relationships as physical links to the records and assigns them only once, when the edge is created.
O(1)
.
In OrientDB, the speed of traversal is not affected by the size of the database. It is always constant regardless of whether it has one
record or one hundred billion records. This is a critical feature in the age of Big Data.
Searching for an identifier at runtime each time you execute a query, for every record will grow very expensive. The first optimization
with relational databases is the use of indexing. Indexes speed up searches, but they slow down
INSERT
,
UPDATE
, and
DELETE
operations. Additionally, they occupy a substantial amount of space on the disk and in memory.
Consider also whether searching an index is actually fast.
Indexes and JOIN
In the database industry, there are a number of indexing algorithms available. The most common in both relational and NoSQL database
systems is the B+ Tree.
Balance trees all work in a similar manner. For example, consider a case where you're looking for an entry with the name
Luca
: after
only five hops, the record is found.
While this is fine on a small database, consider what would happen if there were millions or billions of records. The database would have
to go through many, many more hops to find
Luca
. And, the database would execute this operation on every
Picture: joining four tables with thousands of records. The number of
JOIN
JOIN
per record.
operations could run in the millions.
Relations in OrientDB
There is no
JOIN
in OrientDB. Instead, it uses
LINK
.
LINK
is a relationship managed by storing the target Record ID in the source
record. It is similar to storing the pointer between two objects in memory.
When you have
Invoice
linked to
Customer
, then you have a pointer to
Customer
inside
Invoice
as an attribute. They are exactly
the same. In this way, it's as though your database was kept in memory: a memory of several exabytes.
Types of Relationships
In 1-to-N relationships, OrientDB handles the relationship as a collection of Record ID's, as you would when managing objects in
memory.
41
Relationships
OrientDB supports several different kinds of relationships:
LINK
Relationship that points to one record only.
LINKSET
Relationship that points to several records. It is similar to Java sets, the same Record ID can only be included once. The
pointers have no order.
LINKLIST
LINKMAP
Relationship that points to several records. It is similar to Java lists, they are ordered and can contain duplicates.
Relationship that points to several records with a key stored in the source record. The M ap values are the Record ID's.
It is similar to Java
Map,Record>
.
42
Basic SQL
SQL
M ost NoSQL products employ a custom query language. In this, OrientDB differs by focusing on standards in query languages. That is,
instead of inventing "Yet Another Query Language," it begins with the widely used and well-understood language of SQL. It then
extends SQL to support more complex graphing concepts, such as Trees and Graphs.
Why SQL? Because SQL is ubiquitous in the database development world. It is familiar and more readable and concise than its
competitors, such as M ap Reduce scripts or JSON based querying.
SELECT
The
statement queries the database and returns results that match the given parameters. For instance, earlier in Getting Started,
SELECT
two queries were presented that gave the same results:
available through a
orientdb>
SELECT
BROWSE CLUSTER ouser
and
BROWSE CLASS OUser
. Here is a third option,
statement.
SELECT FROM OUser
Notice that the query has no projections. This means that you do not need to enter a character to indicate that the query should return
the entire record, such as the asterisk in the Relational model, (that is,
SELECT * FROM OUser
).
Additionally, OUser is a class. By default, OrientDB executes queries against classes. Targets can also be:
Clusters To execute against a cluster, rather than a class, prefix
orientdb>
CLUSTER
to the target name.
SELECT FROM CLUSTER:Ouser
Record ID To execute against one or more Record ID's, use the identifier(s) as your target. For example.
orientdb>
SELECT FROM #10:3
orientdb>
SELECT FROM [#10:1, #10:30, #10:5]
Indexes To execute a query against an index, prefix
orientdb>
INDEX
to the target name.
SELECT VALUE FROM INDEX:dictionary WHERE key='Jay'
WHERE
M uch like the standard implementation of SQL, OrientDB supports
orientdb>
This returns all
WHERE
WHERE
conditions to filter the returning records too. For example,
SELECT FROM OUser WHERE name LIKE 'l%'
OUser
records where the name begins with
l
. For more information on supported operators and functions, see
.
ORDER BY
In addition to
WHERE
, OrientDB also supports
ORDER BY
clauses. This allows you to order the results returned by the query according
to one or more fields, in either ascending or descending order.
orientdb>
SELECT FROM Employee WHERE city='Rome' ORDER BY surname ASC, name ASC
The example queries the
Employee
class, it returns a listing of all employees in that class who live in Rome and it orders the results by
surname and name, in ascending order.
43
Basic SQL
GROUP BY
In the event that you need results of the query grouped together according to the values of certain fields, you can manage this using the
GROUP BY
clause.
orientdb>
SELECT SUM(salary) FROM Employee WHERE age < 40 GROUP BY job
In the example, you query the
Employee
class for the sum of the salaries of all employees under the age of forty, grouped by their job
types.
LIMIT
In the event that your query returns too many results, making it difficult to read or manage, you can use the
LIMIT
clause to reduce it
to the top most of the return values.
orientdb>
SELECT FROM Employee WHERE gender='male' LIMIT 20
In the example, you query the
Employee
class for a list of male employees. Given that there are likely to be a number of these, you
limit the return to the first twenty entries.
SKIP
When using the
LIMIT
clause with queries, you can only view the topmost of the return results. In the event that you would like to
view certain results further down the list, for instance the values from twenty to forty, you can paginate your results using the
keyword in the
LIMIT
SKIP
clause.
orientdb>
SELECT FROM Employee WHERE gender='male' LIMIT 20
orientdb>
SELECT FROM Employee WHERE gender='male' SKIP 20 LIMIT 20
orientdb>
SELECT FROM Employee WHERE gender='male' SKIP 40 LIMIT 20
The first query returns the first twenty results, the second returns the next twenty results, the third up to sixty. You can use these
queries to manage pages at the application layer.
INSERT
The
INSERT
statement adds new data to a class and cluster. OrientDB supports three forms of syntax used to insert new data into your
database.
The standard ANSI-93 syntax:
orientdb>
INSERT INTO
Employee(name, surname, gender)
VALUES('Jay', 'Miner', 'M')
The simplified ANSI-92 syntax:
orientdb>
INSERT INTO Employee SET name='Jay', surname='Miner', gender='M'
The JSON syntax:
orientdb>
INSERT INTO Employee CONTENT {name : 'Jay', surname : 'Miner',
gender : 'M'}
Each of these queries adds Jay M iner to the
Employee
class. You can choose whichever syntax that works best with your application.
44
Basic SQL
UPDATE
The
UPDATE
statement changes the values of existing data in a class and cluster. In OrientDB there are two forms of syntax used to
update data on your database.
The standard ANSI-92 syntax:
orientdb>
UPDATE Employee SET local=TRUE WHERE city='London'
The JSON syntax, used with the
orientdb>
MERGE
keyword, which merges the changes with the current record:
UPDATE Employee MERGE { local : TRUE } WHERE city='London'
Each of these statements updates the
Employee
class, changing the
local
property to
TRUE
when the employee is based in London.
DELETE
The
DELETE
statement removes existing values from your class and cluster. OrientDB supports the standard ANSI-92 compliant
syntax for these statements:
orientdb>
DELETE FROM Employee WHERE city <> 'London'
Here, entries are removed from the
Employee
class where the employee in question is not based in London.
S ee also:
The SQL Reference
The Console Command Reference
45
Working with Graphs
Working with Graphs
In graph databases, the database system graphs data into network-like structures consisting of vertices and edges. In the OrientDB
Graph model, the database represents data through the concept of a property graph, which defines a vertex as an entity linked with
other vertices and an edge, as an entity that links two vertices.
OrientDB ships with a generic vertex persistent class, called
new vertex using the
orientdb>
INSERT
command with
V
V
, as well as a class for edges, called
E
. As an example, you can create a
.
INSERT INTO V SET name='Jay'
Created record with RID #9:0
In effect, the Graph model database works on top of the underlying document model. But, in order to simplify this process, OrientDB
introduces a new set of commands for managing graphs from the console. Instead of
orientdb>
INSERT
, use
CREATE VERTEX
CREATE VERTEX V SET name='Jay'
Created vertex with RID #9:1
By using the graph commands over the standard SQL syntax, OrientDB ensures that your graphs remain consistent. For more
information on the particular commands, see the following pages:
CREATE VERTEX
DELETE VERTEX
CREATE EDGE
UPDATE EDGE
DELETE EDGE
Use Case: Social Network for Restaurant Patrons
While you have the option of working with vertexes and edges in your database as they are, you can also extend the standard
E
V
and
classes to suit the particular needs of your application. The advantages of this approach are,
It grants better understanding about the meaning of these entities.
It allows for optional constraints at the class level.
It improves performance through better partitioning of entities.
It allows for object-oriented inheritance among the graph elements.
For example, consider a social network based on restaurants. You need to start with a class for individual customers and another for the
restaurants they patronize. Create these classes to extend the
orientdb>
CREATE CLASS Person EXTENDS V
orientdb>
CREATE CLASS Restaurant EXTENDS V
V
class.
Doing this creates the schema for your social network. Now that the schema is ready, populate the graph with data.
46
Working with Graphs
orientdb>
CREATE VERTEX Person SET name='Luca'
Created record with RID #11:0
orientdb>
CREATE VERTEX Person SET name='Bill'
Created record with RID #11:1
orientdb>
CREATE VERTEX Person SET name='Jay'
Created record with RID #11:2
orientdb>
CREATE VERTEX Restaurant SET name='Dante', type='Pizza'
Created record with RID #12:0
orientdb>
CREATE VERTEX Restaurant SET name='Charlie', type='French'
Created record with RID #12:1
This adds three vertices to the
Restaurant
Person
class, representing individual users in the social network. It also adds two vertices to the
class, representing the restaurants that they patronize.
Creating Edges
For the moment, these vertices are independent of one another, tied together only by the classes to which they belong. That is, they are
not yet connected by edges. Before you can make these connections, you first need to create a class that extends
orientdb>
.
CREATE CLASS Eat EXTENDS E
This creates the class
Restaurant
E
Eat
, which extends the class
E
.
Eat
represents the relationship between the vertex
Person
and the vertex
.
When you create the edge from this class, note that the orientation of the vertices is important, because it gives the relationship its
meaning. For instance, creating an edge in the opposite direction, (from
as
Attendee
Restaurant
to
Person
), would call for a separate class, such
.
The user Luca eats at the pizza joint Dante. Create an edge that represents this connection:
orientdb>
CREATE EDGE Eat FROM ( SELECT FROM Person WHERE name='Luca' )
TO ( SELECT FROM Restaurant WHERE name='Dante' )
Creating Edges from Record ID
In the event that you know the Record ID of the vertices, you can connect them directly with a shorter and faster command. For
example, the person Bill also eats at the restaurant Dante and the person Jay eats at the restaurant Charlie. Create edges in the class
Eat
to represent these connections.
orientdb>
CREATE EDGE Eat FROM #11:1 TO #12:0
orientdb>
CREATE EDGE Eat FROM #11:2 TO #12:1
47
Working with Graphs
Querying Graphs
In the above example you created and populated a small graph of a social network of individual users and the restaurants at which they
eat. You can now begin to experiment with queries on a graph database.
To cross edges, you can use special graph functions, such as:
To retrieve the adjacent outgoing vertices
OUT()
IN()
To retrieve the adjacent incoming vertices
To retrieve the adjacent incoming and outgoing vertices
BOTH()
For example, to know all of the people who eat in the restaurant Dante, which has a Record ID of
that restaurant and traverse the incoming edges to discover which entries in the
orientdb>
Person
#12:0
, you can access the record for
class connect to it.
SELECT IN() FROM Restaurant WHERE name='Dante'
-------+----------------+
@RID
| in
|
-------+----------------+
#-2:1 | [#11:0, #11:1] |
-------+----------------+
This query displays the record ID's from the
EXPAND()
Person
class that connect to the restaurant Dante. In cases such as this, you can use the
special function to transform the vertex collection in the result-set by expanding it.
orientdb>
SELECT EXPAND( IN() ) FROM Restaurant WHERE name='Dante'
-------+-------------+-------------+---------+
@RID
| @CLASS
| Name
| out_Eat |
-------+-------------+-------------+---------+
#11:0 | Person
| Luca
| #12:0
|
#11:1 | Person
| Bill
| #12:0
|
-------+-------------+-------------+---------+
Creating Edge to Connect Users
Your application at this point shows connections between individual users and the restaurants they patronize. While this is interesting,
it does not yet function as a social network. To do so, you need to establish edges that connect the users to one another.
To begin, as before, create a new class that extends
orientdb>
E
:
CREATE CLASS Friend EXTENDS E
The users Luca and Jay are friends. They have Record ID's of
orientdb>
In the
Friend
#11:0
and
#11:2
. Create an edge that connects them.
CREATE EDGE Friend FROM #11:0 TO #11:2
relationship, orientation is not important. That is, if Luca is a friend of Jay's then Jay is a friend of Luca's. Therefore,
you should use the
BOTH()
function.
48
Working with Graphs
orientdb>
SELECT EXPAND( BOTH( 'Friend' ) ) FROM Person WHERE name = 'Luca'
-------+-------------+-------------+---------+-----------+
@RID
| @CLASS
| Name
| out_Eat | in_Friend |
-------+-------------+-------------+---------+-----------+
#11:2 | Person
| Jay
| #12:1
| #11:0
|
-------+-------------+-------------+---------+-----------+
Here, the
the
Eat
BOTH()
function takes the edge class
Friend
as an argument, crossing only relationships of the Friend kind, (that is, it skips
class, at this time). Note in the result-set that the relationship with Luca, with a Record ID of
#11:0
in the
in_
field.
You can also now view all the restaurants patronized by friends of Luca.
orientdb>
SELECT EXPAND( BOTH('Friend').out('Eat') ) FROM Person
WHERE name='Luca'
-------+-------------+-------------+-------------+--------+
@RID
| @CLASS
| Name
| Type
| in_Eat |
-------+-------------+-------------+-------------+--------+
#12:1 | Restaurant
| Charlie
| French
| #11:2
|
-------+-------------+-------------+-------------+--------+
Lightweight Edges
In version 1.4.x, OrientDB begins to manage some edges as Lightweight Edges. Lightweight Edges do not have Record ID's, but are
physically stored as links within vertices. Note that OrientDB only uses a Lightweight Edge only when the edge has no properties,
otherwise it uses the standard Edge.
From the logic point of view, Lightweight Edges are Edges in all effects, so that all graph functions work with them. This is to improve
performance and reduce disk space.
Because Lightweight Edges don't exist as separate records in the database, some queries won't work as expected. For instance,
orientdb>
SELECT FROM E
For most cases, an edge is used connecting vertices, so this query would not cause any problems in particular. But, it would not return
Lightweight Edges in the result-set. In the event that you need to query edges directly, including those with no properties, disable the
Lightweight Edge feature.
To disable the Lightweight Edge feature, execute the following command.
orientdb>
ALTER DATABASE CUSTOM useLightweightEdges=FALSE
You only need to execute this command once. OrientDB now generates new edges as the standard Edge, rather than the Lightweight
Edge. Note that this does not affect existing edges.
For troubleshooting information on Lightweight Edges, see Why I can't see all the edges. For more information in the Graph model in
OrientDB, see Graph API.
49
Using Schema with Graphs
Using Schema with Graphs
OrientDB, through the Graph API, offers a number of features above and beyond the traditional Graph Databases given that it supports
concepts drawn from both the Document Database and the Object Oriented worlds. For instance, consider the power of graphs, when
used in conjunction with schemas and constraints.
Use Case: Car Database
For this example, consider a graph database that maps the relationship between individual users and their cars. First, create the graph
schema for the
Person
and
Car
vertex classes, as well as the
orientdb>
CREATE CLASS Person EXTENDS V
orientdb>
CREATE CLASS Car EXTENDS V
orientdb>
CREATE CLASS Owns EXTENDS E
Owns
edge class to connect the two:
These commands lay out the schema for your graph database. That is, they define two vertex classes and an edge class to indicate the
relationship between the two. With that, you can begin to populate the database with vertices and edges.
orientdb>
CREATE VERTEX Person SET name = 'Luca'
Created vertex 'Person#11:0{name:Luca} v1' in 0,012000 sec(s).
orientdb>
CREATE VERTEX Car SET name = 'Ferrari Modena'
Created vertex 'Car#12:0{name:Ferrari Modena} v1' in 0,001000 sec(s).
orientdb>
CREATE EDGE Owns FROM ( SELECT FROM Person ) TO ( SELECT FROM Car )
Created edge '[e[#11:0->#12:0][#11:0-Owns->#12:0]]' in 0,005000 sec(s).
Querying the Car Database
In the above section, you create a car database and populated it with vertices and edges to map out the relationship between drivers and
their cars. Now you can begin to query this database, showing what those connections are. For example, what is Luca's car? You can find
out by traversing from the vertex Luca to the outgoing vertices following the
orientdb>
Owns
relationship.
SELECT name FROM ( SELECT EXPAND( OUT('Owns') ) FROM Person
WHERE name='Luca' )
----+-------+-----------------+
#
| @RID
| name
|
----+-------+-----------------+
0
| #-2:1 | Ferrari Modena
|
----+-------+-----------------+
As you can see, the query returns that Luca owns a Ferrari M odena. Now consider expanding your database to track where each person
lives.
50
Using Schema with Graphs
Adding a Location Vertex
Consider a situation, in which you might want to keep track of the countries in which each person lives. In practice, there are a number
of reasons why you might want to do this, for instance, for the purposes of promotional material or in a larger database to analyze the
connections to see how residence affects car ownership.
To begin, create a vertex class for the country, in which the person lives and an edge class that connects the individual to the place.
orientdb>
CREATE CLASS Country EXTENDS V
orientdb>
CREATE CLASS Lives EXTENDS E
This creates the schema for the feature you're adding to the cars database. The vertex class
people live and the edge class
Lives
to connect individuals in the vertex class
Person
Country
recording countries in which
to entries in
Country
.
With the schema laid out, create a vertex for the United Kingdom and connect it to the person Luca.
orientdb>
CREATE VERTEX Country SET name='UK'
Created vertex 'Country#14:0{name:UK} v1' in 0,004000 sec(s).
orientdb>
CREATE EDGE Lives FROM ( SELECT FROM Person ) TO ( SELECT FROM Country
Created edge '[e[#11:0->#14:0][#11:0-Lives->#14:0]]' in 0,006000 sec(s).
The second command creates an edge connecting the person Luca to the country United Kingdom. Now that your cars database is
defined and populated, you can query it, such as a search that shows the countries where there are users that own a Ferrari.
orientdb>
SELECT name FROM ( SELECT EXPAND( IN('Owns').OUT('Lives') )
FROM Car WHERE name LIKE '%Ferrari%' )
---+-------+--------+
# | @RID
| name
|
---+-------+--------+
0 | #-2:1 | UK
|
---+-------+--------+
Using in and out Constraints on Edges
In the above sections, you modeled the graph using a schema without any constraints, but you might find it useful to use some. For
instance, it would be good to require that an
Owns
relationship only exist between the vertex
orientdb>
CREATE PROPERTY Owns.out LINK Person
orientdb>
CREATE PROPERTY Owns.in LINK Car
These commands link outgoing vertices of the
Person
class to incoming vertices of the
Car
Person
and the vertex
Car
.
class. That is, it configures your database
so that a user can own a car, but a car cannot own a user.
Using MANDATORY Constraints on Edges
By default, when OrientDB creates an edge that lacks properties, it creates it as a Lightweight Edge. That is, it creates an edge that has
no physical record in the database. Using the
MANDATORY
setting, you can stop this behavior, forcing it to create the standard Edge,
without outright disabling Lightweight Edges.
51
Using Schema with Graphs
orientdb>
ALTER PROPERTY Owns.out MANDATORY TRUE
orientdb>
ALTER PROPERTY Owns.in MANDATORY TRUE
Using UNIQUE with Edges
For the sake of simplicity, consider a case where you want to limit the way people are connected to cars to where the user can only
match to the car once. That is, if Luca owns a Ferrari M odena, you might prefer not to have a double entry for that car in the event that
he buys a new one a few years later. This is particularly important given that our database covers make and model, but not year.
To manage this, you need to define a
orientdb>
UNIQUE
index against both the out and in properties.
CREATE INDEX UniqueOwns ON Owns(out,in) UNIQUE
Created index successfully with 0 entries in 0,023000 sec(s).
The index returns tells us that no entries are indexed. You have already created the
Onws
relationship between Luca and the Ferrari
M odena. In that case, however, OrientDB had created a Lightweight Edge before you set the rule to force the creation of documents for
Owns
instances. To fix this, you need to drop and recreate the edge.
orientdb>
DELETE EDGE FROM #11:0 TO #12:0
orientdb>
CREATE EDGE Owns FROM ( SELECT FROM Person ) TO ( SELECT FROM Car )
To confirm that this was successful, run a query to check that a record was created:
orientdb>
SELECT FROM Owns
---+-------+-------+--------+
# | @RID
| out
| in
|
---+-------+-------+--------+
0 | #13:0 | #11:0 | #12:0
|
---+-------+-------+--------+
This shows that a record was indeed created. To confirm that the constraints work, attempt to create an edge in
Owns
that connects
Luca to the United Kingdom.
orientdb>
CREATE EDGE Owns FROM ( SELECT FROM Person ) TO ( SELECT FROM Country )
Error: com.orientechnologies.orient.core.exception.OCommandExecutionException:
Error on execution of command: sql.create edge Owns from (select from Person)...
Error: com.orientechnologies.orient.core.exception.OValidationException: The
field 'Owns.in' has been declared as LINK of type 'Car' but the value is the
document #14:0 of class 'Country'
This shows that the constraints effectively blocked the creation, generating a set of errors to explain why it was blocked.
You now have a typed graph with constraints. For more information, see Graph Schema.
52
Setup a Distributed Database
Setting up a Distributed Graph Database
In addition to the standard deployment architecture, where it runs as a single, standalone database instance, you can also deploy
OrientDB using Distributed Architecture. In this environment, it shares the database across multiple server instances.
Launching Distributed Server Cluster
There are two ways to share a database across multiple server nodes:
Prior to startup, copy the specific database directory, under
to all servers.
$ORIENTDB_HOME/database
Keep the database on the first running server node, then start every other server node. Under the default configurations, OrientDB
automatically shares the database with the new servers that join.
This tutorial assumes that you want to start a distributed database using the second method.
NOTE: When you run in distributed mode, OrientDB needs more RAM. The minimum is 2GB of heap, but we suggest to use at least 4GB
of heap memory. To change the heap modify the Java memory settings in the file
bin/dserver.sh
(or dserver.bat on Windows).
Starting the First Server Node
Unlike the standard standalone deployment of OrientDB, there is a different script that you need to use when launching a distributed
server instance. Instead of
find it in the
$
bin
server.sh
, you use
dserver.sh
. In the case of Windows, use
dserver.bat
. Whichever you need, you can
of your installation directory.
./bin/dserver.sh
Bear in mind that OrientDB uses the same
orientdb-server-config.xml
configuration file, regardless of whether it's running as a server
or distributed server. For more information, see Distributed Configuration.
The first time you start OrientDB as a distributed server, it generates the following output:
+---------------------------------------------------------------+
|
WARNING: FIRST DISTRIBUTED RUN CONFIGURATION
|
+---------------------------------------------------------------+
| This is the first time that the server is running as
|
| distributed. Please type the name you want to assign to the
|
| current server node.
|
|
|
| To avoid this message set the environment variable or JVM
|
| setting ORIENTDB_NODE_NAME to the server node name to use.
|
+---------------------------------------------------------------+
Node name [BLANK=auto generate it]:
You need to give the node a name here. OrientDB stores it in the
your
orientdb-server-config.xml
nodeName
parameter of
OHazelcastPlugin
. It adds the variable to
configuration file.
Distributed Startup Process
When OrientDB starts as a distributed server instance, it loads all databases in the
database
directory and configures them to run in
distributed mode. For this reason, the first load, OrientDB copies the default distributed configuration, (that is, the
distributed-db-config.json
configuration file), into each database's directory, renaming it
distributed-config.json
default-
. On subsequent
starts, each database uses this file instead of the default configuration file. Since the shape of the cluster changes every time nodes join or
leave, the configuration is kept up to date by each distributed server instance.
For more information on working with the
default-distributed-db-config.json
configuration file, see Distributed Configuration.
Starting Additional Server Nodes
53
Setup a Distributed Database
When you have the first server node running, you can begin to start the other server nodes. Each server requires the same Hazelcast
credentials in order to join the same cluster. You can define these in the
hazelcast.xml
configuration file.
The fastest way to initialize multiple server nodes is to copy the OrientDB installation directory from the first node to each of the
subsequent nodes. For instance,
$
scp user@ip_address $ORIENTDB_HOME
This copies both the databases and their configuration files onto the new distributed server node.
Bear in mind, if you run multiple server instances on the same host, such as when testing, you need to change the port entry in
the
hazelcast.xml
configuration file.
For the other server nodes in the cluster, use the same
dserver.sh
command as you used in starting the first node. When the other
server nodes come online, they begin to establish network connectivity with each other. M onitoring the logs, you can see where they
establish connections from messages such as this:
WARN [node1384014656983] added new node id=Member [192.168.1.179]:2435 name=null
[OHazelcastPlugin]
INFO [192.168.1.179]:2434 [orientdb] Re-partitioning cluster data... Migration
queue size: 135 [PartitionService]
INFO [192.168.1.179]:2434 [orientdb] All migration tasks has been completed,
queues are empty. [PartitionService]
INFO [node1384014656983] added node configuration id=Member [192.168.1.179]:2435
name=node1384015873680, now 2 nodes are configured [OHazelcastPlugin]
INFO [node1384014656983] update configuration db=GratefulDeadConcerts
from=node1384015873680 [OHazelcastPlugin]
WARN [node1383734730415]->[node1384015873680] deploying database
GratefulDeadConcerts...[ODeployDatabaseTask]
WARN [node1383734730415]->[node1384015873680] sending the compressed database
GratefulDeadConcerts over the network, total 339,66Kb [ODeployDatabaseTask]
In the example, two server nodes were started on the same machine. It has an IP address of 10.37.129.2, but is using OrientDB on two
different ports: 2434 and 2435, where the current is called
this
. The remainder of the log is relative to the distribution of the database
to the second server.
On the second server node output, OrientDB dumps messages like this:
WARN [node1384015873680]<-[node1383734730415] installing database
GratefulDeadConcerts in databases/GratefulDeadConcerts... [OHazelcastPlugin]
WARN [node1384015873680] installed database GratefulDeadConcerts in
databases/GratefulDeadConcerts, setting it online... [OHazelcastPlugin]
WARN [node1384015873680] database GratefulDeadConcerts is online [OHazelcastPlugin]
WARN [node1384015873680] updated node status to 'ONLINE' [OHazelcastPlugin]
INFO OrientDB Server v2.2.11-SNAPSHOT is active. [OServer]
What these messages mean is that the database
node1383734730415
GratefulDeadConcerts
was correctly installed from the first node, that is
through the network.
Migrating from standalone server to a cluster
If you have a standalone instance of OrientDB and you want to move to a cluster you should follow these steps:
Install OrientDB on all the servers of the cluster and configure it (according to the sections above)
Stop the standalone server
Copy the specific database directories under
$ORIENTDB_HOME/database
Start all the servers in the cluster using the script
dserver.sh
(or
to all the servers of the cluster
dserver.bat
if on Windows)
54
Setup a Distributed Database
If the standalone server will be part of the cluster, you can use the existing installation of OrientDB; you don't need to copy the
database directories since they're already in place and you just have to start it before all the other servers with
dserver.sh
.
55
Working with Distributed Graphs
Working with Distributed Graphs
When OrientDB joins a distributed cluster, all clients connecting to the server node are constantly notified about this state. This ensures
that, in the event that server node fails, the clients can switch transparently to the next available server.
You can check this through the console. When OrientDB runs in a distributed configuration, the current cluster shape is visible through
the
$
INFO
command.
$ORIENTDB_HOME/bin/console.sh
OrientDB console v.1.6 www.orientechnologies.com
Type 'help' to display all the commands supported.
Installing extensions for GREMLIN language v.2.5.0-SNAPSHOT
orientdb>
CONNECT remote:localhost/GratefulDeadConcerts admin admin
Connecting to database [remote:localhost/GratefulDeadConcerts] with user 'admin'...OK
orientdb>
INFO
Current database: GratefulDeadConcerts (url=remote:localhost/GratefulDeadConcerts)
For reference purposes, the server nodes in the example have the following configurations. As you can see, it is a two node cluster
running a single server host. The first node listens on port
2481
while the second on port
2480
.
+---------+------+-----------------------------------------+-----+---------+--------------+--------------+----------------------+
|Name
|Status|Databases
|Conns|StartedOn|Binary
|HTTP
|UsedMemory
|
+---------+------+-----------------------------------------+-----+---------+--------------+--------------+----------------------+
|europe-0 |ONLINE|distributed-node-deadlock=ONLINE (MASTER)|5
|16:53:59 |127.0.0.1:2424|127.0.0.1:2480|269.32MB/3.56GB (7.40
%)|
|europe-1 |ONLINE|distributed-node-deadlock=ONLINE (MASTER)|4
|16:54:03 |127.0.0.1:2425|127.0.0.1:2481|268.89MB/3.56GB (7.38
%)|
+---------+------+-----------------------------------------+-----+---------+--------------+--------------+----------------------+
Testing Distributed Architecture
Once you have a distributed database up and running, you can begin to test its operations on a running environment. For example, begin
by creating a vertex, setting the
orientdb>
node
property to
1
.
CREATE VERTEX V SET node = 1
Created vertex 'V#9:815{node:1} v1' in 0,013000 sec(s).
From another console, connect to the second node and execute the following command:
56
Working with Distributed Graphs
orinetdb>
SELECT FROM V WHERE node = 1
----+--------+-------+
#
| @RID
| node
|
----+--------+-------+
0
| #9:815 | 1
|
----+--------+-------+
1 item(s) found. Query executed in 0.19 sec(s).
This shows that the vertex created on the first node has successfully replicated to the second node.
Logs in Distributed Architecture
From time to time server nodes go down. This does not necessarily relate to problems in OrientDB, (for instance, it could originate from
limitations in system resources).
To test this out, kill the first node. For example, assuming the first node has a process identifier, (that is, a PID), of
1254
on your
system, run the following command:
$
kill -9 1254
This command kills the process on PID
$
1254
. Now, check the log messages for the second node:
less orientdb.log
INFO [127.0.0.1]:2435 [orientdb] Removing Member [127.0.0.1]:2434
[ClusterService]
INFO [127.0.0.1]:2435 [orientdb]
Members [1] {
Member [127.0.0.1]:2435 this
}
[ClusterService]
WARN [europe-0] node removed id=Member [127.0.0.1]:2434
name=europe-1 [OHazelcastPlugin]
INFO [127.0.0.1]:2435 [orientdb] Partition balance is ok, no need to
re-partition cluster data...
[PartitionService]
What the logs show you is that the second node is now aware that it cannot reach the first node. You can further test this by running the
console connected to the first node..
orientdb>
SELECT FROM V LIMIT 2
WARN Caught I/O errors from /127.0.0.1:2425 (local
socket=0.0.0.0/0.0.0.0:51512), trying to reconnect (error:
java.io.IOException: Stream closed) [OStorageRemote]
WARN Connection re-acquired transparently after 30ms and 1 retries: no errors
will be thrown at application level [OStorageRemote]
---+------+----------------+--------+--------------+------+-----------------+----# | @RID | name
| song_type | performances | type | out_followed_by | ...
---+------+----------------+--------+--------------+------+-----------------+----1 | #9:1 | HEY BO DIDDLEY | cover
| 5
| song | [5]
| ...
2 | #9:2 | IM A MAN
| 1
| song | [2]
| ...
| cover
---+------+----------------+--------+--------------+------+-----------------+-----
57
Working with Distributed Graphs
This shows that the console auto-switched to the next available node. That is, it switched to the second node upon noticing that the first
was no longer functional. The warnings reports show what happened in a transparent way, so that the application doesn't need to
manage the issue.
From the console connected to the second node, create a new vertex.
orientdb>
CREATE VERTEX V SET node=2
Created vertex 'V#9:816{node:2} v1' in 0,014000 sec(s).
Given that the first node remains nonfunctional, OrientDB journals the operation. Once the first node comes back online, the second
node synchronizes the changes into it.
Restart the first node and check that it successfully auto-realigns. Reconnect the console to the first node and run the following
command:
orientdb>
SELECT FROM V WHERE node=2
---+--------+-------+
# | @RID
| node
|
---+--------+-------+
0 | #9:816 | 2
|
---+--------+-------+
1 item(s) found. Query executed in 0.209 sec(s).
This shows that the first node has realigned itself with the second node.
This process is repeatable with N server nodes, where every server is a master. There is no limit to the number of running servers. With
many servers spread across a slow network, you can tune the network timeouts to be more permissive and let a large, distributed cluster
of servers work properly.
For more information, Distributed Architecture.
58
Data M odeling
Multi-Model
The OrientDB engine supports Graph, Document, Key/Value, and Object models, so you can use OrientDB as a replacement for a
product in any of these categories. However, the main reason why users choose OrientDB is because of its true Multi-Model DBM S
abilities, which combine all the features of the four models into the core. These abilities are not just interfaces to the database engine, but
rather the engine itself was built to support all four models. This is also the main difference to other multi-model DBM Ss, as they
implement an additional layer with an API, which mimics additional models. However, under the hood, they're truly only one model,
therefore they are limited in speed and scalability.
The Document Model
The data in this model is stored inside documents. A document is a set of key/value pairs (also referred to as fields or properties), where
the key allows access to its value. Values can hold primitive data types, embedded documents, or arrays of other values. Documents are
not typically forced to have a schema, which can be advantageous, because they remain flexible and easy to modify. Documents are
stored in collections, enabling developers to group data as they decide. OrientDB uses the concepts of "classes" and "clusters" as its
form of "collections" for grouping documents. This provides several benefits, which we will discuss in further sections of the
documentation.
OrientDB's Document model also adds the concept of a "LINK" as a relationship between documents. With OrientDB, you can decide
whether to embed documents or link to them directly. When you fetch a document, all the links are automatically resolved by OrientDB.
This is a major difference to other Document Databases, like M ongoDB or CouchDB, where the developer must handle any and all
relationships between the documents herself.
The table below illustrates the comparison between the relational model, the document model, and the OrientDB document model:
Relational Model
Document Model
OrientDB Document Model
Table
Collection
Class or Cluster
Row
Document
Document
Column
Key/value pair
Document field
Relationship
not available
Link
The Graph Model
A graph represents a network-like structure consisting of Vertices (also known as Nodes) interconnected by Edges (also known as Arcs).
OrientDB's graph model is represented by the concept of a property graph, which defines the following:
Vertex - an entity that can be linked with other Vertices and has the following mandatory properties:
unique identifier
set of incoming Edges
set of outgoing Edges
Edge - an entity that links two Vertices and has the following mandatory properties:
unique identifier
link to an incoming Vertex (also known as head)
link to an outgoing Vertex (also known as tail)
label that defines the type of connection/relationship between head and tail vertex
In addition to mandatory properties, each vertex or edge can also hold a set of custom properties. These properties can be defined by
users, which can make vertices and edges appear similar to documents. In the table below, you can find a comparison between the graph
model, the relational data model, and the OrientDB graph model:
59
Data M odeling
Relational Model
Graph Model
OrientDB Graph Model
Table
Vertex and Edge Class
Class that extends "V" (for Vertex) and "E" (for Edges)
Row
Vertex
Vertex
Column
Vertex and Edge property
Vertex and Edge property
Relationship
Edge
Edge
The Key/Value Model
This is the simplest model of the three. Everything in the database can be reached by a key, where the values can be simple and complex
types. OrientDB supports Documents and Graph Elements as values allowing for a richer model, than what you would normally find in
the classic Key/Value model. The classic Key/Value model provides "buckets" to group key/value pairs in different containers. The most
classic use cases of the Key/Value M odel are:
POST the value as payload of the HTTP call ->
//
GET the value as payload from the HTTP call ->
//
DELETE the value by Key, by calling the HTTP call ->
//
The table below illustrates the comparison between the relational model, the Key/Value model, and the OrientDB Key/Value model:
Relational Model
Key/Value Model
OrientDB Key/Value Model
Table
Bucket
Class or Cluster
Row
Key/Value pair
Document
Column
not available
Document field or Vertex/Edge property
Relationship
not available
Link
The Object Model
This model has been inherited by Object Oriented programming and supports Inheritance between types (sub-types extends the
super-types), Polymorphism when you refer to a base class and Direct binding from/to Objects used in programming languages.
The table below illustrates the comparison between the relational model, the Object model, and the OrientDB Object model:
Relational Model
Object Model
OrientDB Object Model
Table
Class
Class or Cluster
Row
Object
Document or Vertex
Column
Object property
Document field or Vertex/Edge property
Relationship
Pointer
Link
60
Basic Concepts
Basic Concepts
Record
The smallest unit that you can load from and store in the database. Records come in four types:
Document
RecordBytes
Vertex
Edge
A Record is the smallest unit that can be loaded from and stored into the database. A record can be a Document, a RecordBytes record
(BLOB) a Vertex or even an Edge.
Document
The Document is the most flexible record type available in OrientDB. Documents are softly typed and are defined by schema classes
with defined constraints, but you can also use them in a schema-less mode too.
Documents handle fields in a flexible manner. You can easily import and export them in JSON format. For example,
{
"name"
: "Jay",
"surname"
: "Miner",
"job"
: "Developer",
"creations" : [
{
"name"
: "Amiga 1000",
"company" : "Commodore Inc."
}, {
"name"
: "Amiga 500",
"company" : "Commodore Inc."
}
]
}
For Documents, OrientDB also supports complex relationships. From the perspective of developers, this can be understood as a
persistent
Map
.
BLOB
In addition to the Document record type, OrientDB can also load and store binary data. The BLOB record type was called
RecordBytes
before OrientDB v2.2.
Vertex
In Graph databases, the most basic unit of data is the node, which in OrientDB is called a vertex. The Vertex stores information for the
database. There is a separate record type called the Edge that connects one vertex to another.
Vertices are also documents. This means they can contain embedded records and arbitrary properties.
Edge
In Graph databases, an arc is the connection between two nodes, which in OrientDB is called an edge. Edges are bidirectional and can
only connect two vertices.
Edges can be regular or lightweight. The Regular Edge saves as a Document, while the Lightweight Edge does not. For an understanding
of the differences between these, see Lightweight Edges.
For more information on connecting vertices in general, see Relationships, below.
61
Basic Concepts
Record ID
When OrientDB generates a record, it auto-assigns a unique unit identifier, called a Record ID, or RID. The syntax for the Record ID is
the pound sign with the cluster identifier and the position. The format is like this:
#:
.
Cluster Identifier: This number indicates the cluster to which the record belongs. Positive numbers in the cluster identifier
indicate persistent records. Negative numbers indicate temporary records, such as those that appear in result-sets for queries that
use projections.
Position: This number defines the absolute position of the record in the cluster.
NOTE: The prefix character
#
is mandatory to recognize a Record ID.
Records never lose their identifiers unless they are deleted. When deleted, OrientDB never recycles identifiers, except with
local
storage. Additionally, you can access records directly through their Record ID's. For this reason, you don't need to create a field to serve
as the primary key, as you do in Relational databases.
Record Version
Records maintain their own version number, which increments on each update. In optimistic transactions, OrientDB checks the version
in order to avoid conflicts at commit time.
Class
The concept of the Class is taken from the Object Oriented Programming paradigm. In OrientDB, classes define records. It is closest to
the concept of a table in Relational databases.
Classes can be schema-less, schema-full or a mix. They can inherit from other classes, creating a tree of classes. Inheritance, in this
context, means that a sub-class extends a parent class, inheriting all of its attributes.
Each class has its own cluster. A class must have at least one cluster defined, which functions as its default cluster. But, a class can
support multiple clusters. When you execute a query against a class, it automatically propagates to all clusters that are part of the class.
When you create a new record, OrientDB selects the cluster to store it in using a configurable strategy.
When you create a new class, by default, OrientDB creates a new persistent cluster with the same name as the class, in lowercase.
Abstract Class
The concept of an Abstract Class is one familiar to Object-Oriented programming. In OrientDB, this feature has been available since
version 1.2.0. Abstract classes are classes used as the foundation for defining other classes. They are also classes that cannot have
instances. For more information on how to create an abstract class, see CREATE CLASS.
This concept is essential to Object Orientation, without the typical spamming of the database with always empty, auto-created clusters.
For more information on Abstract Class as a concept, see Abstract Type and Abstract M ethods and Classes
Class vs. Cluster in Queries
The combination of classes and clusters is very powerful and has a number of use cases. Consider an example where you create a class
Invoice
, with two clusters
orientdb>
invoice2016
and
invoice2017
. You can query all invoices using the class as a target with
SELECT
.
SELECT FROM Invoice
In addition to this, you can filter the result-set by year. The class
Invoice
includes a
year
field, you can filter it through the
WHERE
clause.
orientdb>
SELECT FROM Invoice WHERE year = 2016
62
Basic Concepts
You can also query specific objects from a single cluster. By splitting the class
Invoice
across multiple clusters, (that is, one per year),
you can optimize the query by narrowing the potential result-set.
orientdb>
SELECT FROM CLUSTER:invoice2016
Due to the optimization, this query runs significantly faster, because OrientDB can narrow the search to the targeted cluster.
Cluster
Where classes in provide you with a logical framework for organizing data, clusters provide physical or in-memory space in which
OrientDB actually stores the data. It is comparable to the collection in Document databases and the table in Relational databases.
When you create a new class, the
process also creates a physical cluster that serves as the default location in which to
CREATE CLASS
store data for that class. OrientDB forms the cluster name using the class name, with all lower case letters. Beginning with version 2.2,
OrientDB creates additional clusters for each class, (one for each CPU core on the server), to improve performance of parallelism.
For more information, see the Clusters Tutorial.
Relationships
OrientDB supports two kinds of relationships: referenced and embedded. It can manage relationships in a schema-full or schema-less
scenario.
Referenced Relationships
In Relational databases, tables are linked through
relationships natively without computing
JOIN
JOIN
commands, which can prove costly on computing resources. OrientDB manges
's. Instead, it stores direct links to the target objects of the relationship. This boosts the
load speed for the entire graph of connected objects, such as in Graph and Object database systems.
For example
customer
Record A
------------->
Record B
CLASS=Invoice
CLASS=Customer
RID=5:23
RID=10:2
Here, record
A
contains the reference to record
B
in the property
customer
. Note that both records are reachable by other records,
given that they have a Record ID.
With the Graph API, Edges are represented with two links stored on both vertices to handle the bidirectional relationship.
1:1 and 1:n Referenced Relationships
OrientDB expresses relationships of these kinds using links of the
LINK
type.
1:n and n:n Referenced Relationships
OrientDB expresses relationships of these kinds using a collection of links, such as:
LINKLIST
An ordered list of links.
LINKSET
An unordered set of links, which does not accept duplicates.
LINKMAP
An ordered map of links, with
String
as the key type. Duplicates keys are not accepted.
With the Graph API, Edges connect only two vertices. This means that 1:n relationships are not allowed. To specify a 1:n relationship
with graphs, create multiple edges.
Embedded Relationships
63
Basic Concepts
When using Embedded relationships, OrientDB stores the relationship within the record that embeds it. These relationships are stronger
than Reference relationships. You can represent it as a UM L Composition relationship.
Embedded records do not have their own Record ID, given that you can't directly reference it through other records. It is only accessible
through the container record.
In the event that you delete the container record, the embedded record is also deleted. For example,
address
Record A
<>---------->
Record B
CLASS=Account
CLASS=Address
RID=5:23
NO RID!
Here, record
A
contains the entirety of record
B
in the property
address
. You can reach record
B
only by traversing the container
record. For example,
orientdb>
SELECT FROM Account WHERE address.city = 'Rome'
1:1 and n:1 Embedded Relationships
OrientDB expresses relationships of these kinds using the
EMBEDDED
type.
1:n and n:n Embedded Relationships
OrientDB expresses relationships of these kinds using a collection of links, such as:
EMBEDDEDLIST
An ordered list of records.
EMBEDDEDSET
An unordered set of records, that doesn't accept duplicates.
EMBEDDEDMAP
An ordered map of records as the value and a string as the key, it doesn't accept duplicate keys.
Inverse Relationships
In OrientDB, all Edges in the Graph model are bidirectional. This differs from the Document model, where relationships are always
unidirectional, requiring the developer to maintain data integrity. In addition, OrientDB automatically maintains the consistency of all
bidirectional relationships.
Database
The database is an interface to access the real Storage. IT understands high-level concepts such as queries, schemas, metadata, indices
and so on. OrientDB also provides multiple database types. For more information on these types, see Database Types.
Each server or Java VM can handle multiple database instances, but the database name must be unique. You can't manage two databases
at the same time, even if they are in different directories. To handle this case, use the
/
$
dollar character as a separator instead of the
slash character. OrientDB binds the entire name, so it becomes unique, but at the file system level it converts
$
with
/
, allowing
multiple databases with the same name in different paths. For example,
test$customers -> test/customers
production$customers = production/customers
To open the database, use the following code:
test = new ODatabaseDocumentTx("remote:localhost/test$customers");
production = new ODatabaseDocumentTx("remote:localhost/production$customers");
Database URL
OrientDB uses its own URL format, of engine and database name as
:
.
64
Basic Concepts
Engine
Description
Example
plocal
This engine writes to the file system to store data. There is a LOG
of changes to restore the storage in case of a crash.
plocal:/temp/databases/petshop/petshop
memory
Open a database completely in memory
memory:petshop
remote
The storage will be opened via a remote network connection. It
requires an OrientDB Server up and running. In this mode, the
database is shared among multiple clients. Syntax: remote::
[]/db-name . The port is optional and defaults to 2424.
remote:localhost/petshop
Database Usage
You must always close the database once you finish working on it.
NOTE: OrientDB automatically closes all opened databases, when the process dies gracefully (not by killing it by force). This is
assured if the Operating System allows a graceful shutdown.
65
Supported Types
Supported Types
OrientDB supports several types natively. Below is the complete table.
#id
Type
Description
Autoconversion
from/to
Minimum
Maximum
Java type
0
Boolean
Handles only the values True or
False
java.lang.Boolean
boolean
or
0
1
String
1
Integer
32-bit signed Integers
java.lang.Integer
int
or
-2,147,483,648
+2,147,483,647
Any
Number,
String
2
Short
Small 16-bit signed integers
java.lang.Short
short
-32,768
32,767
Any
Number,
String
3
Long
Big 64-bit signed integers
java.lang.Long
long
-263
+263-1
Any
Number,
String
4
Float
Decimal numbers
java.lang.Float
float
2-149
-23 127
(2-2 )*2
Any
Number,
String
5
Double
Decimal numbers with high
precision
java.lang.Double
double
2-1074
-52 1023
(2-2 )*2
Any
Number,
String
6
Datetime
Any date with the precision up to
milliseconds. To know more about
it, look at M anaging Dates
java.util.Date
1002020303
Date, Long,
String
7
String
Any string as alphanumeric
sequence of chars
java.lang.String
-
-
8
Binary
Can contain any value as byte array
byte[]
0
2,147,483,647
String
9
Embedded
The Record is contained inside the
owner. The contained Record has
no RecordId
ORecord
-
ORecord
Embedded
list
The Records are contained inside
the owner. The contained records
have no RecordIds and are
reachable only by navigating the
owner record
List