H2o Package

2015-02-10

: H2O H2O Package h2o_package h2o-r docs-website 1042 master h2o-dev

Open the PDF directly: View PDF PDF.
Page Count: 55

DownloadH2o Package
Open PDF In BrowserView PDF
"Package ’h2o’"
February 9, 2015

R topics documented:
h2o-package . . . . . . . . . . . . . . . . .
apply,H2OFrame-method . . . . . . . . . .
as.data.frame.H2OFrame . . . . . . . . . .
as.h2o . . . . . . . . . . . . . . . . . . . .
as.matrix.h2o . . . . . . . . . . . . . . . .
ASTNode-class . . . . . . . . . . . . . . .
ClassesIntro . . . . . . . . . . . . . . . . .
colnames<-,H2OFrame,H2OFrame-method
Export intro . . . . . . . . . . . . . . . . .
h2o.anyFactor . . . . . . . . . . . . . . . .
h2o.assign . . . . . . . . . . . . . . . . . .
h2o.cbind . . . . . . . . . . . . . . . . . .
h2o.clusterInfo . . . . . . . . . . . . . . .
h2o.clusterIsUp . . . . . . . . . . . . . . .
h2o.createFrame . . . . . . . . . . . . . . .
h2o.crossValidate . . . . . . . . . . . . . .
h2o.cut . . . . . . . . . . . . . . . . . . .
h2o.ddply . . . . . . . . . . . . . . . . . .
h2o.deeplearning . . . . . . . . . . . . . .
h2o.dim . . . . . . . . . . . . . . . . . . .
h2o.downloadAllLogs . . . . . . . . . . . .
h2o.downloadCSV . . . . . . . . . . . . .
h2o.exportFile . . . . . . . . . . . . . . . .
h2o.exportHDFS . . . . . . . . . . . . . .
h2o.gbm . . . . . . . . . . . . . . . . . . .
h2o.getFrame . . . . . . . . . . . . . . . .
h2o.getModel . . . . . . . . . . . . . . . .
h2o.glm . . . . . . . . . . . . . . . . . . .
h2o.head . . . . . . . . . . . . . . . . . . .
h2o.importFile . . . . . . . . . . . . . . .
h2o.importFolder . . . . . . . . . . . . . .
h2o.importHDFS . . . . . . . . . . . . . .
h2o.importURL . . . . . . . . . . . . . . .
1

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

1
2
3
3
4
5
5
5
6
7
7
8
8
9
9
11
11
12
13
16
17
17
18
19
19
21
21
22
23
24
24
25
25

R topics documented:

2
h2o.init . . . . . . . . . . .
h2o.kmeans . . . . . . . . .
h2o.length . . . . . . . . . .
h2o.loadModel . . . . . . .
h2o.logAndEcho . . . . . .
h2o.ls . . . . . . . . . . . .
h2o.mean . . . . . . . . . .
h2o.nrow . . . . . . . . . .
h2o.parseRaw . . . . . . . .
h2o.performance . . . . . .
h2o.rbind . . . . . . . . . .
h2o.removeAll . . . . . . .
h2o.rm . . . . . . . . . . . .
h2o.saveModel . . . . . . .
h2o.scale . . . . . . . . . .
h2o.sd . . . . . . . . . . . .
h2o.shutdown . . . . . . . .
h2o.synonym . . . . . . . .
h2o.table . . . . . . . . . . .
h2o.uploadFile . . . . . . .
h2o.var . . . . . . . . . . .
h2o.word2vec . . . . . . . .
H2OConnection-class . . . .
H2OFrame-class . . . . . .
H2OFrame-Extract . . . . .
H2OModel-class . . . . . .
H2OModelMetrics-class . .
H2OObject-class . . . . . .
H2ORawData-class . . . . .
H2OW2V-class . . . . . . .
is.factor,H2OFrame-method
LazyEval . . . . . . . . . .
MethodsIntro . . . . . . . .
MethodsMisc-descrip . . . .
Node-class . . . . . . . . . .
OpsIntro-descrip . . . . . .
print.H2OTable . . . . . . .
quantile . . . . . . . . . . .
summary . . . . . . . . . .
transform.H2OFrame . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

25
27
28
29
29
30
31
31
32
32
33
34
35
35
36
37
37
38
39
39
40
40
41
42
44
45
45
46
46
47
47
47
48
49
49
49
50
51
52
52

h2o-package

3

H2O R Interface

h2o-package

Description
This is a package for running H2O via its REST API from within R. To communicate with a H2O
instance, the version of the R package must match the version of H2O. When connecting to a new
H2O cluster, it is necessary to re-run the initializer.
Details
Package:
Type:
Version:
Branch:
Date:
License:
Depends:

h2o
Package
0.1.27.1042
master
Mon Feb 09 23:32:25 PST 2015
Apache License (== 2.0)
R (>= 2.13.0), RCurl, rjson, statmod, tools, methods, utils

This package allows the user to run basic H2O commands using R commands. In order to use it,
you must first have H2O running (See How to Start H2O). To run H2O on your local machine, call
h2o.init without any arguments, and H2O will be automatically launched on http://127.0.0.1:
54321, where the IP is "127.0.0.1" and the port is 54321. If H2O is running on a cluster, you must
provide the IP and port of the remote machine as arguments to the h2o.init() call.
H2O supports a number of standard statistical models, such as GLM, K-means, and Random Forest
classification. For example, to run GLM, call h2o.glm with the H2O parsed data and parameters
(response variable, error distribution, etc...) as arguments. (The operation will be done on the server
associated with the data object where H2O is running, not within the R environment).
Note that no actual data is stored in the R workspace; and no actual work is carried out by R. R only
saves the named objects, which uniquely identify the data set, model, etc on the server. When the
user makes a request, R queries the server via the REST API, which returns a JSON file with the
relevant information that R then displays in the console.
Author(s)
Anqi Fu, Tom Kraljevic and Petr Maj, with contributions from the 0xdata team
Maintainer: Anqi Fu 
References
• 0xdata Homepage
• H2O Documentation
• H2O on Github

4

apply,H2OFrame-method

Examples
# Check connection with H2O and ensure local H2O R package matches server version.
# Optionally, ask for startH2O to start H2O if its not already running.
# Note that for startH2O to work, the IP must be 127.0.0.1 or localhost with port 54321.
library(h2o)
localH2O = h2o.init(ip = "127.0.0.1", port = 54321, startH2O = TRUE)
# Import iris dataset into H2O and print summary
irisPath = system.file("extdata", "iris.csv", package = "h2o")
iris.hex = h2o.importFile(localH2O, path = irisPath, key = "iris.hex")
summary(iris.hex)
# Attach H2O R package and run GLM demo
??h2o
demo(package = "h2o")
demo(h2o.prcomp)
# Shutdown local H2O instance when finished
h2o.shutdown(localH2O)

apply,H2OFrame-method Overloaded ‘apply‘ method from base::

Description
‘apply‘ operates on H2OFrames (ASTs or H2OFrame objects) and returns an object of type H2OFrame.
Usage
## S4 method for signature H2OFrame
apply(X, MARGIN, FUN, ...)
Details
Overall Plan:
passes an AST of the format
(apply $X #MARGIN $FUN a1 a2 ...)
ASTApply will parse additional arguments to an AST[] _args. This array must be 1 less the number
of args passed to FUN. Otherwise, throw an exception.
Pass the additional by calling _fun.exec(env, _args)

as.data.frame.H2OFrame

5

as.data.frame.H2OFrame
Converts a Parsed H2O data into a Data Frame

Description
Downloads the H2O data and then scan it in to an R data frame.
Usage
## S3 method for class H2OFrame
as.data.frame(x, ...)
Arguments
x

An H2OFrame object.

...

Further arguments to be passed down from other methods.

Examples
localH2O <- h2o.init()
prosPath <- system.file("extdata", "prostate.csv", package="h2o")
prostate.hex <- h2o.uploadFile(localH2O, path = prosPath)
as.data.frame.H2OFrame(prostate.hex)

as.h2o

R data.frame -> H2OFrame

Description
Import a local R data frame to the H2O cloud.
Usage
as.h2o(object, conn = h2o.getConnection(), key = "")
Arguments
object

An R data frame.

conn

An H2OConnection object containing the IP address and port number of the
H2O server.

key

A string with the desired name for the H2O key.

6

as.matrix.h2o

as.matrix.h2o

Converts H2O Data to an R Matrix

Description
Convert an H2OFrame object to a matrix, which allows subsequent data frame operations within
the R environment.
Usage
## S3 method for class H2OFrame
as.matrix(x, ...)
Arguments
x

An H2OFrame object

...

Additional arguments to be passed to or from

Value
Returns a matrix in the R enviornment.
Note
This call establishes the data set in the R environment and subsequent operations on the matrix take
place within R, not H2O. When data are large, users may experience significant slowdown.
See Also
as.matrix for the base R implementation.
Examples
library(h2o)
localH2O <- h2o.init()
prosPath <- system.file("extdata", "prostate.csv", package="h2o")
prostate.hex <- h2o.uploadFile(localH2O, path = prosPath)
prostate.matrix <- as.matrix(prostate.hex)
summary(prostate.matrix)
head(prostate.matrix)

ASTNode-class

ASTNode-class

7

The ASTNode class.

Description
This class represents a node in the abstract syntax tree. An ASTNode has a root. The root has
children that either point to another ASTNode, or to a leaf node, which may be of type ASTNumeric
or ASTFrame.
Usage
## S4 method for signature ASTNode
show(object)
Slots
root Object of type Node
children Object of type list

ClassesIntro

Class definitions and their ‘show‘ & ‘summary‘ methods.

Description
To conveniently and safely pass messages between R and H2O, this package relies on S4 objects
to capture and pass state. This R file contains all of the h2o package’s classes as well as their
complementary ‘show‘ methods. The end user will typically never have to reason with these objects
directly, as there are S3 accessor methods provided for creating new objects.

colnames<-,H2OFrame,H2OFrame-method
Returns Column Names for a Parsed H2O Data Object.

Description
Returns column names for an H2OFrame object.

8

Export intro

Usage
## S4 replacement method for signature H2OFrame,H2OFrame
colnames(x) <- value
## S4 replacement method for signature H2OFrame,character
colnames(x) <- value
## S4 method for signature H2OFrame
names(x)
## S4 replacement method for signature H2OFrame
names(x) <- value
## S4 method for signature H2OFrame
colnames(x)
## S4 method for signature H2OFrame
names(x)
Arguments
x

An H2OFrame object.

See Also
colnames for the base R method.
Examples
library(h2o)
localH2O <- h2o.init()
irisPath <- system.file("extdata", "iris.csv", package="h2o")
iris.hex <- h2o.uploadFile(localH2O, path = irisPath)
summary(iris.hex)
colnames(iris.hex)

Export intro

Data Export

Description
Export data to local disk or HDFS. Save models to local disk or HDFS.

h2o.anyFactor

9

Check H2OFrame columns for factors

h2o.anyFactor

Description
Determines if any column of an H2OFrame object contains categorical data.
Usage
h2o.anyFactor(x)
Arguments
x

An H2OFrame object.

Value
Returns a logical value indicating whether any of the columns in x are factors.
Examples
library(h2o)
localH2O <- h2o.init()
irisPath <- system.file("extdata", "iris_wheader.csv", package="h2o")
iris.hex <- h2o.importFile(localH2O, path = irisPath)
h2o.anyFactor(iris.hex)

Rename an H2O object.

h2o.assign

Description
Makes a copy of the data frame and gives it the desired the key.
Usage
h2o.assign(data, key)
Arguments
data

An H2OFrame object

key

The hex key to be associated with the H2O parsed data object

10

h2o.clusterInfo

Combine H2O Datasets by Columns

h2o.cbind

Description
Takes a sequence of H2O data sets and combines them by column
Usage
h2o.cbind(...)
Arguments
...

A sequence of H2OFrame arguments. All datasets must exist on the same H2O
instance (IP and port) and contain the same number of rows.

deparse.level

Integer controlling the construction of column names. ##Currently unimplemented.##

Value
An H2OFrame object containing the combined . . . arguments column-wise.
See Also
cbind for the base R method.
Examples
library(h2o)
localH2O <- h2o.init()
prosPath <- system.file("extdata", "prostate.csv", package="h2o")
prostate.hex <- h2o.uploadFile(localH2O, path = prosPath)
prostate.cbind <- h2o.cbind(prostate.hex, prostate.hex)
head(prostate.cbind)

h2o.clusterInfo

Print H2O cluster info

Description
Print H2O cluster info
Usage
h2o.clusterInfo(conn = h2o.getConnection())

h2o.clusterIsUp

11

Arguments
conn

h2o.clusterIsUp

H2O connection object

Determine if an H2O cluster is up or not

Description
Determine if an H2O cluster is up or not
Usage
h2o.clusterIsUp(conn = h2o.getConnection())
Arguments
conn

H2O connection object

Value
TRUE if the cluster is up; FALSE otherwise

h2o.createFrame

Data Frame Creation in H2O

Description
Creates a data frame in H2O with real-valued, categorical, integer, and binary columns specified by
the user.
Usage
h2o.createFrame(conn = h2o.getConnection(), key = "", rows = 10000,
cols = 10, randomize = TRUE, value = 0, real_range = 100,
categorical_fraction = 0.2, factors = 100, integer_fraction = 0.2,
integer_range = 100, binary_fraction = 0.1, binary_ones_fraction = 0.02,
missing_fraction = 0.01, response_factors = 2, has_response = FALSE,
seed)

12

h2o.createFrame

Arguments
A H2OConnection object.
A string indicating the destination key. If empty, this will be auto-generated by
H2O.
rows
The number of rows of data to generate.
cols
The number of columns of data to generate. Excludes the response column if
has_response = TRUE.
randomize
A logical value indicating whether data values should be randomly generated.
This must be TRUE if either categorical_fraction or integer_fraction is
non-zero.
value
If randomize = FALSE, then all real-valued entries will be set to this value.
real_range
The range of randomly generated real values.
categorical_fraction
The fraction of total columns that are categorical.
factors
The number of (unique) factor levels in each categorical column.
integer_fraction
The fraction of total columns that are integer-valued.
integer_range The range of randomly generated integer values.
binary_fraction
The fraction of total columns that are binary-valued.
binary_ones_fraction
The fraction of values in a binary column that are set to 1.
missing_fraction
The fraction of total entries in the data frame that are set to NA.
response_factors
If has_response = TRUE, then this is the number of factor levels in the response
column.
has_response
A logical value indicating whether an additional response column should be prepended to the final H2O data frame. If set to TRUE, the total number of columns
will be cols+1.
seed
A seed used to generate random values when randomize = TRUE.
conn
key

Value
Returns a H2OFrame object.
Examples
library(h2o)
localH2O <- h2o.init()
hex <- h2o.createFrame(localH2O, rows = 1000, cols = 100, categorical_fraction = 0.1, factors = 5, integer_fraction
head(hex)
summary(hex)

hex2 <- h2o.createFrame(localH2O, rows = 100, cols = 10, randomize = FALSE, value = 5, categorical_fraction = 0, inte
summary(hex2)

h2o.crossValidate

h2o.crossValidate

13

Cross Validate an H2O Model

Description
Cross Validate an H2O Model
Usage
h2o.crossValidate(model, nfolds, model.type = c("gbm", "glm", "deeplearning"),
params, strategy = c("mod1", "random"), ...)

h2o.cut

Cut H2O Numeric Data to Factor

Description
Divides the range of the H2O data into intervals and codes the values according to which interval
they fall in. The leftmost interval corresponds to the level one, the next is level two, etc.
Usage
## S3 method for class H2OFrame
cut(x, breaks, labels = NULL, include.lowest = FALSE,
right = TRUE, dig.lab = 3, ...)
Arguments
x

An H2OFrame object with numeric columns.

breaks

A numeric vector of two or more unique cut points.

labels

Labels for the levels of the resulting category. By default, labels are constructed
sing "(a,b]" interval notation.

include.lowest Logical, indicationg if an ’x[i]’ equal to the lowest (or highest, for right =
FALSE ’breaks’ value should be included
right

/codeLogical, indicating if the intervals should be closed on the right (opened
on the left) or vice versa.

dig.lab

Integer which is used when labels are not given, determines the number of digits
used in formatting the beak numbers.

...

Further arguments passed to or from other methods.

Value
Returns an H2OFrame object containing the factored data with intervals as levels.

14

h2o.ddply

Examples
library(h2o)
localH2O <- h2o.init()
irisPath <- system.file("extdata", "iris_wheader.csv", package="h2o")
iris.hex <- h2o.uploadFile(localH2O, path = irisPath, key = "iris.hex")
summary(iris.hex)
# Cut sepal length column into intervals determined by min/max/quantiles
sepal_len.cut = cut.H2OFrame(iris.hex$sepal_len, c(4.2, 4.8, 5.8, 6, 8))
head(sepal_len.cut)
summary(sepal_len.cut)

Split H2O Dataset, Apply Function, and Return Results

h2o.ddply

Description
For each subset of an H2O data set, apply a user-specified function, then comine the results.
Usage
h2o.ddply(.data, .variables, .fun = NULL, ..., .progress = "none")
Arguments
.data

An H2OFrame object to be processed.

.variables

Variables to split .data by, either the indices or names of a set of columns.

.fun

Function to apply to each subset grouping.

.progress

Name of the progress bar to use. #TODO: (Currently unimplemented)

...

Additional arguments passed on to .fun. #TODO: (Currently unimplemented)

Value
Returns a H2OFrame object containing the results from the split/apply operation, arranged
See Also
ddply for the plyr library implementation.
Examples
library(h2o)
localH2O <- h2o.init()
# Import iris dataset to H2O
irisPath <- system.file("extdata", "iris_wheader.csv", package = "h2o")
iris.hex <- h2o.uploadFile(localH2O, path = irisPath, key = "iris.hex")

h2o.deeplearning

15

# Add function taking mean of sepal_len column
fun = function(df) { sum(df[,1], na.rm = T)/nrow(df) }
# Apply function to groups by class of flower
# uses h2os ddply, since iris.hex is an H2OFrame object
res = h2o.ddply(iris.hex, "class", fun)
head(res)

h2o.deeplearning

Build a Deep Learning Neural Network

Description
Performs Deep Learning neural networks on an H2OFrame
Usage
h2o.deeplearning(x, y, training_frame, destination_key = "",
override_with_best_model, do_classification = TRUE, n_folds = 0,
validation_frame, ..., checkpoint, autoencoder = FALSE,
use_all_factor_levels = TRUE, activation = c("Rectifier", "Tanh",
"TanhWithDropout", "RectifierWithDropout", "Maxout", "MaxoutWithDropout"),
hidden = c(200, 200), epochs = 10, train_samples_per_iteration = -2,
seed, adaptive_rate = TRUE, rho = 0.99, epsilon = 1e-08, rate = 0.005,
rate_annealing = 1e-06, rate_decay = 1, momentum_start = 0,
momentum_ramp = 1e+06, momentum_stable = 0,
nesterov_accelerated_gradient = TRUE, input_dropout_ratio = 0,
hidden_dropout_ratios, l1 = 0, l2 = 0, max_w2 = Inf,
initial_weight_distribution = c("UniformAdaptive", "Uniform", "Normal"),
initial_weight_scale = 1, loss, score_interval = 5,
score_training_samples, score_validation_samples, score_duty_cycle,
classification_stop, regression_stop, quiet_mode, max_confusion_matrix_size,
max_hit_ratio_k, balance_classes = FALSE, max_after_balance_size,
score_validation_sampling, diagnostics, variable_importances, fast_mode,
ignore_const_cols, force_load_balance, replicate_training_data,
single_node_mode, shuffle_training_data, sparse, col_major)
Arguments
x

A vector containing the character names of the predictors in the model.

y
The name of the response variable in the model.
override_with_best_model
Logcial. If TRUE, override the final model with the best model found during
traning. Defaults to TRUE.
checkpoint

"Model checkpoint (either key or H2ODeepLearningModel) to resume training
with."

autoencoder

Enable auto-encoder for model building.

16

h2o.deeplearning
use_all_factor_levels
Logical. Use all factor levels of categorical variance. Otherwise the first factor
level is omittted (without loss of accuracy). Useful for variable imporotances
and auto-enabled for autoencoder.
activation

A string indicating the activation function to use. Must be either "Tanh", "TanhWithDropout", "Rectifier", "RectifierWithDropout", "Maxout", or "MaxoutWithDropout"

hidden

Hidden layer sizes (e.g. c(100,100))

epochs
How many times the dataset shoud be iterated (streamed), can be fractional
train_samples_per_iteration
Number of training samples (globally) per MapReduce iteration. Special values
are: 0 one epoch; -1 all available data (e.g., replicated training data); or -2 autotuning (default)
seed

Seed for random numbers (affects sampling) - Note: only reproducible when
running single threaded

adaptive_rate

Logical. Adaptive learning rate (ADAELTA)

rho

Adaptive learning rate time decay factor (similarity to prior updates)

rate

Learning rate (higher => less stable, lower => slower convergence)

rate_annealing Learning rate annealing: (rate)/(1 + ratea nnealing ∗ samples)
rate_decay

Learning rate decay factor between layers (N-th layer: rate ∗ α( N − 1))

momentum_start Initial momentum at the beginning of traning (try 0.5)
momentum_ramp Number of training samples for which momentum increases
momentum_stable
Final momentum after ther amp is over (try 0.99)
l1

L1 regularization (can add stability and imporve generalization, cause many
weights to become 0)

l2

L2 regularization (can add stability and improve generalization, causes many
weights to be small)

max_w2
Constraint for squared sum of incoming weights per unit (e.g. Rectifier)
initial_weight_distribution
Can be "Uniform", "UniformAdaptive", or "Normal"
initial_weight_scale
Unifrom: -value ... value, Normal: stddev
loss

Loss function. Can be "Automatic", "MeanSquare", or "CrossEntropy"

score_interval Shortest time interval (in secs) between model scoring
score_training_samples
Number of training set samples for scoring (0 for all)
score_validation_samples
Number of validation set samples for scoring (0 for all)
score_duty_cycle
Maximum duty cycle fraction for scoring (lower: more training, higher: more
scoring)

h2o.deeplearning

17

classification_stop
Stopping criterion for classification error fraction on training data (-1 to disable)
regression_stop
Stopping criterion for regression error (MSE) on training data (-1 to disable)
quiet_mode
Enable quiet mode for less output to standard output
max_confusion_matrix_size
Max. size (number of classes) for confusion matrices to be shown
max_hit_ratio_k
Max number (top K) of predictions to use for hit ration computation(for multiclass only, 0 to disable)
balance_classes
Balance training data class counts via over/under-sampling (for imbalanced data)
max_after_balance_size
Maximum relative size of the training data after balancing class counts (can be
less than 1.0)
score_validation_sampling
Method used to sample validation dataset for scoring
diagnostics
Enable diagnostics for hidden layers
variable_importances
Compute variable importances for input features (Gedeon method) - can be slow
for large networks)
fast_mode
Enable fast mode (minor approximations in back-propagation)
ignore_const_cols
Igrnore constant training columns (no information can be gained anwyay)
force_load_balance
Force extra load balancing to increase training speed for small datasets (to keep
all cores busy)
replicate_training_data
Replicate the entire training dataset onto every node for faster training
single_node_mode
Run on a single node for fine-tuning of model parameters
shuffle_training_data
Enable shuffling of training data (recommended if training data is replicated and
train_samples_per_iteration is close to numRows ∗ numN odes
sparse

Sparse data handling (Experimental)

col_major

Use a column major weight matrix for input layer. Can speed up forward proagation, but might slow down backpropagation (Experimental)

data

An H2OFrame object containing the variables in the model.

key

(Optional) The unique character hex key assigned to the resulting model. If
none is given, a key will automatically be generated.

classification Logical. Indicates whether the algorithm should conduct classification.
nfolds

(Optional) Number of folds for cross-validation. If nfolds >= 2, then validation
must remain empty.

18

h2o.dim
(Optional) An H2OFrame object indicating the validation dataset used to contruct the confusion matrix. If left blank, this defaults to the training data when
nfolds = 0
nesterov_accelarated_gradient
Logical. Use Nesterov accelerated gradient (reccomended)
input_dropout_ratios
Input layer dropout ration (can improve generalization) specify one value per
hidden layer, defaults to 0.5
validation

See Also
predict.H2ODeepLearningModel for prediction.
Examples
library(h2o)
localH2O <- h2o.init()
irisPath <- system.file("extdata", "iris.csv", package = "h2o")
iris.hex <- h2o.uploadFile(localH2O, path = irisPath)
indep <- names(iris.hex)[1:4]
dep <- names(iris.hex)[5]
iris.dl <- h2o.deeplearning(x = indep, y = dep, data = iris.hex, activation = "Tanh", epochs = 5)

h2o.dim

Returns the Dimensions of a Parsed H2O Data Object.

Description
Returns the number of rows and columns for an H2OFrame object.
Usage
## S4 method for signature H2OFrame
dim(x)
Arguments
x

An H2OFrame object.

See Also
dim for the base R method.
Examples
localH2O <- h2o.init()
irisPath <- system.file("extdata", "iris.csv", package="h2o")
iris.hex <- h2o.uploadFile(localH2O, path = irisPath)
dim(iris.hex)

h2o.downloadAllLogs

19

h2o.downloadAllLogs

Download H2O Log Files to Disk

Description
h2o.downloadAllLogs downloads all H2O log files to local disk. Generally used for debugging
purposes.
Usage
h2o.downloadAllLogs(conn = h2o.getConnection(), dirname = ".",
filename = NULL)
Arguments
conn

An H2OConnection object pointing to a running H2O cluster.

dirname

(Optional) A character string indicating the directory that the log file should be
saved in.

filename

(Optional) A character string indicating the name that the log file should be
saved to.

See Also
H2OConnection

h2o.downloadCSV

Download H2O Data to Disk

Description
Download an H2O data set to a CSV file on the local disk
Usage
h2o.downloadCSV(data, filename)
Arguments
filename

A string indicating the name that the CSV file should be should be saved to.

an

H2OFrame object to be downloaded.

Warning
Files located on the H2O server may be very large! Make sure you have enough hard drive psace to
accomoadet the entire file.

20

h2o.exportFile

Examples
library(h2o)
localH2O <- h2o.init()
irisPath <- system.file("extdata", "iris_wheader.csv", package = "h2o")
iris.hex <- h2o.uploadFile(localH2O, path = irisPath)
myFile <- paste(getwd(), "my_iris_file.csv", sep = .Platform$file.sep)
h2o.downloadCSV(iris.hex, myFile)
file.info(myFile)
file.remove(myFile)

h2o.exportFile

Export an H2O Data Frame to a File

Description
Exports an H2OFrame (which can be either VA or FV) to a file. This file may be on the H2O
instace’s local filesystem, or to HDFS (preface the path with hdfs://) or to S3N (preface the path
with s3n://).
Usage
h2o.exportFile(data, path, force = FALSE)
Arguments
path

The path to write the file to. Must include the directory and filename. May be
prefaced with hdfs:// or s3n://. Each row of data appears as line of the file.

force

logical, indicates how to deal with files that already exist.

An

H2OFrame data frame.

Details
In the case of existing files forse = TRUE will overwrite the file. Otherwise, the operation will fail.
Examples
library(h2o)
localH2O <- h2o.init()
irisPath <- system.file("extdata", "iris.csv", package = "h2o")
iris.hex <- h2o.uploadFile(localH2O, path = irisPath)
h2o.exportFile(iris.hex, path = "/path/on/h2o/server/filesystem/iris.csv")
h2o.exportFile(iris.hex, path = "hdfs://path/in/hdfs/iris.csv")
h2o.exportFile(iris.hex, path = "s3n://path/in/s3/iris.csv")

h2o.exportHDFS

h2o.exportHDFS

21

Export a Model to HDFS

Description
Exports an H2OModel to HDFS.
Usage
h2o.exportHDFS(object, path)
Arguments
object

an H2OModel class object.

path

The path to write the model to. Must include the driectory and filename.

h2o.gbm

Gradient Boosted Machines

Description
Builds gradient boosted classification trees, and gradient boosted regression trees on a parsed data
set.
Usage
h2o.gbm(x, y, training_frame, do_classification, ..., destination_key,
loss = c("AUTO", "bernoulli", "multinomial", "gaussian"), ntrees = 50,
max_depth = 5, min_rows = 10, learn_rate = 0.1, nbins = 20,
group_split = TRUE, variable_importance = FALSE,
validation_frame = FALSE, balance_classes = FALSE,
max_after_balance_size = 1, seed)
Arguments
x

A vector containing the names or indices of the predictor variables to use in
building the GBM model.

y

The name or index of the response variable. If the data does not contain a header,
this is the column index number starting at 0, and increasing from left to right.
(The response must be either an integer or a categorical variable).

training_frame An H2OFrame object containing the variables in the model.
loss

Defaults to "AUTO" A character string. The loss function to be implemented. Must be "AUTO" or "Bernoulli"

22

h2o.gbm
ntrees

Defaults to 50 A nonnegative integer that determines the number of trees to
grow.

max_depth

Defaults to 5 Maximum depth to grow the tree.

min_rows

Defaults to 10 Minimum number of rows to assign to teminal nodes.

learn_rate

Defaults to 0.1 An interger from 0.0 to 1.0

nbins

Defaults to 20 Number of bins to use in building histogram.

group_split
#TODO NEED TO FINISH
variable_importance
#TODO: NEED TO FINISH
validation_frame
An H2OFrame object indicating the validation dataset used to contruct the confusion matrix. If left blank, this defaults to the training data when nfolds = 0
balance_classes
Defaults to FALSE logical, indicates whether or not to balance training data
class counts via over/under-sampling (for imbalanced data)
max_after_balance_size
Defaults to 1 Maximum relative size of the training data after balancing class
counts (can be less than 1.0)
seed

Seed for random numbers (affects sampling) - Note: only reproducible when
running single threaded

key

(Optional) The unique hex key assigned to the resulting model. If none is given,
a key will automatically be generated.

nfolds

(Optional) Number of folds for cross-validation. If nfolds >= 2, then validation
must remain empty.

See Also
predict.H2OGBMModel for prediction.
Examples
#TODO GBM wasnt working example needs to be redone, maybe
library(h2o)
localH2O = h2o.init()
# Run regression GBM on australia.hex data
ausPath <- system.file("extdata", "australia.csv", package="h2o")
australia.hex <- h2o.uploadFile(localH2O, path = ausPath)
independent <- c("premax", "salmax","minairtemp", "maxairtemp", "maxsst",
"maxsoilmoist", "Max_czcs")
dependent <- "runoffnew"
h2o.gbm(y = dependent, x = independent, data = australia.hex, ntrees = 3,
max_depth = 3, min_rows = 2)

h2o.getFrame

h2o.getFrame

23

Get an R Reference to an H2O Dataset

Description
Get the reference to a frame with the given key in the H2O instance.
Usage
h2o.getFrame(key, conn = h2o.getConnection(), linkToGC = FALSE)
Arguments
key

A string indicating the unique hex key of the dataset to retrieve.

conn

H2OConnection object containing the IP address and port of the server running
H2O.

linkToGC

a logical value indicating whether to remove the underlying key from the H2O
cluster when the R proxy object is garbage collected.

h2o.getModel

Get an R reference to an H2O model

Description
Returns a reference to an existing model in the H2O instance.
Usage
h2o.getModel(key, conn = h2o.getConnection(), linkToGC = FALSE)
Arguments
key

A string indicating the unique hex key of the model to retrieve.

conn

H2OConnection object containing the IP address and port of the server running
H2O.

linkToGC

a logical value indicating whether to remove the underlying key from the H2O
cluster when the R proxy object is garbage collected.

Value
Returns an object that is a subclass of H2OModel.

24

h2o.glm

Examples
library(h2o)
localH2O <- h2o.init()
iris.hex <- as.h2o(iris, localH2O, "iris.hex")
key <- h2o.gbm(x = 1:4, y = 5, training_frame = iris.hex)@key
model.retrieved <- h2o.getModel(key, localH2O)

h2o.glm

H2O Generalized Linear Models

Description
Fit a generalized linear model, specified by a response variable, a set of predictors, and a description
of the error distribution.
Usage
h2o.glm(x, y, training_frame, destination_key, validation_frame, ...,
score_each_iteration = FALSE, do_classification = FALSE,
balance_classes = FALSE, class_sampling_factors,
max_after_balance_size = 5, solver = c("ADMM", "L_BFGS"),
standardize = TRUE, family = c("gaussian", "binomial", "poisson", "gamma",
"tweedie"), link = c("family_default", "identity", "logit", "log",
"inverse", "tweedie"), tweedie_variance_power = NaN,
tweedie_link_power = NaN, alpha = 0.5, prior1 = 0, lambda = 1e-05,
lambda_search = FALSE, nlambdas = -1, lambda_min_ratio = 1,
higher_accuracy = FALSE, use_all_factor_levels = FALSE, n_folds = 0)
Arguments
x
y
training_frame
destination_key
validation_frame
...
score_each_iteration
do_classification
balance_classes

h2o.head

25

class_sampling_factors
max_after_balance_size
solver
standardize
family
link
tweedie_variance_power
tweedie_link_power
alpha
prior1
lambda
lambda_search
nlambdas
lambda_min_ratio
higher_accuracy
use_all_factor_levels
n_folds

h2o.head

Return the Head or Tail of an H2O Dataset.

Description
Returns the first or last rows of an H2O parsed data object.
Usage
## S4 method for signature H2OFrame
head(x, n = 6L, ...)
## S4 method for signature H2OFrame
tail(x, n = 6L, ...)
Arguments
x
n
...

An H2OFrame object.
(Optional) A single integer. If positive, number of rows in x to return. If negative, all but the n first/last number of rows in x.
Further arguments passed to or from other methods.

26

h2o.importFolder

Value
A data frame containing the first or last n rows of an H2OFrame object.
Examples
library(h2o)
localH2O <- h2o.init(ip = "localhost", port = 54321, startH2O = TRUE)
ausPath <- system.file("extdata", "australia.csv", package="h2o")
australia.hex <- h2o.uploadFile(localH2O, path = ausPath)
head(australia.hex, 10)
tail(australia.hex, 10)

h2o.importFile

Import A File

Description
Import a single file. If the given path is relative, then it will be relative to the start location of the
H2O instance. The default behavior is to pass-through to the parse phase automatically.
Usage
h2o.importFile(path, conn = h2o.getConnection(), key = "", parse = TRUE,
header, sep = "", col.names)

h2o.importFolder

Data Import

Description
Importing data is a _lazy_ parse of the data. It adds an extra step so that a user may specify a variety
of options including a header file, separator type, and in the future column type. Additionally,
the import phase provides feedback on whether or not a folder or group of files may be imported
together.
Usage
h2o.importFolder(path, conn = h2o.getConnection(), pattern = "", key = "",
parse = TRUE, header, sep = "", col.names)
Details
Import a Folder of Files
Import an entire directory of files. If the given path is relative, then it will be relative to the start
location of the H2O instance. The default behavior is to pass-through to the parse phase automatically.

h2o.importHDFS

h2o.importHDFS

27

Import HDFS

Description
Import from an HDFS location.
Usage
h2o.importHDFS(path, conn = h2o.getConnection(), pattern = "", key = "",
parse = TRUE, header, sep = "", col.names)

h2o.importURL

Import A URL

Description
Import a data source from a URL.
Usage
h2o.importURL(path, conn = h2o.getConnection(), key = "", parse = TRUE,
header, sep = "", col.names)

h2o.init

Initialize and Connect to H2O

Description
Attempts to start and/or connect to and H2O instance.
Usage
h2o.init(ip = "127.0.0.1", port = 54321, startH2O = TRUE,
forceDL = FALSE, Xmx, beta = FALSE, assertion = TRUE, license = NULL,
nthreads = -2, max_mem_size = NULL, min_mem_size = NULL,
ice_root = tempdir(), strict_version_check = FALSE)

28

h2o.init

Arguments
Object of class character representing the IP address of the server where H2O
is running.
port
Object of class numeric representing the port number of the H2O server.
startH2O
(Optional) A logical value indicating whether to try to start H2O from R if no
connection with H2O is detected. This is only possible if ip = "localhost" or
ip = "127.0.0.1". If an existing connection is detected, R does not start H2O.
forceDL
(Optional) A logical value indicating whether to force download of the H2O
executable. Defaults to FALSE, so the executable will only be downloaded if it
does not already exist in the h2o R library resources directory h2o/java/h2o.jar.
This value is only used when R starts H2O.
Xmx
(Optional) (DEPRECATED) A character string specifying the maximum size,
in bytes, of the memory allocation pool to H2O. This value must a multiple of
1024 greater than 2MB. Append the letter m or M to indicate megabytes, or g or
G to indicate gigabytes. This value is only used when R starts H2O.
beta
(Optional) A logical value indicating whether H2O should launch in beta
mode. This value is only used when R starts H2O.
assertion
(Optional) A logical value indicating whether H2O should be launched with
assertions enabled. Used mainly for error checking and debugging purposes.
This value is only used when R starts H2O.
license
(Optional) A character string value specifying the full path of the license file.
This value is only used when R starts H2O.
nthreads
(Optional) Number of threads in the thread pool. This relates very closely to the
number of CPUs used. -2 means use the CRAN default of 2 CPUs. -1 means use
all CPUs on the host. A positive integer specifies the number of CPUs directly.
This value is only used when R starts H2O.
max_mem_size
(Optional) A character string specifying the maximum size, in bytes, of the
memory allocation pool to H2O. This value must a multiple of 1024 greater
than 2MB. Append the letter m or M to indicate megabytes, or g or G to indicate
gigabytes. This value is only used when R starts H2O.
min_mem_size
(Optional) A character string specifying the minimum size, in bytes, of the
memory allocation pool to H2O. This value must a multiple of 1024 greater
than 2MB. Append the letter m or M to indicate megabytes, or g or G to indicate
gigabytes. This value is only used when R starts H2O.
strict_version_check
(Optional) Setting this to FALSE is unsupported and should only be done when
advised by technical support.
ip

Details
By defualt, this method first checks if an H2O instance is connectible. If it cannot connect and
start = TRUE with ip = "localhost", it will attempt to start and instance of H2O at localhost:54321. Otherwise it stops with an error.

When initializing H2O locally, this method searches for h2o.jar in the R library resources (system.file("java", "h2o.jar"
and if the file does not exist, it will automatically attempt to download the correct version from
Amazon S3. The user must have Internet access for this process to be successful.

h2o.kmeans

29

Once connected, the method checks to see if the local H2O R package version matches the version
of H2O running on the server. If there is a mismatch and the user indicates she wishes to upgrade,
it will remove the local H2O R package and download/install the H2O R package from the server.
Value
this method will load it and return a H2OConnection object containing the IP address and port
number of the H2O server.
Note
Users may wish to manually upgrade their package (rather than waiting until being prompted),
which requires that they fully uninstall and reinstall the H2O package, and the H2O client package.
You must unload packages running in the environment before upgrading. It’s recommended that
users restart R or R studio after upgrading
See Also
H2O R package documentation for more details, or type h2o in the R console. h2o.shutdown for
shutting down from R.
Examples
# Try to connect to a local H2O instance that is already running.
# If not found, start a local H2O instance from R with the default settings.
localH2O = h2o.init()
# Try to connect to a local H2O instance.
# If not found, raise an error.
localH2O = h2o.init(startH2O = FALSE)
# Try to connect to a local H2O instance that is already running.
# If not found, start a local H2O instance from R with 5 gigabytes of memory.
localH2O = h2o.init(max_mem_size = "5g")
# Try to connect to a local H2O instance that is already running.
# If not found, start a local H2O instance from R that uses 5 gigabytes of memory.
localH2O = h2o.init(max_mem_size = "5g")

h2o.kmeans

KMeans Model in H2O

Description
Performs k-means clustering on an H2O dataset.

30

h2o.length

Usage
h2o.kmeans(training_frame, x, k, destination_key, max_iterations = 1000,
standardize = TRUE, init = c("Furthest", "Random", "PlusPlus"), seed)
Arguments
training_frame An H2OFrame object containing the variables in the model.
x

(Optional) A vector containing the data columns on which k-means operates.

k

The number of clusters. Must be between 1 and 1e7 inclusive. k may be omitted
if the user specifies the initial centers in the init parameter. If k is not omitted,
in this case, then it should be equal to the number of user-specified centers.

destination_key
(Optional) The unique hex key assigned to the resulting model. Automatically
generated if none is provided.
max_iterations The maximum number of iterations allowed. Must be between 0
standardize

Logical, indicates whether the data should be standardized before running kmeans.

init

A character string that selects the initial set of k cluster centers. Possible values
are "Random": for random initialization, "PlusPlus": for k-means plus initialization, or "Furthest": for initialization at the furthest point from each successive center. Additionally, the user may specify a the initial centers as a matrix, data.frame, H2OFrame, or list of vectors. For matrices, data.frames, and
H2OFrames, each row of the respective structure is an initial center. For lists of
vectors, each vector is an initial center.

seed

(Optional) Random seed used to initialize the cluster centroids.

Value
Returns an object of class H2OKMeansModel.
Examples
library(h2o)
localH2O <- h2o.init()
prosPath <- system.file("extdata", "prostate.csv", package="h2o")
prostate.hex <- h2o.uploadFile(localH2O, path = prosPath)
h2o.kmeans(training_frame = prostate.hex, k = 10, x = c("AGE", "RACE", "VOL", "GLEASON"))

h2o.length

Returns the Length of a Parsed H2O Data Object.

Description
Returns the length of an H2OFrame

h2o.loadModel

31

Usage
## S4 method for signature H2OFrame
length(x)
Arguments
x

An H2OFrame object.

See Also
length for the base R method.
Examples
localH2O <- h2o.init()
irisPath <- system.file("extdata", "iris.csv", package = "h2o")
iris.hex <- h2o.uploadFile(localH2O, path = irisPath)
length(iris.hex)

h2o.loadModel

Load H2O Model from HDFS or Local Disk

Description
Load a saved H2O model from disk.
Usage
h2o.loadModel(path, conn = h2o.getConnection())

h2o.logAndEcho

Log a message on the server-side logs

Description
This is helpful when running several pieces of work one after the other on a single H2O cluster and
you want to make a notation in the H2O server side log where one piece of work ends and the next
piece of work begins.
Usage
h2o.logAndEcho(message, conn = h2o.getConnection())
Arguments
message

A character string with the message to write to the log.

conn

An H2OConnection object pointing to a running H2O cluster.

32

h2o.ls

Details
h2o.logAndEcho sends a message to H2O for logging. Generally used for debugging purposes.

See Also
H2OConnection

h2o.ls

List Keys on an H2O Cluster

Description
Accesses a list of object keys in the running instance of H2O.

Usage
h2o.ls(conn = h2o.getConnection())

Arguments
conn

An H2OConnection object containing the IP address and port number of the
H2O server.

Value
Returns a list of hex keys in the current H2O instance.

Examples
library(h2o)
localH2O <- h2o.init()
prosPath <- system.file("extdata", "prostate.csv", package="h2o")
prostate.hex <- h2o.uploadFile(localH2O, path = prosPath)
h2o.ls(localH2O)

h2o.mean

h2o.mean

33

Mean of a column

Description
Obtain the mean of a column of a parsed H2O data object.
Usage
## S4 method for signature H2OFrame
mean(x, trim = 0, na.rm = FALSE, ...)
Arguments
x

An H2OFrame object.

trim

The fraction (0 to 0.5) of observations to trim from each end of x before the
mean is computed.

na.rm

A logical value indicating whether NA or missing values should be stripped before the computation.

...

Further arguments to be passed from or to other methods.

See Also
mean for the base R implementation.
Examples
localH2O <- h2o.init()
prosPath <- system.file("extdata", "prostate.csv", package="h2o")
prostate.hex <- h2o.uploadFile(localH2O, path = prosPath)
mean(prostate.hex$AGE)

h2o.nrow

The Number of Rows/Columns of an H2O Dataset

Description
Returns a count of the number of rows or columns in an H2OFrame object.
Usage
## S4 method for signature H2OFrame
nrow(x)
## S4 method for signature H2OFrame
ncol(x)

34

h2o.performance

Arguments
x

An H2OFrame object.

See Also
dim for all the dimensions. nrow for the default R method.
Examples
library(h2o)
localH2O <- h2o.init()
irisPath <- system.file("extdata", "iris.csv", package="h2o")
iris.hex <- h2o.uploadFile(localH2O, path = irisPath)
nrow(iris.hex)
ncol(iris.hex)

h2o.parseRaw

H2O Data Parsing

Description
The second phase in the data ingestion step.
Usage
h2o.parseRaw(data, key = "", header, sep = "", col.names)
Details
Parse the Raw Data produced by the import phase.

h2o.performance

Model Performance Metrics in H2O

Description
Given a trained h2o model, compute its performance on the given dataset
Usage
h2o.performance(model, data = NULL)

h2o.rbind

35

Arguments
model

An H2OModel object

data

An H2OFrame. The model will make predictions on this dataset, and subsequently score them. The dataset should match the dataset that was used to train
the model, in terms of column names, types, and dimensions.

Value
Returns an object of the H2OModelMetrics subclass.
Examples
library(h2o)
localH2O <- h2o.init()
prosPath <- system.file("extdata", "prostate.csv", package="h2o")
prostate.hex <- h2o.uploadFile(localH2O, path = prosPath)
prostate.hex$CAPSULE <- as.factor(prostate.hex$CAPSULE)
prostate.gbm <- h2o.gbm(3:9, "CAPSULE", prostate.hex)
h2o.performance(model = prostate.gbm, data=prostate.hex)

h2o.rbind

Combine H2O Datasets by Rows

Description
Takes a sequence of H2O data sets and combines them by rows
Usage
h2o.rbind(...)
Arguments
...

A sequence of H2OFrame arguments. All datasets must exist on the same H2O
instance (IP and port) and contain the same number of rows.

deparse.level

Integer controlling the construction of column names. ##Currently unimplemented.##

Value
An H2OFrame object containing the combined . . . arguments column-wise.
See Also
rbind for the base R method.

36

h2o.removeAll

Examples
library(h2o)
localH2O <- h2o.init()
prosPath <- system.file("extdata", "prostate.csv", package="h2o")
prostate.hex <- h2o.uploadFile(localH2O, path = prosPath)
prostate.cbind <- h2o.rbind(prostate.hex, prostate.hex)
head(prostate.cbind)

h2o.removeAll

Remove All Keys on the H2O Cluster

Description
Removes the data from the h2o cluster, but does not remove the local references.

Usage
h2o.removeAll(conn = h2o.getConnection())

Arguments
conn

An H2OConnection object containing the IP address and port number of the
H2O server.

See Also
h2o.rm

Examples
library(h2o)
localH2O <- h2o.init()
prosPath <- system.file("extdata", "prostate.csv", package = "h2o")
prostate.hex <- h2o.uploadFile(localH2O, path = prosPath)
h2o.ls(localH2O)
h2o.removeAll(localH2O)
h2o.ls(localH2O)

h2o.rm

37

Delete Objects In H2O

h2o.rm

Description
Remove the h2o Big Data object(s) having the key name(s) from keys.
Usage
h2o.rm(keys, conn = h2o.getConnection())
Arguments
keys

The hex key associated with the object to be removed.

conn

An H2OConnection object containing the IP address and port number of the
H2O server.

See Also
h2o.assign, h2o.ls

h2o.saveModel

Save an H2O Model Object to Disk

Description
Save an H2OModel to disk.
Usage
h2o.saveModel(object, dir = "", name = "", filename = "", force = FALSE)
Arguments
object

an H2OModel object.

dir

string indicating the directory the model will be written to.

name

string name of the file.

force

logical, indicates how to deal with files that already exist.

Details
In the case of existing files forse = TRUE will overwrite the file. Otherwise, the operation will fail.

38

h2o.scale

See Also
h2o.loadModel for loading a model to H2O from disk
Examples
## Not run:
library(h2o)
localH2O <- h2o.init()
prostate.hex <- h2o.uploadFile(localH2O, path = paste("https://raw.github.com",
"0xdata/h2o/master/smalldata/logreg/prostate.csv", sep = "/"), key = "prostate.hex")
prostate.glm <- h2o.glm(y = "CAPSULE", x = c("AGE","RACE","PSA","DCAPS"),
data = prostate.hex, family = "binomial", nfolds = 10, alpha = 0.5)
h2o.saveModel(object = prostate.glm, dir = "/Users/UserName/Desktop", save_cv = TRUE, force = TRUE)
## End(Not run)

h2o.scale

Scaling and Centering of an H2O Key

Description
Centers and/or scales the columns of an H2O dataset.
Usage
## S3 method for class H2OFrame
scale(x, center = TRUE, scale = TRUE)
Arguments
x

An H2OFrame object.

center

either a logical value or numeric vector of length equal to the number of
columns of x.

scale

either a logical value or numeric vector of length equal to the number of
columns of x.

Examples
library(h2o)
localH2O <- h2o.init()
irisPath <- system.file("extdata", "iris_wheader.csv", package="h2o")
iris.hex <- h2o.uploadFile(localH2O, path = irisPath, key = "iris.hex")
summary(iris.hex)
# Scale and center all the numeric columns in iris data set
h2o.scale(iris.hex[, 1:4])

h2o.sd

h2o.sd

39

Standard Deviation of a column of data.

Description
Obtain the standard deviation of a column of data.
Usage
## S4 method for signature H2OFrame
sd(x, na.rm = FALSE)
Arguments
x

An H2OFrame object.

na.rm

logical. Should missing values be removed?

See Also
h2o.var for variance, and sd for the base R implementation.
Examples
localH2O <- h2o.init()
prosPath <- system.file("extdata", "prostate.csv", package="h2o")
prostate.hex <- h2o.uploadFile(localH2O, path = prosPath)
sd(prostate.hex$AGE)

h2o.shutdown

Shut Down H2O Instance

Description
Shut down the specified instance. All data will be lost.
Usage
h2o.shutdown(conn = h2o.getConnection(), prompt = TRUE)
Arguments
conn

An H2OConnection object containing the IP address and port of the server running H2O.

prompt

A logical value indicating whether to prompt the user before shutting down
the H2O server.

40

h2o.synonym

Details
This method checks if H2O is running at the specified IP address and port, and if it is, shuts down
that H2O instance.
WARNING
All data, models, and other values stored on the server will be lost! Only call this function if you
and all other clients connected to the H2O server are finished and have saved your work.
Note
Users must call h2o.shutdown explicitly in order to shut down the local H2O instance started by R.
If R is closed before H2O, then an attempt will be made to automatically shut down H2O. This only
applies to local instances started with h2o.init, not remote H2O servers.
See Also
h2o.init
Examples
# Dont run automatically to prevent accidentally shutting down a cloud
## Not run:
library(h2o)
localH2O = h2o.init()
h2o.shutdown(localH2O)
## End(Not run)

h2o.synonym

Find Synonyms Using an H2OW2V object

Description
Find Synonyms Using an H2OW2V object
Usage
h2o.synonym(word2vec, target, count)
Arguments
word2vec:

An H2OW2V model.

target:

A single word, or a vector of words.

count:

The top ‘count‘ synonyms will be returned.

h2o.table

h2o.table

41

Cross Tabulation and Table Creation in H2O

Description
Uses the cross-classifying factors to build a table of counts at each combination of factor levels.
Usage
h2o.table(x, y = NULL)
Arguments
x

An H2OFrame object with at most two integer or factor columns.

y

An H2OFrame similar to x, or NULL.

Value
Returns a tabulated H2OFrame object.
Examples
library(h2o)
localH2O <- h2o.init()
prosPath <- system.file("extdata", "prostate.csv", package="h2o")
prostate.hex <- h2o.uploadFile(localH2O, path = prosPath, key = "prostate.hex")
summary(prostate.hex)
# Counts of the ages of all patients
head(h2o.table(prostate.hex[,3]))
h2o.table(prostate.hex[,3])
# Two-way table of ages (rows) and race (cols) of all patients
head(h2o.table(prostate.hex[,c(3,4)]))
h2o.table(prostate.hex[,c(3,4)])

h2o.uploadFile

Upload Data

Description
Upload local files to the H2O instance.
Usage
h2o.uploadFile(path, conn = h2o.getConnection(), key = "", parse = TRUE,
header, sep = "", col.names)

42

h2o.word2vec

Variance of a column.

h2o.var

Description
Obtain the variance of a column of a parsed H2O data object.
Usage
## S4 method for signature H2OFrame
var(x, y = NULL, na.rm = FALSE, use)
Arguments
x

An H2OFrame object.

y

NULL (default) or a column of an H2OFrame object. The default is equivalent to
y = x (but more efficient).

na.rm

logical. Should missing values be removed?

use

An optional character string to be used in the presence of missing values. This
must be one of the following strings. "everything", "all.obs", or "complete.obs".

See Also
var for the base R implementation. h2o.sd for standard deviation.
Examples
localH2O <- h2o.init()
prosPath <- system.file("extdata", "prostate.csv", package="h2o")
prostate.hex <- h2o.uploadFile(localH2O, path = prosPath)
var(prostate.hex$AGE)

h2o.word2vec

Word2Vec

Description
Create a word2vec object.
Usage
h2o.word2vec(trainingFrame, minWordFreq, wordModel, normModel,
negExCnt = NULL, vecSize, windowSize, sentSampleRate, initLearningRate,
epochs)

H2OConnection-class

43

Arguments
wordModel

- SkipGram or CBOW

normModel

- Hierarchical softmax or Negative sampling

vecSize

- Size of word vectors

sentSampleRate - Sampling rate in sentences to generate new n-grams
initLearningRate
- Starting alpha value. This tempers the effect of progressive information as
learning progresses.
epochs

- Number of iterations data is run through.
* Constructor used for hierarchical softmax cases.

numNegEx

- Number of negative samples used per word

vocabKey

- Key pointing to frame of [Word, Cnt] vectors

winSize

- Size of word window

wordModel

- SkipGram or CBOW

vocabKey

- Key pointing to frame of [Word, Cnt] vectors

vecSize

- Size of word vectors

winSize

- Size of word window

sentSampleRate - Sampling rate in sentences to generate new n-grams
initLearningRate
- Starting alpha value. This tempers the effect of progressive information as
learning progresses.
epochs

- Number of iterations data is run through.

Details
Two cases below: 1. Negative Sampling; 2. Hierarchical Softmax
* Constructor used for specifying the number of negative sampling cases.

H2OConnection-class

The H2OConnection class.

Description
This class represents a connection to an H2O cloud.
Usage
## S4 method for signature H2OConnection
show(object)

44

H2OFrame-class

Details
Because H2O is not a master-slave architecture, there is no restriction on which H2O node is used
to establish the connection between R (the client) and H2O (the server).
A new H2O connection is established via the h2o.init() function, which takes as parameters the ‘ip‘
and ‘port‘ of the machine running an instance to connect with. The default behavior is to connect
with a local instance of H2O at port 54321, or to boot a new local instance if one is not found at
port 54321.
Slots
ip A character string specifying the IP address of the H2O cloud.
port A numeric value specifying the port number of the H2O cloud.
mutable An H2OConnectionMutableState object to hold the mutable state for the H2O connection.

H2OFrame-class

The H2OFrame class

Description
The H2OFrame class
Usage
## S4 method for signature H2OFrame
show(object)
## S4 method for signature missing,H2OFrame
Ops(e1, e2)
## S4 method for signature H2OFrame,missing
Ops(e1, e2)
## S4 method for signature H2OFrame,H2OFrame
Ops(e1, e2)
## S4 method for signature numeric,H2OFrame
Ops(e1, e2)
## S4 method for signature H2OFrame,numeric
Ops(e1, e2)
## S4 method for signature H2OFrame,character
Ops(e1, e2)
## S4 method for signature character,H2OFrame

H2OFrame-class

45

Ops(e1, e2)
## S4 method for signature H2OFrame
Math(x)
## S4 method for signature H2OFrame
Math2(x, digits)
## S4 method for signature H2OFrame
Summary(x, ..., na.rm = FALSE)
## S4 method for signature H2OFrame
!x
## S4 method for signature H2OFrame
is.na(x)
## S4 method for signature H2OFrame
t(x)
## S4 method for signature H2OFrame
log(x, ...)
## S4 method for signature H2OFrame
trunc(x, ...)
Methods (by generic)
• Ops: For missing,H2OFrame
"+", "-", "*", "^", "%%", "%/%", "/" "==", ">", "<", "!=", "<=", ">=", "&", "|", "**"
• Ops: For H2OFrame,missing
"+", "-", "*", "^", "%%", "%/%", "/" "==", ">", "<", "!=", "<=", ">=", "&", "|", "**"
• Ops: For H2OFrame,H2OFrame
"+", "-", "*", "^", "%%", "%/%", "/" "==", ">", "<", "!=", "<=", ">=", "&", "|", "**"
• Ops: For numeric,H2OFrame
"+", "-", "*", "^", "%%", "%/%", "/" "==", ">", "<", "!=", "<=", ">=", "&", "|", "**"
• Ops: For H2OFrame,numeric
"+", "-", "*", "^", "%%", "%/%", "/" "==", ">", "<", "!=", "<=", ">=", "&", "|", "**"
• Ops: For H2OFrame,character
"+", "-", "*", "^", "%%", "%/%", "/" "==", ">", "<", "!=", "<=", ">=", "&", "|", "**"
• Ops: For character,H2OFrame
"+", "-", "*", "^", "%%", "%/%", "/" "==", ">", "<", "!=", "<=", ">=", "&", "|", "**"
• Math: Generics
"abs", "sign", "sqrt", "ceiling", "floor", "trunc", "cummax", "cummin", "cumprod",
"cumsum", "log", "log10", "log2", "log1p", "acos", "acosh", "asin", "asinh", "atan",

46

H2OFrame-Extract

•
•
•
•
•
•
•

"atanh", "exp", "expm1", "cos", "cosh", "cospi", "sin", "sinh", "sinpi", "tan", "tanh",
"tanpi", "gamma", "lgamma", "digamma", "trigamma"
Math2: Generics
"round", "signif"
Summary: Generics
"max", "min", "range", "prod", "sum", "any", "all"
!: Generic "!"
is.na: Generic "is.na"
t: Generic "t"
log: Generic "log"
trunc: Generic "trunc"

Slots
conn An H2OConnection object specifying the connection to an H2O cloud.
key A character string specifying the key for the frame in the H2O cloud’s key-value store.
finalizers A list object containing environments with finalizers that remove keys from the H2O
key-value store.
mutable An H2OFrameMutableState object to hold the mutable state for the H2O frame.

H2OFrame-Extract

Extract or Replace Parts of an H2OFrame Object

Description
Operators to extract or replace parts of H2OFrame objects.
Usage
## S4 method for signature H2OFrame
x[i, j, ..., drop = TRUE]
## S4 method for signature H2OFrame
x$name
## S4 method for signature H2OFrame
x[[i, exact = TRUE]]
## S4 replacement method for signature H2OFrame
x[i, j, ...] <- value
## S4 replacement method for signature H2OFrame
x$name <- value
## S4 replacement method for signature H2OFrame
x[[i]] <- value

H2OModel-class

47

Arguments
x
i,j,...

object from which to extract element(s) or in which to replace element(s).
indices specifying elements to extract or replace. Indices are numeric or character vectors or empty (missing) or will be matched to the names.

drop
name

H2OModel-class

The H2OModel object.

Description
This virtual class represents a model built by H2O.
Usage
## S4 method for signature H2OModel
show(object)
Details
This object has slots for the key, which is a character string that points to the model key existing in
the H2O cloud, the data used to build the model (an object of class H2OFrame).
Slots
conn Object of class H2OConnection, which is the client object that was passed into the function
call.
key A character string specifying the key for the model fit in the H2O cloud’s key-value store.
finalizers A list object containing environments with finalizers that remove keys from the H2O
key-value store.
algorithm A character string specifying the algorithm that were used to fit the model.
parameters A list containing the parameter settings that were used to fit the model.
model A list containing the characteristics of the model returned by the algorithm.

H2OModelMetrics-class The H2OModelMetrics Object.

Description
A class for constructing performance measures of H2O models.
Usage
## S4 method for signature H2OModelMetrics
show(object)

48

H2ORawData-class

H2OObject-class

The H2OObject class

Description
The H2OObject class
Usage
## S4 method for signature H2OObject
initialize(.Object, ...)
Slots
conn An H2OConnection object specifying the connection to an H2O cloud.
key A character string specifying the key in the H2O cloud’s key-value store.
finalizers A list object containing environments with finalizers that remove keys from the H2O
key-value store.

H2ORawData-class

The H2ORawData class.

Description
This class represents data in a post-import format.
Usage
## S4 method for signature H2ORawData
show(object)
Details
Data ingestion is a two-step process in H2O. First, a given path to a data source is _imported_ for
validation by the user. The user may continue onto _parsing_ all of the data into memory, or the
user may choose to back out and make corrections. Imported data is in a staging area such that H2O
is aware of the data, but the data is not yet in memory.
The H2ORawData is a representation of the imported, not yet parsed, data.
Slots
conn An H2OConnection object containing the IP address and port number of the H2O server.
key An object of class "character", which is the hex key assigned to the imported data.

H2OW2V-class

H2OW2V-class

49

The H2OW2V object.

Description
This class represents a h2o-word2vec object.

is.factor,H2OFrame-method
Is H2O Data Frame column a enum

Description
Returns Boolean.

Usage
## S4 method for signature H2OFrame
is.factor(x)

LazyEval

The H2OFrame "lazy" evaluators: Evaulate an AST.

Description
The pattern below is necessary in order to swap out S4 objects *in the calling frame*, and the code
re-use is necessary in order to safely assign back to the correct environment (i.e. back to the correct
calling scope).

50

MethodsIntro

MethodsIntro

A Mix of H2O-specific and Overloaded R methods.

Description
Below we have a mix of h2o and overloaded R methods according to the following ToC:
Details
H2O Methods: ————
h2o.ls, h2o.rm, h2o.assign, h2o.createFrame, h2o.splitFrame, h2o.ignoreColumns, h2o.cut, h2o.table
Time & Date: ’*’ matches "Frame" and "ParsedData" –> indicates method dispatch via UseMethod
————
year.H2O*, month.H2O*, diff.H2O*
Methods are grouped according to the data types upon which they operate. There is a grouping of
H2O specifc methods and methods that are overloaded from the R language (e.g. summary, head,
tail, dim, nrow).
Important Developer Notes on the Lazy Evaluators: ————————————————The H2OFrame "lazy" evaluators: Evaulate an AST.
The pattern below is necessary in order to swap out S4 objects *in the calling frame*, and the code
re-use is necessary in order to safely assign back to the correct environment (i.e. back to the correct
calling scope). If you *absolutely* need to nest calls like this, you _MUST_ correctly track the
names all the way down, and then all the way back up the scopes. Here’s the example pattern:
Number of columns
Num Columns of an AST.
Evaluate the AST and produce the ncol of the eval’ed AST.
ncol.H2OFrame <- function(x) ID <- as.list(match.call())$x # try to get the ID from the call if(length(as.list(substitute(x)))
> 1) ID <- "Last.value" # get an appropriate ID .force.eval(h2o.getConnection(), x, ID = ID, rID
= ’x’) # call the force eval ID <- ifelse(ID == "Last.value", ID, x@key) # bridge the IDs between
the force.eval and the parent frame assign(ID, x, parent.frame()) # assign the eval’d frame into the
parent env ncol(get(ID, parent.frame())) # get the object back from the parent and perform the op
Take this line-by-line: Line 1: grab the ID from the arg list, this ID is what we want the key to be
in H2O Line 2: if there is no suitable ID (i.e. we have some object, not a named thing), assign to
Last.value Line 3: 1. Get a handle to h2o (h2o.getConnection()) 2. x is the ast we want to eval
3. ID is the identifier we want the eventual object to have at the end of the day 4. rID is used in
.force.eval to assign back into *this* scope (i.e. child scope -> parent scope) Line 4: The identifier
in the parent scope will either be Last.value, or the key of the H2OFrame *NB: x is _guaranteed_
to be an H2OFrame object at this point (this is post .force.eval) Line 5: assign from *this* scope,
into the parent scope Line 6: Do

MethodsMisc-descrip

MethodsMisc-descrip

51

Methods that don’t fit into the S4 group generics:

Description
This also handles the cases where the Math ops have multiple args (e.g. <80><99>log<80><99>
and <80><98>trunc<80><99>)
Details
<80><98>"!"<80><99>, <80><98>"is.na"<80><99>, <80><98>"t"<80><99>,
<80><98>"trunc"<80><99>

Node-class

The Node class.

Description
An object of type Node inherits from an H2OFrame, but holds no H2O-aware data. Every node
in the abstract syntax tree An object of type Node inherits from an H2OFrame, but holds no H2Oaware data. Every node in the abstract syntax tree has as its ancestor this class.
This class represents an operator between one or more H2O objects. ASTApply nodes are always
root nodes in a tree and are never leaf nodes. Operators are discussed more in depth in ops.R.
Details
Every node in the abstract syntax tree will have a symbol table, which is a dictionary of types and
names for all the relevant variables and functions defined in the current scope. A missing symbol is
therefore discovered by looking up the tree to the nearest symbol table defining that symbol.

OpsIntro-descrip

Overview: ———

Description
R operators mixed with H2OFrame objects.

52

print.H2OTable

Details
Operating on an object of type H2OFrame triggers the rollup of the expression _to be executed_ :
the expression itself is not evaluated. Instead, an AST is built up from the R expression using R’s
built-in parser (which handles operator precedence), and, in the case of assignment, is stashed into
the variable in the assignment.
The AST is bound to an R variable as a promise to evaluate the expression on demand. When
evaluation is forced, the AST is walked, converted to JSON, and shipped over to H2O. The result
returned by H2O is a key pointing to the newly created frame.
Methods may have a non-H2OFrame return type. Any extra preprocessing of data returned by H2O
is discussed in each instance, as it varies from method to method.
What’s implemented? ——————–
Many of R’s generic S3 methods may be mixed with H2OFrame objects wherein the result is coerced to the appropraitely typed object (typically an H2OFrame object).
A list of R’s generic methods may be found by calling ‘getGenerics()‘. Likewise, a call to ‘h2o.getGenerics()‘
will list the operations that are permissible with H2OFrame objects.
S3 methods are divided into four groups: Math, Ops, Complex, and Summary. H2OFrame methods
follow these divisions as well, with the exception of Complex, which are unimplemented.
More precicely, the group divisions follow the S4 divisions: Ops, Math, Math2, Summary.
See also groupGeneric.

Print method for H2OTable objects

print.H2OTable

Description
Print method for H2OTable objects
Usage
## S3 method for class H2OTable
print(x, ...)
Arguments
x

An H2OTable object

...

Further arguments passed to or from other methods.

Value
The original x object

quantile

quantile

53

Quantiles of H2O Data Frame.

Description
Obtain and display quantiles for H2O parsed data.
Usage
## S3 method for class H2OFrame
quantile(x, probs = c(0.01, 0.05, 0.1, 0.25, 0.333, 0.5,
0.667, 0.75, 0.9, 0.95, 0.99), ...)

Arguments
x

An H2OFrame object with a single numeric column.

probs

Numeric vector of probabilities with values in [0,1].

...

Further arguments passed to or from other methods.

Details
quantile.H2OFrame, a method for the quantile generic. Obtain and return quantiles for an
H2OFrame object.
Value
A vector describing the percentiles at the given cutoffs for the H2OFrame object.
Examples
# Request quantiles for an H2O parsed data set:
library(h2o)
localH2O <- h2o.init()
prosPath <- system.file("extdata", "prostate.csv", package="h2o")
prostate.hex <- h2o.uploadFile(localH2O, path = prosPath)
# Request quantiles for a subset of columns in an H2O parsed data set
quantile(prostate.hex[,3])
for(i in 1:ncol(prostate.hex))
quantile(prostate.hex[,i])

54

transform.H2OFrame

Summarizes the columns of a H2O data frame.

summary

Description
A method for the summary generic. Summarizes the columns of an H2O data frame or subset of
columns and rows using vector notation (e.g. dataset[row, col])
Usage
## S4 method for signature H2OFrame
summary(object, ...)
Arguments
object

An H2OFrame object.

...

Further arguments passed to or from other methods.

Value
A table displaying the minimum, 1st quartile, median, mean, 3rd quartile and maximum for each
numeric column, and the levels and category counts of the levels in each categorical column.
Examples
library(h2o)
localH2O = h2o.init()
prosPath = system.file("extdata", "prostate.csv", package="h2o")
prostate.hex = h2o.importFile(localH2O, path = prosPath)
summary(prostate.hex)
summary(prostate.hex$GLEASON)
summary(prostate.hex[,4:6])

transform.H2OFrame

Transform Columns in an H2OFrame Object.

Description
Functions that facilitate column transformations of an H2OFrame object.
Usage
## S3 method for class H2OFrame
transform(_data, ...)
## S3 method for class H2OFrame
within(data, expr, ...)

transform.H2OFrame

55

Arguments
_data,data

An H2OFrame object.

...

For transform method, column transformations in the form tag=value.

expr

For within method, column transformations specified as an expression.

See Also
transform, within for the base R methods.
Examples
library(h2o)
localH2O <- h2o.init()
iris.hex <- as.h2o(iris, localH2O)
transformed1 <- transform(iris.hex,
Sepal.Ratio = Sepal.Length / Sepal.Width,
Petal.Ratio = Petal.Length / Petal.Width )
transformed1
transformed2 <- within(iris.hex,
{Sepal.Product <- Sepal.Length * Sepal.Width
Petal.Product <- Petal.Length * Petal.Width
Sepal.Petal.Ratio <- Sepal.Product / Petal.Product
Sepal.Length <- Sepal.Width <- NULL
Petal.Length <- Petal.Width <- NULL
})
transformed2



Source Exif Data:
File Type                       : PDF
File Type Extension             : pdf
MIME Type                       : application/pdf
PDF Version                     : 1.4
Linearized                      : No
Page Count                      : 55
Page Mode                       : UseOutlines
Author                          : 
Title                           : 
Subject                         : 
Creator                         : LaTeX with hyperref package
Producer                        : pdfTeX-1.40.10
Create Date                     : 2015:02:09 23:32:32-08:00
Modify Date                     : 2015:02:09 23:32:32-08:00
Trapped                         : False
PTEX Fullbanner                 : This is pdfTeX, Version 3.1415926-1.40.10-2.2 (TeX Live 2009/Debian) kpathsea version 5.0.0
EXIF Metadata provided by EXIF.tools

Navigation menu