SparkCognition Darwin API User Guide Spark Cognition V1.6

SparkCognition_Darwin_API_User_Guide_v1.6

SparkCognition_Darwin_API_User_Guide_v1.6

SparkCognition_Darwin_API_User_Guide_v1.6

User Manual:

Open the PDF directly: View PDF PDF.
Page Count: 27

SparkCognition Darwin API User Guide
Contents
About this guide 1
Darwin overview 1
Accessing the API .............................................. 2
Expectation 2
Technical routes 2
analyze .................................................... 2
auth ...................................................... 4
clean ...................................................... 8
download ................................................... 9
job ....................................................... 11
lookup ..................................................... 13
run ...................................................... 21
train ...................................................... 21
upload ..................................................... 25
Revision Table 26
About this guide
This manual describes the Darwin
API and its use in automated model building. It is intended for data
scientists, software engineers, and analysts who want to use the Darwin API to interact with Darwin to
create and train models, monitor jobs, and perform analysis.
Darwin overview
Darwin is a SparkCognition
tool that automates model building processes to solve specific problems.
This tool enhances data scientist potential because it automates various tasks that are often manually
performed. These tasks include data cleaning, latent relationship extraction, and optimal model deter-
mination. Darwin promotes rapid and accurate feature generation through both automated windowing
and risk generation. Darwin quickly creates highly-accurate, dynamic models using both supervised and
unsupervised learning methods.
1
Darwin API User Guide
For additional information on Darwin, contact your local SparkCognition partner for access to the white
paper titled: Darwin - A Neurogenesis Platform.
Accessing the API
The Darwin API can normally be accessed through one of three methods:
the Darwin Python SDK (preferred, recommended)
the https://darwin-api.sparkcognition.com/v1 end point
optionally, through user-created curl commands
For additional information on the Darwin SDK, see the SparkCognition Darwin Python SDK Guide.
Expectation
This document assumes the experience of a data scientist or software engineer that is knowledgeable of
data science techniques and associated programming tasks.
Technical routes
The Darwin API includes the following api operations:
analyze - analyze a model or dataset
auth - register and authenticate
clean - preprocess a dataset
download - download or delete a generated artifact
job - return status on jobs
lookup - get model or dataset metadata
run - run a model on a dataset
train - train a model
upload - upload or delete a dataset
analyze
Request Type: POST
URI: /v1/analyze/model/{model_name}
Headers:
Authorization: Bearer token
Form Data:
model_name: The name of the model to be analyzed
job_name: (optional) If not specified, a uuid is created as the job_name.
Page 2
Darwin API User Guide
artifact_name: (optional) If not specified, a uuid is created as the artifact_name.
category_name
: (optional) The name of the class for supervised or cluster for unsupervised to get fea-
ture importances for. If this is not specified, the feature importances will be over all classes/clusters.
model_type
: (optional) Model type from the population. Possible values include:
DeepNeuralNetwork,
RandomForest, GradientBoosted.
Description: Analyze the universal feature importances for a particular model given the model name.
Note: This API is capable of returning the structure of the model in the form of a pandas Series.
Response Codes: 201, 400, 401, 403, 422
Successful Response:
{
"job_name": "string",
"artifact_name": "string"
}
Request Type: POST
URI: /v1/analyze/model/predictions/{model_name}/{dataset_name}
Headers:
Authorization: Bearer token
Form Data:
dataset_name
: The name of the dataset containing the data to analyze predictions for. This is a new
dataset that was not used during training for which you want feature importance scores for each
row of this dataset. This dataset has a limit of 500 rows. There is no limit for columns.
model_name: The name of the model to be analyzed
job_name: (optional) If not specified, a uuid is created as the job_name.
artifact_name: (optional) If not specified, a uuid is created as the artifact_name.
start_index: (optional) Index to start at in the dataset when analyzing model predictions.
end_index: (optional) Index to stop at in the dataset when analyzing model predictions.
model_type
: (optional) Model type from the population. Possible values include:
DeepNeuralNetwork,
RandomForest, GradientBoosted.
Description:
Analyze specific feature importances for a particular sample or samples given the model
name and sample data. Analyze predictions cannot be used if you trained your model with a dataset that
is larger than 500 MB.
Response Codes: 201, 400, 401, 403, 422
Successful Response:
{
"job_name": "string",
Page 3
Darwin API User Guide
"artifact_name": "string"
}
Request Type: POST
URI: /v1/analyze/data/{dataset_name}
Headers:
Authorization: Bearer token
Description: Analyze a dataset and return statistics/metadata concerning designated data.
Parameter Descriptions:
dataset_name: The name of the dataset to analyze and return statistics/metadata for
job_name: The job name
artifact_name: The artifact name
max_unique_values
: Threshold for automatic pruning of categorical columns prior to one hot encod-
ing based on the number of unique values
Note
: If a categorical column contains at least
max_unique_values
, it is dropped during preprocess-
ing prior to one hot encoding.
Payload:
{
"job_name": "string",
"artifact_name": "string",
"max_unique_values": 30
}
Response Codes: 201, 400, 401, 403, 408, 422
Successful Response:
{
"job_name": "string",
"artifact_name": "string"
}
auth
Request Type: PATCH
URI: /v1/auth/email
Headers:
Authorization: Bearer token
Page 4
Darwin API User Guide
Description: Add or change an email address.
Form Data:
email: Email address
Response Codes: 204, 400, 401, 422
Successful Response:
{
'access_token': 'token_string'
}
Request Type: POST
URI: /v1/auth/login
Headers:
Authorization: Bearer token
Description: Login as a service.
Form Data:
api_key: The api key of the service
pass1: The service level password
Response Codes: 201, 400, 401
Successful Response:
{
'access_token': 'token_string'
}
Request Type: POST
URI: /v1/auth/login/user
Description: Login as a user.
Form Data:
username: The end user’s name
pass1: The end user’s password
Response Codes: 201, 400, 401, 422
Successful Response:
{
'access_token': 'token_string'
}
Page 5
Darwin API User Guide
Request Type: PATCH
URI: /v1/auth/password
Headers:
Authorization: Bearer token
Description: Change the password.
Form Data:
curpass: Current password
newpass1: New password
newpass2: Confirmation of new password
Response Codes: 204, 400, 401, 422
Successful Response:
{
'access_token': 'token_string'
}
Request Type: PATCH
URI: /v1/auth/password/reset
Headers:
Description:
Reset a user’s password. An email will be sent to the user’s email address with a temporary
password and instructions for changing it.
Form Data:
username: The username of the user whose password needs resetting
Response Codes: 201, 400, 401, 422
Successful Response:
{
'access_token': 'token_string'
}
Request Type: POST
URI: /v1/auth/register
Headers:
Description: Register as a service.
Form Data:
Page 6
Darwin API User Guide
api_key: The api key of the service
pass1: The service level password
pass2: The service level password confirmation
email: Email address
Response Codes: 201, 400, 401, 403
Successful Response:
{
'access_token': 'token_string'
}
Request Type: POST
URI: /v1/auth/register/user
Headers:
Authorization: Bearer token
Description: Register a user for your service.
Form Data:
username: The end user’s name
pass1: The end user’s password
pass2: The end user’s password confirmation
email: The end user’s email address
Response Codes: 201, 400, 401, 422
Successful Response:
{
'access_token': 'token_string'
}
Request Type: DELETE
URI: /v1/auth/register/user/{username}
Headers:
Authorization: Bearer token
Description: Remove/Unregister a user.
Form Data:
username: The username of the user to remove
Page 7
Darwin API User Guide
Response Codes: 201, 401, 403
Successful Response: None
clean
Request Type: POST
URI: /v1/clean/dataset/{dataset_name}
Headers:
Authorization: Bearer token
Description:
Clean a named dataset. The output is the cleaned dataset which is scaled and one-hot-
encoded based on parameters in
/analyze/data
. Use
/download/dataset
to retrieve the cleaned dataset.
/clean/dataset
is only used for visualizing what Darwin would do or for when you want to use the cleaned
data outside of Darwin. Do not clean data and then train on the cleaned data with Darwin. Invoking
/train/model has its own cleaning function as part of the model creation process.
Form Data:
dataset_name: Name of dataset to clean
job_name: Name of job
artifact_name: Name given to the cleaned dataset
target
: (Mandatory for Supervised Model Building) String denoting target prediction column in input
data.
impute: String alias that indicates how to fill in missing values in input data.
ALIAS DESCRIPTION COMPLEXITY
‘ffill’ (Default)
Forward Fill: Propagate values forward from one example
into the missing cell of the next example. Might be useful for
timeseries data, but also applicable for both numerical and
categorical data.
Linear
Fast
‘bfill’ Backward Fill: Propagate values backward from one example into
the missing cell of the previous example. Might be useful for
timeseries data, but also applicable for both numerical and
categorical data.
Linear
Fast
‘mean’ Mean Fill: Computes the mean value of all non-missing examples
in a column to fill in missing examples. The result may or might
not be interpretable in terms of the input space for categorical
variables.
Linear
Fast
max_int_uniques
: Expected input/type:
integer
. Threshold for automatic encoding of categorical
variables. If a column contains less than
max_int_uniques
unique values, it is treated as categorical
and one hot encoded during preprocessing.
Note:
If the target has more numeric values than the
Page 8
Darwin API User Guide
max_int_uniques set point, the problem is treated as a regression and will use MSE.
max_unique_values
: Expected input/type:
integer
. Threshold for automatic pruning of categorical
columns prior to one hot encoding based on the number of unique values.
Note
: If a categorical column contains at least
max_unique_values
, it is dropped during preprocess-
ing prior to one hot encoding.
Response Codes: 400, 401, 403, 422
Successful Response:
{
"job_name": "string",
"artifact_name": "string"
}
download
Request Type: GET
URI: /v1/download/artifacts/{artifact_name}
Headers:
Authorization: Bearer token
Description: Download an artifact by name.
Form Data:
artifact_name: Name of the artifact to download
Response Codes: 201, 401, 404, 408, 422
Successful Response:
{
'artifact': 'artifact_name'
}
Request Type: DELETE
URI: /v1/download/artifacts/{artifact_name}
Headers:
Authorization: Bearer token
Description: Delete an artifact.
Form Data:
artifact_name: Name of the artifact to download
Page 9
Darwin API User Guide
Response Codes: 204, 401, 404, 408, 422
Successful Response: None
Request Type: GET
URI: /v1/download/dataset/{dataset_name}
Headers:
Authorization: Bearer token
Description: Download a dataset by name. It can be an original or cleaned dataset.
Form Data:
dataset_name
: Name of the dataset to download. In the case of downloading a cleaned dataset, this
would be the name returned by /clean/dataset/{dataset_name}.
file_part: Part number of a multi-part dataset, expressed as an integer.
Response Codes: 401, 404, 408, 422
Successful Response:
{
"dataset": "string",
"part": 1,
"note": "string"
}
Request Type: GET
URI: /v1/download/model/{model_name}
Headers:
Authorization: Bearer token
Description: Download a supervised model by name.
Form Data:
model_name: Name of the model to download
path
: (optional) Relative or absolute path of the directory to download the model to. This directory
must already exist prior to model download. If no path is specified, the current directory is used.
There are two files associated with a model: ’model’ and ’data_profiler’.
model_type
: (optional) Model type of the model to be downloaded. Possible values include:
Deep-
NeuralNetwork, RandomForest, GradientBoosted.
model_format
: (optional) Format in which the model is to be downloaded. Possible values include:
json, onnx.
Page 10
Darwin API User Guide
Response Codes: 401, 404, 408, 422
Successful Response:
A successful response returns a .zip file, which contains two files: the supervised model itself and the
data profiler. Downloading unsupervised models is not supported.
job
Request Type: GET
URI: /v1/job/status
Headers:
Authorization: Bearer token
Query Parameters:
age: List jobs that are less than X units old (for example, 3 weeks, 2 days)
status: List job of a particular status, for example Running
Description: Get the status for all jobs. Note that only 2 jobs can be running concurrently.
Response Codes: 200, 400, 401, 422
Successful Response:
[
{
"job_name": "job1_name",
"status": "Requested",
"starttime": "2018-01-30T13:27:46.449865",
"endtime": "2018-01-30T13:28:46.449865",
"percent_complete": 0,
"job_type": "TrainModel",
"loss": 0,
"generations": 0,
"dataset_names": [
"phone_data"
],
"artifact_names": [
"art1"
]
"model_name": null,
"job_error": "string"
},
{
"job_name": "job2_name",
"status": "Running",
Page 11
Darwin API User Guide
"starttime": "2018-01-30T13:27:46.449865",
"endtime": "2018-01-30T13:28:46.449865",
"percent_complete": 23,
"job_type": "UpdateModel",
"loss": 0.92,
"generations": 50,
"dataset_names": [
"language_data"
],
"artifact_names": null,
"model_name": "test_model",
"job_error": "string"
}
]
Request Type: GET
URI: /v1/job/status/{job_name}
Headers:
Authorization: Bearer token
Description: Get the status for a particular job.
Form Data:
job_name: The job name you want status on.
Response Codes: 200, 400, 401, 403, 404, 422
Successful Response:
{
"status": "Requested, Running, Completed",
"starttime": "string",
"endtime": "string",
"percent_complete": 30,
"job_type": "string",
"loss": 0,
"generations": 0,
"dataset_names": [
"string"
],
"artifact_names": [
"string"
],
"model_name": "string",
"job_error": "string"
}
Page 12
Darwin API User Guide
Request Type: PATCH
URI: /v1/job/status/{job_name}
Headers:
Authorization: Bearer token
Description: Stop a running job.
Form Data:
job_name: The job name you want to stop.
Response Codes: 200, 400, 401, 403, 404, 422
Successful Response:
"Job is scheduled to stop"
Request Type: DELETE
URI: /v1/job/status/{job_name}
Headers:
Authorization: Bearer token
Description: Soft delete a running job
Form Data:
job_name: The job name you want to delete.
Response Codes: 200, 400, 401, 403, 404, 422
Successful Response:
None
lookup
Request Type: GET
URI: /v1/lookup/limits
Headers:
Authorization: Bearer token
Description: Get a client’s usage limit metadata.
Response Codes: 200, 401, 422
Successful Response:
Page 13
Darwin API User Guide
{
"username": "string",
"tier": 0,
"model_limit": 0,
"job_limit": 0,
"upload_limit": 0,
"user_limit": 0
}
Request Type: GET
URI: /v1/lookup/artifact
Headers:
Authorization: Bearer token
Query Parameters:
type: filter on the type of artifact (for example, Model, Dataset, Test, or Run)
Description: Get artifact metadata
Response Codes: 200, 401, 422
Successful Response:
[
{
"id": "string",
"name": "string",
"type": "string",
"created_at": "2018-01-22T19:00:39.863Z",
"mbytes": 0
}
]
Request Type: GET
URI: /v1/lookup/artifact/{artifact_name}
Headers:
Authorization: Bearer token
Description: Get artifact metadata for a single artifact
Form Data:
artifact_name: The artifact name you want to look up.
Response Codes: 200, 401, 404, 422
Successful Response:
Page 14
Darwin API User Guide
{
"name": "string",
"type": "string",
"created_at": "2018-01-22T19:00:39.869Z",
"mbytes": 0
}
Request Type: GET
URI: /v1/lookup/model
Headers:
Authorization: Bearer token
Description:
Get the model metadata for a user. This is useful if a user has forgotten certain model
names.
Response Codes: 200, 401, 422
Successful Response:
[
{
"id": {},
"name": "model1_name",
"type": "Supervised",
"updated_at": "2017-02-03T073000",
"problem_type": "string"
"trained_on": ["dataset1_id", "dataset2_id"],
"generations": 100,
"loss": 0.8,
"complete": {},
"parameters": {},
"train_time_seconds": 240,
"algorithm": "string",
"running_job_id": "string",
"description": {"best_genome": "RandomForestClassifier", "recurrent": False}
},
{
"id": {},
"name": "model2_name",
"type": "Ensembled",
"updated_at": "2017-08-22T175022",
"trained_on": ["dataset3_id"],
"loss": 0.82,
"complete": {},
"generations": 80,
Page 15
Darwin API User Guide
"parameters": {
"target": "target1"
},
"train_time_seconds": 180,
"algorithm": "string",
"running_job_id": "string",
"description": {"best_genome": "DeepNet(\n (l0): LSTM(20, 18, num_layers=2)\n
(l1): Linear(in_features=18, out_features=1, bias=True)\n)",
"recurrent": True}
}
]
Note: running_job_id is only returned when complete is False.
Request Type: GET
URI: /v1/lookup/model/{model_name}
Headers:
Authorization: Bearer token
Description: Get all of the model metadata for a particular model.
Form Data:
model_name: The model name you want to look up.
Response Codes: 200, 401, 404, 422
Successful Response:
{
"type": "Unsupervised",
"updated_at": "2017-02-03T073000",
"trained_on": ["dataset1_id", "dataset2_id"],
"generations": 100,
"loss": 0.8,
"parameters": {},
"train_time_seconds": 180,
"algorithm": "string",
"running_job_id": "string",
"description": {"best_genome": "RandomForestClassifier", "recurrent": False}
}
Note: running_job_id is only returned when complete is False.
Request Type: GET
URI: /v1/lookup/model/{model_name}/population
Page 16
Darwin API User Guide
Headers:
Authorization: Bearer token
Description:
Get model descriptions of the best genomes for all model types that were trained. The
population is displayed for unsupervised models only.
Form Data:
model_name: The model name or identifier.
Response Codes: 201, 401, 404, 422
Successful Response:
{
"population": {
"model_types": {
"DeepNeuralNetwork": {
"model_description": "string",
"loss_function": "string",
"fitness": Double
},
"RandomForest": {
"model_description": "string",
"loss_function": "string",
"fitness": Double
},
"GradientBoosted": {
"model_description": "string",
"loss_function": "string",
"fitness": Double
}
}
}
}
Request Type: GET
URI: /v1/lookup/dataset
Headers:
Authorization: Bearer token
Description:
Get the dataset metadata for a user. This is useful if a user has forgotten certain dataset
names.
Response Codes: 200, 401, 422
Successful Response:
Page 17
Darwin API User Guide
[
{
"name": "dataset1_name",
"mbytes": 0.2,
"minimum_recommended_train_time": "string",
"updated_at": "20170924T000000",
"categorical": False,
"sequential": True,
"imbalanced": True,
},
{
"name": "dataset2_name",
"mbytes": 3.5,
"minimum_recommended_train_time": "string",
"updated_at": "20170902T010101",
"categorical": True,
"sequential": False,
"imbalanced": False,
}
]
Request Type: GET
URI: /v1/lookup/dataset/{dataset_name}
Headers:
Authorization: Bearer token
Description: Get all of the metadata for a particular dataset.
Form Data:
dataset_name: The dataset name for which you want the metadata.
Response Codes: 200, 401, 404, 422
Successful Response:
{
"mbytes": 0.2,
"minimum_recommended_train_time": "string",
"updated_at": "20170924T000000",
"categorical": False,
"sequential": True,
"imbalanced": True,
}
Page 18
Darwin API User Guide
Request Type: GET
URI: /v1/lookup/tier
Headers:
Authorization: Bearer token
Description: Get all of the tier metadata.
Response Codes: 200, 401, 422
Successful Response:
[
{
"tier": 0,
"model_limit": 0,
"job_limit": 0,
"upload_limit": 0,
"user_limit": 0
}
]
Request Type: GET
URI: /v1/lookup/tier/{tier_num}
Headers:
Authorization: Bearer token
Description: Get the metadata for a particular tier.
Form Data:
tier_num: Tier for which you want metadata.
Response Codes: 200, 401, 404, 422
Successful Response:
{
"tier": 0,
"model_limit": 0,
"job_limit": 0,
"upload_limit": 0,
"user_limit": 0
}
Request Type: GET
URI: /v1/lookup/user
Headers:
Page 19
Darwin API User Guide
Authorization: Bearer token
Description: Get user metadata for all users.
Response Codes: 200, 401, 422
Successful Response:
[
{
"user_id": "string",
"internal_name": "string",
"username": "string",
"tier": 0,
"created_at": "string",
"client_api_key": "string",
"expires_on": "string",
"parent_id": "string"
}
]
Request Type: GET
URI: /v1/lookup/user/{username}
Headers:
Authorization: Bearer token
Description: Get user metadata for a particular user.
Form Data:
username: Username for which you want user metadata.
Response Codes: 200, 401, 404, 422
Successful Response:
{
"user_id": "string",
"internal_name": "string",
"username": "string",
"tier": 0,
"created_at": "string",
"client_api_key": "string",
"expires_on": "string",
"parent_id": "string"
}
Page 20
Darwin API User Guide
run
Request Type: POST
URI: /v1/run/model/{model_name}/{dataset_name}
Headers:
Authorization: Bearer token
Form Data:
model_name: The name of the model.
artifact_name: The name of the artifact.
dataset_name: The name of the dataset.
anomaly
: Setting this parameter to
True
indicates that an isolation forest should be built for
anomaly detection. If set to True, clustering will automatically be interpreted as False.
supervised
: (
Deprecated
. This argument exists only for backward compatibility.) A boolean
(True/False) indicating whether the model is supervised or not, for example, set this to
False
for
unsupervised.
model_type
- (optional) Model type of the model to be downloaded. Possible values include:
Deep-
NeuralNetwork, RandomForest, GradientBoosted.
Description:
Run a model on a dataset and return the predictions/classifications/clusters found by the
model.
Response Codes: 201, 400, 401, 403, 404, 408, 422
Successful Response:
{
"job_name": "name_of_job",
"artifact_name": "name_of_artifact"
}
train
Request Type: POST
URI: /v1/train/model
Headers:
Authorization: Bearer token
Description: Create a model trained on the dataset identified by dataset_names.
Parameter descriptions:
Page 21
Darwin API User Guide
dataset_names
: A list of dataset names to use for training. The maximum file size is 500 MB for
unsupervised and NBM and 10 GB for supervised.
Note: Using only 1 dataset is currently supported.
job_name: The job name.
model_name: The string identifier of the model to be trained.
loss_fn_name
: Specify the loss function. Possible values include:
"CrossEntropy", "MSE", "BCE",
"L1", "NLL", "BCEWithLogits", "SmoothL1". "CrossEntropy"
can be used for classification data, while
all others can be used for regression data. The default value is
"CrossEntropy"
if this field is left
empty.
fitness_fn_name
: Specify the fitness function. This represents the name of the fitness function used
for evolution of the model population during training. Possible values include:
"Accuracy", "F1",
"R2", "MSE"
. "F1" is the default for classification and "R2" is the default for regression problems.
"Accuracy" and "F1" are for classification only. "R2" and "MSE" are for regression only.
max_train_time
(supervised only): Sets the training time for the model in ‘HH:MM’ format. Default
value is 00:01.
max_epochs
(unsupervised only): Expected input/type:
numeric
. Sets the training time for the
model in epochs. Default value is 10.
recurrent
: Expected input/type:
True/False
. Enables recurrent connections to be evolved in the
model. This option can be useful for timeseries or sequential data.
Note
: This option is automatically enabled if a
datetime
column is detected in the input data. This
may result in slower model evolution.
impute: String alias that indicates how to fill in missing values in input data.
ALIAS DESCRIPTION COMPLEXITY
‘ffill’ (Default)
Forward Fill: Propagate values forward from one example
into the missing cell of the next example. Might be useful for
timeseries data, but also applicable for both numerical and
categorical data.
Linear
Fast
‘bfill’ Backward Fill: Propagate values backward from one example into
the missing cell of the previous example. Might be useful for
timeseries data, but also applicable for both numerical and
categorical data.
Linear
Fast
‘mean’ Mean Fill: Computes the mean value of all non-missing examples
in a column to fill in missing examples. The result may or might
not be interpretable in terms of the input space for categorical
variables.
Linear
Fast
anomaly
: Setting this parameter to
True
indicates that an isolation forest should be built for
anomaly detection. If set to True, clustering will automatically be interpreted as False.
Page 22
Darwin API User Guide
n_clusters (unsupervised only): Specifies the number of clusters to be used.
Note: If this value is not provided, the number of clusters will be heuristically determined.
anomaly_prior
(
unsupervised
only): Expected input/type:
between [0,1]
. Significance level at which
a point is defined as anomalous. This is only used for unsupervised problems if
clustering
is
disabled.
lead_time_days
(
nbm
only): Expected input/type:
integer
. Default value is
60
. The number of days
prior to failure when the behavior starts trending toward either abnormal behavior or failure.
nbm_window_size
(
nbm
only): Expected input/type:
integer
. Default value is
256
. The number of
sample points to consider for each failure detection.
nbm
(
nbm
only): Expected input/type:
True/False
. Default value is
False
. Set value to
True
for a
normal behavioral model (NBM).
failure_dates
(
nbm
only): Expected input/type:
string
. List of failure dates to use for the calculation.
Currently, only a list of one date can be used in the query. Example date format: "07/01/2015"
recovery_dates
(
nbm
only): Expected input/type:
string
. List of recovery dates to use for the
calculation. Currently, only a list of one date can be used in the query. Example date format:
"11/01/2015"
Payload:
{
"dataset_names": ["dataset_name1"],
"job_name": "my_job",
"model_name": "string",
"loss_fn_name": "CrossEntropy",
"fitness_fn_name": "Accuracy",
"max_train_time": "00:01",
"max_epochs": 0,
"recurrent": True,
"impute": "mean",
"drop": "no",
"feature_eng": "mi",
"feature_select": 1,
"outlier": "mad",
"imbalance": True,
"anomaly": False,
"n_clusters": 5,
"anomaly_prior": 0.01,
"lead_time_days": 60,
"nbm_window_size": 256,
"nbm": False,
"return_risk": True,
"failure_dates": ["string"],
"recovery_dates": ["string"],
"scaler": "MinMax",
Page 23
Darwin API User Guide
"target_scaler": "MinMax"
}
Response Codes: 201, 400, 401, 403, 404, 408, 422
Successful Response:
{
"job_name": "nameofjob",
"model_name": "nameofmodel",
}
Request Type: PATCH
URI: /v1/train/model/{model_name}
Headers:
Authorization: Bearer token
Description: Resume training for a model on the dataset identified by dataset_names.
Parameter Descriptions:
dataset_names: A list of dataset names to use for training.
Note: Using only 1 dataset is currently supported.
job_name: The job name
max_train_time
(supervised only): Sets the training time for the model in ‘HH:MM’ format. Default
value is 00:01.
max_epochs
(unsupervised only): Sets the training time for the model in epochs. Default value is
10
.
Payload:
{
"dataset_names": ["dataset_name1"],
"job_name": "my_job",
"max_train_time": "00:01",
"max_epochs": 0
}
Response Codes: 201, 401, 403, 404, 408, 422
Successful Response:
{
"job_name": "nameofjob",
"model_name": "nameofmodel",
}
Request Type: DELETE
Page 24
Darwin API User Guide
URI: /v1/train/model/{model_name}
Headers:
Authorization: Bearer token
Description: Delete a model.
Form Data:
model_name: - Name of the model to delete.
Response Codes: 204, 400, 401, 403, 404, 408, 422
Successful Response: None
upload
Request Type: POST
URI: /v1/upload
Headers:
Authorization: Bearer token
Description: Upload a dataset.
Form Data:
dataset: a dataset file in a supported format (csv, h5)
dataset_name: the name for the uploaded dataset
Note: If not set, a guid will be provided
Response Codes: 201, 400, 401, 403, 408, 413, 422
Successful Response:
{
"dataset_name": "name_of_dataset"
}
Request Type: DELETE
URI: /v1/upload/{dataset_name}
Headers:
Authorization: Bearer token
Description: Delete a dataset.
Form Data:
dataset_name: Name or identifier of dataset to delete.
Page 25
Darwin API User Guide
Response Codes: 204, 401, 403, 404, 422
Successful Response: None
Revision Table
Version Date Notes
v 1.0 02-Feb-2018 First Release
v 1.1 15-Feb-2018 added types: supervised and ensembled
v 1.2(pre)
16-Mar-2018 added Status: Type= PATCH
v 1.2 27-Mar-2018 Added or changed:
• /v1/job/status/{job_name}
• /v1/lookup/user
• /v1/lookup/username/{username}
• /v1/train/model
• /v1/run/model/{model_name}/{dataset_name}
Name change: /v1/lookup/client to /v1/lookup/limits
v 1.3 23-May-2018 Added or changed:
• /v1/analyze/model/{model_name}
• /v1/analyze/model/predictions/{model_name}/{dataset_name}
• /v1/auth/email
• /v1/auth/password/reset
• /v1/auth/register
• /v1/train/model
• /v1/train/model/{model_name}
Name change: /v1/lookup/client to /v1/lookup/limits
v 1.3.1 14-Jun-2018 Edits to:
• /v1/job/status/
• /v1/download/artifacts
Model uses example
v 1.4 31-Jul-2018 Edits to:
• /v1/analyze/model/{model_name}
• /v1/analyze/data/{dataset_name}
• /v1/lookup/model
• /v1/lookup/model/{model_name}
• /v1/train/model
• /v1/train/model/{model_name}
Page 26
Darwin API User Guide
Version Date Notes
v 1.5 15-Oct-2018 Added:
• /v1/clean/dataset/{dataset_name}
• /v1/download/dataset/{dataset_name}
• /v1/download/model/{model_name}
Edits to:
• /v1/analyze/data/{dataset_name}
• /v1/lookup/model
• /v1/train/model
• /v1/download/artifacts/{artifact_name}
v 1.6 16-Jan-2019 Added:
• /v1/lookup/model/{model_name}/population
Edits to:
• /v1/analyze/model/predictions/{model_name}/{dataset_name}
• /v1/analyze/model/{model_name}
• /v1/clean/dataset/{dataset_name}
• /v1/download/model/{model_name}
• /v1/train/model
• /v1/run/model/{model_name}/{dataset_name}
Page 27

Navigation menu