Instructions

instructions

User Manual:

Open the PDF directly: View PDF .
Page Count: 2

SAP Machine Learning Challenge

SAP Machine Learning Challenge

Please read the following instructions carefully

Which Novel Do I Belong To?

In this challenge, you are tasked with training a machine learning model that classiﬁes a given

line of text as belonging to one of the following 12 novels:

0. alice_in_wonderland

1. dracula

2. dubliners

3. great_expectations

4. hard_times

5. huckleberry_ﬁnn

6. les_miserable

7. moby_dick

8. oliver_twist

9. peter_pan

10. tale_of_two_cities

11. tom_sawyer

You are provided with a zip ﬁle (oﬄine_challenge.zip) containing three text ﬁles:

xtrain.txt

ytrain.txt

xtest.txt.

As you can see in the train ﬁles, we have applied an encoding to the text, but it is done such

that each character has a deterministic mapping. Each line in xtrain.txt corresponds to a label in

ytrain.txt.

Example:

line:

satwamuluhqgulamlrmvezuhqvkrpmletwulcitwskuhlemvtwamuluhiwiwenuhlrvimvqvkruh

ulenamuluhqgqvtwvimviwuhtwamuluhulqvkrenamcitwuhvipmpmqvuhskiwkrpmdfuhlrvimv

skvikrpmqvuhskmvgzenleuhqvmvamuluhulenamuluhqvletwtwvipmpmgzleenamuhtwamuluh

twletwdfuhiwkrxeleentwxeuhpmqvuhtwiwmvamdfuhpkeztwamuluhvimvuhqvtwmkpmpmlelr

uhgztwtwskuhtwlrkrpmlruhpmuluhqvenuhtwyplepmxeuhenuhamypkrqvuhamulmvdfuhqvsk

entwamletwlrlrpmiwuhtwamul

label: 7

Your Task

You are tasked with developing a deep learning model that predicts the novel id of a given line

of text. We prefer Python as the programming language and TensorFlow/Keras as the deep

learning framework.

Submission

As part of your submission, please include:

Your model's predictions on xtest.txt (in the same format as ytrain.txt).

This ﬁle must be named as ytest.txt

Source code as a .zip ﬁle (we prefer Jupyter notebooks, size limit is 10 MB)

Evaluation

Your submission will be evaluated based on the following criteria:

Test set accuracy (80%)

Explanation/documentation (10%)

Implementation (10%)

Contents of Source Code

In your source code, please include the following:

Implementation of the model

Clear documentation of relevant parts of the code

Training & validation accuracies

Explanation of strategy, methodology, and algorithms employed

The last point is especially important, as we want to assess your reasoning and approach to this

problem.

Good luck!

Instructions

instructions

Navigation menu

Versions of this User Manual:

Views

Navigation