Codalab Submission Instructions

User Manual: Pdf

Open the PDF directly: View PDF .
Page Count: 5

Introduction
Set up
Run your model
Submit your model
Appendix
- Build a Docker image for your code
- Train your model and manage your experiments on CodaLab

CodaLab and Submission Instructions

CS 224N: Programming Assignment #4

February 28, 2017

CodaLab Oﬃce Hours: Monday 1–4PM Gates AI Lounge (2nd ﬂoor, outside room 251)

Contents

1 Introduction 1

2 Set up 1

3 Run your model 2

4 Submit your model 3

A Appendix 3

A.1 Build a Docker image for your code ........................... 3

A.2 Train your model and manage your experiments on CodaLab ............ 5

1 Introduction

We will be accepting and evaluating your submissions on CodaLab, an online platform for com-

putational research built by Percy Liang and his team. With CodaLab, you can run your jobs

on a cluster, document and share your experiments, all while keeping track of full provenance, so

you can be a more eﬃcient researcher. For the purposes of this assignment, using CodaLab to

manage your experiments is optional, but you will need to use CodaLab to submit your models

for evaluation.

To learn more about what CodaLab is and how it works, check out the CodaLab wiki at https:

//github.com/codalab/codalab-worksheets/wiki.

2 Set up

Visit https://worksheets.codalab.org/ to sign up for an account on CodaLab.1It is possible

to use CodaLab entirely from the browser, and in fact the web interface provides a great view of

your data and experiments. However, we also recommend installing the command-line interface

(CLI) on your machine to make uploading your submission easier:

pi p install co dal ab - cl i

You should now be able to use the cl command. Go ahead and create the "worksheet" where you

will place all of your code and data, and ensure it has the correct permissions in preparation for

submission. Make sure to replace GROUPNAME with your group name in the following commands.

Worksheets have a global namespace, so this will help avoid naming collisions.

1Note that your name and username on this account will be public to the world; you are responsible for your

own privacy here.

cl work m ain :: # connect and log in with your a c c o u n t

cl new cs224n - GROUP NAME # creat e a new w o r ksheet

cl work cs224n - GR OUPNA M E # swit c h to your new work s h e e t

cl wp erm . public none # make your wor ksheet p rivate ( IMPO R TANT )

cl wp er m . c s22 4n - win 17 - st af f rea d # give us read access ( I M P O RTANT )

Since you are working in groups, you can create a group on CodaLab, add each of your members

to it, then give them all full access to the worksheet.

cl gnew cs224n - GR OUPNA M E # crea t e the group

cl uadd janedoe cs224n - GROUPNAME # add janedoe as a member

cl uadd maryma j or cs224 - GR OUPN AME # add mar y m a jor as a m ember

# Gi ve t he g roup fu ll access ( i. e . " all ") to the w or ks he et

cl wp erm cs224n - G ROUPNAME cs224n - GROUPNA M E all

You can check out the tutorial on the CodaLab Wiki to familiarize yourself with the CLI: https:

//github.com/codalab/codalab-worksheets/wiki/CLI-Basics.

3 Run your model

Note: We assume here that you have been developing and training your model on your local machine

or VM. These instructions go over how to now upload your model, and show us how to run your

code for the leaderboards. If you’d like to use more of CodaLab’s facilities to managing your

experiments from end to end, skip over to Section A.2 ﬁrst.

Since you don’t have access to the test set, you will have to submit your code so that we can run

it for you. Of course, the tricky part is that we have to know how to run your code, to which you

may have made all sorts of modiﬁcations. Thankfully, you just need to upload your code (along

with the trained model and any other dependencies) to CodaLab and run qa_answer.py on an

example dataset, which we will call the "tiny dev set". Our leaderboard script will re-execute that

run, substituting the actual test set in for the tiny dev set.

The tiny dev set is available globally on CodaLab, and can be loaded into any of your runs by its

UUID 0x4870af2556994b0687a1927fcec66392.

cd p at h / to / a ss ig n me nt 4

# Tra in your m odel

pyth on c od e / t ra in . p y

# Ma ke sur e you ’ r e on you r p ri vate p a4 w or ks h ee t

cl work m ain :: cs224n - GR OUPN AME

# Uploa d your lates t code , data , and mod el p a ramete r s

cl upload code

cl upload data

cl upload train

# To see your newly upload e d bundles and inspect their c o n t e n ts ( you can al so

# go to h ttps :// wo rkshe ets . codalab . org and cl ick on My Dashboard and then

# cs224n - GR OUPN AME to see your workshe et ) .

cl ls

cl cat data

# Run your predic t i o n code : This loa ds your code , model p arameters , data ,

# and the tiny dev set into a sa n d b o x directory , insi d e a co ntainer based

# on the scko o /cs224n - squad : v4 Docker imag e .

cl r un -- na me run - p re di ct - -re ques t - d ock er - im ag e scko o / cs22 4n - sq ua d : v4 \

: co de : da ta : trai n de v. jso n :0 x 487 0a f25 56 99 4b 06 87 a1 92 7f ce c6 63 92 \

’ python code / qa_answ e r .py -- dev_path dev . json ’

You can check the status and results of the run with one or more of these commands:

# Look at the status of the run

cl i nfo - - verbose run - predict

# Block s u ntil the job is complet e , while tailing the output

cl w ai t -- t ai l run - p re di ct

# Inspect the r esulting files

cl cat run - predict # list the f iles

cl cat run - predict / stderr # inspect s t derr

cl cat run - predict / dev - p redic tion . json # re ad specif i c file

You may need to modify some of the commands above, in particular the run command, depending

on how you built your model. If you built your own Docker image2for example, just replace

sckoo/cs224n-squad:v4 with the tag of your own image. The most important part is that you

create a run of qa_answer.py on the tiny dev set, then tag the resulting run so that our leaderboard

script knows what to look for.

4 Submit your model

Submitting your project for the leaderboards will simply involve tagging your prediction run-

bundles with the appropriate tag. Our leaderboard script will then be able to ﬁnd your bundle,

re-run it with the corresponding dataset.

To make sure everything works, you can test your submission on the sanity-check leaderboard,

which just re-runs your prediction run-bundle on the tiny dev set again. Just tag the bundle with

cs224n-win17-submit-sanity-check.

cl e dit run - pr ed ic t -T cs22 4n - win17 - submi t - s anity - chec k

After you submit, go to the leaderboard to check out your results. (We will release the leaderboard

URL on Piazza when it’s ready.)

The tags for the other leaderboards will be made available on Piazza when they are ready. A

leaderboard may be throttled (e.g. the ﬁnal leaderboard for the test set will only allow one

submission total), so make sure to that everything worked as expected with the sanity-check before

you submit to other leaderboards.

If you have any problems, submit a post on Piazza and tag it with “hw4”. Make sure to include the

UUID of the bundle you’re having problems with (e.g., 0x4870af2556994b0687a1927fcec66392),

or else we won’t be able to help you. An even better option is to come to CodaLab oﬃce hours,

which are normally held Mondays from 1:00PM to 4:00PM in the AI Lounge (the area outside

Professor Chris Manning’s oﬃce on the second ﬂoor of Gates).

A Appendix

A.1 Build a Docker image for your code

Docker images provide a convenient way to package all of the dependencies that you need for

running your code. When you submit a job on CodaLab, you can specify a Docker image to

use. CodaLab will download that Docker image, then start a new Docker container based on that

image, and execute your code inside the new container. We’ve already built a basic Docker image

for you that contains all the dependencies required by the starter code, including TensorFlow.

It is available on DockerHub as sckoo/cs224n-squad:v4. You can ﬁnd the speciﬁcation for the

image in your starter code at assignment4/code/docker/Dockerfile. A Dockerﬁle is a simple

text-based speciﬁcation that describes how a Docker image should be built.3

However, as your model increases in complexity, you may want to include other dependencies in

the image, so that we can still run your code. If you want to learn more about Docker and have

more control over how your images are built, we recommend modifying the Dockerﬁle we gave you

and building your own images from scratch. Otherwise, CodaLab has some basic facilities to let

you easily build and test your own images.

First, you will need to install Docker. Docker runs natively on Linux (installation instructions:

https://docs.docker.com/engine/installation/linux/), but they have recently released an

2See Section A.1

3You can learn more about how Docker images and Dockerﬁles work at https://www.digitalocean.com/

community/tutorials/docker-explained-using-dockerfiles-to-automate-building-of-images

excellent set of tools called Docker for Mac that runs a Linux kernel in the native macOS hypervi-

sor (installation instructions: https://docs.docker.com/docker-for-mac/install/), if you’re

developing on a Mac for some reason.

Second, you will need head over to https://hub.docker.com/ and create a DockerHub account.

DockerHub, as its name suggests, is a public repository for Docker images. When you “push” an

image that you’ve created onto DockerHub, it can now be shared and used by other people, such

as the CodaLab servers.

Now let’s build our image, based oﬀ of the sckoo/cs224n-squad:v4 image we gave you. We

will start a container with that image, loading in your code, data, and model directories into the

container.

cd p at h / to / a ss ig n me nt 4

dock er pull s ckoo / cs224n - squad : v4 # download the imag e

cl edit - im age -- req uest - docker - ima ge scko o / cs224n - squa d :v4 : co de : da ta : tr ain

You will now be given a Bash shell inside the new container. You can poke around it to see that

your ﬁles have been loaded into the working directory. Now let’s try running your code inside the

container.

====

Entering c o n t ainer 701 c0bfa

Once you are happy wit h the changes , please exit the con t a i ner (ctrl -D )

and commit your changes to a new i mage by running :

cl c ommit - i ma ge 7 01 c0bf a [ image - ta g ]

====

r oo t @7 0 1c 0b fa 8 a9 d :~# p yt ho n c od e / q a_ answ er . py

Traceb ack ( most r ec ent ca ll la st ) :

Fi le " d ev / q a_ an swe r . py " , li ne 30 , in < m odu le >

impo r t m arshma l l ow

Import E rro r : No module nam ed m ars h mall o w

If anything fails due to a missing dependency, go ahead and just install inside the container,

whether it’s with apt-get or pip.

root @ 701 c 0bf a 8 a9d :~# pip install m arshm a l low

Collecting marshmallow

D ow nl o ad ing m ar sh ma ll ow - 2.13 .0 - p y2 . py3 - none - any . wh l (4 5 kB )

100% |################################| 51kB 499kB/s

Install ing c olle cted p ackages : m ars hmal low

Succe s s full y instal l e d marsh m allow -2.13.0

Just repeat this until your code ﬁnally works, then exit out of the container.

root @ 701 c 0bf a 8 a9d :~# exit

exit

====

Exit e d from contain e r 701 c0 bfa

If you are happy wit h the changes , please c o mmit your changes to a new

ima ge by running :

cl c ommit - i ma ge 7 01 c0bf a [ image - ta g ]

====

Now you can commit your image, tag it with a name of your choice, then push it to Docker

Hub.

cl commit - image 701 c 0bfa YO UR _D OC KE RH UB_ID / s quad : v1

cl push - im age Y OU R_ DO CK ER HU B_ ID / squa d :v1

You can now refer to this Docker image in CodaLab by its tag.

A.2 Train your model and manage your experiments on CodaLab

Note: The public CodaLab worker nodes that execute your jobs by default do not have GPUs. As

you know, training a deep model involves a huge amount of matrix computation that could take a

prohibitive amount of time to complete on CPUs alone. Luckily, you can actually run a worker

from your own VMs! If you choose to manage your training on CodaLab, you must set up a worker

daemon on your GPU instance by following the instructions at https://github.com/codalab/

codalab-worksheets/wiki/Execution#running-your-own-worker.

When you run your programs on CodaLab, it keeps track of full provenance so that you always

know how you got any set of results. For example, you can train your model with many diﬀerent

hyperparameter settings all at the same time, then easily compare the results without ever losing

track of the hyperparameters that gave you the best score.

First, set things up and upload your code.

cd p at h / to / a ss ig n me nt 4

cl work m ain :: cs224n - GR OUPN AME # ma ke sur e you ’ re on y our p r iv at e pa4 w or k sh ee t

cl upload code

Then prepare the preprocessed data.

# A ) Run data prep r oces s i ng on CodaLab , us ing a globally - av a i l able

# download d ir ec to ry ( with U UID 0 x 2a fc be ed a5 07 4a fa 96 06 d5 13 9e 65 6d 20 )

# that we ’ ve already pr e p a r e d for you on the s e rver .

cl r un -- na me run - p re pr oc es s - - req ues t - docke r - image s ck oo / cs2 24n - s quad : v4 \

:code download:0x2afcbeeda5074afa9606d5139e656d20 \

’ co de / g e t_ s ta r te d . sh ’

# B ) OR just run the data prep r o cess i n g on your own machine , and

# upl o a d the re s u l ting data d i r e c tory .

co de / g et _s ta r te d . sh

cl upload data

Now run your model training!

# A) If you ran p r e proc e s sing on CodaLab

# Note that we refer e n c e a directory i n s ide the pr e v i o u s run - bundle

cl run -- name run - train -- request - docker - image s ckoo / cs224n - squad : v4 \

: co de da ta : run - p re pr oc es s / da ta \

’ py thon co de / tr ain . py -- ep ochs 5 -- aweso men ess 2.7 ’

# B ) Or if you just u p l o a ded yo ur data di r e c tory

cl run -- name run - train -- request - docker - image s ckoo / cs224n - squad : v4 \

:code :data \

’ py thon co de / tr ain . py -- ep ochs 5 -- aweso men ess 2.7 ’

You can run this last run command multiples times, each with a diﬀerent set of command line

arguments, and they will all be scheduled to run. Then if you visit your worksheet on the website4

and track the progress of the runs. You can modify the tables to display your hyperparameters as

custom columns. By modifying your code to output a TSV or JSON ﬁle containing the training loss

and accuracy, you can even make your worksheet display plots of your training progress. Learn more

about the worksheet markdown syntax at https://github.com/codalab/codalab-worksheets/

wiki/Worksheet-Markdown.

There are many other features in CodaLab that are not covered here. You can even build up a

pipeline of chained dependent jobs, then easily re-run that pipeline while just substituting one

of the components. If you want to learn more, you can ﬁnd more detailed documentation at

https://github.com/codalab/codalab-worksheets/wiki/CLI-Reference.

4You can ﬁnd your worksheet by looking under your worksheets list on your dashboard: https://worksheets.

codalab.org/rest/worksheets/?name=dashboard

Codalab Submission Instructions

Navigation menu

Versions of this User Manual:

Views

Navigation