Codalab Submission Instructions

User Manual: Pdf

Open the PDF directly: View PDF PDF.
Page Count: 5

CodaLab and Submission Instructions
CS 224N: Programming Assignment #4
February 28, 2017
CodaLab Office Hours: Monday 1–4PM Gates AI Lounge (2nd floor, outside room 251)
Contents
1 Introduction 1
2 Set up 1
3 Run your model 2
4 Submit your model 3
A Appendix 3
A.1 Build a Docker image for your code ........................... 3
A.2 Train your model and manage your experiments on CodaLab ............ 5
1 Introduction
We will be accepting and evaluating your submissions on CodaLab, an online platform for com-
putational research built by Percy Liang and his team. With CodaLab, you can run your jobs
on a cluster, document and share your experiments, all while keeping track of full provenance, so
you can be a more efficient researcher. For the purposes of this assignment, using CodaLab to
manage your experiments is optional, but you will need to use CodaLab to submit your models
for evaluation.
To learn more about what CodaLab is and how it works, check out the CodaLab wiki at https:
//github.com/codalab/codalab-worksheets/wiki.
2 Set up
Visit https://worksheets.codalab.org/ to sign up for an account on CodaLab.1It is possible
to use CodaLab entirely from the browser, and in fact the web interface provides a great view of
your data and experiments. However, we also recommend installing the command-line interface
(CLI) on your machine to make uploading your submission easier:
pi p install co dal ab - cl i
You should now be able to use the cl command. Go ahead and create the "worksheet" where you
will place all of your code and data, and ensure it has the correct permissions in preparation for
submission. Make sure to replace GROUPNAME with your group name in the following commands.
Worksheets have a global namespace, so this will help avoid naming collisions.
1Note that your name and username on this account will be public to the world; you are responsible for your
own privacy here.
1
cl work m ain :: # connect and log in with your a c c o u n t
cl new cs224n - GROUP NAME # creat e a new w o r ksheet
cl work cs224n - GR OUPNA M E # swit c h to your new work s h e e t
cl wp erm . public none # make your wor ksheet p rivate ( IMPO R TANT )
cl wp er m . c s22 4n - win 17 - st af f rea d # give us read access ( I M P O RTANT )
Since you are working in groups, you can create a group on CodaLab, add each of your members
to it, then give them all full access to the worksheet.
cl gnew cs224n - GR OUPNA M E # crea t e the group
cl uadd janedoe cs224n - GROUPNAME # add janedoe as a member
cl uadd maryma j or cs224 - GR OUPN AME # add mar y m a jor as a m ember
# Gi ve t he g roup fu ll access ( i. e . " all ") to the w or ks he et
cl wp erm cs224n - G ROUPNAME cs224n - GROUPNA M E all
You can check out the tutorial on the CodaLab Wiki to familiarize yourself with the CLI: https:
//github.com/codalab/codalab-worksheets/wiki/CLI-Basics.
3 Run your model
Note: We assume here that you have been developing and training your model on your local machine
or VM. These instructions go over how to now upload your model, and show us how to run your
code for the leaderboards. If you’d like to use more of CodaLab’s facilities to managing your
experiments from end to end, skip over to Section A.2 first.
Since you don’t have access to the test set, you will have to submit your code so that we can run
it for you. Of course, the tricky part is that we have to know how to run your code, to which you
may have made all sorts of modifications. Thankfully, you just need to upload your code (along
with the trained model and any other dependencies) to CodaLab and run qa_answer.py on an
example dataset, which we will call the "tiny dev set". Our leaderboard script will re-execute that
run, substituting the actual test set in for the tiny dev set.
The tiny dev set is available globally on CodaLab, and can be loaded into any of your runs by its
UUID 0x4870af2556994b0687a1927fcec66392.
cd p at h / to / a ss ig n me nt 4
# Tra in your m odel
pyth on c od e / t ra in . p y
# Ma ke sur e you ’ r e on you r p ri vate p a4 w or ks h ee t
cl work m ain :: cs224n - GR OUPN AME
# Uploa d your lates t code , data , and mod el p a ramete r s
cl upload code
cl upload data
cl upload train
# To see your newly upload e d bundles and inspect their c o n t e n ts ( you can al so
# go to h ttps :// wo rkshe ets . codalab . org and cl ick on My Dashboard and then
# cs224n - GR OUPN AME to see your workshe et ) .
cl ls
cl cat data
# Run your predic t i o n code : This loa ds your code , model p arameters , data ,
# and the tiny dev set into a sa n d b o x directory , insi d e a co ntainer based
# on the scko o /cs224n - squad : v4 Docker imag e .
cl r un -- na me run - p re di ct - -re ques t - d ock er - im ag e scko o / cs22 4n - sq ua d : v4 \
: co de : da ta : trai n de v. jso n :0 x 487 0a f25 56 99 4b 06 87 a1 92 7f ce c6 63 92 \
’ python code / qa_answ e r .py -- dev_path dev . json
You can check the status and results of the run with one or more of these commands:
# Look at the status of the run
cl i nfo - - verbose run - predict
# Block s u ntil the job is complet e , while tailing the output
cl w ai t -- t ai l run - p re di ct
# Inspect the r esulting files
cl cat run - predict # list the f iles
cl cat run - predict / stderr # inspect s t derr
cl cat run - predict / dev - p redic tion . json # re ad specif i c file
2
You may need to modify some of the commands above, in particular the run command, depending
on how you built your model. If you built your own Docker image2for example, just replace
sckoo/cs224n-squad:v4 with the tag of your own image. The most important part is that you
create a run of qa_answer.py on the tiny dev set, then tag the resulting run so that our leaderboard
script knows what to look for.
4 Submit your model
Submitting your project for the leaderboards will simply involve tagging your prediction run-
bundles with the appropriate tag. Our leaderboard script will then be able to find your bundle,
re-run it with the corresponding dataset.
To make sure everything works, you can test your submission on the sanity-check leaderboard,
which just re-runs your prediction run-bundle on the tiny dev set again. Just tag the bundle with
cs224n-win17-submit-sanity-check.
cl e dit run - pr ed ic t -T cs22 4n - win17 - submi t - s anity - chec k
After you submit, go to the leaderboard to check out your results. (We will release the leaderboard
URL on Piazza when it’s ready.)
The tags for the other leaderboards will be made available on Piazza when they are ready. A
leaderboard may be throttled (e.g. the final leaderboard for the test set will only allow one
submission total), so make sure to that everything worked as expected with the sanity-check before
you submit to other leaderboards.
If you have any problems, submit a post on Piazza and tag it with “hw4”. Make sure to include the
UUID of the bundle you’re having problems with (e.g., 0x4870af2556994b0687a1927fcec66392),
or else we won’t be able to help you. An even better option is to come to CodaLab office hours,
which are normally held Mondays from 1:00PM to 4:00PM in the AI Lounge (the area outside
Professor Chris Manning’s office on the second floor of Gates).
A Appendix
A.1 Build a Docker image for your code
Docker images provide a convenient way to package all of the dependencies that you need for
running your code. When you submit a job on CodaLab, you can specify a Docker image to
use. CodaLab will download that Docker image, then start a new Docker container based on that
image, and execute your code inside the new container. We’ve already built a basic Docker image
for you that contains all the dependencies required by the starter code, including TensorFlow.
It is available on DockerHub as sckoo/cs224n-squad:v4. You can find the specification for the
image in your starter code at assignment4/code/docker/Dockerfile. A Dockerfile is a simple
text-based specification that describes how a Docker image should be built.3
However, as your model increases in complexity, you may want to include other dependencies in
the image, so that we can still run your code. If you want to learn more about Docker and have
more control over how your images are built, we recommend modifying the Dockerfile we gave you
and building your own images from scratch. Otherwise, CodaLab has some basic facilities to let
you easily build and test your own images.
First, you will need to install Docker. Docker runs natively on Linux (installation instructions:
https://docs.docker.com/engine/installation/linux/), but they have recently released an
2See Section A.1
3You can learn more about how Docker images and Dockerfiles work at https://www.digitalocean.com/
community/tutorials/docker-explained-using-dockerfiles-to-automate-building-of-images
3
excellent set of tools called Docker for Mac that runs a Linux kernel in the native macOS hypervi-
sor (installation instructions: https://docs.docker.com/docker-for-mac/install/), if you’re
developing on a Mac for some reason.
Second, you will need head over to https://hub.docker.com/ and create a DockerHub account.
DockerHub, as its name suggests, is a public repository for Docker images. When you “push” an
image that you’ve created onto DockerHub, it can now be shared and used by other people, such
as the CodaLab servers.
Now let’s build our image, based off of the sckoo/cs224n-squad:v4 image we gave you. We
will start a container with that image, loading in your code, data, and model directories into the
container.
cd p at h / to / a ss ig n me nt 4
dock er pull s ckoo / cs224n - squad : v4 # download the imag e
cl edit - im age -- req uest - docker - ima ge scko o / cs224n - squa d :v4 : co de : da ta : tr ain
You will now be given a Bash shell inside the new container. You can poke around it to see that
your files have been loaded into the working directory. Now let’s try running your code inside the
container.
====
Entering c o n t ainer 701 c0bfa
Once you are happy wit h the changes , please exit the con t a i ner (ctrl -D )
and commit your changes to a new i mage by running :
cl c ommit - i ma ge 7 01 c0bf a [ image - ta g ]
====
r oo t @7 0 1c 0b fa 8 a9 d :~# p yt ho n c od e / q a_ answ er . py
Traceb ack ( most r ec ent ca ll la st ) :
Fi le " d ev / q a_ an swe r . py " , li ne 30 , in < m odu le >
impo r t m arshma l l ow
Import E rro r : No module nam ed m ars h mall o w
If anything fails due to a missing dependency, go ahead and just install inside the container,
whether it’s with apt-get or pip.
root @ 701 c 0bf a 8 a9d :~# pip install m arshm a l low
Collecting marshmallow
D ow nl o ad ing m ar sh ma ll ow - 2.13 .0 - p y2 . py3 - none - any . wh l (4 5 kB )
100% |################################| 51kB 499kB/s
Install ing c olle cted p ackages : m ars hmal low
Succe s s full y instal l e d marsh m allow -2.13.0
Just repeat this until your code finally works, then exit out of the container.
root @ 701 c 0bf a 8 a9d :~# exit
exit
====
Exit e d from contain e r 701 c0 bfa
If you are happy wit h the changes , please c o mmit your changes to a new
ima ge by running :
cl c ommit - i ma ge 7 01 c0bf a [ image - ta g ]
====
Now you can commit your image, tag it with a name of your choice, then push it to Docker
Hub.
cl commit - image 701 c 0bfa YO UR _D OC KE RH UB_ID / s quad : v1
cl push - im age Y OU R_ DO CK ER HU B_ ID / squa d :v1
You can now refer to this Docker image in CodaLab by its tag.
4
A.2 Train your model and manage your experiments on CodaLab
Note: The public CodaLab worker nodes that execute your jobs by default do not have GPUs. As
you know, training a deep model involves a huge amount of matrix computation that could take a
prohibitive amount of time to complete on CPUs alone. Luckily, you can actually run a worker
from your own VMs! If you choose to manage your training on CodaLab, you must set up a worker
daemon on your GPU instance by following the instructions at https://github.com/codalab/
codalab-worksheets/wiki/Execution#running-your-own-worker.
When you run your programs on CodaLab, it keeps track of full provenance so that you always
know how you got any set of results. For example, you can train your model with many different
hyperparameter settings all at the same time, then easily compare the results without ever losing
track of the hyperparameters that gave you the best score.
First, set things up and upload your code.
cd p at h / to / a ss ig n me nt 4
cl work m ain :: cs224n - GR OUPN AME # ma ke sur e you ’ re on y our p r iv at e pa4 w or k sh ee t
cl upload code
Then prepare the preprocessed data.
# A ) Run data prep r oces s i ng on CodaLab , us ing a globally - av a i l able
# download d ir ec to ry ( with U UID 0 x 2a fc be ed a5 07 4a fa 96 06 d5 13 9e 65 6d 20 )
# that we ve already pr e p a r e d for you on the s e rver .
cl r un -- na me run - p re pr oc es s - - req ues t - docke r - image s ck oo / cs2 24n - s quad : v4 \
:code download:0x2afcbeeda5074afa9606d5139e656d20 \
’ co de / g e t_ s ta r te d . sh
# B ) OR just run the data prep r o cess i n g on your own machine , and
# upl o a d the re s u l ting data d i r e c tory .
co de / g et _s ta r te d . sh
cl upload data
Now run your model training!
# A) If you ran p r e proc e s sing on CodaLab
# Note that we refer e n c e a directory i n s ide the pr e v i o u s run - bundle
cl run -- name run - train -- request - docker - image s ckoo / cs224n - squad : v4 \
: co de da ta : run - p re pr oc es s / da ta \
’ py thon co de / tr ain . py -- ep ochs 5 -- aweso men ess 2.7 ’
# B ) Or if you just u p l o a ded yo ur data di r e c tory
cl run -- name run - train -- request - docker - image s ckoo / cs224n - squad : v4 \
:code :data \
’ py thon co de / tr ain . py -- ep ochs 5 -- aweso men ess 2.7 ’
You can run this last run command multiples times, each with a different set of command line
arguments, and they will all be scheduled to run. Then if you visit your worksheet on the website4
and track the progress of the runs. You can modify the tables to display your hyperparameters as
custom columns. By modifying your code to output a TSV or JSON file containing the training loss
and accuracy, you can even make your worksheet display plots of your training progress. Learn more
about the worksheet markdown syntax at https://github.com/codalab/codalab-worksheets/
wiki/Worksheet-Markdown.
There are many other features in CodaLab that are not covered here. You can even build up a
pipeline of chained dependent jobs, then easily re-run that pipeline while just substituting one
of the components. If you want to learn more, you can find more detailed documentation at
https://github.com/codalab/codalab-worksheets/wiki/CLI-Reference.
4You can find your worksheet by looking under your worksheets list on your dashboard: https://worksheets.
codalab.org/rest/worksheets/?name=dashboard
5

Navigation menu