Instructions

User Manual:

Open the PDF directly: View PDF .
Page Count: 19

Judgmental Eye

In this lengthy exercise, you’ll tie together a number of important concepts:

HTML

CSS

Forms

Python processing of data files

Flask servers

Relational databases

ORMs / SQLAlchemy

In addition, we’ll introduce a machine learning algorithm in the final further study.

The overall goal of the exercise is to build a website where users can login and add or update ratings of

films. If time allows, in the further study we will add another feature, which predicts how the users will

rate films. This prediction is based on analysis of how other users rated that film, depending on how

similar those other users are to the user. Lastly, we’ll add

The Eye

, a system user with terrible taste in

movies who will berate users for their movie ratings – again, using machine learning to decide how

The

Eye

would rate things.

Object Relational Mappers

If you worked on the

project-tracker

exercise, you saw how we used classes to model objects that

ultimately were stored in a database. In this project, we’ll take this idea further, and introduce a tool

called an

ORM

https://en.wikipedia.org/wiki/Object-relational_mapping

In 2003,

Martin Fowler

https://en.wikipedia.org/wiki/Martin_Fowler

documented a technique called

“Active Record,” a scheme which can be diagrammed like so:

Database Concept

Object-Oriented Concept

Table

Class

Column

Attribute

Row

Instance

In essence, a table definition is more or less equivalent to a class definition. Each column can be thought

of as an attribute or property of that class. Each row is also analogous to an instantiation of the class.

The analogy isn’t perfect, but it serves us well enough. The analogy allows us to construct so!ware that,

Judgmental Eye https://fellowship.hackbrightacademy.com/materials/ft24a/exercises/...

1 of 19 10/29/18, 2:25 PM

through the magic of “introspection” (self-examination), can automatically write and execute SQL

queries for us

without

the programmer having to stop and think about the SQL required to accomplish a

task.

Here’s an example. Given the following SQL:

CREATE

TABLE

users

(

user_id

SERIAL

PRIMARY

KEY

NOT

NULL

VARCHAR

(

password

VARCHAR

(

));

Here is how you might set up the class:

class

User

(

object

"""A user of our website; stored in a database."""

__tablename__

"users"

def

__init__

(

self

user_id

password

"""Create a user, given id, email, and password."""

self

user_id

self

password

@classmethod

def

get_by_id

(

cls

user_id

"""Get a user from database by ID and return instance."""

QUERY

"""SELECT user_id, email, password

FROM users WHERE user_id = :id"""

cursor

session

execute

(

QUERY

{

'id'

user_id

})

user_id

password

cursor

fetchone

()

return

cls

(

user_id

password

)

def

change_password

(

self

password

"""Change password for the user."""

QUERY

(

"UPDATE users SET password = :password "

"WHERE user_id = :id"

)

session

execute

(

QUERY

{

'password'

password

'id'

self

user_id

})

session

commit

()

And, with that class, here’s how you would update a user’s password:

Judgmental Eye https://fellowship.hackbrightacademy.com/materials/ft24a/exercises/...

2 of 19 10/29/18, 2:25 PM

jessica

User

get_by_id

(

)

jessica

change_password

(

"my-new-password"

)

There’s nothing wrong here yet, but if we wanted to be able to update the user’s email address as well,

we start having to write significant amounts of repetitive code.

An ORM provides us with a slightly di"erent workflow. Instead of writing a bunch of code to handle SQL,

it instead

peeks

at your class definitions (“introspects it”) and uses the information gathered to

generate

appropriate SQL. Our

User

class definition changes:

Given this, our ORM can

deduce

the original schema we generated earlier. The generation of the SQL

schema can be le! to the so!ware. Furthermore, our previous example of changing a user’s password

can be done as follows:

jessica

User

query

get

(

)

jessica

password

"my-new-password"

session

commit

()

The

change_password()

function no longer exists in that form; its existence is obviated as we can access

columns directly as if they were object attributes, as long as we

commit

the database a!er every

modification to an object. Overall, the amount of overhead code required to store data for an app

dropped precipitously when ORMs first appeared, allowing lone programmers to single-handedly build a

full-stack application in shorter time periods.

For this project, we will continue using

SQLAlchemy

http://sqlalchemy.org

as our ORM.

There are other ORMs out there, each implementing the ideas in Fowler’s book slightly di"erently.

Note:

Other ORMs

The other primary competing Python-based ORM is the one that is bundled with

Django

https://docs.djangoproject.com/en/dev/topics/db/queries/

. The one restriction for that one is

that it cannot easily be used outside of Django, so we won’t consider it here.

Setup

There are a number of Python libraries we’ll need for this project.

We’ll create a virtual environment and install them, using a

requirements.txt

file that has the names and

exact versions of products we’d like to use:

Judgmental Eye https://fellowship.hackbrightacademy.com/materials/ft24a/exercises/...

3 of 19 10/29/18, 2:25 PM

vv ii rr tt uu aa ll ee nn vv

ee nn vv

New python executable in env/bin/python

Installing setuptools, pip...done.

ss oo uu rr cc ee ee nn vv // bb ii nn // aa cc tt ii vv aa tt ee

(env) $

pp ii pp 33

ii nn ss tt aa ll ll -- rr rr ee qq uu ii rr ee mm ee nn tt ss .. tt xx tt

Downloading/unpacking Flask (from -r requirements.txt (line 1))

Downloading Flask-0.10.1.tar.gz (544kB): 544kB downloaded

.. .. ..

Successfully installed Flask Flask-SQLAlchemy Jinja2

.. .. ..

Cleaning up...

(env) $

Awesome.

Investigating Our Data

The dataset we’ll be using is something called the

MovieLens 100k

http://grouplens.org/datasets

/movielens/

dataset. It consists of many ratings of movies from users. We’ll mine this data for

correlations, but first we need to know what it looks like.

The data has already been unpacked for you in the

seed_data/

directory. Take a look:

u.data

u.item

u.user

These files have been extracted from the MovieLens 100k; in the original documentation that came with

the data set was the following information:

u.item

Information about the items (movies); this is a

-separated list of:

movie

title

release

date

video

release

date

IMDb

URL

unknown

Action

Adventure

Animation

Children

Film

Noir

Horror

Musical

Mystery

Romance

Sci

Thriller

War

Western

The last 19 fields are the genres, a 1 indicates the movie is of that genre, a 0 indicates it is not; movies can

be in several genres at once (however, we won’t be using genre information at all in this exercise).

Judgmental Eye https://fellowship.hackbrightacademy.com/materials/ft24a/exercises/...

4 of 19 10/29/18, 2:25 PM

The movie ids are the ones used in the

u.data

data set.

u.user

Demographic information about the users; this is a

- separated list of:

user

age

gender

occupation

zip

code

The user ids are the ones used in the

u.data

data set.

u.data

The full user data set, 100,000 ratings by 943 users on 1682 items. Each user has rated at least 20 movies.

Users and items are numbered consecutively from 1. The data is randomly ordered. This is a tab (

)

separated list of:

user_id

movie_id

score

timestamp

The score column is an integer between 1 and 5 that will form the basis for our rating system throughout

this exercise.

We don’t use them for our data set, but it’s good to understand the formatting for the time stamp here–

they’re integers, and measured in what UNIX calls “epoch time”–seconds since 1/1/1970 UTC.

What’s the Model?

Spend a little bit trying to decipher the files before moving on. It will make it easier to remember that in

this data set, “items” are movies.

Stop here and think about how these three files constitute a “model” – that is, how they cooperate

together to provide a set of information. It might help to imagine how, given a movie name, you could

find a list of the users who rated that movie.

Thinking About the Model

If you caught the copious hints, you should be thinking that each of these files is a table in a database.

Every one of the columns in the file is the same as a database column. To reconstruct an entire record

(

who

rated

what

movie) we first go to the

u.data

table to get a

user_id

and a

movie_id

. We take those

numbers and search their respective files for the row id that matches, then glue all three rows together.

Building the Database

Judgmental Eye https://fellowship.hackbrightacademy.com/materials/ft24a/exercises/...

5 of 19 10/29/18, 2:25 PM

Okay, so our data is in files and we need to put them into database tables. Great, we’ll start writing a

schema. Identify the tables we’ll need to make, and sketch out the schema.

We’re not going to use all the genre data for movies, nor are we going to keep track of users’ genders or

occupations.

However, we’ll add authentication data to the user schema, adding both an email and password, while

making the remaining user data optional. Sketch out a rough schema as well as any relationships

between the tables (has many, belongs to, etc).

Going by our files, we can come up with the following skeleton:

Model

User

Name

Type

user_id

integer, primary key

optional string

password

optional string

age

optional integer

zipcode

optional string (technically, these aren’t numeric)

Movie

Name

Type

movie_id

integer, primary key

title

string

released_at

datetime

imdb_url

string

Rating

Name

Type

rating_id

integer, primary key

movie_id

integer

Judgmental Eye https://fellowship.hackbrightacademy.com/materials/ft24a/exercises/...

6 of 19 10/29/18, 2:25 PM

Name

Type

user_id

integer

score

integer

Relationships

A user has many ratings

A rating belongs to a user

A movie has many ratings

A rating belongs to a movie

A user has many movies through ratings

A movie has many users through ratings

We can draw that model as a diagram as such:

Model

Users

user_id*

password

age

zipcode

Ratings

rating_id*

movie_id

user_id

score

Movies

movie_id*

title

released_at

imdb_url

Now, to write the SQL.

Judgmental Eye https://fellowship.hackbrightacademy.com/materials/ft24a/exercises/...

7 of 19 10/29/18, 2:25 PM

Well, not so fast. Writing SQL can be tedious work. It needs to be done, but it gets tricky remembering

the syntax. We’ll write

code

that writes our schemas for us.

First, look in your project directory. You should see a

model.py

file. We’ll begin the alchemy.

SQLAlchemy is powerful so!ware, and the process it uses to transmute python into SQL and back is

indeed alchemical. While it would be most excellent for you to

understand

exactly what’s happening, at

this stage we just need to be able use it reliably. Trust the incantations, then open your

model.py

file,

and we’ll take a look at the

User

model sample that has been provided.

class

User

(

Model

"""User of ratings website."""

So far, the only thing out of the ordinary is the inheritance from a class named

Model

, which we find on

the

object. This is how we declare a class to be managed by SQLAlchemy.

This

db.Model

class is required for SQLAlchemy’s magic to work.

The next few lines are how we define our table and its columns:

class

User

(

Model

"""User of ratings website."""

__tablename__

"users"

user_id

Column

(

Integer

autoincrement

True

primary_key

True

)

Column

(

String

(

nullable

True

)

We’ll go over it line by line, but try not to find the deeper reasons for this syntax: this is a fairly non-

standard use of python class attributes. It’s allowed by the language definition, but ultimately, these

lines we just added are SQLAlchemy specific and only make sense in that context. It’s good to remember

them, but it’s equally good to remember how to look them up.

The first line:

__tablename__

"users"

Simply informs SQLAlchemy that instances of this class will be stored in a table named

users

The next:

Judgmental Eye https://fellowship.hackbrightacademy.com/materials/ft24a/exercises/...

8 of 19 10/29/18, 2:25 PM

user_id

Column

(

Integer

autoincrement

True

primary_key

True

)

Tells SQLAlchemy to add a column to the table named

user_id

. It will contain an integer, and be the

primary key for our table: it will uniquely identify every row in the table.

The next line contains something slightly di"erent:

Column

(

String

(

nullable

True

)

This behaves as you’d expect, with the exception of the

nullable=True

part. That tells SQLAlchemy (and

thus, PostgreSQL) that this column is optional. It’s allowed to be

NULL

. Since our MovieLens 100k data

set is anonymized, we won’t have any email addresses for any of the users we’re given. However, to

simplify things, we’ll be using the same table to store

new

users who can log in via email, so we need to

make the field available for them.

The remaining columns follow in a similar fashion.

The word

String

is not a built-in python class (that one is

str

), nor is

Integer

(

int

, respectively). These are

the SQLAlchemy-managed versions of the same data types. You’ll find that they’re imported from

, at

the top of the file. SQLAlchemy has a number of other data types as well, including datetimes, booleans,

floats, etc., all imported from the same place.

The last thing of note is that there is no

__init__()

method. SQLAlchemy’s

Model

superclass provides

one for you that uses keyword arguments when initializing objects.

In our terminal window with the activated virtual environment, run your

model.py

with the

-i

option:

(env) $

pp yy tt hh oo nn 33

-- ii mm oo dd ee ll .. pp yy

Connected to DB.

>>>

We’ve put some code at the bottom of the

model.py

file already that connects to the database. Open a

second terminal window and type:

(env) $

pp ss qq ll rr aa tt ii nn gg ss

to verify that the

ratings

database does not exist. You should see:

psql

FATAL

database

"ratings"

does

nn oo tt

exist

Judgmental Eye https://fellowship.hackbrightacademy.com/materials/ft24a/exercises/...

9 of 19 10/29/18, 2:25 PM

Now in your terminal create your empty database by typing:

(env) $

cc rr ee aa tt ee dd bb rr aa tt ii nn gg ss

Then, back in the same python console execute the method on the database connection that creates the

tables:

>>>

dd bb .. cc rr ee aa tt ee __ aa ll ll (( ))

Back in your second window type:

(env) $

pp ss qq ll rr aa tt ii nn gg ss

This time you should see something like:

psql

(

9.4

)

Type

"help"

ff oo rr

help

ratings

Enter

(the “d” stands for “describe”) to see what tables are in the ratings database, and then

users

and you should see something like this:

ratings=# \d

List of relations

Schema | Name | Type | Owner

--------+-----------------------+----------+--------------

public | users | table | hackbright

public | users_user_id_seq | sequence | hackbright

(2 rows)

ratings=# \d users

Table "public.users"

Column | Type | Modifiers

----------+-----------------------+---------------------------------------------

user_id | integer | not null default nextval ...

email | character varying(64) |

password | character varying(64) |

age | integer |

zipcode | character varying(15) |

Indexes:

Judgmental Eye https://fellowship.hackbrightacademy.com/materials/ft24a/exercises/...

10 of 19 10/29/18, 2:25 PM

"users_pkey" PRIMARY KEY, btree (user_id)

Mind =

blown

To Do

Quit both

psql

and

python

Drop your

ratings

database (

dropdb

ratings

Create two additional classes,

Movie

and

Rating

, to hold the information that we want to keep from

those files (refer to the

SQLAlchemy tutorial

http://docs.sqlalchemy.org/en/latest/orm/tutorial.html

necessary.)

Movie

Name

Type

movie_id

integer, primary key

title

string

released_at

datetime

imdb_url

string

Rating

Name

Type

rating_id

integer, primary key

movie_id

integer

user_id

integer

score

integer

Note:

DateTime

Be sure to define your data type as

db.DateTime

for the

released_at

column in your Movie table –

this will become important later.

When you’re done, repeat the process of running

db.create_all()

to create your tables. Reopen your

PostgreSQL database in your second window.

Judgmental Eye https://fellowship.hackbrightacademy.com/materials/ft24a/exercises/...

11 of 19 10/29/18, 2:25 PM

STOP. Get a code review. Seriously: a mistake here will be a pain to fix later, and we’d love to give

you good advice on design.

Populating Our Tables with Data

In your PostgreSQL window, insert a new row into your users table:

ratings=#

II NN SS EE RR TT II NN TT OO uu ss ee rr ss (( ee mm aa ii ll ,, pp aa ss ss ww oo rr dd ,, aa gg ee ,, zz ii pp cc oo dd ee ))

ratings=#

VV AA LL UU EE SS (( '' jj ee ss ss ii cc aa @@ gg mm aa ii ll .. cc oo mm '' ,, '' mm yy pp aa ss ss '' ,, 22 99 ,, '' 99 44 11 11 44 '' )) ;;

Now, we will transmute SQL into Python. First, query to see your shiny new record in PostgreSQL:

ratings=#

SS EE LL EE CC TT ** FF RR OO MM uu ss ee rr ss ;;

user_id | email | password | age | zipcode

----------+-----------------------+----------+----------+----------

1 |

jessica

gmail

com

mailto:jessica%40gmail.com

| mypass | 29 | 94114

Switch to another terminal window and do the following:

pp yy tt hh oo nn 33

-- ii mm oo dd ee ll .. pp yy

Connected to DB.

>>>

jj ee ss ss ii cc aa == UU ss ee rr .. qq uu ee rr yy .. gg ee tt (( 11 ))

>>>

pp rr ii nn tt (( jj ee ss ss ii cc aa .. ee mm aa ii ll ))

jessica

gmail

com

mailto:jessica%40gmail.com

The

in the

get()

method is the id of the User we want to get from our table. If your database has a

di"erent id for the user you want to find, use that instead.

Before we go further, let’s see what happens if we print out the user directly:

>>>

pp rr ii nn tt (( jj ee ss ss ii cc aa ))

<__main__.User object at 0x10741f850>

When you print out an object in Python (either in a script or in the Python console), Python normally

prints something like

<User

object

0x11244222>

, which just tells us the location in memory of that

object. That’s not very helpful; it would be more helpful to get some useful information for debugging

what user we’re dealing with.

Therefore, we’ll add a magic method to

User

__repr__()

. If you define this method on a class, when

Judgmental Eye https://fellowship.hackbrightacademy.com/materials/ft24a/exercises/...

12 of 19 10/29/18, 2:25 PM

Python tries to “represent” an instance of this class, it will use this instead.

def

__repr__

(

self

"""Provide helpful representation when printed."""

return

"<User user_id={self.user_id} email={self.email}>"

To get this new Python code, you’ll need to quit the Python console and re-start:

pp yy tt hh oo nn 33

-- ii mm oo dd ee ll .. pp yy

Connected to DB.

>>>

jj ee ss ss ii cc aa == UU ss ee rr .. qq uu ee rr yy .. gg ee tt (( 11 ))

>>>

pp rr ii nn tt (( jj ee ss ss ii cc aa ))

<User user_id=1

email=jessica

gmail

com

mailto:email=jessica%40gmail.com

That’s helpful.

Let’s update her password to be something more secure:

>>>

jj ee ss ss ii cc aa .. pp aa ss ss ww oo rr dd == "" qq ff uu jj ff 33 ""

Now, let’s query the database to see if that worked:

ratings=#

SS EE LL EE CC TT ** FF RR OO MM uu ss ee rr ss ;;

user_id | email | password | age | zipcode

----------+-----------------------+----------+----------+----------

1 |

jessica

gmail

com

mailto:jessica%40gmail.com

| mypass | 29 | 94114

The new password isn’t there! What gives? Well, like when we did raw SQL (and not dissimilar to Git), we

need to commit data a!er we’ve modified it.

In Python:

>>>

dd bb .. ss ee ss ss ii oo nn .. cc oo mm mm ii tt (( ))

And query again:

ratings=#

SS EE LL EE CC TT ** FF RR OO MM uu ss ee rr ss ;;

user_id | email | password | age | zipcode

Judgmental Eye https://fellowship.hackbrightacademy.com/materials/ft24a/exercises/...

13 of 19 10/29/18, 2:25 PM

----------+-----------------------+----------+----------+----------

1 |

jessica

gmail

com

mailto:jessica%40gmail.com

| qfujf3 | 29 | 94114

SQLAlchemy took our python and wrote the appropriate SQL update query for us behind the scenes.

This is a powerful idea, because now we can write programs, only worrying about the classes and data

we’re interested in, and not how to write the SQL we need to save it somewhere.

Once more, for e!ect:

Reversing Direction

We inserted data in SQL, then got it back out on the python end, where we could update it. Now, let’s do

the reverse, where we insert data in from python. Let’s make a record for Jada.

>>>

jj aa dd aa == UU ss ee rr (( ee mm aa ii ll == "" jj aa dd aa @@ gg mm aa ii ll .. cc oo mm "" ,, pp aa ss ss ww oo rr dd == "" aa bb cc 11 22 33 "" ,, aa gg ee == 22 55 ,,

...

zz ii pp cc oo dd ee == "" 99 44 11 00 33 "" ))

If we query the database, we get nothing:

ratings=#

SS EE LL EE CC TT ** FF RR OO MM uu ss ee rr ss ;;

user_id | email | password | age | zipcode

----------+-----------------------+----------+----------+----------

1 |

jessica

gmail

com

mailto:jessica%40gmail.com

| qfujf3 | 29 | 94114

Right, we have to commit first. Actually, we have to do more than commit. Right now, we have a

User

object that we created in Python, but that isn’t reflected in the database immediately. There are times

when we want to do exactly this, so SQLAlchemy forces us to be

explicit

when we want to insert

something into the database as well. We do this by

adding

an object to our session. Here, the Git parallel

is particularly strong.

>>>

dd bb .. ss ee ss ss ii oo nn .. aa dd dd (( jj aa dd aa ))

>>>

dd bb .. ss ee ss ss ii oo nn .. cc oo mm mm ii tt (( ))

Now, in PostgreSQL, one more time:

ratings=#

SS EE LL EE CC TT ** FF RR OO MM uu ss ee rr ss ;;

user_id | email | password | age | zipcode

----------+-----------------------+----------+----------+----------

1 |

jessica

gmail

com

mailto:jessica%40gmail.com

| qfujf3 | 29 | 94114

2 |

jada

gmail

com

mailto:jada%40gmail.com

| abc123 | 25 | 94103

Judgmental Eye https://fellowship.hackbrightacademy.com/materials/ft24a/exercises/...

14 of 19 10/29/18, 2:25 PM

Now that our object has been “added” to the database, it is being tracked, and if we need to update it,

we only need to commit a!er modifying it:

>>>

jj aa dd aa .. pp aa ss ss ww oo rr dd == "" bb uu nn nn ii ee ss ""

>>>

dd bb .. ss ee ss ss ii oo nn .. cc oo mm mm ii tt (( ))

and to confirm:

ratings=#

SS EE LL EE CC TT ** FF RR OO MM uu ss ee rr ss WW HH EE RR EE uu ss ee rr __ ii dd == 22 ;;

user_id | email | password | age | zipcode

----------+-----------------------+----------+----------+----------

2 |

jada

gmail

com

mailto:jada%40gmail.com

| bunnies | 25 | 94103

Let’s do one more thing. So far, we’ve relied on PostgreSQL to assign unique ids to our users. We can

specify an id when creating a user.

>>>

jj uu aa nn ii tt aa == UU ss ee rr (( uu ss ee rr __ ii dd == 55 ,, ee mm aa ii ll == "" jj uu aa nn ii tt aa @@ gg mm aa ii ll .. cc oo mm "" ,,

...

pp aa ss ss ww oo rr dd == "" aa bb cc 11 22 33 "" ,, aa gg ee == 44 22 ,, zz ii pp cc oo dd ee == "" 99 44 11 00 33 "" ))

>>>

dd bb .. ss ee ss ss ii oo nn .. aa dd dd (( jj uu aa nn ii tt aa ))

>>>

dd bb .. ss ee ss ss ii oo nn .. cc oo mm mm ii tt (( ))

If we query the database, we should see this:

ratings=#

SS EE LL EE CC TT ** FF RR OO MM uu ss ee rr ss WW HH EE RR EE ee mm aa ii ll == '' jj uu aa nn ii tt aa @@ gg mm aa ii ll .. cc oo mm '' ;;

user_id | email | password | age | zipcode

----------+-----------------------+----------+----------+----------

5 |

juanita

gmail

com

mailto:juanita%40gmail.com

| abc123 | 42 | 94103

Experiment with adding, committing, and querying to make sure you understand how data goes into

PostgreSQL through Python, and how to get it back out. Add new records on both the PostgreSQL and

Python sides, and use

.get()

to get them back out. Change some fields, then commit them back and

see how the columns get updated. Do this for all three tables, then get ready to wipe them out.

First, drop your

ratings

database using the

dropdb

command. Then, recreate your database using the

createdb

command. Reconnect to your

model.py

file and run

db.create_all()

in order to recreate

your

ratings

database.

(env) $

dd rr oo pp dd bb rr aa tt ii nn gg ss

(env) $

cc rr ee aa tt ee dd bb rr aa tt ii nn gg ss

(env) $

pp yy tt hh oo nn 33

-- ii mm oo dd ee ll .. pp yy

Judgmental Eye https://fellowship.hackbrightacademy.com/materials/ft24a/exercises/...

15 of 19 10/29/18, 2:25 PM

Connected to DB.

>>>

dd bb .. cc rr ee aa tt ee __ aa ll ll (( ))

Reading Data From Seed Files

Now that we know how to insert single rows into the database, we have to

bulk insert

a bunch of our

movie data. You’ll find a file,

seed.py

, which contains a rough outline of what needs to happen. You’ll

need to open up the seed files corresponding to each table, read each row in, parse it, then insert it into

the database using our SQLAlchemy object interface.

The general steps are:

Open and read a file

Parse a line

Create an object

Add the object to the

db.session

Repeat until all objects are added

Commit

We’ve supplied the first,

load_users

, to give you a sense of what to do for the others. Read it carefully.

Each of the files is formatted slightly di"erently, so you’ll have to modify the second two functions to

account for those changes.

There are two particular challenges to pay attention to as you write your own

load

functions:

In the

u.item

file, the dates are given as strings like “31-Oct-2015”. We need to store this in the

database as an actual date object, not as a string that just looks like a date. To do this, you’ll need to

research the Python

datetime

library to find the function that can parse a string into a datetime

object.

The movies include the year of release at the end of the title, like “Balloonicorn’s Big Adventure

(2010)” – but we don’t want to include that parenthetical date in the database (we’re already storing

the release date as a separate field, so it’s duplicative, plus it’s ugly). Find a way to remove it from the

title.

Feeling stumped on your datetime conversion? Here’s a hint:

Datetime Conversion

You will need to use

strptime

from the datetime library. Here’s some example code:

Judgmental Eye https://fellowship.hackbrightacademy.com/materials/ft24a/exercises/...

16 of 19 10/29/18, 2:25 PM

released_str

released_at

datetime

strptime

(

released_str

-%b-%Y"

)

else

released_at

None

Autoincrement

Much like the Take-A-Number dispenser at the deli, Postgres uses sequences to set the values for

autoincrement columns. By default, the sequence starts at 1. Since we loaded in the user_ids from the

data file when we seeded our database, we haven’t used the sequence yet. When we create a new user

later on, Postgres will attempt to assign it id 1 (the first number in the sequence), and this will generate

an error, because there already is a user with user_id 1.

To prevent this error, we’ve included a function

set_val_user_id

seed.py

. This function queries users

to find the maximum id, and then sets our sequence next value to be one more than that max.

If you want to better understand how this function works, enter psql from the terminal. Just as

\dt

shows a list of the tables,

lists all tables, views, and sequences. A!er you seed the database, you can

see that you have:

ratings=# d

List of relations

Schema | Name | Type | Owner

--------+-----------------------+----------+-----------

public | movies | table | user

public | movies_movie_id_seq | sequence | user

public | ratings | table | user

public | ratings_rating_id_seq | sequence | user

public | users | table | user

public | users_user_id_seq | sequence | user

We are adjusting the users_user_id_seq, using the Postgres function

setval

. You can read more about

Postgres functions on sequences

here

http://www.postgresql.org/docs/8.2/static/functions-

sequence.html

Once you have written your importing code and satisfied yourself that it is importing all of the data

properly,

ask for a code review

Once you’ve finished your code review, you can either, depending on your schedule:

Further Study

further-study.html

Exercise: Judgmental Eye Part 2

index-2.html

Judgmental Eye https://fellowship.hackbrightacademy.com/materials/ft24a/exercises/...

17 of 19 10/29/18, 2:25 PM

Solution

solution/index.html

Judgmental Eye https://fellowship.hackbrightacademy.com/materials/ft24a/exercises/...

18 of 19 10/29/18, 2:25 PM

Judgmental Eye https://fellowship.hackbrightacademy.com/materials/ft24a/exercises/...

19 of 19 10/29/18, 2:25 PM

Instructions

Navigation menu

Versions of this User Manual:

Views

Navigation