Python James R. Parker An Introduction To Programming
User Manual: James-R.-Parker-Python-An-Introduction-to-Programming
Open the PDF directly: View PDF .
Page Count: 534 [warning: Documents this large are best viewed by clicking the View PDF Link!]
- Title
- Copyrights
- Contents
- Preface
- Chapter 0 Modern Computers
- Chapter 1 Computers and Programming
- Chapter 2 Repetition
- Chapter 3 Sequences: Strings, Tuples, and Lists
- Chapter 4 Functions
- Chapter 5 Files: Input and Output
- Chapter 6 Classes
- Chapter 7 Graphics
- Chapter 8 Manipulating Data
- Chapter 9 Multimedia
- Chapter 10 Basic Algorithms
- Chapter 11 Programming for the Sciences
- Chapter 12 How to Write Good Programs
- Chapter 13 Communicating with the Outside World
- Chapter 14 A Brief Glib Reference
- Index
PYTHON
LICENSE,DISCLAIMEROFLIABILITY,ANDLIMITEDWARRANTY
By purchasing or using this book (the “Work”), you agree that this license grants
permission to use the contents contained herein, but does not give you the right of
ownership to any of the textual content in the book or ownership to any of the
informationorproductscontainedinit.Thislicensedoesnotpermituploadingofthe
WorkontotheInternetoronanetwork(ofanykind)withoutthewrittenconsentofthe
Publisher. Duplication or dissemination of any text, code, simulations, images, etc.
contained herein is limited to and subject to licensing terms for the respective
products, and permission must be obtained from the Publisher or the owner of the
content,etc.,inordertoreproduceornetworkanyportionofthetextualmaterial(in
anymedia)thatiscontainedintheWork.
MERCURYLEARNINGANDINFORMATION(“MLI”or“thePublisher”)andanyoneinvolved
in the creation, writing, or production of the companion disc, accompanying
algorithms,code,orcomputerprograms(“thesoftware”),andanyaccompanyingWeb
siteorsoftwareoftheWork,cannotanddonotwarranttheperformanceorresultsthat
mightbeobtainedbyusingthecontentsoftheWork.Theauthor,developers,andthe
Publisherhave usedtheir besteffortstoinsure theaccuracy andfunctionality ofthe
textual material and/or programs contained in this package; we, however, make no
warrantyofanykind,expressorimplied,regardingtheperformanceofthesecontents
orprograms.TheWorkissold“asis”withoutwarranty(exceptfordefectivematerials
usedinmanufacturingthebookorduetofaultyworkmanship).
Theauthor,developers,andthepublisherofany accompanyingcontent,andanyone
involvedinthecomposition,production,andmanufacturingofthisworkwillnotbe
liable for damages of any kind arising out of the use of (or the inability to use) the
algorithms, source code, computer programs, or textual material contained in this
publication. This includes, but is not limited to, loss of revenue or profit, or other
incidental,physical,orconsequentialdamagesarisingoutoftheuseofthisWork.
Thesoleremedyintheeventofaclaimofanykindisexpresslylimitedtoreplacement
ofthebook,andonlyatthediscretionofthePublisher.Theuseof“impliedwarranty”
andcertain“exclusions”varyfromstatetostate,andmightnotapplytothepurchaser
ofthisproduct.
Companion disc files are available for download from the publisher by writing to
info@merclearning.com.
Copyright©2017byMERCURYLEARNINGANDINFORMATIONLLC.Allrightsreserved.
Thispublication,portionsofit,oranyaccompanyingsoftwaremaynotbereproduced
in any way, stored in a retrieval system of any type, or transmitted by any means,
media, electronic display or mechanical display, including, but not limited to,
photocopy,recording,Internet
postings,orscanning,withoutpriorpermissioninwritingfromthepublisher.
Publisher:DavidPallai
MERCURYLEARNINGANDINFORMATION
22841QuicksilverDrive
Dulles,VA20166
info@merclearning.com
www.merclearning.com
(800)232-0223
JamesR.Parker.PYTHON:AnIntroductiontoProgramming.
ISBN:978-1-9445346-5-3
The publisher recognizes and respects all marks used by companies, manufacturers,
and
developers as a means to distinguish their products. All brand names and product
names
mentionedinthisbookaretrademarksorservicemarksoftheirrespectivecompanies.
Anyomissionormisuse(ofanykind)ofservicemarksortrademarks,etc.isnotan
attempttoinfringeonthepropertyofothers.
LibraryofCongressControlNumber:2016915244
161718321PrintedintheUnitedStatesofAmerica
Thisbookisprintedonacid-freepaper.
Our titles are available for adoption, license, or bulk purchase by institutions,
corporations,etc.
For additional information, please contact the Customer Service Dept. at 800-232-
0223 (toll free).Digital versions of our titles are available at:
www.authorcloudware.comandothere-vendors.Allcompanionfilesareavailableby
writingtothepublisheratinfo@merclearning.com.
The sole obligation of MERCURY LEARNINGAND INFORMATION to the purchaser is to
replacethebookand/ordisc,basedondefectivematerialsorfaultyworkmanship,but
notbasedontheoperationorfunctionalityoftheproduct.
Contents
Prefacexv
Chapter0ModernComputers
0.1CalculationsbyMachine
0.2HowComputersWorkandWhyWeMadeThem
0.2.1Numbers
Example:Base
ConvertBinaryNumberstoDecimal
ConvertDecimalNumberstoBinary
ArithmeticinBinary
0.2.2Memory
0.2.3StoredPrograms
0.3ComputerSystemsAreBuiltinLayers
0.3.1AssemblersandCompilers
0.3.2GraphicalUserInterfaces(GUIs)
Widgets
0.4ComputerNetworks
0.4.1Internet
0.4.2WorldWideWeb
0.5Representation
0.6Summary
Chapter1ComputersandProgramming
1.1SolvingaProblemUsingaComputer
1.2ExecutingPython
1.3GuessaNumber
1.4Rock-Paper-Scissors
1.5SolvingtheGuessaNumberProblem
1.6SolvingtheRock-Paper-ScissorsProblem
1.6.1VariablesandValues–ExperimentingwiththeGraphicalUserInterface
1.6.2ExchangingInformationwiththeComputer
1.6.3Example1:DrawaCircleUsingCharacters
1.6.4Strings,Integers,andRealNumbers
1.6.5NumberBases
1.6.6Example2:ComputetheCircumferenceofanyCircle
1.6.7GuessaNumberAgain
1.7IFStatements
1.7.1Else
1.8Documentation
1.9Rock-Paper-ScissorsAgain
1.10TypesAreDynamic(Advanced)
1.11Summary
Chapter2Repetition
2.1TheWHILEStatement
2.1.1TheGuess-A-NumberProgramYetAgain
2.1.2ModifyingtheGame
2.2Rock-Paper-ScissorsYetAgain
2.2.1RandomNumbers
2.3CountingLoops
2.4PrimeorNon-Prime
2.4.1ExitingfromaLoop
2.4.2Else
2.5LoopsThatareNested
2.6DrawaHistogram
2.7LoopsinGeneral
2.8ExceptionsandErrors
2.8.1Problem:AFinalLookatGuessaNumber
2.9Summary
Chapter3Sequences:Strings,Tuples,andLists
3.1Strings
3.1.1ComparingStrings
Problem:DoesaCityName,EnteredattheConsole,Comebeforeorafterthe
NameDenver?
3.1.2Slicing–ExtractingPartsofStrings
Problem:Identifya“Print”StatementinaString
3.1.3EditingStrings
Problem:CreateaJPEGFileNamefromaBasicString
Problem:ChangetheSuffixofaFileName
Problem:ReversetheOrderofCharactersinaString
Problem:IsaGivenFileNameThatofaPythonProgram?
3.1.4StringMethods
3.1.5SpanningMultipleLines
3.1.6ForLoopsAgain
3.2TheTypeBytes
3.3Tuples
3.3.1TuplesinForLoops
Problem:PrinttheNumberofNeutronsinanAtomicNucleus
3.3.2Membership
Problem:WhatEvenNumbersLessthanorEqualto100areAlsoPerfect
Squares?
3.3.3Delete
Problem:DeletetheElementLithiumfromtheTupleAtoms,alongwithIts
AtomicNumber.
3.3.4Update
Problem:ChangetheEntryforLithiumtoanEntryforOxygen
3.3.5TupleAssignment
3.3.6Built-InFunctionsforTuples
3.4Lists
Problem:ComputetheAverage(Mean)ofaListofNumbers
3.4.1EditingLists
3.4.2Insert
3.4.3Append
3.4.4Extend
3.4.5Remove
3.4.6Index
3.4.7Pop
3.4.8Sort
3.4.9Reverse
3.4.10Count
3.4.11ListComprehension
3.4.12ListsandTuples
3.4.13Exceptions
Problem:DeletetheElementHeliumfromaList
Problem:DeleteaSpecifiedElementfromaList
3.5SetTypes
3.5.1Example:Craps
3.6Summary
Chapter4Functions
4.1FunctionDefinition:SyntaxandSemantics
4.1.1Problem:UsepoundntoDrawaHistogram
4.1.2Problem:GeneralizetheHistogramCodeforOtherYears
4.2FunctionExecution
4.2.1ReturningaValue
Problem:WriteaFunctiontoCalculatetheSquareRootofitsParameter
4.2.2Parameters
4.2.3DefaultParameters
4.2.4None
4.2.5Example:TheGameofSticks
4.2.6Scope
4.2.7VariableParameterLists
4.2.8VariablesasFunctions
Example:FindtheMaximumValueofaFunction
4.2.9FunctionsasReturnValues
4.3Recursion
4.3.1AvoidingInfiniteRecursion
4.4CreatingPythonModules
4.5ProgramDesignUsingFunctions–Example:TheGameofNim
4.5.1TheDevelopmentProcessExposed
4.6Summary
Chapter5Files:InputandOutput
5.1WhatIsaFile?ALittle“Theory”
5.1.1HowAreFilesStoredonaDisk?
5.1.2FileAccessisSlow
5.2KeyboardInput
5.2.1Problem:ReadaNumberfromtheKeyboardandDivideItby2
5.3UsingFilesinPython:LessTheory,MorePractice
5.3.1OpenaFile
FileNotFoundExceptions
5.3.2ReadingfromFiles
EndofFile
CommonFileInputOperations
CSVFiles
Problem:PrinttheNamesofPlanetsHavingFewerThanTenMoons
Problem:PlayJeopardyUsingaCSVDataSet
TheWithStatement
5.4WritingToFiles
Example:WriteaTableofSquarestoaFile
5.4.1AppendingDatatoaFile
Example:AppendAnother20SquarestotheTableofSquaresFile
5.5Summary
Chapter6Classes
6.1ClassesandTypes
6.1.1ThePythonClass–SyntaxandSemantics
6.1.2AReallySimpleClass
6.1.3Encapsulation
6.2ClassesandDataTypes
6.2.1Example:ADeckofCards
6.2.2ABouncingBall
6.2.3Cat-A-Pult
BasicDesign
DetailedDesign
6.3SubclassesandInheritance
6.3.1Non-TrivialExample:ObjectsinaVideoGame
6.4DuckTyping
6.5Summary
Chapter7Graphics
7.1IntroductiontoGraphicsProgramming
7.1.1Essentials:TheGraphicsWindowandColors
7.1.2PixelLevelGraphics
Example:CreateaPageofNotepaper
Example:CreatingaColorGradient
7.1.3LinesandCurves
Example:NotepaperAgain
7.1.4Polygons
7.1.5Text
7.1.6Example:AHistogram
7.1.7Example:APieChart
7.1.8Images
Pixels
Example:IdentifyingaGreenCar
Example:Thresholding
Transparency
7.1.9GenerativeArt
7.2Summary
Chapter8ManipulatingData
8.1Dictionaries
8.1.1Example:ANaiveLatin–EnglishTranslation
8.1.2FunctionsforDictionaries
8.1.3DictionariesandLoops
8.2Arrays
8.3FormattedText,FormattedI/O
8.3.1Example:NASAMeteoriteLandingData
8.4AdvancedDataFiles
8.4.1BinaryFile
Example:CreateaFileofIntegers
8.4.2TheStructModule
Example:AVideoGameHighScoreFile
8.4.3RandomAccess
Example:MaintainingtheHighScoreFileinOrder
8.5StandardFileTypes
8.5.1ImageFiles
8.5.2GIF
8.5.3JPEG
8.5.4TIFF
8.5.5PNG
8.5.6SoundFiles
WAV
8.5.7OtherFiles
HTML
EXE
8.6Summary
Chapter9Multimedia
9.1MouseInteraction
Example:DrawaCircleattheMouseCursor
Example:ChangeBackgroundColorUsingtheMouse
9.1.1MouseButtons
Example:DrawLinesUsingtheMouse
Example:AButton
9.2TheKeyboard
Example:Pressinga“+”CreatesaRandomCircle
Example:ReadingaCharacterString
9.3Animation
9.3.1ObjectAnimation
Example:ABallinaBox
Example:ManyBallsinaBox
9.3.2FrameAnimation
Example:ReadFramesandPlayThemBackasanAnimation
Example:SimulationoftheSpaceShuttleControlConsole(AClassThatWill
DrawanAnimationataSpecificLocation)
9.4RGBAColors–Transparency
9.5Sound
Example:PlayaSound
Example:ControlVolumeUsingtheKeyboard.PauseandUnpause
Example:PlayaSoundEffectattheRightMoment:Bounces
9.6Video
Example:Carclub–DisplaytheVideocarclub2.mpg(Annotated)
Exercise:ThresholdaVideo(ProcessingPixels)
9.7Summary
Chapter10BasicAlgorithms
10.1Sorting
10.1.1SelectionSort
10.1.2MergeSort
10.2Searching
10.2.1Timings
10.2.2LinearSearch
10.2.3BinarySearch
10.3RandomNumberGeneration
10.3.1LinearCongruentialMethod
10.4Cryptography
10.4.1One-TimePad
10.4.2PublicKeyEncryption(RSA)
Example:EncrypttheMessage“DepartatDawn”UsingRSA
10.5Compression
10.5.1HuffmanEncoding
10.5.2LZWCompression
10.6Hashing
djb2
10.6.1sdbm
10.7Summary
Chapter11ProgrammingfortheSciences
11.1FindingRootsofEquations
11.2Differentiation
11.3Integration
11.4Optimization:FindingMaximaandMinima
11.4.1NewtonAgain
11.4.2FittingDatatoCurves–Regression
11.4.3EvolutionaryMethods
11.5LongestCommonSubsequence(EditDistance)
11.5.1DeterminingLongestCommonSubsequence(LCS)
11.6Summary
Chapter12HowtoWriteGoodPrograms
12.1ProceduralProgramming–WordProcessing
12.1.1Top-Down
12.1.2Centering
12.1.3RightJustification
12.1.4OtherCommands
12.2ObjectOrientedProgramming–Breakout
12.3DescribingtheProblemasaProcess
12.3.1InitialCodingforaTile
12.3.2InitialCodingforthePaddle
12.3.3InitialCodingfortheBall
12.3.4CollectingtheClasses
12.3.5DevelopingthePaddle
12.3.6BallandTileCollisions
12.3.7BallandPaddleCollisions
12.3.8FinishingtheGame
12.4RulesforProgrammers
12.5Summary
Chapter13CommunicatingwiththeOutsideWorld
13.1Email
Example:SendanEmail
13.1.1ReadingEmail
Example:DisplaytheSubjectHeadersforEmailsinInbox
13.2FTP
Example:DownloadandDisplaytheREADMEFilefromanFTPSite
13.3CommunicationBetweenProcesses
Example:AServerThatCalculatesSquares
13.4Twitter
Example:ConnecttotheTwitterStreamandPrintSpecificMessages
13.5CommunicatingwithOtherLanguages
Example:FindTwoLargeRelativelyPrimeNumbers
13.6Summary
Chapter14ABriefGlibReference
14.1Glibtkinter
14.2Images
Preface
This book is intended to teach introductory programming. Material is included for the
introductory computer science course, but also for students and readers in science and
other disciplines. I firmly believe that programming is an essential skill for all
professionalsandespeciallyacademicsinthe21stcenturyandhaveemphasizedthatinthe
contentdiscussedinthebook.
Thebookusesa“just-in-time”approach,meaningthatItrytopresentnewinformation
just before or just after the reader needs it. As a result, there are numerous examples,
carefullyselectedtofitintotheirproperplacesinthetext.Nottoosoon,andnottoolate.
Ibelieveinobject-orientedprogramming.Mymaster’sthesisinthelate1970swason
that subject, cut my teeth on Simula, was there when C++ was created, and knew the
creator of Java. I do not believe that object-oriented programming is the only solution,
though, and realized early that good objects can only be devised by someone who can
alreadyprogram. Iam thereforenot an“objects first”instructor, but a “whateverworks
best”instructor.
Manyoftheexamplesinvolvecomputergamesandgamedevelopment.Asweknow,
themajorityofundergraduatestudentsplaygames.Theyunderstandthembetterthan,say,
accountingorinventorysystems,whichhavebeenthetypicalearlyassignments.Ibelieve
inpresentingstudentsassignmentsthatareinteresting.
Idon’tthinkthatcateringtoanyparticularlanguageforminanintroductorytextserves
thestudentorthelanguage.Thestudent,ifsensible,willlearnotherlanguages.Bringing
Python idioms into play too soon may interfere with the generality of the ideas being
presentedandwillnotassistthestudentwhenlearningJava,C++,orRuby.
This book introduces a multimedia code module Glib that can assist the programmer
withgraphics,animation,sound,interaction,andvideo.Glibisincludedonthecompanion
discorcanbedownloadedfromthebook’swebsite.Thebasiclibrary,staticGlib,needs
nothingbutastandard3.4orbetterinstallationofPython.Itusestkinterasabasis,which
is distributed with the language. The expanded library uses pygame, and that is easily
downloaded and installed. The extended Glib, called dynamic Glib, allows exactly the
same interface as does static Glib, but extends it to also include sound, interface, and
video.Thus,ifstaticGlibcompilesandrunsaprogram,thendynamicGlibshouldtoo.
Thereisawikiconcerningthebookathttps://sites.google.com/site/pythonparker/andI
am happy to receive comments, code fixes, extensions, extra teaching material, and
general suggestions. I see a good textbook as a community, and encourage everyone –
especially first year students, the target audience of this book - to send me their
experiencesandideas.
Software(anycomputerprogram)isubiquitous.Cars,phones,refrigerators,television,
(and almost everything in our society) are computerized. Decisions made about how a
programistobebuilttendtosurvive,andevenaftermanymodifications,theycanaffect
how people use that device or system. Creating efficient software helps in achieving a
productiveandhappycivilization.
Python is a great language for beginning programmers. It is easy to write the first
programs,becausetheconceptualoverheadissmall.Thatis,there’snoneedtounderstand
what“void”or“public”meansattheoutset.Pythondoesmanythingsforaprogrammer.
Do you want something sorted? It’s a part of the language. Lists and hash tables
(dictionaries)areapartofthelanguage.Youcanwriteclasses,butdonothaveto,soitcan
be taught objectsfirst or not. The required indentation means that it is much harder to
placecodeincorrectlyinloopsorifstatements.TherearehundredsofreasonswhyPython
isagreatidea.
Anditisfree.Thisbookwaswrittenusingversion3.4,andwiththePyCharmAPI.The
modulesusedthatrequiredownloadarefew,butincludePyGameandtweepy.Allfree.
OverviewofChapters
Here’s a brief outline of the book. It can be used to teach computer science majors or
sciencestudentswhowishtohaveacompetencyinprogramming.
Chapter 0: Historical and technological material on computers. Binary numbers, the
fetch-excutecycle.Thischaptercanbeskippedinsomesyllabi.
Chapter 1: Problem solving witha computer; breaking a problem down so itcan be
solved. The Python system. Some simple programs involving games that introduce
variables,expressions,print,types,andtheifstatement.
Chapter 2: Repetition in programming: while and for statements. Random numbers.
Countingloops,nestedloops.Drawingahistogram.Exceptions(try-except).
Chapter3:Stringsandstringoperations.Tuples,theirdefinitionanduse.Listsandlist
comprehension. Editing, slices. The bytes type. And set types. Example: the game of
craps.
Chapter4:Functions:modularprogramming.Definingafunction,callingafunction.
Parameters,includingdefaultparameters,andscope.Returnvalues.Recursion.TheGame
ofSticks.Variableparameterlists,assigningafunctiontoavariable.Findthemaximumof
amathematicalfunction.Modules.GameofNim.
Chapter5:Files.Whatisafileandhowarefilesrepresented.Propertiesoffiles.File
exceptions.Input,output,append,open,close.Commaseparatedvalue(CSV)files.Game
ofJeopardy.Thewithstatement.
Chapter6:Classesandobjectorientation.Whatisanobjectandwhatisaclass?Types
andclasses.Pythonclassstructure.Creatinginstances,__init__andself.Encapsulation.
Examples: deck of playing cards; a bouncing ball; Cat-a-pult. Designing with classes.
Subclassesandinheritance.Videogameobjects.Ducktyping.
Chapter7:Graphics.TheGlibmodule.Drawingwindow;colorrepresentation,pixels.
Drawing lines, curves, and polygons. Filling. Drawing text. Example: Histogram, Pie
chart.Imagesandimagedisplay,gettingandsettingpixels.Thresholding.Generativeart.
Chapter 8: Data and information. Python dictionaries. Latin to English translator.
Arrays,formattedtext,formattedinput/output.Meteoritelandingdata.Non-textfilesand
thestructmodule.Highscorefileexample.Randomaccess.Imageandsoundfiletypes.
Chapter9:Digitalmedia:dynamicGlibmodule.Usingthemouseandthekeyboard.
Animation. Space shuttle control console example. Transparent colors. Sound: playing
soundfiles,volume,pause.Video:playandpositionavideo,accessingframesandpixels
inavideo.
Chapter 10: Basic algorithms in computer science. Sorting (selection, merge) and
searching (linear, binary). Timing code execution. Generating random numbers;
cryptography;datacompression(includingHuffmancodesandRLE);hashing.
Chapter 11: Programming for Science. Roots of equations; differentiation and
integration. Optimization (minimum and maximum) and curve fitting (regression).
Evolutionaryalgorithms.Longestcommonsubsequence,oreditdistance.
Chapter12:Writinggoodcode.Awalkthroughtwomajorprojects:awordprocessor
written as procedural code and a breakout game written as object oriented code. A
collectionofeffectiverulesforwritinggoodcode.
Chapter 13: Dealing with real world interfaces, which tend to defined for you.
ExamplesareEmail(sendandreceive),FTP,inter-processcommunication(client-server),
Twitter,callingotherlanguageslikeC++.
Chapter14:AreferenceforbothversionsofGlib.
ChapterCoverageforDifferentMajors
Acomputerscienceintroductioncouldusemostchapters,dependingonthebackground
ofthestudents,butChapters0,7,9,and/or11couldbeomitted.
Anintroductiontoprogrammingforsciencecouldomitchapters0,10,12.
Chapter13isalwaysoptional,butisinterestingasitexplainshowsocial
mediasoftwareworksundertheinterface.
Basicintroductiontoprogrammingfornon-scienceshouldinclude
Chapters0,1,2,3,4,5,and7.
CompanionFiles(Discincludedinphysicalbookorfiles
availablefordownloading)
Thecompanionfilescontainusefulmaterialforeachchapter:
•Selectedexercisesaresolved,includingworkingcodewhenthatisapartofthe
solution.
•AllsignificantprogrammingexamplesareprovidedasPythoncodefiles(over100),
thatcanbecompiledandexecuted,andthatcanbemodifiedasexercisesorclass
projects.Thisincludessampledatafileswhenappropriate.
•AnimportantaspectofthisbookistheuseofagraphicslibrarynamedGlib.Source
codeforthismoduleisprovidedonthediscandonline.Therearetwoversions:one
thatworkswiththebuilt-inmoduletkinterwhichallowsgraphics,andasecondthat
extendsthepreviousmoduleusing
pyGameandallowsvideos,interaction,andsound.
•Allfiguresareavailableasimages,infullcolor.
InstructorAncillaries
•Solutionstoalmostalloftheprogrammingexercisesgiveninthetext.
•MSPowerPointlecturesprovidedforanentiresemester(35files)includingsome
newexamplesandshortvideos.
•Likelythemostimportantaspectofthisbook,asidefromtheverypractical
viewpoint,istheprovisionoftheGlibgraphicsandmultimedialibrary.Thiscomes
intwoversions:auniversalversionthathandlesbasicgraphicsandthatcanexecute
withoutanyextrainstallationstep;andthefullmultimediaextensionthathandles
sound,video,andinteraction,butthatrequiresthatpyGamebeinstalled,whichisa
simpleprocess.
•AllofthePythoncodethatappearsinthebookshasbeenexecuted,andallcomplete
programsareprovidedas.pyfiles.Someofthenumerousprogrammingexamples
(over100)thatareexploredinthebookandforwhichworkingcodeisincluded:
oAninteractivebreakoutgame
oAtextformattingsystem
oPlottinghistogramsandpiecharts
oReadingTwitterfeeds
oPlayJeopardyUsingaCSVDataSet
oSendingandreceivingEmail
oAsimpleLatintoEnglishtranslator
oRock-Paper-Scissors
•HundredsofansweredmultiplechoicequizandexaminationquestionsinMSWord
filesthatcanbeeditedandusedinvariousways.
DedicatedWebSite
An online community has been started at https://sites.google.com/site/pythonparker/ for
comments, new exam questions and exercises, extra code, and as a place to report
problems.
Pleaseconsidercontributingmaterialtotheon-linecommunity,anddohavefun.Ifyou
don’t,thenyou’redoingitwrong.
J.Parker
October2016
CHAPTER0
MODERNCOMPUTERS
0.1CalculationsbyMachine
0.2HowComputersWorkandWhyweMadeThem
0.3ComputerSystemsareBuiltinLayers
0.4ComputerNetworks
0.5Representation
0.6Summary
Inthischapter
Humansaretoolmakersandtoolusers.Thisisnotuniqueintheanimalkingdom,butthe
facilitythathumanshavewithtoolsandthevarietyofapplicationswehaveforthemdoes
make us unique. Starting with mechanical tools (machines) like levers and wheels that
could lighten the physical effort of everyday life, more and more complex and specific
deviceshavebeencreatedtoassistwithallfacetsofourlives.Thiswasextendedinthe
twentiethcenturytoassistingwithmentalefforts,specificallycalculation.
Computers are devices that humans have built in order to facilitate complex
calculations.Earlycomputerswereusedtodosomeofthecomputationsneededtodesign
thefirstnuclearbombs,butnowcomputersseemtobeeverywhere,evenembeddedwithin
carsandkitchenappliances,andevenwithinourownbodies.Thesuccessofthesedevices
insuchawiderangeofapplicationareasisaresultoftheirabilitytobeprogrammed—
thatis,thedeviceitselfisonlyapotentialwhenfirstbuiltandhasnospecificfunction.It
isdesignedtobeconfiguredtodoanytaskthatrequirescalculations,andtheconfiguring
processiswhatwecallprogramming.
Tosomeextentthishastakentheplaceofalotofothertooldevelopmentthatusedto
be done by engineers. When designing a complex machine like an automobile, for
example,there usedto be alot of mechanicalwork involved.The careful timingof the
currenttothesparkplugwasaccomplishedbyrotatingshaftswithsensors,andresultedin
thefiringofeachcylinderatthecorrectmoment.Theairtogasolinemixturefedintothe
enginewascontrolledbytubesandcablesandsprings.Nowallofthesethingsandmany
morearedoneusingcomputersthatsenseelectricand magneticevents,docalculations,
andsendelectricalcontrolsignalstoactuatorsintheengine.Thesamecomputercanbe
usedtocontrolarefrigerator,maketelephonecallsonacellularphone,changechannels
onatelevision,andwakeyouupinthemorning.Itistheflexibilityofthecomputerthat
has led to them becoming a dominant technology in human society, and the flexibility
comeslargelyfromtheirabilitytobeprogrammed.
0.1 CALCULATIONSBYMACHINE
People have been calculating things for thousands of years, and have always had
mechanicalaidstohelp.
Whensomeoneprogramsacomputer,theyarereallycommunicatingwithit.Itisavery
imperativeandprecisecommunicationtobesure.Imperativebecausethecomputerhasno
choice;itisbeingtoldwhattodo,andwilldoexactlythat.Precisebecauseacomputer
doesnotapplyanyinterpretationtowhatitisbeingtold.Humanlanguagesarevagueand
subject to interpretation and ambiguity. There are sentences that are legal in terms of
syntax that have no real meaning: “Which is faster, to Boston or by bus?” is a legal
sentence in English that has no meaning. Such things are not possible in a computer
language. Also, computers do not think and so can’t evaluate a command that would
amount to “expose the patient to a fatal dose of radiation” with any skepticism. As a
result,we,asprogrammers,mustbecarefulandpreciseinwhatweinstructthemachineto
do.
Whenhumanscommunicatewitheachotherweusealanguage.Similarly,humansuse
languagestocommunicatewithcomputers;itiseasyforus.Suchlanguagesareartificial
(humansinventedthemforthispurpose,allatonce),terse(therearefewifanymodifiers,
no way to express emotions or graduations of any feeling), precise (each item in the
language means one thing), and written (we do not speak to the computer in a
programminglanguage.Notyet,perhapsnever).
Computerlanguagesoperateatahighlevel,anddonotrepresentthewaythecomputer
actually works. For the purposes of learning to program there are a few fundamental
thingsthatneedtobeknownaboutcomputers.It’snotrequiredtoknowhowtheyoperate
electronically,buttherearebasicprinciplesthatshouldbeunderstoodinordertoputthe
processofusingcomputersinpracticalcontexts.
0.2 HOWCOMPUTERSWORKANDWHYWEMADE
THEM
Thereasonpeopleusecomputersisdifferentdependingonthepointinhistoryinwhich
one looks, but the military always seems to be involved. There have been many
calculating devices built and used throughout history, but the first one that would have
been programmable was designed by Charles Babbage. The military, as well as the
mathematiciansoftheday,wereinterestedinmoreaccuratemathematicaltables,suchas
those for logarithms. At the time these were calculated by hand, but the idea that a
machine could be built to compute more digits of accuracy was appealing. This would
havebeenamechanicaldeviceofgearsandshafts,butitwasnotcompletedduetobudget
andcontractingissues.
Babbage continued his work in design and created, on paper, a programmable
mechanicaldevicecalledtheanalyticalenginein1837.Whatdoesprogrammablemean?
Acalculationdeviceismanipulatedbytheoperatortoperformasequenceofoperations:
addthistothat,thensubtractthisanddividebysomethingelse.Onamoderncalculator
thiswouldbedoneusingasequenceofkeypresses,butonolderdevicesitmayinvolve
movingbeadsalongwiresorrotatinggearsalongshafts.Nowimaginethatthesequence
ofkeypressescanbeencodedonsomeothermedia:asetofcams,orplugsintosockets,
orholespunchedintocards.Thisisaprogram.
Suchasetofpunchedcardsorcamswouldbesimilartoasetofinstructionswrittenin
English and given to some human to calculate, but would instead be coded in a form
(language)thatthecomputingdevicecoulduseimmediately.Thedirectionsonthecards
could be changed so that something new could be computed as needed. The difference
enginewouldonlyfindlogarithmsandtrigonometricfunctions,butadevicethatcouldbe
programmedin this way could, intheory,calculateanything. The analytical enginewas
programmedbypunchingholesinstiffcards,anideathatwasderivedfromtheJacquard
loom of the day. The location of holes would indicate either an operation (e.g., add,
subtract,etc.)ordata(anumber).Asequenceofsuchcardswouldbeexecutedoneata
timeandyieldavalueattheend.
Althoughtheanalyticalenginewasnevercompleted,aprogramwaswrittenforit,but
notbyhim.Theworld’sfirstprogrammermayhavebeenawoman,AugustaAdaKing,
CountessofLovelace.SheworkedwithBabbageforafewyearsandwroteaprogramto
computeBernoullinumbers.Thisisthefirstalgorithmeverdesignedforacomputerandis
often claimed to be the first computer program ever written, although it was never
executed.
The very concept of programmability is a more important development than is the
developmentofthedifferenceoranalyticalengines.Theideathatamachinecanbemade
tododifferentthingsdependingonauser-definedsetofinstructionsistheverybasisofall
moderncomputers,whiletheuseofmechanicalcalculationhasbecomeobsolete;itistoo
slow,expensive,andcumbersome.Thisiswhereitbegan,though,andtheprogramming
conceptisthesametoday.
During World War II computers made the leap to being electrical. Work on breaking
codesandbuildingtheatomicbombrequiredlargeamountsofcomputing.Initiallysome
ofthiswasprovidedbyroomsfullofhumansoperatingmechanicalcalculators,butthey
couldnotkeepupwiththedemand,soelectroniccomputersweredesignedandbuilt.The
firstwasColossus,designedandbuiltbyTommyFlowersin1943.Itwascreatedtohelp
breakGermanmilitarycodes,andanupdatedversion(MarkII)wasbuiltin1944.
IntheUnitedStatestherewasaneedforcomputationalpowerinLosAlamoswhenthe
firstnuclearweapons werebeing built.Electro-mechanicalcalculators werereplaced by
IBMpunched-cardcalculators,originallydesignedforaccounting,andthesewerealittle
faster than the humans running calculators, but could run twenty-four hours a day and
made fewer errors. This computer was programmed by plugging wires into sockets to
createnewconnectionsbetweencomponents.
0.2.1 Numbers
Theelectroniccomputersdescribedsofar,andthoseofthe1940sgenerally,hadalmost
no storage for numbers. Input was through devices like cards, and they could have
numbersonthem.Theycouldbetransferredtothecomputationunit,thenmovedaheador
back,andperhapsreadagain.Memorywasaprimitivething,andvariousmethodswere
devisedtostorejustafewdigits.Asignificantadvancecamewhenengineersdecidedto
usebinarynumbers.Thiswillrequiresomeexplanation.
Electronicdevicesusecurrentandvoltagetorepresentinformation,suchassoundsor
pictures(radioand television).One ofthe simplestdevices isa switch,which canopen
and close a circuit and turn things like lights on and off. Electricity needs a complete
circuitorroutefromthesourceofelectrons,thenegativepoleofabatteryperhaps,tothe
sink,whichcouldbethepositivepole.Electrons,whichiswhatelectricityis,inasimple
sense,flowfromthenegativetothepositivepolesofabattery.Electricitycanbemadeto
do work by putting devices in the way of the flow of electrons. Putting a lamp in the
circuitcancausethelamptolightup,forexample.
A switch makes a break in the circuit, which stops the electrons from flowing; they
cannotjumpthegap.Thiscausesthelamptogodark.Thisseemsobvioustoanyonewith
electriclightsintheirhouse,butwhatmaynotbesoobviousisthatthiscreatestwostates
of the circuit, on and off. These states can be assigned numbers. Off could be 0, for
example,andoncouldbe1.Thisishowmostcomputersrepresentnumbers:ason/offor
1/0states. Tobe more clearabout this way ofrepresenting numbers, considerthe usual
way,whichiscalledpositionalnumbering.
Mosthumansocietiesnowuseasystemthatusestendigits:0,1,2,3,4,5,6,7,8,and
9.Thenumber123isacombinationofdigitsandpowersoften.Itisashorthandnotation
for100+20+3,or1×102+2*101+3*100.Eachdigitismultipliedbyapoweroften
andsummedtogetthevalueofthenumber.Anyonewhohasbeentoschoolacceptsthis
anddoesnotthinkaboutit,butreallythevalueusedasthebasisofthesystem,ten,isnot
magical.Itsimplyhappenstobethenumberofdigitshumanshaveontheirhands.Any
basewouldworkalmostaswell.
Example:Base
Numbersthatuse4asabasecanonlyhavethedigits0,1,2,and3.Eachpositioninthe
numberrepresentsapowerof4.Thusthenumber123is,inbase4,1×42+2*41+3*40,
whichis1×16+2*4+3=16+8+3=27intraditionalbase10representation.
This could get confusing, what with various bases and such, so numbers will be
considered to be in base 10 unless specific by a suffix. 1234 is 123 in base 4, whereas
1238is123inbase8,andsoon.
Binary numbers can have digits that are 1 or 0. The numbers are in base 2, and can
thereforeonly have the digits 0 and 1. These numbers can be representedby the on/off
stateofaswitchortransistor,whichisanelectronicswitch,whichiswhytheyareusedin
electroniccomputers.Moderncomputersrepresentalldataasbinarynumbersbecauseitis
easytorepresentthosenumbersinelectronicform;avoltageisarbitrarilyassignedto“0”
and to “1.” When a device detects a particular voltage, it can then be converted into a
digit,andviceversa.If2voltsisassignedtoa0and5voltsisassignedtoa1,thenthe
followingcircuitcouldsignala0or1dependingonwhatswitchwasselected:
ConvertBinaryNumberstoDecimal
Considerthebinarynumber110112.Itcanbeconvertedinbase10bymultiplyingeach
digitbyitscorrespondingpoweroftwoandthensummingtheresults.
Someobservations:
•Terminology:Adigitinabinarynumberiscalledabit(forbinarydigit)
•Anyevennumberhas0asthelowdigit,whichmeansthatoddnumbershave1asthe
lowdigit.
•Anyexactpoweroftwo,suchas16,32,64,andsoon,willhaveexactlyonedigit
thatisa1,andallotherswillbe0.
•Terminology:Abinarydigitorbitthatis1issaidtobeset.Abitthatis0issaidtobe
clear.
ConvertDecimalNumberstoBinary
Going from base 10 to base 2 is more complicated than the reverse. There are a few
waystodothecalculation,buthere’sonethatmanypeoplefindeasytounderstand.Ifthe
lowest digit (rightmost) is 1 then the number is odd, and otherwise it is even. If the
number 7310 is to be converted into binary the rightmost digit will be 1, because the
numberisodd.
Thenextstepistodividethenumberby2,eliminatingtherightmostbinarydigit,the
one that was just identified, from the number. 7310 / 210 = 3610, and there can be no
fractionalpart,soanysuchpartistobediscarded.Nowtheproblemistoconvert3610to
binaryandthenappendthepartalreadyconvertedtothat.Is3610evenorodd?Itiseven,
sothenextdigitis0.Thefinaltwodigitsof7310inbinaryare01.
Theprocessisrepeated:
Divide36by2toget18,whichiseven,sothenextdigitis0.
Divide18by2toget9,whichisodd,sothenextdigitis1.
Divide9by2toget4,whichiseven,sothenextdigitis0.
Divide4by2toget2,whichiseven,sothenextdigitis0.
Divide2by2toget1,whichisodd,sothenextdigitis1.
Divide1by2toget0.Whenthenumberbecomes0,theprocessiscomplete.
Theconversionprocessgivesthebinarynumbersinreverseorder(righttoleft)sothe
resultisthat7310=10010012.
Isthiscorrect?Convertthisbinarynumberintodecimalagain:
10010012=1×20+1*23+1*26=1+8+64=7310.
Asummaryoftheprocessforconvertingxintobinaryis:
Startatdigitn=0(rightmost)
repeat
Ifxiseven,thecurrentdigitnis0,otherwiseitis1
Dividexby2
Add1ton
Ifxiszero,thenendtherepetition
ArithmeticinBinary
Computers do all operations on data as binary numbers, so when two numbers are
added,forexample,thecalculationisperformedinbase2.Itturnsoutthatbase2iseasier
thanbase10forsomethings,andaddingisoneofthosethings.It’sdoneinthesameway
asinbase10butthereareonly2digits,andtwosarecarriedinsteadoftens.Forexample:
add010112to011102:
010112
011102
Startingthesumontherightasusual,thereisa0addedtoa1andthesumis1,justas
inbase10.
010112
011102
–––—
12
The next column in the sum contains two 1s. 1 + 1 is two, but in binary that is
representedas102.So,theresultof1+1is0withacarryof1:
1
010112
011102
–––—
012
Thenextcolumnhas1+0,butthereisacarryof1soitis1+0+1.That’s0witha1
carryagain:
1
010112
011102
–––—
0012
Nowthecolumnis1+1witha1carry,or1+1+1.Thisis1withacarryof1:
1
010112
011102
–––—
10012
Finally, the leading digits are 0 + 0 with a carry of 1, or 0 + 0 + 1. The answer is
110012.Isthiscorrect?Well,010112is1110and011102is142,and1110+1410=2510.
Theanswer110012is,infact,2510(confirmthis!)soitallworksout.
Binarynumberscanbesubjectedtothesameoperationsasanyotherformofnumber
(i.e.,multiplication,subtraction,division).Inaddition,theseoperationscanbeperformed
byelectroniccircuitsoperatingonvoltagesthatrepresentthedigits1and0.
0.2.2 Memory
Adding memory to computers was another huge step forward. A computer memory
mustholdsteadyacollectionofvoltagesthatrepresentdigits,andthedigitsarecollected
intosets,each ofwhich is anumber.A switchcan hold abinary digit,butswitches are
activated by people. Computer memory must store and recall (retrieve) numbers when
theyarerequiredbyacalculationwithouthumanintervention.
The first memories were rather odd things: acoustic delay lines store numbers as a
soundpassingthroughmercuryinatube.Thespeedofsoundallowsasmallnumberof
digits,around500,tobestoredintransitfromaspeakerononeendtoareceiveronthe
other. A phosphor screen can be built that is activated by an electric pulse and draws a
brightspotonascreenthatneedsnopowertomaintainit.Numberscanbesavedasbright
anddarkspots(1and0)andretrievedusinglightsensitivedevices.
Other devices were used in the early years, such as relays and vacuum tubes, but in
1947 the magnetic core memory was patented, in which bits were stored as magnetic
fieldsinsmalldonut-shapedelements.Thiskindofmemorywasfasterandmorereliable
than anything used before, and even held the data in memory without power being
applied,ahandythinginapowerfailure.Itwasalsoexpensive,ofcourse.
This kind of memory is almost never used anymore, but its legacy remains in
terminology:memoryisstillfrequentlyreferredtoascore,andacoredumpisstillwhat
manypeoplecallalistingofthecontentsofacomputermemory.
Currentcomputers use transistors tostore bitsand solidstate memoriesthat canhold
billionsofbits(Gigabits),butthewaytheyareusedinthecomputerisstillthesameasit
was.Bitsarecollectedintogroupsof8(abyte)andthengroupsofmultiplebytestofora
word. Words are collected into a linear sequence, each numbered starting at 0. These
numbersarecalledaddresses,andeachword,andsometimeseachbyte,canbeaccessed
by specifying the address of the data that is wanted. Acquiring the data element at a
particular location is called a fetch, and placing a number into a particular location is a
store.Acomputerprogramtoaddtwonumbersmightbespecifiedas:
Fetchthenumberatlocation21
Fetchthenumberatlocation433
Addthosetwonumbers
Storetheresultinlocation22
Thismayseemlikeaverbosewaytoaddtwonumbers,butrememberthatthiscanbe
accomplishedinatinyfractionofasecond.
Memoryisoftenpresentedtobeginningprogrammersasacollectionofmailboxes.The
addressisanumberidentifyingthemailbox,whichalsocontainsanumber.Thereissome
specialmemoryinthecomputerthathasnospecificaddress,andisreferredtoinvarious
ways.Whenafetchisperformed,thereisaquestionconcerningwherethevaluethatwas
fetchedgoes.Itcangotoanothermemorylocation,whichisamoveoperation,oritcango
intooneofthesespeciallocations,calledregisters.
Acomputercanhavemanyregistersorveryfew,buttheyareveryfastmemoryunits
that are used to keep intermediate results of computations. The simple program above
wouldnormallyhavetobemodifiedtogiveregistersthatareinvolvedintheoperations:
Fetchthenumberatlocation21intoregisterR0
Fetchthenumberatlocation433intoregisterR1
AddR1andR0andputtheresultintoR3
StoreR3(theresult)inlocation22
Thisisstillverbose,butmorecorrect.
0.2.3 StoredPrograms
The final critical step in creating the modern computer occurred in 1936 with Alan
Turing’s theoretical paperon the subject, butan actual computer toemploy the concept
was not built until 1948 when the Manchester Small-Scale Experimental Machine ran
whatisconsideredtobethefirststoredprogram.Ithasbeenthebasicmethodbywhich
computersoperateeversince.
Theideaistostoreacomputerprograminmemorylocationsinsteadofoncardsorin
some other way. Programs and data now coexist in memory, and this also means that
computerprogramshavetobeencodedasnumbers;everythinginacomputerisanumber.
Therearemanydifferentwaystodothis,andmanypossibledifferentinstructionsetsthat
have been implemented and various different configurations of registers, memory, and
instructions.Thecomputerhardwarealwaysdoesthesamebasicthing:firstitfetchesthe
next instruction to be executed, and then it decodes it and executes it. Executing an
instructioncouldinvolvemoreaccessestomemoryorregisters.
This repeated fetch then execute process is called, not surprisingly, the fetch-execute
cycle,andisattheheartofallcomputers.Thelocationoraddressofthenextinstruction
resides in a register called the program counter, and this register is incremented every
time an instruction is executed, meaning that instructions will be placed in consecutive
memorylocationsandwillbefetchedandexecutednaturallyinthatorder.Sometimesthe
instructionisfetchedintoaspecialregistertoo,calledtheinstructionregister, so that it
canbeexaminedquicklyforimportantcomponentslikedatavaluesoraddresses.Finally,
acomputerwillneedatleastoneregistertostoredata;thiswillbecalledtheaccumulator,
becausethat’susuallywhatsucharegisteriscalled.
Thestoredprogramconceptisactuallyprettydifficulttograsp,soadetailedexampleis
inorder.Imagineacomputerthathas12bitwordsasmemorylocationsandthatpossesses
theregistersdescribedabove.Thisisafictionalmachine,butitturnsouttohavesomeof
thepropertiesofanoldcomputerfromthe1960scalledthePDP-8.
Todemonstratetheexecutionofaprogramonastoredprogramcomputeraverysimple
program will be used: add 21 and 433, placing the answer in location 11. As an initial
assumption, assume that the value 21 is in location 9 and 433 is in location 10. The
programitselfwillresideinconsecutivememorylocationsbeginningataddress0.
The program should be described in English first. Note that it is very much like the
previous two examples, but in this case there is only one register to put data into, the
accumulator.Theprogramcouldperhapslooklikethis:
Fetchthecontentsofmemorylocation9intotheaccumulator
Addthecontentsofmemorylocation10totheaccumulator
Storethecontentsoftheaccumulatorintomemorylocation11
Theprogramisnowcomplete,andtheresult21+433shouldbefoundinlocation11.
Computerprogramsarenormallyexpressedintermsthatthe computercanimmediately
use,normallyasfairlyterseandprecisecommands.Thenextstageinthedevelopmentof
thisprogram is to use a symbolicform of the actual instructionsthat the computer will
use.
Thefirststepistomovethecontentsoflocation9totheaccumulator.Theinstruction
thatdoesthiskindofthingiscalledLoadAccumulator,shortedasthemnemonicLDA.
Theinstructionwouldbeinlocation0:
0:LDA9#Loadaccumulatorwithlocation9
The text following the “#” character is ignored by the computer, and is really a
commenttoremindtheprogrammerwhatishappening.Thenextinstructionistoaddthe
contents of location 10 to the accumulator; the instruction is ADD and it is placed in
address1:
1:ADD10#Addcontentsofaddress10totheaccumulator
Finally,the result, current in the accumulator register,will be savedinto the memory
locationataddress11.ThisisaStoreinstruction:
2:STO11#Answerintolocation11
Theprogramiscomplete.ThereisaHaltinstruction:
3:HLT#Endofprogram
If this program starts executing at address 0, and if the correct data is in the correct
locations,thentheresult454shouldendupinlocation11.Buttheseinstructionsarenot
yetinaformthecomputercanuse.Theyarecharacters,textthatahumancanread.Ina
stored program computer these instructions must be encoded as numbers, and those
numbersmustagreewiththeonesthecomputerwasbuilttoimplement.
Aninstructionmustbeabinarynumber,soallofthepossibleinstructionshavenumeric
codes.Aninstructioncanalsocontainamemoryaddress;theLDAinstructionspecifiesa
memorylocationfromwhichtoloadtheaccumulator.Boththeinstructioncodeandthe
addresshavetobeplacedintoonecomputerword.Thedesignersofthecomputerdecide
howthatwillbedone,andtheprogrammershavetolivewiththeresult.
This computer has 12 bit words. Imagine that the upper 3 bits indicate what the
instructionis.Thatis,atypicalinstructionisformattedlikethis:
Thereare9bitsatthelower(right)endoftheinstructionforanaddress,and3atthetop
endforthecodethatrepresentstheinstruction.ThecodeforLDAis3;thecodeforADD
is5andthecodeforSTOis6.HLTonmostcomputersthathavesuchaninstructionis
code0.Hereiswhattheprogramlookslikeasnumbers:
Code3Address9
Code5Address10
Code6Address11
Code0Address0
Thesehaveto bemade intobinary numbersto bestored in memory, butthat’spretty
easy.FortheLDAinstructionthecode310is0112andtheaddressis910=0000010012,
so the instruction as a binary number is 011 0000010012, where the space between the
codeandtheaddressisonlypresenttomakeitobvioustoapersonreadingit.
The ADD instruction has code 510, which is 1012, and the address is 10, which in
binaryis00010102.Theinstructionis1010000010102.
The STO instruction has code 6, which is 1102, and the address is 11, which is
0010112.Theinstructionis1100000010112.
TheHLTinstructioniscode0,orin12bitbinary0000000000002.
Thecodes aremade up bythe designersof the computer. Whenmemory is setup to
containthisprogramhere’swhatitlookslike:
Thisishowmemorylookswhentheprogrambegins.Theactofsettingupthememory
likethissothattheprogramcanexecuteiscalledloading.Thebinarynumbersinmemory
locations9and10are21and433respectively(checkthis!),whicharethenumberstobe
summed.
Of course there are more instructions than these in a useful computer. There is not
alwaysasubtractinstruction,butsubtractioncanbedonebymakinganumbernegative
andthenadding,sothereisoftenaNEGateinstruction.Settingtheaccumulatortozerois
acommonthingtodo,sothereisaCLA(ClearAccumulator)instruction;andthereare
manymore.
The fetch-execute cycle involves fetching the memory location addressed by the
programcounterintotheinstructionregister,incrementingtheprogramcounter,andthen
executingtheinstruction.Executioninvolvesfiguringoutwhatinstructionisrepresented
bythecodeandthensendingtheaddressordatathroughthecorrectelectroniccircuits,a
processbeyondanythingthischapterwilladdress.
Averyimportantinstructionthatthisprogramdoesnotuseisabranch.Theinstruction
BRA0willcausethenextinstructiontobeexecutedstartingatmemorylocation0.This
allows a program to skip over some instructions or to repeat some many times. A
conditional branch will change the current instruction if a certain condition is true. An
example would be Branch if Accumulator is Zero (BAZ), which will only perform a
branch if, as the instruction indicates, there is a value of zero in the accumulator. The
combinationofarithmeticandcontrolinstructionsmakesitpossibleforaprogrammerto
describeacalculationtobeperformedveryprecisely.
0.3 COMPUTERSYSTEMSAREBUILTINLAYERS
Enteringaprogramasbinarynumbersusingswitchesisaverytedious,time-consuming
process. Lacking a disk drive, the early computers depended on other kinds of storage:
punched cards again, or paper tape. It should be understood that because there was no
permanent storage, booting one of these machines often meant toggling a small “boot
loader”program,thenreadingapapertape.Nowthecomputerwouldrespondsensiblyto
itsperipheraldevices,likeaprinterorcardreader.Thepapertapecontainedaprimitive
“operating system” that would control the few devices available. That’s what operating
systemsdo:allocateresourcesandcontroldevices.
Thebootloader(bootstrapprogram)isthelowestlayerofsoftware.Itwasprovidedby
thecomputermanufacturer,buthastobeenteredbytheuser.Thepapertapesystemwas
the second layer, and the user did not have to write this program. Gradually more and
morelayerswerewrittensoastoprovidetheuserwithahighlevelofabstractionrather
than having to understand the entire machine. After all, physicists and engineers have
otherthingstodoratherthantendtothecomputer.
Whendiskdrivesbecameavailable,theoperatingsystemwouldbestoredonthem,and
abootstraploaderwouldbesavedinaspecialsectionofmemorythatcouldnotbeerased
(read only memory) so that when the computer was turned on it would run the loader,
which would load the operating system. Very convenient, and it is essentially what
happenstodayonWindows.
Thisoperating systemon the diskdrive is athird layerof software. Itprovides basic
hardwareallocationfunctionalityandalsogivestheuseraccesstosomeprogramstouse
forprintingandsavingthingsondisk—afilesystem.
0.3.1 AssemblersandCompilers
Programming a computer could still be a daunting task if done in binary, so the first
thing that was provided was an assembler. This was a program that would permit a
programmertoentera textprogramthatcould beconvertedinto abinaryexecutable.It
would allow memory locations to be named instead of using an absolute number as an
address,andwouldconverttextoperationcodesandaddressesintobinaryprograms.The
additionprogramfromtheprevioussectioncouldbewritteninassembleras:
LDAData1
ADDData2
STORes
HLT
Data1:21
Data2:433:
Res:0
Usuallyonelineoftextinanassemblercorrespondstoasingleinstructionormemory
location.It’sthesameprogrambutiseasierforaprogrammertounderstand because of
thenamedmemorylocationsandmnemonicinstructionnames.
Itismuchhardertodescribehowacompilerworks,butrelativelyeasytoexplainwhat
itdoes.Acompilertranslateshighlevellanguagestatementsintoassembler,whichinturn
convertsitintobinarycode.Compilerstranslatestatementslike:
A=21
B=433
C=A+B
into executable code. It is a very complex process, but essentially it allows the
programmer to declare that certain names represent integers, that values are to be
assigned,and that arithmeticcan be done. Thereare also more complexstatements like
conditionalexecutionofcodeandfunctioncallswithparameters,aswillbeseeninlater
chapters.
Compilersalsoimplementinputandoutputfromtheuser(readingfromakeyboardand
writing to the video screen), sophisticated data types, and mathematical functions. An
interpreter,whichiswhatthelanguagePythonis,doesapartofthecompilationprocess
but does not produce executable code. Instead, it simulates the execution of the code,
doingmostoftheworkinsoftware.TheJavalanguagedoesasimilarthinginmanycases.
Theprogramsthatsomeonewrites(software)createanotherlayerforsomeonetouse.
An example might be a database management system that gives a user access to a
computer that can query data for certain kinds of values. A graphics system gives a
programmeraccesstoasetofoperationsthatcandrawpictures.
0.3.2 GraphicalUserInterfaces(GUIs)
Mostcomputers now interfacewith their owners througha keyboard, one of thefirst
devicestobeinterfacedtoacomputer;amouse,thefirstdevicetopermit2Dnavigation
on a screen; and windows, a graphical construction that allows many independent
connectionstoacomputertoshareasinglevideoscreen.GUIsarepopularbecausethey
improve the user’s perception of what is happening on a computer. Previous computer
interfaceswerecompletelytextbased,soifsomethingwasgoingwronginaplacewhere
theuserwasnotlooking,thenitwouldprobablynotbenoticed.
Ontheotherhand,GUIsaremoredifficulttoprogram.Justopeninganewwindowina
Microsoft-basedoperatingsystemcanrequirescoresoflinesofC++codethatwouldtake
agreatdealoftimetounderstand.Naturally,itisthejobofaprogrammertobeabletodo
this, but it means that the average user could not create their own software that
manipulated the interface in any reasonable way. So, what is a window, and what’s
involvedinaGUI?
Awindow,intheoperatingsystemsense,isarectangleonthecomputerscreenwithin
which an exchange of information takes place between the user and the system. The
rectangle can generally be resized, removed from the screen temporarily (minimized),
moved,andclosed.Itcanbethoughtofasavirtualcomputerterminalinthateachonecan
dowhattheentirevideoscreenwasneededtodoinearlysystems.Whenthewindowis
active,ausercantypeinformationtobereceivedbytheprogramcontrollingit,andcan
manipulategraphicalobjectswithinthewindowusingamouseor,morerecently,byusing
their fingers on a touch screen. Without a mouse or something like it, a window-based
systemisprettymuchcrippled,sothetwoarealmostalwaysusedtogether.
The mouse is a variation on the tracker ball, and it is agreed that the German
engineeringcompanyTelefunkendevisedaworkingversionandwasthefirsttosellit.A
mouseislinkedthroughsoftwaretoacursoronthescreen,andleft-rightmotionsofthe
mousecauseleft-rightmotionsofthecursor;forwardandbackwardmotionsofthemouse
causethecursortomoveupanddownthescreen.Whenthecursorisinsideofawindow,
thenthatwindowisactive.Amousehasbuttons,andpressingamousebuttonactivates
whatever software object is related to the cursor position on the screen. This describes
thingsthatareobvioustoanyoneusedtocomputersbuiltsincethe1980s.
Widgets
Awidgetisagraphicalobjectdrawninawindoworotherwiseonacomputerscreen
thatcanbeselectedand/oroperatedusingthemouseandmousebuttons.Itisconnectedto
asoftwareelementthatwillbesentacontrolsignalornumericalparameterbyvirtueof
the widget being manipulated. That’s a pretty formal description, but a widget is
exemplifiedby the button, a very commonly used widget on web pages and interfaces.
Buttonscanbeusedtodisplayinformationaswellastocontrolaprogram.Somepopular
widgetsare:
Button:Whenthemousecursoriswithintheboundariesofthebuttononthescreen,
thebuttonissaidtobeactivated.Pressingamousebuttonwhenthebuttonwidgetis
activatedwillcausethesoftwareconnectedtothebuttontoperformitsfunction.
Radio Button: A set of two or more buttons used to select from a set of discrete
options.Onlyoneofthebuttonscanbeselectedatatime,meanthattheoptionsare
mutuallyexclusive.
CheckBox:Awaytoselectasetofoptionsfromalargerset.Thiswidgetconsistsof
acollectionofboxesorbuttonsthatcanbechosenbyclickingonthem.Whenchosen,
they indicate that fact by using a graphical change, sometimes a check mark but
sometimesacolororothervisualeffect.
Slider:Ahorizontalorverticalcontrolwithaselectiontoolthatcanbeslidalongthe
control.Therelativepositionofthecontroldictatesthevaluethatthewidgetprovides.
Thisvalueisoftendisplayedinatextbox,andtherangeisalsocommonlydisplayed.
Drop-downList: A box containing text that displays a complete set of options that
can be displayed when the mouse button is clicked within it. Then any one of the
optionscanbeselectedusingthemouseandthemousebutton.
Icon: An icon is a small graphical representation (pictogram) that represents the
functionofaprogramorfile.Whenselected,theprogramwillexecuteorthefilewill
beopened.
There are many other widgets and variations on the ones shown here. There are two
basicprinciplesatplay:
1.Thewidgetrepresentsanactivityusingacommonlyunderstoodsymbol,and
performsthatactivity,oronerelatedtothesymbol,whenselectedusingthemouse.
Thisisagraphicalandtactileoperationthatreplacesthetypingofacommandin
previouscomputersystems.
2.Thesoftwarethatimplementsthewidgetisamodule,apieceofsoftwarethatcan
bereusedandreconfiguredforvariouscircumstances.Abuttoncanbequickly
createdtoperformanynumberoftasksbecausetheprogramthatimplementsitis
designedforthatdegreeofflexibility.
0.4 COMPUTERNETWORKS
Schools, offices, and some homes are equipped with computer networks, which are
wiresthatconnectcomputerstogetherandsoftwareandspecialhardwarethatallowsthe
computers to communicate with each other. This allows people to send information to
eachotherthroughtheircomputers;alotofworkisdoneinacomputerreadableformin
anycase,anditisconvenienttoallowcomputerstoshareinformation.Buthowdoesthis
reallywork?
Computersuseelectricitytoperformcalculationsonbinarynumbers.Arbitraryvoltages
have been selected to represent 0 and 1, and so long as everyone agrees on that
representation, those voltages can be sent along a wire no matter how long and still be
numbersatthereceivingend.Aslongastwocomputersarebeingconnectedthisworks
fine,butiftwowiresareneededtoconnectanytwocomputersthensixareneededtofully
connectthreecomputerstoeachotherandtwelvetoconnectfourcomputers.Aroomwith
thirtynetworkedcomputerswouldbefullofwires(870toeachcomputer)!Theremustbe
abetterway.
Hawaiihasanunusualproblemwhenitcomestocomputernetworkcommunication.It
isacollectionofislands.Linkingthembycablesisanexpensiveproposition.Intheearly
1970sthefolksattheUniversityofHawaiihadagoodidea—tolinkthecomputersusing
radio. Radio transmission is really similar to wire transmission in many practical ways,
andallocating35radiofrequenciestoconnectonecomputeroneachislandtoallofthe
otherswouldhavebeenpossible,buttheirideawasbetter.Theyusedasingleradiolink
for all computers. When a computer wanted to send information along the network, it
wouldlistentoseeifanothercomputerwasalreadydoingso.Ifso,itwouldwait.Ifnot,it
would begin to send data to all of the other computers and would include in the
transmissionacodeforwhichcomputerwassupposedtoreceiveit.Allcouldhearit,but
allwouldknowwhichcomputerwasthecorrectdestinationsotheotherswouldignoreit.
ThissystemwascalledAlohanet.
There is a problem with this scheme. Two or more computers could try to send at
almost the same time, having noted that no other computer was sending when they
checked. This is called a collision, and is relatively easy to detect; the data received is
nonsense.Whenthathappenseachcomputerwaitsforarandomtime,checksagain,and
triesagaintosendthedata.Ananalogywouldbeameetingwheremanypeoplearetrying
tospeakatonce.
Obviously the busier the network is the more likely a collision will be, and the
retransmissions will make things worse. Still, this scheme works very well and is
functioningtodayintheformofthemostcommonnetworkingsysteminearth—Ethernet.
Ethernet is essentially Alohanet along a wire. It means that each computer has one
connectiontoit,ratherthanconnectionstoeachofthepossibledestinations,andcollisions
arepossible.Thereisanotherconsiderationthatmakesthisschemeworkbetter,andthatit
isuseofpackets.Informationalongthesenetworksissentinfixed-sizepackagesofafew
thousand bytes. In this way, the time needed to send a packet should be more or less
constant,andit’smoreefficientthansendingabitorabyteatatime.
Eachpacketcontainsasetofdatabytesintendedforanothercomputer,sowithinthat
packetshouldbesomeinformationaboutthedestination,thesender,andotherimportant
stuff.Forinstance,ifadatafileisbiggerthanapacket,thenithastobesplitupintoparts
tobesent.Thus,apartofthepacketisasequencenumberindicatingwhichpacketitis
(e.g.,number3of5).Ifaparticularpacketnevergetsreceivedforsomereason,thenthe
missing one is known, and the receiver can ask the sender for that packet to be resent.
Therearealsocodesthancanbeusedtodeterminewhetheranerrorhasoccurred.
0.4.1 Internet
The Internet is a computer network designed to communicate reliably over long
distances. It was originally created to be a reliable communications system that could
surviveanuclearattack,andwasfundedbythemilitary.Itisdistributed,inthatdatacan
besentfromonecomputertoanotherinachainuntilitreachesitsdestination.
Imagine a collection of a few dozen computers, and that each one is connected to
multiple others, but not directly to all others. Computer A wishes to send a message to
computerB,anddoessousingapacketthatincludesthedestination.ComputerAsends
themessagetoallcomputersthatitisconnectedto.Eachofthosecomputerssendsittoall
ofthecomputersthattheyareconnectedto,andsoonuntilthedestinationisreached.All
of the computers will receive every message, which is pretty inefficient, but so long as
thereexistssomepathfromAtoBthemessagewillbedelivered.
Itwouldbehardtotellwhentostopsendingamessageinthisscheme.Anotherwayto
do it is to have a table in each computer saying which computers in the network are
connectedtowhichothers.Amessagecanbesenttoacomputerknowntobeashortpath
to the destination, one computer at a time, and in this case not all computers see the
message,onlythe onesalong theroute do.Anewcomputer added to the networkmust
sendaspecialmessagetoalloftheotherstellingthemwhichoftheexistingcomputersit
isdirectlyconnectedto,andthismessagewillpropagatetoallmachines,allowingthemto
updatetheirmap.Thisisessentiallytheschemeusedtoday.
The Internet has a hierarchy of communication links and processors. First, all
computersontheInternethaveauniqueIP(InternetProtocol)addressthroughwhichthey
are reached. Because there are a lot of computers in the world, an IP address is a large
number.Anexamplewouldbe172.16.254.1(obtainedfromWikipedia).Whenacomputer
in,say,Portlandwantstosendamessageto,forexample,London,thePortlandcomputer
composes a packet that contains the message, its address, and the recipient’s address in
London.ThismessageissentalongtheconnectiontoitsInternetserviceprovider,which
isalocalcomputer,atarelativelowspeed,10megabitspersecondperhaps.Theservice
provider operates a collection of computers designed to handle network traffic. This is
calledaPointofPresence(POP)andcollectsmessagesfromalocalareaandconcentrates
themfortransmissionfurtherdowntheline.
Multiple POP sites connect to a Network Access Point (NAP) using much faster
connectionsthanusershavetoconnectwiththePOP.The NAPconcentratesevenmore
users, and provides a layer of addressing that can be used to send the data to the
destination. The NAP for the Portland user would have the message delivered to a
relativelylocalNAP,whichwouldsendittothenextNAPalongapathtothedestination
in London using an exceptionally fast (high bandwidth) data connection. The London
NAPwouldsendthemessagetotheappropriatelocalPOP,whichwouldinturnsenditto
thecorrectuser.
AnimportantconsiderationisthatthemessagecanbereadbyanyPOPnorNAPserver
alongtheroute.DatasentalongtheInternetispublicunlessitisproperlyencryptedbythe
users.
0.4.2 WorldWideWeb
The World Wide Web, or simply the Web, is in fact a layer of software above the
Internetprotocols.Itisawaytoaccessfilesanddataremotelythroughavisualinterface
provided by a program that runs on the user’s computer, a browser. When someone
accesses a web page, a file that describes that page is downloaded to the user and
displayed.Thatfileistextinaparticularformat,andthefilenameusuallyendsin“.html”
or“.htm.”Thefileholds adescriptionofhow todisplaythepage: whattexttodisplay,
whereimagescanbefoundthatarepartofthepage,howthepageisformatted,andwhere
otherconnectedpages(links)arefoundontheInternet.Oncethefileisdownloaded,allof
thehardworkconcernedwiththedisplayofthefile,suchasplayingsoundsandvideos
anddrawinggraphicsandtext,isdonebythelocal(receiving)computer.
TheWebisthebasisformostofthemodernadvancesinsocialnetworkingandpublic
dataaccess.TheInternetprovidestheunderlyingnetworkcommunicationsfacilitywhile
theWebusesthattofetchanddisplayinformationrequestedbytheuserinavisualand
auditory fashion. Podcasts, blogs, and wikis are simple extensions of the basic
functionality.
The Web demands the ability for a user in Portland to request a file from a user in
Londonandtohavethatfiledeliveredandmadeintoagraphicaldisplay,allwithasingle
click of a mouse button. Web pages are files that reside on a computer that has an IP
address, but the IP address is often hidden by a symbolic name called the Universal
Resource Locator (URL). Almost everyone has seen one of these:
“http://www.facebook.com”isoneexample.Webpageseachhaveauniquepathoraddress
basedonaURL.Itisaprettyamazingfactthatanyonecancreateanewwebpagethat
usesitsveryownunambiguousURLatanytime,andthatmostoftheworldwouldbeable
toviewit.
TheWebisanexampleofwhatprogrammerscallaclient-serversystem.Theclientis
where the person requesting the web page lives, and is making a request. The server is
where the web page itself lives, and it satisfies the request. Other examples of such
systemswouldbeonlinecomputergames,Email,Skype,andSecondLife.
0.5 REPRESENTATION
Whenapplyingacomputertoataskorwritingaprogramtodealwithatypeofdata
thatseemstobenon-numeric,theissueofhowtorepresentthedataonthecomputerwill
invariably arise. Everything stored and manipulated on a computer has to be a number.
Whatifthedataisnotnumeric?
A fundamental example of this is character data. When a user types at the computer
keyboard,whatactuallyhappens?Eachkey,and some keycombinations(e.g.,shiftkey
and“1”helddownatthesametime),whenpressedwillresultinelectricalsignalsbeing
sent along a set of wires that connect to an input device on the computer, a USB port
perhaps. While knowing the details of USB and the keyboard hardware is beyond the
scopeofthisbook,itiseasytounderstandthatpressingakeycanresultinanidentifiable
combination of wires being given a voltage. This is in fact a representation of the
character, and one that underlies the one that will be used on the computer itself. As
describedpreviously,voltagescanbeusedtorepresentbinarynumbers.
Therepresentationofcharactersonacomputeramountstoanassignmentofanumber
toeachpossiblecharacter.Thisassignmentcouldbearbitrary,andforsomedataitis.The
valueoftheletter“a”couldbe1,“b”couldbe12,and“c”couldbe6.Thiswouldwork,
butitwouldbeapoorrepresentationbecausecharactersarenotinanarbitraryorder.The
letter“b”shouldbebetween“a”and“c”invaluebecauseitispositionedthereinthedata
set,thesetofcharacters.Inanycase,whencreatinganumericrepresentation,thefirstrule
is:
1.Iftherearearelativelysmallnumberofindividualdataitems,assignthem
consecutivevaluesstartingat0.Ifthereisapracticalreasontostartatsomeother
number,thendoso.
Thesecondruleconsiderstheexistingorderingoftheelements:
2.Incaseswheredataitemsareassignedconsecutivevalues,assigntheminamanner
thatmaintainsanypredefinedorderoftheelements.
Thismeansthatinadefinitionofcharacters,theletters“a,”“b,”and“c”should
appearinthatorder.
3.Incaseswheredataitemsareassignedconsecutivevalues,assigntheminamanner
thatmaintainsanypreexistingdistancebetweentheelements.
Thismeansthattheletters“a,”“b,”and“c”wouldbeadjacenttoeachotherinthe
numericrepresentationbecausetheyarenexttoeachotherinthealphabet.Italso
meansthatcharacterclasseswillstaytogether;theuppercaseletterswillbe
consecutive,thedigitswillalsohaveconsecutivecodessothatthecodefor“0”will
beadjacenttoandsmallerthanthecodefor“1”,andsoon.Thissetofthreerules
actuallycreatesaprettygoodmappingofcharacterstonumbers.However,thereare
morerulesformakingrepresentations.
4.Incaseswheredataitemsareassignedconsecutivevalues,assigntheminamanner
thatsimplifiestheoperationsthatarelikelytobeperformedonthedata.
Inthepresentexampleofcharacterdata,therearerelativelyfewplaceswherethis
rulewouldbeinvoked,butonewouldbewhencomparingcharacterstoeachother.
Acharacter“A”isusuallythoughttocomebefore“a,”sothismeansthatallofthe
uppercaseletterswillcomebeforealllowercaseones,inanumericalsense.
Similarly,“0”comesbefore“A,”soalldigitscomebeforealllettersinthe
representation.Aspacewouldcomebefore(i.e.,haveasmallervaluethan)any
characterthatprints.
Oneofthemostcommoncharacterrepresentations,namedtheAmericanStandard
CodeforInformationInterchangeorASCII,hasallofthesepropertiesandafew
others.ThestandardASCIIcharactersetlists128characterswithnumericalcodes
from0to127.Inthetablebelow,eachcharacterislistedwiththecodethat
representsit.Theyappearinnumericalorder.Thecharactersinorangeare
telecommunicationscharactersthatareneverusedbyatypicalcomputeruser;green
charactersarenon-printingcharactersthatareusedforformattingtextonapage;
lettersandnumbersforEnglisharered;specialcharacterslikepunctuationareblue.
Thespacecharacterisinsomesenseunique,andisblack.
Furtheronthesubjectofrepresentation,ifthereareaverylargenumberofpossible
datavalues,thenenumeratingthemwouldseemunreasonable.Thereareusually
otherwaystoattackthatsortofproblem.
5.Ifthedatacanbebrokenupintoenumerableparts,thentrytodothat.
Datescanbeanexampleofthiskindofdata.Therearetoomanydatestostoreas
discretevalues,asthereisnoactualday0andthereisnopracticalfinaldayinthe
generalcase.However,acommonwaytostateadateistogiveayear,amonth,and
aday.Thisisawkwardfromacomputerperspectivebecauseofthevariablenumber
ofdaysineachmonth,butworksprettywellforhumans.Eachcomponentis
enumerable,soapossiblerepresentationforadatewouldbeasthreenumbers:year,
month,day;itwouldbeYYYYMMDD,whereYYYYisafour-digityear,MMisa
numberbetween0(January)and11(December),andDDisanumberbetween0
and30,whichisthedayofthemonth.
Thisrepresentationshouldkeepthedatesinthecorrectsequence,soDec9,1957
(19571108)willcomeafterAug24,1955(19550723).However,anothercommon
operationondatesistofindthenumberofdaysbetweentwospecifieddates.Thisis
difficult,andtheonlyrepresentationthatwouldsimplifyitwouldbetostart
countingdaysatazeropoint.IfthatzeropointwereJan1,1900,thenthe
representationforthedateOct31,2017wouldbe43037.Thenumberofdays
betweentwodateswouldbefoundbysubtraction.However,printingthedateina
formforhumanstoreadisdifficult.Whenselectingarepresentation,themost
commonoperationsonthedatashouldbetheeasiestonestoperform.
Anotherexampleofthissortorrepresentationiscolor,whichwillbediscussedin
detailinalaterchapter.
6.Whenthedataispartofacontinuousstreamofrealvalues,thenitmaybepossible
tosamplethemand/orquantizethem.
Samplingmeanstorepresentasequencebyusingasubsetofthevalues.Imagineaset
ofnumberscomingfromaseismometer.Thenumbersequencerepresentsmeasurements
ofthemotionofthegroundcapturedcontinuouslybyamechanicaldevice.Itisnormally
OKtoignoresomeofthesevalues,knowingthatbetweenavalueof5.1(whateverthat
means)andavalueof6.3,thenumberswouldhavetakenonallpossiblevaluesbetween
thosetwo;that’swhatcontinuousmeans.
So instead of capturing an infinite number of values, which is not possible, why not
captureavalueeverysecond,ortenthofasecond,oratwhateverintervalmakessensefor
the data concerned. Some data will be lost. The important thing is not to lose anything
valuable.
The samething can bedone spatially. If someone is building a road, then it must be
surveyed.Aset ofheight values forpoints alongthearea tobe occupiedbythe roadis
collectedsothatamodelofthe3Dregioncanbebuilt.Butbetweenanytwopointsthat
can be sampled there is another point that could be sampled, on to infinity. Again, a
decisionismadetolimitthenumberofsamplessothatthemeasurementsaremadeevery
fewyards.Thislimitstheaccuracy,butnotinapracticalway.Theheightatsomespecific
pointmaynothavebeenmeasured,butitcanbeestimatedfromthenumbersaroundit.
The distance between two sample points is referred to casually as the resolution. In
spatial sampling it will be expressed in distance units and says something about the
smallestthingthatcanbepreciselyknown.Intimesamplingitisexpressedinseconds.
Quantization means how accurately each measurement is known. In high school
science,numbersthataremeasurementsaregiventosomenumberofsignificantfigures.
Measuringaweightas110.9881poundswouldseemimpossiblyaccurate,and111would
be a more reasonable number. Quantization in computer terms would be restricting the
numberofbitsusedtorepresentthevalue.Somethingthatisstoredasan8-bitnumbercan
have256distinctvalues,forexample.Iftheworld’stallestpersonisunder8feettall,then
using8bitstorepresentheightwouldmeanthat8feetwouldbebrokenupinto256parts,
whichis0.375inches;thatis,8feetx12inches/foot=96inches,anddividingthisinto
256parts=0.375.Thesmallestdifferenceinheightthatcouldbeexpressedwouldbethis
value,alittleoverathirdofaninch.
Quantization is reflected in the representation as a possible error in each value. The
greaterthenumberofbitspersamplethemoreaccuratelyeachoneisrepresented.Theuse
of sampling and quantization is very common, and is used when saving sounds (MP3),
images(JPEG),andvideos(AVI).
Thereareotherpossibleoptionsforcreatingarepresentationfordata,butthesixbasic
ideasherewillworkmostofthetime,aloneorincombination.Aprogrammerwillspend
mostofhisorhertimelivingwiththeconsequencesoftherepresentationstheychosefor
theirdata.Apoorchoicewillresultinmorecomplexcode,whichgeneratesmoreerrors
andlessoverallsatisfactionwiththeresult.Spendingalittleextratimeatthebeginning
analyzingthepossibilitiescansavealotofeffortlater.
0.6 SUMMARY
Computersaredevicesthathumanshavebuiltinordertofacilitatecomplexcalculations
and are tools for rapidly and accurately manipulating numbers. When humans
communicate with each other, we use a language. Similarly, humans use languages to
communicate with computers. A computer program can be thought of as a sequence of
operationsthatacomputercanperforminordertoaccomplishacalculation.Thekeyis
thatitmustbeexpressedintermsthatthecomputercando.
Early computers were mechanical, using gears to represent numbers. Electronic
computers usually use two electrical states or voltages to represent numbers, and those
numbersareinbinaryorbase2form.Electroniccomputershavememoriesthatcanstore
numbers, and everything stored in memory must be in numeric form. That includes the
instructionsthatthecomputercanexecute.
Computers have been around long enough to provide many layers of computer
programs that can assist in their effective use: graphical user interfaces, assemblers,
compilers for programming languages, web browsers, and accounting packages each
provideauserwithadifferentviewofacomputerandadifferentwaytouseit.Computers
can exchange data between each other using wires over short distances (computer
network) and long ones (Internet). The World Wide Web sits atop the Internet and
provides an easy and effective way for computers all over the world to exchange
informationinanyform.
Everythingstoredandmanipulatedonacomputerhastobeanumber.Whatifthedata
is not numeric? In that case a numeric representation has to be devised that effectively
characterizestheinformationwhilepermittingitsefficientmanipulation.
Exercises
1.Convertthefollowingbinarynumbersintodecimal:
a)0100000
b)0000100
c)0000111
d)0101010
e)0110100101
f)0111111
g)110110110
2.Convertthefollowingdecimalnumbersintobinary:
a)10
b)100
c)64
d)128
e)254
f)5
g)999
3.Corememorywouldnoteraseitselfwhenitspowersourcewasremoved.Give
reasonswhythisisavaluableproperty.
___________________________________________________
___________________________________________________
___________________________________________________
4.Specifyadevicethatisusedfor:
a)Outputonly
b)Inputonly
c)Bothinputandoutput
5.AdaCountessofLovelaceisgenerallyconsideredtobethefirstprogrammer,but
somecontraryinformationhascometolightrecently.Searchtheliteraturefortwo
articlesoneachsideoftheargumentandformulateaconclusion.Wasshe?
6.Whatisthedifferencebetweenacompilerandaninterpreter?Giveanexampleof
each.
7.IdentifyaGUIwidgetthatwasnotdiscussedinthischapter.Sketchitsappearanceand
describeitsoperation.Giveanexampleofasituationwhereitmightbeused.
8.GivetheASCIIcodesforthefollowingcharacters:
a)ꞌPꞌ
b)ꞌ;ꞌ
c)ꞌrꞌ
d)ꞌ,ꞌ
e)ꞌ=ꞌ
9.WhatisthevalueoftheASCIIcodeforthecharacter“1”minusthecodeforthe
character“0”?Whatis“2”-“0”?Whatdoesthissayaboutconvertingfromthe
characterformofanumberintoitsnumericvalueingeneral?
10.Considertheimaginarycomputerdevisedinthischapter.Ithasamemoryinwhich
eachlocationhas12binarydigits(bits)tostoreanumber.Inoneofthememory
locationsthevalue101000000000isseen.Whatisthis?Isitaninstruction,anumber,
acharacter,anaddress,orsomethingelse?Howcanthisbedetermined?
NotesandOtherResources
1.L.Carlitz.(1968).Bernoullinumbers,FibonacciQuarterly,6,71–85.
2.DigitalEquipmentCorporation.(1972).IntroductiontoProgramming,PDP-8
handbookseries,onlineversion
http://www.mirrorservice.org/sites/www.bitsavers.org/pdf/dec/pdp8/handbooks/IntroToProgramming1969.pdf
3.JamesEssinger.(2004).Jacquard’sWeb,OxfordUniversityPress,Oxford,ISBN
978-0-19-280578-2.
4.TonySale.TheColossusComputer1943–1996:HowItHelpedtoBreakthe
GermanLorenzCipherinWWII,M.&M.Baldwin,Kidderminster,2004,ISBN0-
947712-36-4.
5.StephenStephenson.(2013).AncientComputers,PartI-Rediscovery,2nd
Edition,ISBN1-4909-6437-1.
6.A.M.Turing.(1936).OnComputableNumbers,withanApplicationtothe
Entscheidungsproblem.
7.MichaelR.Williams.(1998).The“LastWord”onCharlesBabbage,IEEEAnnals
oftheHistoryofComputing,20(4),10–4,doi:10.1109/85.728225
8.JavierYanes.(2015).AdaLovelace:OriginalandVisionary,butNoProgrammer,
OpenMind,December9,2015,https://www.bbvaopenmind.com/en/ada-lovelace-
original-and-visionary-but-no-programmer/
CHAPTER1
COMPUTERSANDPROGRAMMING
1.1SolvingaProblemUsingaComputer
1.2ExecutingPython
1.3GuessaNumber
1.4Rock-Paper-Scissors
1.5SolvingtheGuessaNumberProblem
1.6SolvingtheRock-Paper-ScissorsProblem
1.7IFStatements
1.8Documentation
1.9Rock-Paper-ScissorsAgain
1.10TypesAreDynamic(Advanced)
1.11Summary
Inthischapter
The vast majority of computers that most people encounter are referred to as digital
computers. This refers to the fact that the computer works on numbers. Other kinds of
computersdoexistbutarenotascommon;analogcomputersoperateinanumberofother
ways, but are usually electrical—they manipulate electrical voltages and currents—or
mechanical—theyusegearsandshaftstocalculateamechanicalresponse.
Thefactthatanyproblemmustbeexpressedinnumericalformhaspresentedaproblem
to some potential programmers. I’m not good at math is a common complaint, and the
beliefthatcomputerprogrammingrequiresaknowledgeof
advancedmathematicsisusedasareasontonotstudyprogramming.Infact,thekindof
mathcommonlyneededwouldmoreproperlybecalledarithmetic,notmath.
Inorderforaproblemtobesolvedusingacomputer,theproblemmustbeexpressedin
a way that manipulates numbers, and the data involved must be numeric. This is often
accomplishedbysomekindofencodingofthedata.Itissocommonthattheprocessis
invisibleonmoderncomputers.Mostdatahaveavarietyofencodingsthathavebeenused
for years and are taken for granted: images in JPEG format or sounds in MP3 are
examplesofcommonlyusedencodingofdataintonumbers.
What can computers do with numbers? Addition, subtraction, multiplication, and
divisionarethebasicoperations,butcomputerscancomparethevalueofnumberstoo.
1.1 SOLVINGAPROBLEMUSINGACOMPUTER
Theprocessbeginswithaproblemtobesolved,andthefirststepistostatetheproblem
asclearlyaspossible.Thisfirststepiscriticallyimportantbecauseunlesstheproblemis
completely understood, its solution on a computer is impossible. Then the problem is
analyzedtodeterminemethodsbywhichitmaybesolved.Ascomputerscanonlydirectly
manipulatenumbers,itiscommonforsolutionsdiscussedatthisstagetobenumericalor
mathematicalandforthemtoinvolvedecidinguponrepresentationsforthedatathatwill
facilitatesolvingtheproblem.Thenasketchofthesolution,perhapsonpaperinahuman
languageandmath,iscreated.Thisistranslatedintocomputerlanguageandthentyped
intocomputerformusingakeyboard.Theresultingtextfileiscalledascript,sourcecode,
or more commonly the computer program. A program called a compiler takes this
programandconvertsitintoaformthatcanbeexecutedonthecomputer.Basically,all
programsareconvertedinto asetof numberscalledmachine code,whichthe computer
canexecute.
WearegoingtolearnalanguagecalledPython.Itwasdevelopedasageneralpurpose
programming language and is a good language for teaching because it makes a lot of
thingseasy.QuiteafewapplicationsarebuiltusingPython;forexample:thegamesEve
OnlineandCivilizationIV,BitTorrent,andDropboxtonameonlyafew.Itisabitlikea
lot of other languages in use these days in terms of structure (syntax) but has some
simplifyingideasthatwillbediscussedinlaterchapters.
Inordertouseaprogramminglanguage,therearesomebasicconceptsandstructures
thatneedtobeunderstoodatabasiclevel.Someoftheseconceptswillbeintroducedin
thischapter,andtherestofthebookwillteachyoutoprogrambyexample;inallcases,
codingexampleswillbeintroducedbystatingaproblemtobesolved.Theproblemstobe
solvedinthischapterinclude:asimpleguess-a-numbergameandthegameofrock-paper-
scissors.TheseproblemswillbethemotivationforlearningmoreabouteitherthePython
language itself or about methods of solving problems. Any computer programs in this
bookwillexecuteonacomputerrunninganymajoroperatingsystemoncethefreePython
languagedownloadhasbeeninstalled.
1.2 EXECUTINGPYTHON
InstallingPythonisnottoodifficult,andinvolvesdownloadingtheinstaller,runningit,
andperhapsconfiguringafewspecificdetails.Thisprocesscanbefoundonthenet.Once
installedthereareafewvariationsthatcanbeusedwithit,thesimplestprobablybeingthe
PythonGraphicalUserInterfaceorGUI.IfrunningPythononaWindowsPC,lookatthe
Start menu for Python and click; a link named “IDLE (Python GUI)” will be seen, as
showninFigure1.1.Clickonthisandtheuserinterfacewillopen.Clickthemouseinthe
GUIwindowsothatyoucanstarttypingcharactersthere.
PythoncanberuninteractivelyintheGUIwindow.Thecharacters“>>>”arecalleda
prompt, and indicate that Python is waiting for something to be typed at the keyboard.
AnythingtypedherewillbepresumedtobeaPythonprogram,oratleastpartofone.Asa
demonstration, type “1” followed by “Enter.” Python responds by printing “1.” Why?
When“1”wastypeditwasaPythonexpression,somethingtobeevaluated.Thevalueof
“1”issimply“1,”sothatwastheanswerPythoncomputed.
Nowtype“1+1.”Pythonrespondswith“2.”Pythoninputswhattheuser/programmer
types,evaluatesitasamathematical(inPythonform)expression,andprintstheanswer.
Thisisnotreallyprogrammingyet,becauseabasictwo-dollarcalculatorcandothis,butit
iscertainlyastart.
IDLEisgoodformanythings,buteventuallyamoresophisticatedenvironmentwillbe
needed,onethatcanindentautomatically,detectsomekindsoferrors,andallowprograms
toberunanddebuggedandsavedasprojects.Thiskindofsystemiscalledanintegrated
development environment, or IDE. There are many of these available for Python, some
costing quite a lot and some freely downloadable. The code in this book has been
compiledandtestedusingonecalledPyCharm,butmostIDEs outtherewouldbe fine,
anditislargelyamatterofpersonalpreference.BasicPyCharmisfree,andithasabigger
brotherthatcostsasmallamount.
AnadvantageofanIDEisthatitiseasytotypeinawholeprogram,runit,findthe
errors, fix them, and run it again. This process is repeated until the program works as
desired. Multiple parts of a large program can be saved as separate files and collected
togetherbytheIDE,andtheycanbeworkedonindividuallyandtestedtogether.Anda
good IDE uses color to indicate syntax features that Python understands and can show
somekindsoferrorwhilethecodeisbeingentered.
Tobeginprogrammingitmustbeunderstoodthatalanguagehasasyntaxorstructure,
andthatforcomputerlanguagesthisstructurecannotbevaried.Thecomputerwillalways
bethearbiterofwhatiscorrect,andifanyprogramhasasyntaxerrorinitorproduces
erroneousresults,thenitistheprogramandnotthecomputerthatisatfault.
Next, one should appreciate that the syntax is arbitrary. It was designed by a human
withattitudesand biasesand newideas, andwhile thesyntax mightsometimes beugly
andhardtorecall,itiswhatitis.Partsofitmightnotbeunderstoodatfirst,butaftera
whileandafterreadingandexecutingthefirstprogramsinthisbook,mostofitwillmake
sense.
A program, just like any sentence or paragraph in English, consists of symbols, and
ordermatters.Somesymbolsarespecialcharacterswithadefinedmeaning.Forexample,
“+”usuallymeansadd,and“-”usuallymeanssubtract.Somesymbolsarewords.Words
definedbythelanguage,likeif,while,andtrue,cannotalsobedefinedbyaprogrammer
—they mean what the language says they mean, and are called reserved words. Some
names have a definition given by the system but can be reused by a programmer as
needed.Thesearecalledpredefinednamesorsystemvariables.However,somewordscan
bedefinedbytheprogrammer,andarethenamesforthingstheprogrammerwantstouse
intheprogram:variablesandfunctionsareexamples.
1.3 GUESSANUMBER
Games that involve guessing are common, and are sometimes used to resolve minor
conflictssuchaswhogetsthenextpieceofcakeorwhogetsthefirstkickatafootball.
It’s also sometimes a way to occupy time, and can simply be fun. How can we write a
programtohavetheuserguessanumberthattheprogramhaschosen?
There are many variations on this simple game. In one the number is to be guessed
precisely.Oneperson(thechooser)hasselectedanumber,aninteger,inaspecifiedrange.
“Pickanumberbetweenoneandten”wouldbeatypicalproblem.Theotherperson,the
guesser,must choose anumberinthatrange.Iftheyselectthecorrectnumber,thenthe
guesserwins.Thisisaboringgameandisbiasedinfavorofthechooser.
Amoreinterestingvariationwouldbetostartwithoneguessandhavethechooserthen
saywhetherthetargetnumberisgreaterthanorlessthantheguessednumber.Theguesser
thenguessesagain,andtheprocesscontinuesuntilthenumberisguessedcorrectly.The
rolesofguesserandchoosercannowswitchandthegamestartsagain.Thebestguesseris
theonewhousesthefewestguesses.
Athirdalternativeisto havemultipleguessers.Allguessersmaketheirselectionand
theonewhohaschosenanumbernearestthecorrectnumberisthewinner.Thisisthebest
game for solving disputes, because it involves one guess from each person. Ties are
possible,inwhichcasethegamecanbeplayedagain.
1.4 ROCK-PAPER-SCISSORS
Althoughthisgameisusedbychildrentosettledisputes and makerandomdecisions
suchas “who goes first,” ithas beentaken moreseriously byadults. There areactually
competitions where money is at stake. A televised contest in Las Vegas had a prize of
$50,000.Thisgameisnotastrivialasitoncewas.
Inthisgameeachoftwoplayersselectsoneitemfromthelist[rock,paper,scissors]in
secret,andthenbothdisplaytheirchoicesimultaneously.Ifbothplayersselectedthesame
item,thentheytryagain.Otherwise,rockbeatsscissors,scissorsbeatspaper,andpaper
beatsrock.Thiscontestcanberepeatedfora“bestoutofN”competition.
Bothofthesegamesformthefirstproblemset,andserveas
themotivationforlearningtheelementsofthePythonlanguage.
1.5 SOLVINGTHEGUESSANUMBERPROBLEM
The simple version of the guessing program has two versions, depending on who is
guessing.Thecomputershouldpickthenumberandthehumanusershouldguess,because
theotherwayaroundcanusesomecomplexprogramming.Inthatcasehere’swhathasto
happen:
1.Thecomputerselectsanumber.
2.Thecomputeraskstheplayertoguess.
3.Theplayertypesanumberonthekeyboardandthecomputerreadsitin.
4.Thecomputercomparestheinputnumberagainsttheonethatitselected,andifthe
twoagree,thentheplayerwins.Otherwisethecomputerwins.
ThePythonfeaturesneededtodothisinclude:printingamessage,readinginanumber,
havingaplacetostoreanumber(avariable),havingawaytoselectanumber,andhaving
awaytocomparethetwonumbersandactdifferentlydependingontheresult.
Thesecondversionrequirestheabove,plusawaytorepeattheprocessincaseswhen
theguessiswronganduntilitiscorrect.Inthiscasethemethodbecomes:
1.Thecomputerselectsanumber.
2.Thecomputeraskstheplayertoguess.
3.Theplayertypesanumberonthekeyboardandthecomputerreadsitin.
4.Thecomputercomparestheinputnumberagainsttheonethatitselected,andifthe
twoagree,thentheplayerhasguessedcorrectly.ExittoStep7.
5.Thecomputerdetermineswhethertheguessishigherorlowerthantheactual
numberandprintsanappropriatemessage.
6.RepeatfromStep2.
7.Gameover.
Therepetitionmechanismistheonlynewaspecttothissolution,butitisanessential
componentofPythonandeveryotherprogramminglanguage.
1.6 SOLVINGTHEROCK-PAPER-SCISSORS
PROBLEM
The solution to this problem has no new requirements, but re-enforces the language
featuresoftheprevioussolutions.Onesolutiontothisproblemis:
1.Selectarandomchoiceformthethreeitemsrock,paper,orscissors.Savethis
choiceinavariablenamedchoice
2.Asktheplayerfortheirchoice.Useanintegervalue,where1=rock,2=paper,and
3=scissors
3.Readtheplayer’sselectionintoavariablenamedplayer
4.Ifplayerisequaltochoice
5.Printthemessage“Tie.We’lltryagain.”
6.RepeatfromStep1
7.Ifplayerisequaltorock
8.Ifchoiceisequaltoscissors,gotoStep17
9.ElsegotoStep18
10.Ifplayerisequaltopaper
11.Ifchoiceisequaltoscissors,gotoStep17
12.ElsegotoStep18
13.Ifplayerisequaltoscissors
14.Ifchoiceisequaltorock,gotoStep17
15.ElsegotoStep18
16.Printerrormessageandterminate
17.Print“Computerwins”andterminate
18.Print“Youwin”andterminate
Foreachplayerselection,oneofthealternateitemswillbeatitandonewilllosetoit.
Eachchoiceischeckedandthewin/losedecisionismadebasedontheknownoutcomes.
The solutions to both problems require similar language elements: a way to store a
value(avariable),awaytoexecutespecificpartsoftheprogramdependingonthevalue
ofavariableorexpression(anifstatement),awaytoreadavaluefromthekeyboard,a
waytoprintamessageonthescreen,andawaytoexecutecoderepeatedly(aloop).
1.6.1 VariablesandValues–ExperimentingwiththeGraphicalUser
Interface
Avariableisanamethattheprogrammercandefinetorepresentsomevalue,anumber
oratextstringgenerally.Itrepresentstheplacewherethecomputerstoresthatvalue;itis
a symbol in text form, something humans like, representing a value. Everything that a
computerdoesisultimatelydonewithnumbers,sothelocationofanythingisanumber
thatrepresentstheplaceincomputermemorywherethatthingisstored.It’slikeofficesin
a building. Each office has a number (its address) and usually has a name too (the
occupantorbusinessfoundthere).Additionally,theofficehascontents,andthosecontents
areoftendescribedbythenamegiven.InFigure1.2acollectionofofficesinaspecific
buildingcanbeseen.Inthismetaphortheofficenumbercorrespondstotheaddress,and
thename(variablename),beingmorehumanfriendly,ishowitisoftenreferredtobya
person(programmer).Inallcases,though,itisthecontentsoftheoffice(location)thatare
important.Thenumberandnamearewaystoaccessit.So,someonemightsay“Bringme
thePythonmanualfromtheServerRoom”or“BringmethePythonmanualfrom607”and
bothwouldbethesamething.Thecontentsoflocation607wouldbethePythonmanual.
Nowsomeonecouldsay“PutthisPythonmanualintheDigitalMediaLab,”whichwould
changethecontentsoflocation611.InactualPythontheactofretrievingavaluefroma
locationdoesnotchangethecontentsofthatlocation,butinsteadmakesacopy,butthe
basicmetaphorissound.
Notallstringsorcharacterscanbevariablenames.Avariablecannotbeginwithadigit,
for example, or with most non-alphabetic characters like “&” or “!,” although in some
casesbeginningwith“_”isacceptable.Avariablenamecancontainupper-orlowercase
letters,digits,and“_”.Uppercaseandlowercasearedistinct,sothevariablesHelloand
helloaredifferent.
Soavariablecanchangevaluesbut,unlikearealoffice,asimplevariablecanholdonly
onevalueatatime.Thenamechosendoesnothavetobesignificant.Programsoftenhave
variablesnamediorx.However,itisagoodideatoselectnamesthatrepresentthekind
of value that the variable it to contain so as to communicate that meaning to another
person,aprogrammerprobably.Forexample,thevalue3.1415926shouldbestoredina
variablenamedpi,becausethat’sthenameeveryoneelsegivestothisvalue.
IntheGUItypepi=3.1415926.Pythonrespondswithaprompt,whichindicatesthatit
is content with this statement, and that it has no value to print. If you now type pi,the
responsewillbe3.1415926;thevariablenamedpithatwasjustcreatednowhasavalue.
InthesyntaxofPython,thenamepiisavariable,thenumber3.1415926isaconstant,
but is also an expression, and the symbol = means assign to. In the precise domain of
computer language, pi = 3.1415926 is an assignment statement and gives the variable
namedpithespecifiedvalue.
Continuingwiththisexample,defineanewvariablenamedradiustobe10.0usingan
assignmentstatementradius=10.0.Ifyoutyperadiusand“enter,”Pythonrespondswith
10.0.Finally,weknowthatthecircumferenceofacircleis2prinmathterms,or2times
pi times the radius in English. Type 2*pi*radius into the Python GUI, and it responds
with62.831852, which is the correct answer. Now type circumference = 2*pi*radius,
andPythonassignsthevalueofthecomputationtothevariablecircumference.
Python defines a variable when it is given a value for the first time. The type of the
variableisdefinedatthatmomenttoo;thatis,ifanumberisassignedtoaname,thenthat
nameisexpectedtorepresentanumberfromthenon.Ifastringisassignedtoaname,
thenthatnamewillbeexpectedtobeastringfromthenon.Tryingtouseavariablebefore
ithasbeengivenavalueandatypeisanerror.Attemptingthecalculation:
area=side*side
is not allowed unless there is a variable named side already defined at this point. The
followingisOKbecauseitdefinessidefirst,andtheninturnisusedto
definearea:
side=12.0
area=side*side
Thetwolinesabovearecalledstatementsinaprogramminglanguage,andinPythona
statementusuallyendsattheendoftheline(the“enter”keywaspressed).Thisisabit
unusualinacomputerlanguage,andpeoplewhoalreadyknowJava orC++havesome
difficultywiththisideaatfirst.Inothercomputerlanguagesstatementsareseparatedby
semicolons,notbytheendoftheline.Infact,inmostlanguagestheindentingoflinesin
theprogramdoesnothaveanymeaningexcepttotheprogrammer.InPythonthat’snotthe
caseeither,aswillbeseenshortly.
Theexpressionsweuseinassignmentscanbepretty complicated,but arereallyonly
thingsthatwelearnedinhighschool.Add,subtract,multiply,anddivide.Multiplication
anddivisionareperformedbeforeadditionandsubtraction,whichiscalledaprecedence
rule,so3*2+1is7,not9;otherwise,evaluationisdonelefttoright,so6/3*2is4(dothe
divisionfirst)asopposedto1(ifthemultiplicationwasdonefirst).Thesearerulesthat
shouldbefamiliarbecauseitishowpeoplearetaughttodoarithmetic.Thesymbol“**”
meansexponentortothepowerof,so2**3is23whichis8,andthisoperatorhasahigher
precedence (i.e., is done before) than the others. Parentheses can be used to specify the
order of things. So, for example, (2+3)**2 is 25, because the expression within the
parenthesisisdonefirst,thentheexponent.
1.6.2 ExchangingInformationwiththeComputer
When using most programming languages, it is necessary to design communication
with the computer program. This goes two ways: the program will inform the user of
things,suchasthecircumferenceofacirclegivenaspecificradius,andtheusermaywant
totelltheprogramcertainthings,likethevalueoftheradiuswithwhichtocomputethe
circumference. We communicate with a program using text, which is to say characters
typedintoakeyboard.Whenacomputerispresentingresults,thattextisoftenintheform
ofhuman language, messages as sentences.“The circumference is62.831852” could be
suchamessage.Thesentenceisactuallycomposedbyaprogrammerandhasanumberor
collectionofnumbersembeddedwithinit.
Python allows a programmer to send a message to the screen, and hence to the user,
using a print directive. This is the word print followed by a characterstring, which is
oftenasetofcharactersinquotes.Anexample:
print(“Theanswerisyes.”)
Theparenthesesareusedtoencloseeverythingthatistobeprinted;suchastatement
canprintmanystringsiftheyareseparatedbycommas.Numberswillbeconvertedinto
stringsforprinting.Sothefollowingiscorrect:
print(“Thecircumferenceis“,62.831852)
Ifavariableappearsinthelistfollowingprint,thenthevalueofthatvariablewillbe
printed,notthenameofthevariable.So:
print(“Thecircumferenceis“,circumference)
isalsocorrect.
1.6.3 Example1:DrawaCircleUsingCharacters
Assumingthatitisdesiredtoprintacirclehavingaconstantpredefinedradius,thiscan
bedonewithafewprintstatements.Theplanningofthegraphicitself(thecircle)canbe
doneusinggraphpaper.Assumingthateachcharacterusesthesameamountofspace,a
circlecanbe approximatedusingsome skillfullyplaced“*”characters. Thenprinteach
rowofcharactersusingaprintstatement.Asamplesolutionis:
print(”***“)
print(”*********“)
print(”*************“)
print(”***************“)
print(”***************“)
print(”***************“)
print(”*************“)
print(”*********“)
print(”***“)
1.6.4 Strings,Integers,andRealNumbers
Computerprogramsdealmainlywithnumbers.Integers,orwholenumbers,andrealsor
floating point numbers, which represent fractions, are represented differently, and
arithmetic works differently on the two types of numbers. A Python variable can hold
eithertype,butifavariablecontainsanintegerthenitistreatedasaninteger,andifit’s
holdingafloatingpointnumberthenitistreatedasoneofthose.What’sthedifference?
First,there’sadifferenceinhowtheyareprintedout.Ifwemaketheassignmentvar=1
andthenprintthevalueofvar,itprintssimplyas1.Ifwemaketheassignmentvar=1.0
andthenprintvar,itprintsas1.0.Inbothcasesvarisarealorfloatingpointnumberand
willbetreatedassuch.Numericconstantswillbethoughtofasrealnumbers.However,a
variablecanbefirstonethingandthenanother.Itwillbethelastthingitwasassigned.
Arithmeticdiffersbetweenintegersandreals,buttheonlytimethatdifferenceisreally
apparentiswhendoingdivision.Integersarealwayswhole,non-fractionalnumbers.Ifwe
divide3by2,both3and2areintegersandsothedivisionmustresultinaninteger:the
resultis1.Thisisbecausethereisexactlyasingle2in3,orifyoulike,2goesinto3just
once,witharemainderof1.Thereisaspecificoperatorfordoingintegerdivision:“//.”
So,3//2 isequalto1. Theremainderpartcan’tbe handledandis discarded,butcanbe
foundseparatelyusingthe“%”operator.Forexample,8//5is1,and8%5istheremainder,
3.Thisexplanationisanapproximationtothetruth,andonethatcanbecleareduplater,
butitworksperfectlywellforpositivenumbers.
Ofcoursefractionsworkfineforrealnumbers,andwillbeprintedasdecimalfractions:
8.0/5.0 is 1.6, for example. What happens if we mix reals and integers? In those cases
things get converted into real, but now things get more complicated, because order can
mattera greatdeal.Theexpression7//2*2.0doesthedivision7//2first,whichis3, and
thenmultipliesthatby2.0,yieldingtheresult6.0;theresultof8/3*3.0wouldbe5.333.
Mixing integers and reals is not a good idea, but if done, the expressions should use
parenthesestospecifyhowtheexpressionshouldbeevaluated.
Arealcanbeusedinplaceofanintegerinmostplaces,buttheresultwillbereal.Thus,
2.0* 3 =6.0, not 6,and 6.0//2is 3.0, not3. There aresome exceptions. To convert an
integertoareal,thereisaspecialoperationnamedfloat:float(3)yields3.0.Ofcourseit’s
possibletosimplymultiplyby1.0andtheresultwillbefloattoo.Convertingfloatvalues
tointegersismorecomplicated,becauseofthefractionissue:whathappenstothedigits
totherightofthedecimal?Theoperationintwilltakeafloatingpointvalueandthrow
away the fraction. The value of int(3.5) will be 3, as a result. It is normal in human
calculations to round to the nearest integer, and the operation round(3.5) does that,
resultingin4.
1.6.5 NumberBases
In elementary school, perhaps grade 3 or 4, the idea of positional number systems is
taught. The number 216 is a way to write the value of 6 + 1*10 + 2*100. Not all
civilizations use such a scheme; Roman numerals are not positional,for example. Still,
most people are comfortable with the idea. What people are not as comfortable with is
changingthenumberbaseawayfrom10.InChapter0,thebinarysystem,orbase2,was
discussed,butanybasethatisapowerof2isofsomeinterest,especiallybase8andbase
16.
Humansuseabase10schemeprobablybecausewehave10fingers.Whatitmeansis
thatwehaveasymbolforeachofthe10digits,0through9,andeachdigitpositiontothe
leftofthefirstdigitismultipliedbythenextpowerof10.Thenumber216isreally2*102
+1*101+6*100.Thebaseis10,andeachdigitrepresentsapowerofthebasemultiplied
by a digit. What if the base is 8? In that case 216 is really 2*82 + 1*81 + 6. If the
arithmeticiscarriedout,thisnumberturnsouttobe128+8+6=142.
If multiple number bases are used, it is common to give the base as a subscript. The
number216inbase8iswrittenas2168.Thedefaultwouldbebase10.Inbase8thereare
only8digits,0through7.Thedigits8and9cannotappear.Inbaseslargerthan10more
symbolsareneeded.Acommonbasetobeusedoncomputersis16,orhexadecimal(hex
forshort).Inahexnumber16digitsareneeded,sotheregularonesareusedandthen“A”
represents10,“B”is11,“C”is12,“D”is13,“E”is14,and“F”is15.Thehexnumber
1216is1*16+2,or1810.Thenumber1A16is1*16+10=2610.
In Python numbers are given in decimal (base 10) by default. However, if a number
constantbegins with“0o” (zero followedby the letter“o”) Pythonassumes it isbase 8
(octal).Thenumber0o21,forexample,is218=1710.Anumberthatbeginswith“0x”is
hexadecimal.0x21is2116=3310.Thisappliesonlytointegers.
There is a number base that is the most important, because it lies under all of the
numbersonacomputer.Thatwouldbebase2.Allnumbersonamoderndigitalcomputer
arerepresentedinbase2,orbinary,intheirinternalrepresentation.Abinarynumberhas
onlytwodigits,0and1,andeachrepresentsapowerof2.Thus,11012is1*23+1*22+
0*21+1=8+4+1=1310.InPythonabinarynumberbeginswith“0b,”sothenumber
0b10101represents2110.
These number bases are important for many reasons, but base 2 is fundamental, and
bases8and16areimportantbecausetheyarepowersof2andsoconvertveryeasilyto
binarybuthavefewerdigits.Oneexampleoftheuseofhexisforcolors.InPythonthey
can represent a color, and on web pages they are certainly used that way. The number
0xFF0000isthecolorred,forexample,ifusedonawebpage.Butmoreofthatlater.
1.6.6 Example2:ComputetheCircumferenceofanyCircle
When humans send information into a computer program, the text tends to be in the
formofnumbers.ThePythoncodethatwaswrittentocalculatetheradiusofacircleonly
didthecalculationforasingleradius:10.That’snotasusefulasaprogramthatcomputes
thecircumferenceofanycircle,andthatwouldmeanallowingtheusertotelltheprogram
what radius to use. This should be easy to do, because it is something that is needed
frequently. Frequently needed things should always be easy. In the case of sending a
number into a program in Python, the word input can be used within a program. For
example:
radius=input()
willacceptanumberfromthekeyboard,typedbytheuser,andwillreturnitasastringof
characters.Thismakessensebecausetheusertypeditasastringofcharacters,butitcan’t
beusedinacalculationinthisform.Toconvertitintotheinternalformofanumber,we
mustspecificallyaskforthistobedone:
radius=input()
radius=float(radius)
willreadastringintoradius,thenconvertitintoafloatingpoint(real)numberandassign
ittothevariableradiusagain.Thiscanbedoneallinonestatement:
radius=float(input())
Now the variable radius can be used to calculate a circumference. This is a whole
computerprogramthatdoesausefulthing.Ifthevalueofradiuswastobeaninteger,the
codewouldread:
radius=int(input())
If the conversion to a number is not done, then Python will
giveanerrormessagewhenthecalculationisperformed,like:
Traceback(mostrecentcalllast):
File“<pyshell#13>”,line1,in<module>
circumference=2*pi*radius
TypeError:can'tmultiplysequencebynon-intof
type'float'
Thisisprettyuninformativetoabeginningprogrammer.WhatisaTraceback?What’s
pyshell?Therearecluesastowhatthismeans,though.Thelineofcodeatwhichtheerror
occursisgivenandthetermTypeErrorisdescriptive.Thiserrormeansthatsomethingthat
can’tbemultiplied(astring)wasusedinanexpressioninvolvingamultiplication.That
thing is the variable radius in this instance because it was a text string and was not
convertedtoanumber.
Alsonotethatint(input())canpresentproblemswhentheinputstringisnotinfactan
integer. If it is a floating point number, this results in an error. The expression
int(“3.14159”) could be interpreted as an attempt convert pi into an integer, and would
havethevalue3;infact,itisanerror.Thefunctionintwaspassedastringandthestring
containedafloat,notanint.ThisissomethingofaquirkofPython.Itisbettertoconvert
inputnumbersintofloats.
1.6.7 GuessaNumberAgain
The simple version of the guessing program can now nearly be written in Python.
Examiningthemethodofsolution,here’swhatcanbecodedsofar;versionsdependon
whoisguessing.Thecomputershouldpickthenumberandthehumanusershouldguess,
becausetheotherwayaroundcaninvolvesomecomplexprogramming.Inthatcasehere’s
whathastohappen:
1.Thecomputerselectsanumber.
choice=7
2.Thecomputeraskstheplayertoguess.
print(“Pleaseguessanumberbetween1and10:“)
3.Theplayertypesanumberonthekeyboardandthecomputerreadsitin.
playerchoice=input()
4.Thecomputercomparestheinputnumberagainsttheonethatitselected,
andifthetwoagree,thentheplayerwins.Otherwisethecomputerwins.
Itisthe finalstepthat isstillnot possiblewithwhat isknown.It isnecessary in this
program,asit isin mostcomputer programs,to makeadecision andto executecertain
code(i.e.,dospecificthings)conditionallybasedontheoutcomeofthatdecision.People
dothatsortofthingallofthetimeinreallife.Examplesinclude:
“Ifthelightisredthenstop,otherwisecontinuethroughtheintersection.”
“Ifalltellersarebusywhenyouarriveatthebank,thenstandinlineandwaitfor
thenextonetobecomeavailable.”
“Ifyouneedbreadormilk,thenstopatthegrocerystoreonthewayhome.”
“Ifitrains,thepicnicwillbecancelled.”
Notice that all of these examples use the word “if.” This word indicates a standard
conditionalsentenceinEnglish.Theconditioninthefirstcaseisthephrase“ifthelightis
red” (called in English the protaxis or antecedent) and the consequence to that is the
phrase“thenstop”(theapodosisorconsequent).Terminologyaside,theintentisclearto
an English speaker: on the condition that or in the event that the light is red, then the
necessary action is that the driver is to stop their car. The action is conditional on the
antecedent, which in Python will be called an expression or, more precisely, a logical
expression,whichhasthevalueTrueorFalse.
ThestructureorsyntaxofthissortofthinginPythonwouldbe:
ifthelightisred:
stop
ormoreexactly:
iflight==red:
#executewhatevercodemakesthecarstop
Thisiscalledanifstatement,andismoreprofoundwithamorecomplexsyntaxthan
canbeinferredfromthisexample.
1.7 IFSTATEMENTS
In Python an if statement begins with the word if, followed by an expression that
evaluatestoTrueorFalse,followedby acolon (:), then a series of statements that are
executed if the expression is true. The names True and False are constants having the
obvious meaning, and a variable that can take on these values is a logical or Boolean
(namedafterthemanwhoinventedtwostateorlogicalalgebra)variable.Theexpression
istheonlytrickypart.ItcanbeaconstantlikeTrue,oravariablethathasaTrueorFalse
value,orarelationalexpression(onethatcomparestwothings)oralogicalcombination
ofanyofthese—anythingthathasaresultthatistrueorfalse.
ifTrue: #Constant
ifflag: #Logicalvariable
ifa<b: #relationalexpression
ifa<bandc>d: #logicalcombination
Alogicalexpressioncanbeanyarithmeticexpressionsbeingcomparedusinganyofthe
followingoperators:
<Lessthan
>Greaterthan
<=Lessthanorequalto
>=Greaterthanorequalto
==Equalto
!=Notequalto
Logicalcombinationscanbe:
andEG:a==bandb==c
orEG:a==bora==c
notEG:not(a==b)#sameas!=
Thesyntaxissimpleandyetallowsahugenumberofcombinations.Forexample:
ifp==qandnotp==zandnotz==p:
ifpi**2<12:
if(a**b)**(c-d)/3<=z**3:
Theconsequent,ortheactionstobetakenifthelogicalexpressionistrue,followsthe
colon on the following lines. The next statement is indented more than the if, and all
statements that follow immediately that have the same indentation are a part of the
consequentandareexecutedifthecondition istrue, otherwisenone ofthemare. Asan
example,consider:
ifa<b:
a=a+1
b=b–1
c=a–b
Inthiscasethetwostatementsfollowingthe“:”areindentedby4morespacesthanis
theif.ThistellsPythonthattheyarebothapartoftheifstatement,andthatifthevalueof
aissmallerthanthevalueofb,then bothof thosestatements willbe executed.Python
callssuchagroupofstatementsasuite.Theassignmenttothevariablecisindentedtothe
samelevelastheif,soitwillbeexecutedinanycaseandisnotconditional.
The use of indentation to connect statements into groups is unusual in programming
languages. Most languages in use pretty much ignore spaces and line breaks altogether,
anduseastatementseparatorsuchasasemicolontodemarkstatements.So,intheJava
languagetheabovecodewouldlooklikethis:
if(a<b){
a=a+1;
b=b–1;
}
c=a–b;
Thebraces{…}enclosethesuite,whichwouldprobablybecalledablockinJavaor
C++. Notice that this code is also indented, but in Java this means nothing to the
computer.Indentationisusedforclarity,sothatsomeonereadingthecodelatercansee
moreclearlywhatishappening.
SemicolonsareusedinPythontoo,butmuchmorerarely.Ifitisdesiredtoplacemore
thanone statement on a singleline, then semicolons can be used to separate them. The
Pythonifstatementunderconsiderationherecouldbewrittenas:
ifa<b:
a=a+1;
b=b-1
c=a-b
Thisishardertocomprehendquicklyandisthereforelessdesirable.Therearetoomany
symbolsallgroupedtogether.Aprogramthatiseasytoreadisalsoeasiertomodifyand
maintain.Codeiswrittenforcomputerstoexecute,butitisalsoforhumanstoread.
There are some special assignment operators that can be used for incrementing and
decrementingvariables.Intheabovecodethestatementa=a+1couldbewrittenasa+=
1,andb=b–1canbewrittenasb-= 1.Thereisnorealadvantagetodoingthis,but
other languages permit it so Python adopted it too. There is another syntax that can be
used to simplify certain code in languages like Java and C, and that is the increment
operator“++”andthedecrementoperator“—.”Pythondoesnothavethese.However,an
effectofthewaythatPythondealswithvariablesandexpressionsisthat“++x”islegal;so
is“++++x.”Thevalueissimplyx.Theexpression“x++”isnotcorrect.
1.7.1 Else
An if statement is a two-way or binary decision. If the expression is true, then the
indicatedstatementsareexecuted.Ifitisnottrue,thenitispossibletoexecuteadistinct
setofstatements.Thisisneededforthepickanumberprogram.Inonecasethecomputer
wins,andintheotherthehumanwins.Anelseclauseiswhatwillallowthis.
Theelseisnotreallyastatementonitsown,becauseithastobeprecededbyanif,so
it’spartoftheifstatement.Itmarksthepartofthestatementthatisexecutedonlywhen
theconditionintheifstatementisfalse.Itconsistsofthewordelsefollowedbyacolon,
followedbyasuite(sequenceofindentedstatements).Soatrivialexampleis:
ifTrue:
print(“Theconditionwastrue”)
else:
print(“theconditionwasfalse”)
Theelseasaclauseisnotrequiredtoaccomplishanyspecificprogramminggoals,and
itcanbeimplementedusinganotherif.Thecode:
ifa<b:
print(“a<b”)
else:
print(“a>=b”)
couldalsobewrittenas:
ifa<b:
print(“a<b”)
ifnot(a<b):
print(“a>=b”)
Theelseisexpressive,efficient, and syntacticallyconvenient. It is expressive because it
representsawaythathumansactuallycommunicate.Thewordelsemeansprettymuchthe
samethinginPythonasitdoesinEnglish.Itisefficientbecauseitavoidsevaluatingthe
same expression twice, which costs something in terms of execution speed. And it is
syntactically convenient because it expresses an important element of the language in
fewersymbolsthanwhentwoifsareused.
ThefinalPythoncodeforthesimplesolutionoftheguessanumberprogramcannow
bewritten.Itis:
choice=7
print(“Pleaseguessanumberbetween1and10:“)
playerchoice=int(input())
ifchoice==playerchoice:
print(“Youwin!”)
else:
print(“Sorry,Youlose.”)
1.8 DOCUMENTATION
Therearesomeproblemswiththisprogram,butisdoeswork.Alargeproblemisthatit
alwayschoosesthesamenumbereverytimeitisexecuted(thatnumberbeing7).Thiswill
befixedlateron.Alesscriticalproblemisthatitisundocumented;thatis,thereareno
instructionstoaplayerconcerninghowtousetheprogramandthereisnodescriptionof
howtheprogramworksthatanotherprogrammermightuseifmodifyingthiscode.This
canbefixedbyprovidinginternalandexternaldocumentation.
Externaldocumentationislikeamanualfortheuser.Mostprogramshavesuchathing,
andeventhoughthisprogramisquitesimple,somedegreeof
documentation can be provided. In fact, it is brief enough that it could be printed
whenevertheprogramstartstorun.Forexample:
print(“Pick-a-numberisasimpleguessinggame.The”)
print(“computerwillselectanumberbetween1and10”).
print(“andyouareexpectedtoguesswhatitis.”)
print(“Whentheprogramdisplays'Pleaseguess”)
print(“anumberbetween1and10:'youtypein”)
print(“yourguessfollowedbythe<enter>key.Your“)
print(“guessmustbeanintegerintherange1to10.”)
print(“Thecomputerwilltellyouifyouwinorlose.)
For many more sophisticated programs, such as PowerPoint for example, the
documentationismanypagesandformsasmallbook.Itwouldbedistributedasabooklet
alongwiththesoftwareorprovidedasawebsite.
Internaldocumentationisintendedforprogrammerswhohaveaccesstothesourcecode
oftheprogram.Itcantaketheformofwrittendocumentstoo,butiscommonlyasetof
commentsthatappearsalongwiththecodeitself.High-levellanguageslikePythonallow
theprogrammertoaddhumanlanguagetexttothecodethatwillbecompletelyignoredby
the computer but that can be read by anyone looking at the code. These comments
describetheactionoftheprogram,themeaningofthevariables,detailsofcomputational
methodsused,andmanyotheritemsofinterest.
InPythonacommentbeginswiththecharacter“#”andendsattheendoftheline.
There are no rules for what can appear typed in a comment, but there are some
guidelines developed through years of programming practice. A comment should not
simplyrepeatwhatappearsinthecode;acommentshouldshedsomelightonanaspectof
theprogramthatmightnotbecleartoeveryonelookingatit,anditshouldbewrittenin
plain language. As an example, here is the guess-a-number program with comments
included:
#Thisprogramselectsanumberbetween1and
#10andallowsauser(player)toguesswhat
#itis.
choice=7#Thenumberselectedbythecomputer
#Prompttheuser,indicatingwhatisexpected
print(“Pleaseguessanumberbetween1and10:“)
#Readtheplayer'sinputfromthekeyboard
playerchoice=int(input())#convertfromstring
#Printtheoutcomeofthegame.
ifchoice==playerchoice:#Istheplayer'sguess
print(“Youwin!”)#correct?Playerwins!
else:#Otherwisethecomputerwins
print(“Sorry,Youlose.”)
All programsshould be documented, not after the factbut as they are being written.
Why?Becauserelativelyfewprogramsarewrittenallinonesitting.Thecommentsinthe
codeserve as reminders tothe programmer about what the variables representand why
particular code segments read the way they do. It also indicates the current state of
thinking about the design of the code. When the program is looked at again at the
beginningofanewworking(orschool)day,thecommentscanbeessentialinresuming
thework.
There is also something called a docstring that seems to do the same things as a
comment,butcoversmultiplelinesandisnotreallyacomment.Adocstringbeginsand
endswithatriplequote:
print(“Thiscodewillexecute”)
”””
print(“Thiscodeiswithinadocstring”)
”””
Adocstringisactuallyastring,notacomment,butbehaveslikeacommentandcanbe
usedinthatway.Itcanbeespeciallyusefulfortemporarilycommentingoutsmallsections
ofcodewhiletryingtofindoutwhereerrorsare.Therearealsoprogramsthatwillcollect
thedocstringsintoaseparatedocumentthatcanbeusedasadescriptionoftheprogram.
Forthat reasontheir intended useis to allowthe programmerto explain thepurpose of
certainsectionsofcode.
1.9 ROCK-PAPER-SCISSORSAGAIN
With what is now known about Python, it is time to look at the rock-paper-scissors
problem and see if it can be coded. It takes more steps, but it is really no more
complicatedthantheguess-a-numberprogram.Thecodeisthesame.
1.Selectachoicefromthethreeitemsrock,paper,orscissors.Savethischoiceina
variablenamedchoice.
Arepresentationforthethreeitemswaswhenthesolutionwasfirst
described,whereeachchoicewasaninteger.However,inputreadsstrings,soit
shouldbepossibletoavoidtheconversiontonumbersandusethestringsdirectly.
choice=“paper”#Computerchoosespaper.
2.Asktheplayerfortheirchoice.
Printaspromptmessage.
print(“Rock-paper-scissors:typeinyourchoice:“)
3.Readtheplayer’sselectionintoavariablenamedplayer.
Useinputaswedidbefore,butthistimereadastringandkeepitthatway.The
playermusttypeoneof“rock,”“paper,”or“scissors,”orelseanerrorwillbe
reported.
player=input()
4.Ifplayerisequaltochoice:
5.Printthemessage“Tie.We’lltryagain.”
Stringscanbecomparedagainsteachotherforequality,sothisstepisquitesimple:
ifplayer==choice:
print(“Gameisatie.Pleasetryagain.”)
6.Ifplayerisequaltorock
7.Ifchoiceisequaltoscissors,gotoStep17
Thewillbeno“gotoStep17,”butthatstepsimplysaysthattheplayerwins.Just
printthatmessagehere.
ifplayer==“rock”:
ifchoice==“scissors”:
print(“Congratulations.Youwin.”)
else:
print(“Sorry-computerwins.”)
8.Ifplayerisequaltopaper
9.Ifchoiceisequaltoscissors,gotoStep17
ifplayer==“paper”:
ifchoice==“scissors”:
print(“Sorry-computerwins.”)
else:
print(“Congratulations.Youwin.”)
10.Ifplayerisequaltoscissors
11.Ifchoiceisequaltorock,gotoStep17
ifplayer==“scissors”:
ifchoice==“rock”:
print(“Sorry-computerwins.”)
else:
print(“Congratulations.Youwin.”)
Thiscodeillustratesanewconcept,ifnotanewlanguagefeature.Ithasifstatements
thatarenestedonewithintheother.Again,it’snotnecessarytodothisbecausenon-nested
statementscanimplementthesamedecision.Forexample:
NestedIFs Non-nestedIFs
ifplayer==“scissors”: ifplayer==“scissorsand
choice==“rock”
ifchoice==“rock”: print(“Computerwins”)
print(“Computerwins.”) ifplayer==“scissors”and
choice!=“rock”
else: print(“Youwin”)
print(“Youwin.”)
Nested if statements seem more expressive, and communicate the flow of the program
bettertoahumanprogrammerthandoesthenon-nestedcode.
ThereisanotherPythonlanguageelementthatcanbeusedhere.Lookingatthecode,
there is no indication when the user makes an error. For example, if the user enters
“ROCK”(i.e.,uppercase),thenitwillnotmatchanyofthechoices,andtheprogramwill
notindicatethis.Infact,itwon’tprintanythingatall.Whatisreallywantedisasequence
ofif-else-if-elsestatementssuchas:
ifplayer==“scissors”:
ifchoice==“rock”:
else:
ifplayer==“rock”:
ifchoice==paper:
else:
ifplayer==“scissors”:
##andsoon…
Pythonhasaspecialfeaturethatimplementsthisnestingofifandelse:theelif.Theelif
constructcombinesanelseandanif,andthisreducestheamountofindentingthathasto
bedone.Thefollowingcodesnippetsdothesamething:
ifa<b:ifa<b:
print(“a<b”)print(“a<b”)
elifa>b:else:
print(“a>b”)if(a>b):
else:print(“a>b”)
print(“a=b”)else:
else:print(“a=b”)
If too many nested if-else statements exist, then the indenting gets to be too much,
whereas the elif allows the same indent level and has the same meaning. In some
programs this is essential, and in general is easy to read. Using the elif statement the
programfortherock-paper-scissorsproblemlookslikethis:
choice=“paper”#Computerchoosespaper.
print(“Rock-paper-scissors:typeinyourchoice:“)
player=input()
ifplayer==choice:
print(“Gameisatie.Pleasetryagain.”)
ifplayer==“rock”:
ifchoice==“scissors”:
print(“Congratulations.Youwin.”)
else:
print(“Sorry-computerwins.”)
elifplayer==“paper”:
ifchoice==“scissors”:
print(“Sorry-computerwins.”)
else:
print(“Congratulations.Youwin.”)
elifplayer==“scissors”:
ifchoice==“rock”:
print(“Sorry-computerwins.”)
else:
print(“Congratulations.Youwin.”)
else:
print(“Error:Selectoneof:rock,paper,scissors”)
Nowallofthepossibleoutcomesarehandledbythecode.
1.10 TYPESAREDYNAMIC(ADVANCED)
ToprogrammerswhoonlyprogramusingPython,itwouldseemoddthataparticular
variablecouldhaveonly one typeandthatit wouldhavetobe initiallydefinedtohave
that type, but it is true. In Python the type associated with a variable can change. For
example,considerthestatements:
x=10#Xisaninteger
x=x*0.1#Xisfloatingpointnow
x=(x*10==10)#XisBoolean
Somefindthisperfectlylogical,andothersfinditconfusing.Thefactisthatsolongasthe
variableisusedaccordingtoitscurrenttype,allwillbewell.
ItisalsotruethatevenapparentlysimplePythontypescanbequitecomplexintermsof
their implementation. The point is that the programmer rarely needs to know about the
underlying details of types like integers. In many programming languages an integer is
simplyaoneortwo-wordnumber,andthelanguagesbuildoperationslike“+”fromthe
instructionsetofthecomputer.If,forexample,aone-wordintegerAisaddedtoanother
oneB,itcanbedoneusingasinglecomputerinstructionlikeADDA,B.Thisisveryfast
atexecutiontime.
Python was designed to be convenient for the programmer, not fast. An integer is
actuallyacomplexobjectthathasattributesandoperations.Thiswillbecomecleareras
morePython examples arewritten andunderstood, butas asimple casethink aboutthe
way that C++ represents an integer. It is a 32-bit (4 byte) memory location, which is a
fixedsizespaceinmemory.Thelargestnumberthatcanbestoredthereis232-1.Isthat
trueinPython?
Here’s a program that will answer that question, although it uses more advanced
features:
foriinrange(0,65):
print(i,2**i)
Evenanespeciallylongintegerwouldbelessthan65bits.Thefactisthatthisprogram
runssuccessfully,andevenratherquickly.IntegersinPythonhaveanarbitrarilylargesize.
So calculating 264 * 264 is possible and results in
340282366920938463463374607431768211456. This is very handy indeed from a
programmer’sperspective.
Thetypeofavariablecanbedeterminedbytheprogrammerastheprogramexecutes.
Thefunctiontype()willreturnthetypeofitsparameterasastring,andcanbeprintedor
tested.So,thecode:
z=1
print(type(z))
z=1.0
print(type(z))
willresultin:
<class'int'>
<class'float'>
Ifoneneededtoknowifzwasafloatataparticularmoment,then:
iftype(z)isfloat:
woulddothetrick.Type(z)doesnotreturnastring,itreturnsatype.Theprint()function
recognizesthatandprintsastring,justasitdoesforTrueandFalse.So:
iftype(z)==“<class'float'>”:
wouldbeincorrect.
In future chapters this malleability of types will be further described, and practical
methodsfortakingadvantageofitinPythonwillbeexamined.
1.11 SUMMARY
Acomputerisatoolforrapidlyandaccuratelymanipulatingnumbers.Itcanperform
tediousrepetitivetasksaccuratelyandquickly,butmustbetoldwhattodoandfollowsits
instructions very literally. A computer program is a set of instructions for performing a
task using a computer, and Python is one language that can be used for this purpose.
Pythonallowsaprogrammertodefinevariablesbysimplyusingthem, and associatesa
typewithavariablebasedonwhatitisgiven.Anifstatementallowspartsofaprogramto
be executed when a certain condition becomes true, and it can have an else part that is
executedwhentheconditionisfalse.Ifstatementscanbenested,andsometimestheelif
structureisagoodwaytoexpressasetofnestedconditionalcode.
Inthischapterthemainexamplesweretwoprograms,oneofwhichallowedauserto
guessanumber,whiletheotherwasthewell-knowngameofrock-paper-scissors.
Exercises
Inthefollowingexercisessomeof the expressionsmayresultin anerror.If so,explain
whytheerroroccurs.WhenaskedtowritecodeitshouldbePython3code.
1.Evaluatethefollowingexpressions:
a)3*3/2
b)3*3//2
c)3*3%2
d)(3*3)%2
e)3**3/3
f)(3+2)-(2-4)
g)(3+2)/(2-4)
2.Ifthestatements:
x=3
y=9
z=“2.4”
havebeenexecutedthenevaluatethefollowingexpressions.Ifanerroroccurs,state
why:
a)x/y
b)x//y
d)x%y
e)y/x*z
f)float(x)/float(z)
g)float(x)//float(z)
h)int(x)//int(z)
3.Giventhevariabledefinitionspresented,evaluatethefollowingexpressionsasbeing
TrueorFalse.
x=12
y=14
a)x>3
b)x>=12
c)x<y
d)x<yandy>14
e)x<yory>14
f)not(x==y)
g)not(x<y)andnot(y>14)
4.Whatwillbeprintedbythefollowingstatements?
a)print(int(“23”))
b)if3**2+4**2==5**2:
print(“345”)
elif3**2<4**2:
print(“34”)
else:
print(“5”)
c)if“toast”<“jam”:
print(“toast”)
else:
print(“jam”)
d)if“12”<“5”:
print(“12”)
else:
print(“5”)
e)a=12.3
b=100
c=0
ifa<b:a=a+1;b=b-1
c=c–b
print(a)
print(c)
f)a=100
b=200
c=300
ab=a<b
cd=(c==a+b)
ifabandcd:
print(“ABandCD”)
elifab:
print(“AB”)
else:
print(“Nope”)
5.TheUnitedStatesmeasurestemperatureinFahrenheitdegrees,whereasCanadauses
Celsius.Acompanyisdevelopinganapptoconvertbetweenthetwoforpeople
wantingtoskiinBanfforWhistler.TheformulatoconvertfromCelsiusdegreesCto
FahrenheitdegreesFis:
F=C*9/5+32
Writeaprogramthatwillbethebasisofthisapp:itwillreadatemperatureinCelsius,
convertittoFahrenheit,andprinttheresult.
6.Thenumericalvaluesofcoinshavebeenarrangedsothatthegreedyalgorithmwill
resultinthesmallestnumberofcoinswhenmakingchange.Thismeansthatthe
largestvaluedcoinistriedfirst,andasmanyofthosecoinsareusedaspossible.Then
thenextsmallerdenominationcoinisused,andsoonuntilthepenniesaredealtout.
Sofor84centsinchange,a50-centpiececouldbeused(leaving34cents),thena25-
centpiece(leaving9cents),a5centpiece(leaving4cents),and4pennies.Ifno50-
centpiecewasavailable,then25-centpieceswouldbeusedinitsplace:3quarters,
followedbyanickelandfourpennies.Writeaprogramthatreadsanumberbetween1
and99thatisanamountofchangetobegivenandprintsthecoinvaluesthatwouldbe
used.
7.Threefloatingpointvariablesa,b,andchavebeenreadinfromtheconsole.Writea
setofifstatementsthatprintstheseindescendingorder.
8.Ifthevalueof1.0/7.0isprinted,therearemanynumberstotherightofthedecimal
place.DeviseawaytoprintonlythreeplacesandwritesomePythoncodetotestthe
idea.
9.Calculateanapproximationtopi.ThereisaninfiniteseriescalledtheGregory-Leibniz
seriesthatsumstopi.Ofcourseitcanneverreachtheexactvaluebecausethereisno
suchthing,butitcancomputeasmanydigitsasaredesired.Theseriesis:
Π=4/1–4/3+4/5–4/7+4/9–4/11….
Writeaprogramthatcalculatestheresultofthefirst15termsofthisseries.Howmany
digitsofpiarecorrect?Addsixmoreterms.Howmanydigitsarecorrectnow?
10.AnotherseriesthatcancalculatepiistheNilakanthaseries.Itisalittlemore
complicatedtocalculate,butgetsclosetopimuchfasterthandoestheGregory-
LeibnizseriesofExercise9.TheNilakanthaseriesis:
Π=3+4/(2*3*4)–4/(4*5*6)+4/(6*7*8)–4/(8*9*10)…
Calculatethefirst15termsofthisseries.Howmanydigitsofpiarecorrect?
NotesandOtherResources
ManyteachingresourcesforPythonexist,bothinprintandontheInternet.
Hereisthedevelopmentenvironmentusedtotestthecodeforthisbook.
PyCharm.https://www.jetbrains.com/pycharm/
1.DavidBeazleyandBrianK.Jones.PythonCookbook,3rdEdition:Recipesfor
MasteringPython3,http://www.onlineprogrammingbooks.com/python-cookbook-
third-edition/
2.CodyJackson.LearningtoProgramUsingPython,
http://www.onlineprogrammingbooks.com/learning-program-using-python/
3.BradMiller.ProblemSolvingwithAlgorithmsandDataStructuresUsingPython,
http://www.onlineprogrammingbooks.com/problem-solving-with-algorithms-and-data-
structures/
4.HarryPercival.Test-DrivenDevelopmentwithPython,
http://www.onlineprogrammingbooks.com/test-driven-development-with-python/
5.LennartRegebro.PortingtoPython3:AnIn-DepthGuide,
http://www.onlineprogrammingbooks.com/porting-to-python-3-an-in-depth-guide/
6.ZedA.Shaw.LearnPythontheHardWay,http://learnpythonthehardway.org/book/
CHAPTER2
REPETITION
2.1TheWHILEStatement
2.2Rock-Paper-ScissorsYetAgain
2.3CountingLoops
2.4PrimeorNon-Prime
2.5LoopsThatareNested
2.6DrawaHistogram
2.7LoopsinGeneral
2.8ExceptionsandErrors
2.9Summary
Inthischapter
Oneofthethingsthatmakescomputersattractivetohumansistheirabilitytodotedious,
repetitivetasksaccurately and at high speedwithout gettingbored. Itis somethingthey
were designed to do. Humans have to do things repeatedly, and not all of them can be
doneforusbycomputers.Brushingourteeth,drivingtowork,cleaningthecarpet—allare
repeatedactions,andmanywouldbecalledchores.Inprogrammingtermssomemightbe
referredtoasloops.
Considerafactoryjobonanassemblyline.AccordingtoHenryFord,oneofthepeople
principallyconnectedwithdevisingtheassemblylineconcept,itismoreefficienttohave
eachworkerdoonejobwellandrepeatitmanytimesadaythantoteachworkershowto
buildentirethings,inhiscaseautomobiles.Eachworkerdoesonerelativelyshortjob,and
then the piece they are working on goes to the next station where the next person does
theirrelativelyshortjob.Onesuchjobcouldbetheinstallationoftheelectronicignition
modulebracket:
1.Acquireabracketandplaceoverattachmentholeswithwideendbelowthe
smallerend.
2.Placeatwo-inchboltintheupperleftboltholeandscrewintotwopoundsof
torque.
3.Placeafour-inchboltintheupperrightboltholeandscrewintotwopoundsof
torque.
4.Placeatwo-inchboltinthelowerleftboltholeandscrewintotwopoundsof
torque.
5.Placeaten-millimeternutovertheboltatthelowerrightandtightentotenpounds.
6.Re-tightentheboltstotenpoundsinthefollowingorder:upperleft,
upperright,lowerleft.
BeforeStep1aboveanewworkpiece(anengine,probably)isplacedinfrontofthe
worker, and after Step 6 the piece is moved to the next station. So from the worker’s
perspective,solongasorwhilethereisanengineattheirstationthatneedsabracket,they
repeat the steps. In a form that a computer might be able to understand this might be
writtenas:
whilethereisanengineattheirstationthatneedsabracket:
Acquireabracketandplaceoverattachmentholeswithwideendbelowthesmaller
end.
Place a two-inch bolt in the upper left bolt hole and screw in to two pounds of
torque.
Place a four-inch bolt in the upper right bolt hole and screw in to two pounds of
torque.
Place a two-inch bolt in the lower left bolt hole and screw in to two pounds of
torque.
Placeaten-millimeternutovertheboltatthelowerrightandtightentotenpounds.
Re-tighten the bolts to ten pounds in the following order: upper left, upper right,
lowerleft.
Alloftheactionsthatfollowthewhileareindentedtoindicatethattheyareapartof
theactivitiestoberepeated,justaswasdoneinaPythonifstatementtomarkthethings
thatweretobedoneiftheconditionwastrue.Thisexample
illustratesoneofthePythonrepetitionstructuresquiteaccurately:thewhilestatement.
2.1 THEWHILESTATEMENT
Whenusingthisrepetitionstatement,theconditionistestedatthetoporbeginningof
the loop. If upon that initial test the condition is true, then the body of the loop is
executed;otherwiseitisnot,andthestatementfollowingtheloopisexecuted.Thismeans
thatitispossiblethatthecodeintheloopisnotexecutedatall.Theconditiontestedisthe
samekindofexpressionthatisevaluatedinanifstatement:onethatevaluatestoTrueor
False.Itcouldbe,andoftenis,acomparisonbetweentwonumericorstringvalues,asitis
intheexampleofFigure2.1.
Whenthecodeinthebodyofthewhilestatementhasbeenexecuted,thenthecondition
istestedagain.Ifitisstilltrue,thenthebodyoftheloopisexecutedagain,otherwisethe
loopisexitedandthestatementfollowingtheloopisexecuted.Thereisanimplicationin
this description that the body of the loop must change something that is used in the
evaluationoftheloopcondition,otherwisetheconditionwillalwaysbethesameandthe
loopwillneverterminate.So,hereisanexampleofaloopthatisenteredandterminates:
a=0
b=0
whilea<10:
a=a+1
print(a)
Theconditiona<10istrueattheoutsetbecauseahasthevalue0,sothecodeinthe
loopisexecuted.Thelonestatementinthisloopincrementsa,sothatafterthefirsttime
theloopisexecuted,thevalueofais1.Nowtheconditionistestedand,again,a<10so
theloopexecutesagain.Inthefinaliterationoftheloop,thevalueofastartsoutas9,is
incrementedandbecomes10.Whentheconditionistesteditfails,becauseaisnolonger
lessthan10(itisequal)andsotheloopends.Thestatementfollowingtheloopisprint
(a)andthevalueprintedis10.Thisloopexplicitlymodifiesoneofthevariablesinthe
loopcondition,anditiseasytoseethattheloopwillendandwhatthevalueofawillbe
atthattime.
Hereisanexampleofaloopthatisenteredanddoesnotterminate:
a=0
b=0
whileb<10:
a=a+1
print(a)
Inthiscasethevalueofbislessthan10attheoutset,sotheloopisentered.Thebody
oftheloopincrementsaas before,butdoes notchangeb. The loop condition does not
dependona,onlyonb,sowhentheloopconditionistestedagainthevalueofbisstill0,
andtheloopexecutesagain.Thevalueofbwillalwaysbe0eachtimeitistested,sothe
loopconditionwillalwaysbetrueandtheloopwillneverend.Theprintstatementwill
neverbeexecuted.
When this program is executed, the computer will seem to become unresponsive. As
longastheloopisexecutingtheprogramcandonothingelse,andsotheonlyindication
that something is wrong is that nothing is happening. There are many reasons why a
programcanappeartobedoingnothing:whenwaitingfortheusertotypesomeinput,for
instance, or when performing an especially difficult calculation. However, in this case,
whichiscalledaninfiniteloop,theonlythingtodoistoterminatetheprogramandfixthe
loop.
Hereisanexampleofaloopthatisnotentered:
a=100
b=0
whilea<10:
a=a+1
print(a)
Theconditiona<10isfalseattheoutsetbecauseahasthevalue100,sothecodeinthe
loopisnotexecuted.Thestatementfollowingtheloopisexecutednext,whichistheprint
statement,andthevalueprintedis100.
Theseloopsaremerelyexamplesthatillustratethethreepossibilitiesforawhileloop
and do not calculate anything useful. The two examples from the previous chapter can
makepracticaluseofawhileloop,anditwouldbeusefultolookatthoseagain.
2.1.1 TheGuess-A-NumberProgramYetAgain
TheprogramasitwaswritteninChapter1is:
choice=7
print(“Pleaseguessanumberbetween1and10:“)
playerchoice=int(input())
ifchoice==playerchoice:
print(“Youwin!”)
else:
print(“Sorry,Youlose.”)
Thegamewouldbebetterifitallowedtheplayertoguessagain,perhapsuntilacorrect
guesswasachieved.Awhileloopcouldbeusedtoaccomplishthis.Thinkaboutwhatthe
condition might be. The loop should end when the player guesses the answer. Another
waytosaythisisthattheloopshouldcontinuesolongastheplayerhasnotguessedthe
answer. The condition is one for continuation of the loop, not termination, so the loop
mustbeconstructedinsuchawaythatitcontinueswhentheconditionistrue.Theloop
willbeginwiththis:
whilechoice!=playerchoice:
At the beginning of the loop the variables choice and playerchoice must be defined.
This means that before the while statement, there must be code that does this. The
programnowlookslikethis:
choice=7
print(“Pleaseguessanumberbetween1and10:“)
playerchoice=int(input())
whilechoice!=playerchoice:
If the player has guessed incorrectly, then the body of the loop will execute. What
shouldbedone?Oneofthevariablesintheconditionhastobechanged,firstofall,and
thegoaloftheprogrammustbekeptinmind.Inthiscase,becausetheplayerhasguessed
incorrectly,twothingsshouldhappen.First,theplayermustbetoldthattheyarewrong
and to make another guess. Next, the new guess must be read into the variable
playerchoice,thussatisfyingtherulethattheloopconditionmusthaveanopportunityto
becomeFalse.Theprogramisnow:
choice=7
print(“Pleaseguessanumberbetween1and10:“)
playerchoice=int(input())
whilechoice!=playerchoice:
print(“Sorry,notcorrect.Guessagain:“)
playerchoice=int(input())
When the player finally guesses the number the loop will exit; if the first guess is
correctthentheconditionfailsatthebeginning,andthisamountstothesamethinginthis
case.Thelastthingtodoistoprintamessagetotheplayer:
choice=7
print(“Pleaseguessanumberbetween1and10:“)
playerchoice=int(input())
whilechoice!=playerchoice:
print(“Sorry,notcorrect.Guessagain:“)
playerchoice=int(input())
print(“Youhaveguessedcorrectly.”)
Note that, as was true with the if statement and as is always true in Python, the
indentation indicates which statements are a part of the loop (the suite) and which are
outside.
2.1.2 ModifyingtheGame
Asimplemodificationofthegameinvolvestellingtheplayerwhethertheirguesswas
toolargeortoo small.This will helpthem shrinkthe possiblerange of valuesand thus
guess the right answer more quickly. A modification to the body of the loop will
accomplish this. If the value that the player guessed is smaller than the target, then a
messagetothateffectisprinted,andsimilarlyiftheplayerguessesavaluelargerthanthe
target.Theuseofanifstatementherewouldbeappropriate,andthatifstatementwould
benestedinsideofthewhileloop:
choice=7
print(“Pleaseguessanumberbetween1and10:“)
playerchoice=int(input())
whilechoice!=playerchoice:
if(playerchoice<choice):
print(“Sorry,yourguesswastoosmall.
Guessagain:“)
else:
print(“Sorry,yourguesswastoolarge.
Guessagain.”)
playerchoice=int(input())
print(“Youhaveguessedcorrectly.”)
This program illustrates a second level of indentation. The if-else are indented to
indicatetheyarepartofthewhilestatement.Theprintstatementsareindentedfurther,to
showthattheyarealsopartoftheifstatement.
Doing some printing inside of the loop is useful because an infinite loop will be
obvious.It willprint awhole lotof stuffand never stop.It’snotalways practicalto do
that,soadegreeofcarefulanalysisshouldalwaysbedonetoensurethattheloopcanand
willterminate.
2.2 ROCK-PAPER-SCISSORSYETAGAIN
Thisgamereallyneedsaloop,andthepreviousimplementationwasnotcomplete.If
there is a tie then the game has to be repeated, and a winner must be determined. This
meansthattheloopinthiscasewillbesomethinglike:
whilethereisnowinner:
Thishappensonlywhentheplayerandthecomputerselectthesameobject,andinthe
originalcodewashandledbythestatements:
ifplayer==choice:
print(“Gameisatie.Pleasetryagain.”)
Thecondition“nowinner”becomesplayer==choice.Thecompletesolutioninvolves
the while loop and another input from the user within the loop. Here is one possible
answer:
choice=“paper”#Computerchoosespaper.
print(“Rock-paper-scissors:typeinyourchoice:“)
player=input()
#–––Thenewsectionofcode–––––––
whileplayer==choice:#Repeatinputuntilthereisa
winner
print(“Gameisatie.Pleasetryagain.”)
player=input()
#––––––––––––––––––-
ifplayer==“rock”:
ifchoice==“scissors”:
print(“Congratulations.Youwin.”)
else:
print(“Sorry-computerwins.”)
elifplayer==“paper”:
ifchoice==“scissors”:
print(“Sorry-computerwins.”)
else:
print(“Congratulations.Youwin.”)
elifplayer==“scissors”:
ifchoice==“rock”:
print(“Sorry-computerwins.”)
else:
print(“Congratulations.Youwin.”)
else:
print(“Error:Selectoneof:rock,paper,scissors”)
Theterminationoftheloopdependsontheuser’sinputandthevalueofthecomputer’s
choice,whichcouldalso(andshould)changeinsidetheloop.Theprobabilityoftheloop
continuingafteroneiterationis1in3,andtheprobabilitythatitwillstillbeloopingafter
Niterationsis(1/3)N,sothereisaverysmallchanceofthelooprepeatingmorethan2or
3times.
2.2.1 RandomNumbers
Most games depend on an element of unpredictability or chance. Those that do not
might be more properly called puzzles (or sports—football and hockey ought to have a
certain degree of skill involved). Given that computers do calculations, and that
calculationsshouldhavethesameresulteverytime,howdoesoneproduceanythingthat
israndomusingacomputer?Theanswerispartlyinhowthetermrandomisdefined.The
discussion involves some mathematics or at least some basic ideas in probability and
statistics.
If integers in the range 1 through 10 inclusive are considered, what is the likelihood
(chance,probability)thatthenumber5willbeselectedatrandom?Theansweris1in10,
or 0.1. This is true each time that question is asked. So if the number 5 has just been
chosenandanothernumberistobechosen,whatisthechancethatitwillbea5?Same
answer:1in10.Theprincipleisthatthenextchoicedoesnotdependonthepreviousone;
it’sapartofwhatmakesthemrandom.
Perhapsthewrongquestionwasbeingasked.So,whatisthelikelihoodthatthenumber
5 will be selected twice in a row at random? The answer is 1 in 100, or 0.01. Why?
Becauseitdependsonthequestionasked.Togettwoinarow,thefirstonemustbea5(1
in10)andthesecondonemustalsobea5(also1in10)sotheresultinglikelihoodis1in
10*10or1in100.Buteachtimeanumberischosen,thenumber5hasa1in10chanceof
beingselected.Amathematicaldiscussionofrandomnessdependsontheaskingtheright
question,andonprobabilities.Ifsomeeventiscompletelyrandom,thenitshouldhavethe
sameprobabilityofhappeningastheotherpossibleevents,buteventscanbecollectedto
formmore complex events. Each cardin a deck of playing cardsshould have the same
probabilityofturningup,butifthequestionis“What’sthechanceofaflush,”thenthe
differentwaysthataflushcanbecomprisedhavetobetakenintoaccount.Itcanbeavery
complicatedandinterestingsubject.
Numbers,inparticular,arerandomonlywithrespecttoeachother.Isthenumber“6”
random?That’snotreallyagoodquestion.Isthesequence87394random?Perhapsatest
couldbedevisedtoanswerthat.Isthesequence66666random?Mostwouldsaynot,but
ithasthesameprobabilityofbeinggeneratedatrandomasdoes87354.Tocreategood
gamesandsimulations,itisnecessarytodevisewaystogeneratearandomnumberusing
a computer, and to test numbers to see if they are in fact random. Then it would be
possibletosimulatetheflippingofacoin,ortherollingofadie.
What is the 100th digit of pi? It can be found easily. Are consecutive digits of pi
effectively random? As it happens, the answer is not known, but it is a good question.
Whatis108763dividedby98581?Whatistheremainder?Calltheremainderx:whatis
108763dividedbyx?Arethesenumbersrandom?Thesearchforamethodforgenerating
really good random numbers continues, but there are some pretty good methods (See:
Chapter 10). In Python a random number created by a computer algorithm can be
requestedbyusingabuilt-infunction.
A built-in function is like a mathematical function, and is provided by the language
itself (and so is “built-in”). The language element print that has been used so much is
really a built-in function. So are int() and float(). The functions sine and square root
shouldbewellunderstoodbyanyonewhohasgraduatedfromhighschool,andarealso
built-infunctions.SuchfunctionsbelongtomodulesinPythonandhavetobespecifically
requested by the program so that they can be used. This means that the name of the
module has to be known as well as the names of the built-in functions within it. The
commonmathematicalfunctionsarelocatedwithinthemodulemathandcanbeusedby
requestingthemathmodulewiththestatement:
importmath
Using a function in the math module involves using the name math followed by a
period(“.”)followedbythenameofthefunction.The“.”opensthemodulesothatthe
nameswithincanbeused,becausetheremaybeotherbuilt-infunctionsorevenvariables
thathavethesamename.So,ifthestatements:
x=math.sqrt(64)
print(x)
are executed, the program will print the number 8, which is the square root of 64. The
expressionsqrt(64)iscalledafunctioncall,andexecutesthecodeneededtocalculatethe
squarerootof64.Thenamesqrtisthenameofthefunction,whichiscodeprovidedby
the Python language. This particular call will always return the value 8, because 8 is
alwaysthesquarerootof64.Itisverymuchlikethefunctionsthatarestudiedingrade
school mathematics, such as sine and cosine. A module can be thought of as a bag of
programs. Each bag contains a set of programs that do a particular class of things, like
mathematics or drawing. By specifying the name of the module, access to all of the
functionswithinis granted,and byspecifying thespecific nameof afunction, thecode
thatwewantisspecificallymadeavailable.
Bytheway,theimportstatementshouldbeattheverybeginningoftheprogram.
Nowimaginethatitispossibletohaveafunctionthatproducesarandomnumberasa
value.Itisinthemodulenamedrandom,andthefunctioncouldbecalledrandomtoo.
Forexample:
importrandom
print(random.random())
Everytime(wellalmosteverytime,becauseitisrandom,afterall)thefunctionisused
itwillgiveadifferentvalue,arandomvalue.Thisvaluecanbeusedtomakegamesmore
realistic,becausegameshavearandomaspect.
This code prints the value 0.07229650795715237. Why? Because random.random()
producesarandomnumberbetween0.0and1.0.Thisisthemostcommonexampleofa
randomnumberfunction,anditisreallyverygeneral.Increasingtherangeisdonesimply
by multiplying by the maximum value desired; random.random()*100 gives a random
numberbetween0and100,forinstance.
Whatiftheproblemistosimulatetherollofadie?Thebagofcodethatistherandom
modulecontainsotherfunctionsrelatedtothegenerationofrandomnumbers,andoneof
themisespeciallysuitedtothisproblem.Adierollwouldbeimplementedas:
random.randint(1,6)
Therandint function accepts two numbers, called parameters. The first is the lower
limit of the range of random integers to be produced and the second is the upper limit.
Specifying1asthelowerlimitand6astheupper,asintheexampleabove,meansthatit
willgeneratenumbersbetween1and6inclusive,whichiswhatwouldbeexpectedfrom
rollingadie.Theresultofrollingtwodicewouldbeanumberbetween2and12,foundby
random.randint(2,12).
Flipping a coin is a two-level choice, and could be done with random.randint(1,2).
Morecompletely:
if(random.randint(1,2)==1):
print(“Heads”)
else:
print(“Tails”)
Going back again to the number guessing game, a random choice for the computer’s
numberisnowpossible.Insteadofthefirstlineofcodebeing:
choice=7
itshouldnowbe:
choice=random.randint(1,10)
Every time the program executes, the program will select a new random number, as
opposedtothechoicealwaysbeing7.
The introduction of a random choice is a little more complicated for the rock-paper-
scissors program because the variable holding the player’s choice is a string. There are
threepossiblechoices,sotoselectoneatrandommightlooklikethis:
i=random.randint(1,3)
ifi==1:
choice=“rock”
elifi==2:
choice=“paper”
else:
choice=“scissors”
Many of the examples that will be developed will involvea game or puzzle of some
kind,sotheuseofrandomnumberswillbeaconsistentfeatureofthecodeshown.
2.3 COUNTINGLOOPS
Featuresofprogramminglanguagesareprovidedbecausethedesignersknowtheywill
beuseful.Thewhileloopisobviouslyuseful,andisinfacttheonlykindofloopthatis
required in order to implement any program. However, loops that involve counting a
certainnumberofiterationsareprettycommon,andaddingsyntaxforthiskindofthing
wouldbecertaintobevaluableforaprogrammer.Theideaisthatsometimesaloopthat
executes,forexample,tentimesoraloopthatiteratedNtimesforsomevariableNwould
bewanted.InPythonandmanyotherlanguages,thisiscalledaforloop.
Insomelanguagesaforloopinvolvesaspecialsyntax,butinPythonitinvolvesanew
type(aclassoftypes,really):atuple.Hereisanexampleofaforloop:
foriin(1,2,3,4,5):
printi
Thiswillprintthenumbers12345eachonaseparateline.Thevariableitakeson
eachofthevaluesinthecollectionprovidedinparentheses,andtheloopexecutesoncefor
eachvalue of i. Thecollection (1,2,3,4,5) iscalled a tuple, and can contain any Python
objectsinanyorder.It’sbasicallyjustasetofobjects.Thefollowingarelegaltuples:
(3,6,9,12)
(2.1,3.5,9.1,0,12)
(“green”,“yellow”,“red”)
(“red”,3,4.5,2,“blue”,i)#whereiisavariablewitha
value
Theforloophastheloopcontrolvariable(inthecaseaboveitisi)takeoneachofthe
valuesinthetuple,inlefttorightorder,andthenexecutestheconnectedsuite.Theloop
willthereforeexecutethesamethenumberoftimesasthereareelementsinthetuple.
Sometimesitmaybenecessarytohavetheloopexecuteagreatmanytimes.Iftheloop
wastoexecuteamilliontimes,itwouldbemorethanawkwardtorequireaprogramtolist
a million integers in a tuple. Python provides a function to make this more convenient:
range().Itreturnsatuplethatconsistsofalloftheintegersbetweenthetwoparametersit
isgiven,includingthelowerendpoint.So:
range(1,10)is(1,2,3,4,5,6,7,8,9)
range(-1,2)is(-1,0,1)
range(-1,-3)isnotaproperrange.
range(1,1000000)ifthesetofallintegersfrom1to
9999999
Ranges involving strings are not allowed, although tuples having strings in them are
certainlyallowed.Theoriginalexampleforloopcannowbewritten:
foriinrange(1,6):
printi
andtheloopthatistoexecuteamilliontimescouldbespecifiedas:
foriin(0,1000000):
printi
This would print the integers from 0 to 999999. If range() is passed only a single
argument,thentherangeisassumedtostartat0;thismeansthatrange(0,10)andrange
(10)arethesame.
2.4 PRIMEORNON-PRIME
Here’sagamethatcanillustratetheuseofaforloop,andsomeotherideasaswell.The
computerpresentstheplayerwithlargenumbers,oneatatime.Theplayerhastoguess
whethereachnumberisprimeornon-prime.Aprimenumberdoesnothaveanydivisors
except 1 and itself; 3, 5, 11, and 17 are prime numbers. The game ends either when a
specificnumberofguesseshavebeenmade,orwhentheplayermakesaspecificnumber
ofmistakes.
A key problem to solve in this game is to determine when a number is prime. The
computermust beable to determinewhether theplayer is correct,and so forany given
numbertheremustbeawaytofigureoutwhetheritisprime.
Otherwise,theprogramforthisgameisnotverycomplicated:
whilegameisnotover:
selectarandomintegerk
printkandasktheplayerifitisprime
readtheplayerꞌsanswer
ifplayerꞌsansweriscorrect:
print“Youareright”
else:
print“Youarewrong.”
The mysterious portion of this program is the if statement that asks if the player’s
answeriscorrect.Thisreallymeansthattheprogrammustdeterminewhetherornotthe
number K is prime and then see if the player agrees. How can it be determined that a
numberisprime?Aprimenumberhasnodivisors,soifonecanbefoundthenthenumber
isnotprime.Themodulooperator%canbeusedtotellifadivisionhasaremainder:ifk
%n=0thenthenumberndividesevenlyintok,andkisnotprime.
Sotofindoutwhetheranumberisprime,trydividingitbyallnumberssmallerthanit,
andifanyofthemhaveazeroremainderthenthenumberisnotprime.Thisisajobfora
forloop.Here’safirstdraft:
isprime=True
forninrange(1,K):
ifk%n==0:
isprime=False
Aftertheloophascompleted,thevariableisprimeindicateswhetherKisprimeornot.
Thisseemsprettysimple,iftedious.Itdoesalotofdivisions.Toomany,infact,becauseit
isnotpossibleforanynumberlargerthanK/2todivideevenlyintoK.Soaslightlybetter
programwouldbe:
isprime=True#IsthenumberKprime?
forninrange(1,int(k/2))#DivideKbyallnumbers<K/2
ifk%n==0:#Iftheremainderis0thenn
isprime=False#dividesevenlyintoK:not
#prime
#Ifisprimeisstilltrueherethenthenumberisprime.
Next,thissectionofprogramshouldbeincorporatedintoacompleteprogramthatplays
thegame.Ifthegameissupposedtoallow10guesses,thenthefirststepistorepeatthe
wholething10times:
importrandom
correct=0#Thenumberofcorrectguesses
foriterationinrange(0,10):#10guesses
Now select a number at random. It should be large enough so that it is hard to see
immediatelyifitisprime,althoughevennumbersareagiveaway:
K=random.randint(10000,1000000)#Generateanewnumber
Nextprintamessagetotheuseraskingfortheirguess,andreadit:
print(“PrimeorNot:Isthenumber“,K,”prime?(yesor
no)”)
answer=input()#Readtheuserꞌschoice
Theusertypesinastring,“yes”or“no,”astheirresponse.Thevariableisprimethat
was used in the program that determines whether K is prime is logical, being True or
False.Itcouldbemadeintoastringtoosothatitwasthesameaswhattheusertyped,and
thenitcouldbecompareddirectlyagainsttheuser’sinput:
isprime=“yes”
Nowcomesthecodefordeterminingprimalityascodedabove,exceptwithisprimeas
astring:
isprime=True#IsthenumberKprime?
forninrange(1,int(k/2))#DivideKbyallnumbers<K/2
ifk%n==0:#Iftheremainderis0thenn
isprime=“no”#dividesevenlyintoK:notprime
#Ifisprimeisstilltrueherethenthenumberisprime.
Atthispointthevariableisprimeiseither“yes”or“no,”dependingonwhetherKis
actually prime. The user’s guess is also “yes” or “no.” If they are equal then the user
guessedcorrectly.
ifisprime==answer:
print(“Youarecorrect!”)
correct=correct+1
else:
print(“Youareincorrect.”)
Finally, the outer loop is ended and the result is printed. The value of the variable
correctisthenumberofcorrectguessestheusermade,becauseitwasincrementedevery
timeacorrectanswerwasdetected.Thelaststatementis:
print(“Yougave“,correct,“rightanswersoutof10.”)
ThisprogramcanbefoundontheCDinthedirectory“primegame.”
2.4.1 ExitingfromaLoop
Acleverprogrammerwouldnoticeaprettyseriousinefficiencywiththeprimenumber
program.Whenithasbeendeterminedthatthenumberisnotprime,theloopcontinuesto
divide more numbers into k until k/2 of them have been tried. If k= 999992 then it is
knownafterthefirstiterationthatthenumberifnotprime;itiseven,soitcan’tbeprime.
But the program continues to try nearly another half million numbers anyway. What is
neededisawaytotelltheprogramthattheloopisover.Thereisawaytodothis.
Aloopcanbeexitedusingthebreakstatement.Itissimplythewordbreakbyitself.
Thecorrectwaytousethisintheprogramabovewouldbe:
forninrange(1,int(k/2))#DivideKbyallnumbers<K/2
ifk%n==0:#Iftheremainderis0thenn
isprime=“no”#dividesevenlyintoK:notprime
break
This loop terminates when the number k is known to be not prime. The statement
followingtheloopwillbeexecutednext.Thiscansavealotofcomputercycles,butdoes
notmaketheprogrammorecorrect—justfaster.
Avariationonthisisthecontinuestatement.Thiswillresultinthenextiterationofthe
loop being started without executing any more statements in the current iteration. This
avoids doing a lot of work in a loop after it is known it’s not necessary. For example,
doing some task for a bunch of names except for people named “Smith” could use a
continuestatement:
fornamein(ꞌJonesꞌ,ꞌSmithꞌ,ꞌPetersꞌ,ꞌSinatraꞌ,ꞌBohrꞌ,
ꞌConradꞌ):
print(name);
ifname==ꞌSmithꞌ:
continue
#Nowdoabunchofstuff…
Boththebreakandcontinuedothesamethinginbothwhileandforloops.
Modifying the loop variable will not change the number of iterations the loop will
execute.Infact,ithasnoeffect.Thisloopdemonstratesthat:
foriinrange(0,10):
print(“Before“,i)
i=i+1000
print(“After“,i)
Itprints:
Before0
After1000
Before1
After1001
...
andsoon.Itseemsthatthevalueofichangesaftertheassignmentfortheremainderof
the loop and then is set to what it should be for the next iteration. This makes sense if
Pythonistreatingtherangeasasetofelements(itis),anditassignsthenextonetoiat
thebeginningofeachiteration.Unlikeawhileloop,thereisnotatestforcontinuation.In
anycase,changingiheredoesnotalterthenumberofiterationsandcan’tbeusedinplace
ofabreak.
2.4.2 Else
Theideathattheloopcanbeexitedexplicitlymakesthenormalterminationoftheloop
something that should be detectable too. When a while or for loop exits normally by
exhaustingtheiterationsorhavingtheexpressionbecomeFalse,itissaidtohavefallen
through.Whentheforloopintheprimenumberprogramdetectsafactor,itnowexecutes
abreakstatement,thusexitingtheloop.
What if it never does that? In that case no factor exists, and the number isprime. The
programasitstandshasaflagthatindicatesthis,butitcouldbedonewithanelseclause
ontheloop.
Theelsepartofawhileorforloopisexecutedonlyiftheloopfallsthrough;thatis,
whenitisnotexitedthroughabreak.Thiscanbequiteuseful,especiallywhentheloopis
involvedinasearch,aswillbediscussedlater.Inthecaseoftheprimenumberprogram,
anelsecouldbeusedwhenthenumberisinfactprime,asfollows:
forninrange(1,int(k/2))#DivideKbyallnumbers<K/2
ifk%n==0:#Iftheremainderis0thenn
isprime=“no”#dividesevenlyintoK:not
#prime
break
else:
isprime=“yes”#Loopnotexited:itisprime
Anelseinawhileloopoccurswhentheconditionbecomesfalse.Consideraloopthat
readsfrominputuntiltheusertypes“end”andissearchingforthename“Smith”:
inp=input()
while(inp!=“Smith”):
s=input()
ifs==“end”:
break
else:
print(“Smithwasfound”)
#Whentheprogramreachesthispointitisno
#longerknownwhetherSmithwasfound.
Ofcourse,theelseisnotrequired,andsomeprogrammersbelieveitisevenharmful.
Therearealwaysotherwaystoaccomplishthesamething.
2.5 LOOPSTHATARENESTED
Just as it is possible to have if statements nested within other if statements, it is
possible,andevenlikely,tohavealoopnestedwithinanotherloop.Anexampleofnested
forloopswouldbe:
foriinrange(0,10)
forjinrange(0,10)
print(i,j)
Theprintinthisexamplewouldexecute100times.Eachtimetheouterloopexecutes
oncetheinneroneisexecuted10times,foratotalof10*10or100iterations.Loopscan
be nested to a greater depth if necessary, and while and for loops can be nested
interchangeably.Isiteverusefultodothis?Yes,veryoften.
Sincetherehasbeenarecentdiscussionofprimenumbersandfactoring,considerthe
problemoffindthenumberwithinagivenrangethathasthegreatestnumberofdifferent
factors.Leavingout1andthenumberitself,2hasnofactors,nordoes3;4hasone(=2),5
hasnone,and6hastwo(2and3).Whichnumberbetween0and1000hasthemost?
Fromtheprimenumbergameitisclearthatthefactorscanbefoundusingaloop.Ifthe
loopisnotexitedwhenoneisfound,allofthemcanbeidentifiedand,moreimportantly
forthisproblem,counted.Foragivennumberk,thefactors can beidentifiedusing the
followingloop:
count=0;
forninrange(1,int(k/2)):#DivideKbyallnumbers<K/2
ifk%n==0:#Iftheremainderis0thenn
count=count+1
#Thenumberkhascountnumbersthatdivideevenlyintoit.
Thestatementcount=count+1hasreplacedtheisprime=“no”statementfromthe
prime number game. When the loop ends the value of count will be the number of
divisors it has. If this number is 0, then the number k is prime, by the way. Okay, the
problemhasbeensolvedforanynumberk.Nowsolveitforallnumbersbetween1and
1000andidentifythenumberwiththelargestvalueofcount(i.e.,thelargestnumberof
divisors).Thisinvolvesanotherloopenclosingthisonethatcountsfrom1to1000.
Defineavariablemaxvwhichisatanygivenmomentthenumberthathasthegreatest
numberofdivisors,andanothervariablemaxcountwhichisthenumberofdivisorsthat
maxvhas.Initiallymaxvis1andmaxcountis0(i.e.,thenumber1hasnodivisors).Now
loop between 1 and 1000 and replace maxv and maxcount whenever a new number is
foundforwhichthenumberofdivisorsisgreaterthanmaxcount.Specifically:
maxv=1
maxcount=0
forkinrange(1,1000):#Countthedivisorsforarange
count=0;
forninrange(2,int(k/2)):#DivideKbyallnumbers
#<K/2
ifk%n==0:#Iftheremainderis0
#thenn
count=count+1#Countthisdivisor
ifcount>maxcount:#Anewmaximum
maxcount=count#Savethecount
maxv=k#andthevalueitself
print(“Themostdivisorsis“,maxv,”with“,maxcount)
Theresultfor1to1000is:
Themostdivisorsis840with30
Theresultfor1to10000:
Themostdivisorsis7560with62
Bytheway,thislastversionneeded10secondstoexecute.
2.6 DRAWAHISTOGRAM
A histogram is a kind of graph. It usually represents the frequency of occurrence of
certaindiscretevalues.Commonexamplesincludetemperatureasafunctionofmonthof
theyear,orhistogramsofincomeasafunctionofyear,age,race,orgender.Drawingone
involvesknowing howmany categoriesthere areand whatthe numericalvalues arefor
each category. Then the numbers are scaled so they fit in a particular area, and the
rectanglesaredrawnsothattheheightsreflectthe relative numerical values. Figure 2.3
showssometypicalexamples.
A company wishes to plota histogram oftheir income for each quarterof 2016. The
numerical values are stored in variables Q1,Q2,Q3, and Q4 and range between 0 and 1
million.Usingfancygraphicsisnotpossibleyet,sohowcansimplehistogramsbedrawn?
Byusingtext.Ifthehistogramisdrawnsothatthebarsarehorizontalinsteadofvertical,
thenthenumberofcharactersdrawninarowcanbeusedtorepresentthe“height”ofthe
histogrambar.Usingthe“#”character,avalueof20couldbedrawnas:
Q1:####################20
Thisisanothersituationwherealoopisnecessary.
Therearethreepartstothehistogrambarabove:thelabel,thebar,andthedatavalue.
Thelabeliseasytoprint,andintheexampletherearefourpossibilities;thesearesimply
printedatthebeginningofeachlinebeingdrawn.Thedatavalueisnotnecessary,butitis
useful for people looking at the graph to know what the exact number is. Each “#”
character drawn could represent a range of values. The histogram bar is the trick. If
numbersuptoamillionmustberepresented,thenthebarmustbescaledsothatitfitsona
line.If50charactersfitonaline,theneach“#”printedneedstorepresent1000000/50,or
20000dollars.Anotherwaytosaythisisthatevery$20000ofincomeresultsinone“#”
character being printed. How many “#” are printed for the first quarter? Q1/20000 of
them.
Aproblem:theprintfunctionprintsoutalineeverytimeitiscalled.Howcanmultiple
thingsbeprintedonaline?Happily,printhasaspecialparametertoallowthat.Thecall:
print(i,end=ꞌ!′)
willprintthevariableiandthenprintthe“!”stringfollowingthat,everytime.Normally
theprintwillplaceanendoflinecharacter(representedas“\n”)attheendofeveryline,
buttheend= clause allows the programmer to change this to whatever they like. If the
string provided is empty (contains no characters) then the print will not print anything
extra after each call, meaning specifically that no end of line will be printed. Thus, the
statement:
print(“#”,end=””)
printsone“#”characterbutnoendofline.Ifanother“#”isprinted,thenitwillcomeright
aftertheonejustprinted.Thisisexactlywhatisneededforthehistogramprogram.Aloop
thatwouldprintten“#”charactersononelinecannowbewrittenas:
foriinrange(0,10):
print(“#”,end=””)
Given that the value of the variable Q1 is between 0 and 1000000 and each 20000
shouldresultinasingle“#”characterbeingprinted,thefirstquarterhistogrambarcould
bedrawnbythefollowing:
print(“Q1:“,end=””)
forkinrange(0,int(Q1/20000)):
print(ꞌ#ꞌ,end=ꞌꞌ)
print(”“,q1)
Thisincludesallofthelabels,andtheoutputwouldlooklikethis:
Q1:########190000
A complete solution to the problem would draw the histogram for all four quarters
alongwithaheadingforthegraph.Theoutputmightlooklikethis:
EarningsforWidgetCorpfor2016
Dollarsforeachquarter
==============================
Q1:#########190000
Q2:#################340000
Q3:###########################################873000
Q4:#####################439833
Exercise5attheendofthechapterinvolvesfinishingthisprogram.
2.7 LoopsinGeneral
Theconceptofaloopinaprogramminglanguagehasbeendiscussedformanyyears
andhasalargedegreeofboththeoryandpracticeunderlyingit.Theoriginal“loop”wasa
branchorgoto,wherethetopoftheloopwasidentifiedwithanaddressorlabelandatthe
bottom there was a statement that said to “go to” or transfer control to that location.
Examplesofthisare:
label1:add1tox12x=x+1
subtract2fromminmin=min-2
branchtolabel1goto12
Branchesweretypicalofassemblylanguageprogramming,whereeachlineofcodewas
one actual computer instruction. The goto statement was introduced in the first real
programminglanguageFORTRAN,butwas quicklysupplementedbyamorestructured
loopconstruct,thedostatement.Bothbranchandgotostatementscanbeconditional.
Variouskindsofloophavebeendevelopedovertheyears,andthemostcommonlyused
variationisthewhileloop.Theorysaysthattheonlykindthatisneeded,andprobablythe
most general, is the loop statement as defined in the Ada language. It is essentially an
infinite loop that allows escapes at multiple and various pointson specified conditions.
Thebasicsyntaxis:
loop
exitwhencondition1;
Statements…
…
exitwhencondition2;
endloop;
Anexitatthetopoftheloopisawhileloop.Anexitattheendcouldbearepeat…
until as in Pascal or C++, and it is a simple matter to declare and initialize a control
variableandtesttheconditiontoimplementaforloop.Everythingispossiblewiththis
loopsyntax.
When specifically using Python, a while loop is all that is needed. If the range is an
integerone,thentheloop:
foriinrange(a..b):
isthesameastheloop:
i=a
whilei<b:
…
i=i+1
This loop has an initialization, a condition, and an increment. As individual entities
thesearesomewhathiddeninPython,beingmaskedbythesyntax, but theloopcontrol
variabletakesonthefirstvaluethefirsttimetheloopisexecuted(initialization),iterates
throughtheselections(increment),andterminatesafteritselectsthefinalone(condition).
The loop control variable is not really what gets incremented; what is incremented is a
countthatindicateswhichoftheitemsinthetupleiscurrentlybeingused.Intheloop:
foriin(“red”,“yellow”,“green”):
thevariableitakesonthevalues“red,”“yellow,”and“green,”butwhatgetsincremented
eachtimethroughtheloopisanindicationofwhichpositioninthetupleisrepresentedby
i.Thevalue“red”is0,“yellow”is1,and“green”is2andacountimplicitlystartsat0and
stepsuntil2assigningvaluestoi.Thiskindofloopissimilartothatfoundinthelanguage
PHP,andisalevelofabstractionabovethoseinJavaandC++.
2.8 EXCEPTIONSANDERRORS
Computersdonot,asageneralrule,makemistakes.Likeotherhuman-designedand-
constructeddevicessuchascarsandstoves,computerscanbeawkwardtouse,canhave
designfeaturesthatdon’tturnoutasexpected,andcanevenbreakdowntooquickly.But
theydonotmakemistakes.Acomputerprogram,ontheotherhand,almostcertainlyhas
mistakesorbugscodedwithinit.Consumersdon’tusuallymakeadistinctionbetweenthe
computerandthesoftwarethatrunsonit,butprogrammersandengineersmust.Whena
computer program does not work properly, a programmer must exhaust all ways the
programcouldbewrongbeforelookingatanerrorinthecomputeritself.
Creatingacorrectprogramisdifficultformanyreasons.Firstofall,beforeanycodeis
written,theproblemtobesolvedmustbeclearlyunderstood,anditmustbethecorrect
problem.Solvingthe wrongproblem isa prettycommon error,but can’tbe detectedor
correctedbythecomputer.Commonexamplesofthissortoferrorcomefromstatingthe
problem in English (or a human language of any description) where errors in
understanding occur. “Find the average of the first ten integers,” for example, is a little
ambiguous. Is the first integer 0 or 1? What is meant by “average”—the mean or the
median? Computer programmers tend to be quite literal, and so what they think is the
answer will be written into the code, and then they will argue for that answer as being
correct.Itisveryimportanttorealizethat,whatevertheliterallycorrectansweris,thereal
correctanswerisbasedonthecorrectunderstandingoftheproblem.Sometimesitisstated
badly, but no matter whose fault the problem is, the job of fixing it will lie with the
programmer. Sometimes a little time at the beginning clarifying the question can save
moretimelater,andstickingwithanoverlypedanticinterpretationwillcauseproblemsin
thelongrun.
Acorrectprogramalsodependsontheprogrammerbeingabletoidentifyallpossible
circumstances that can occur and knowing how to deal with each of them. Failing to
handleonepossiblesituationisanerror,andtheprogramwillbehaveunpredictablyifthat
situation occurs in practice. Statements that handle errors appear all though real (in the
fieldorcommercial)code.Infact,itiscommonthattherearemorestatementsthatdetect
anddealwitherrorsthancodethatactuallycomputesananswer.Onethingthatshouldbe
remembered:alllinesofthecodeneedtobetested.Inverylargeprogramsthismaybe
impossible,buteverylineofcodethathasneverbeenexecutedisapotentialerror.Testas
manyaspossible,includingtheerrordetectioncode.
User input is a frequent cause of mistakes in programs. It’s not that the user is the
problem; the programmer must anticipate all possible ways that a user can enter data.
Thereisusuallyonecorrectwaybutmanyerroneousones,anditisimpossibletopredict
whatauserwillenterfromakeyboardinresponsetoanyrequest.Similarly,thecontents
of a file may not be what the programmer expects. File formats are standards, but
sometimes there are variations and at other times a user may have entered the data
improperly.Whilethemistakeisonthepartoftheuser,itisalsoaprogrammingmistake
if the error is not detected and is allowed to have an impact on the execution of the
program.
Programmerstendtomakeassumptionsabouttheproblem.Itisacommonmistaketo
think“thissituationcanneverhappen”andthenignoreit,howeverunlikelythesituation
seems. Testing every statement for everything that could possibly go wrong may be
impossible,buttestingforthegeneralsituationmaybepossible.Itwouldbegreattobe
abletosay“ifanystatementinthissectionofcodedividesbyzero,”or“ifanyvariables
inthiscodehavethewrongtype,”thendosomeparticularthing.
Sinceitisimpossibletowriteaprogramofanylengthwithouttherebeingcodingerrors
of some kind included, a step towards a solution may be to check all data before it is
operated on to ensure the pending operation is going to succeed. For instance, before
performingthedivisiona/b,testtomakesurethatbisnotzero.Thisdependsontheerror
being at least in principle predictable. Most modern languages, Python included, have
implemented a way to catch errors and permit the programmer to handle them without
havingtestsbeforeeachstatementorexpression.Thisfacilityiscalledtheexception.
The word exception communicates a way to think about how errors will be handled.
Somecodeislegalandcalculatesadesiredvalueexceptundercertaincircumstances,or
unless some particular thing happens. The way it works is that the program tries to
perform some operation and errors are allowed to occur. If one does, the computer
hardwareoroperatingsystemdetectsitandtellsPython.Theprogramcannotcontinuein
thewaythatwasplanned,whichiswhythisiscalledanexception.Theprogrammercan
tellPython whatto do if specific errorsoccur bywriting some codethat dealswith the
problem.Iftheprogrammerdidnotdothis,thenthedefaultisforPythontoprintanerror
messagethatdescribestheerrorandthenstopexecutingtheprogram.Errormessagescan
beseenasafailureonthepartoftheprogrammertohandleerrorscorrectly.
A simple exampleis the divide byzero error mentioned previously.If the expression
a/bistobeevaluated,thevalueofbcanbecheckedtomakesureitisnotzerobeforethe
divisionisdone:
ifb!=0:
c=a/b
Thiscanbetediousfortheprogrammerifalotofcalculationsarebeingdone,andcan
beerrorprone.Theprogrammermayforgettotestoneortwoexpressions,especiallyif
engagedinmodificationsortesting.Usingexceptionsisamatterofallowingtheerrorto
happenandlettingthesystemtestfortheproblem.Thesyntaxisasfollows:
try:
c=a/b
except:
c=1000000
The try statement begins a section of code within which certain errors are being
handledbytheprogrammer’scode.Afterthatstatement,codeisindentedtoshowthatitis
partof the try region. Nearly any code can appear here, but the try statement must be
endedbeforetheprogramends.
Theexceptstatementconsistsofthekeywordexceptand,optionally,thenameofan
error.TheerrorsarenamedbythePythonsystem,andthecorrectnamehastobeused,but
if no error name is given as in this example, then any error will cause the code in the
exceptstatementtobeexecuted.Notspecifyinganamehereisanimplicitassumptionthat
eitheronlyonekindoferrorcouldpossiblyoccurorthatnomatterwhaterrorhappens,the
samecodewillbeusedtodealwithit.Specifyinganunrecognizednameisitselfanerror.
Thenamecanbeavariable,butthatvariablemusthavebeenassignedarecognizederror
namebeforetheerroroccurs.Thecodefollowingtheexceptkeywordisindentedtoo,to
showthatitispartoftheexceptstatement.Thisisreferredtobyprogrammersasanerror
handler,andisexecutedonlyifthespecifiederroroccurs.
Thisappearstobeevenmoreverbosethantestingb,butanynumberofstatementscan
appearbetweenthetryandtheexcept.Thissectionofcodeisnowprotectedfromdivide
byzeroerrors.Ifanyoccur,thencodefollowingtheexceptstatementwillbeexecuted,
otherwisethatcodewillnotexecute.Ifothererrorsoccur,thenthedefaultactionwilltake
place—anerrormessagewillbeprinted.
Testingspecificallyforthedividebyzeroerrorcanbedonebyspecifyingthecorrect
errornameintheexceptstatement:
try:
c=a/b
exceptZeroDivisionError:
c=1000000
Morethanonespecificerrorcanbecaughtinoneexceptstatement:
try:
c=a/b
except(ValueError,ZeroDivisionError):
c=1000000
Clearly (ValueError, ZeroDivisionError) is a tuple, and could be made longer and
couldbeassignedtoavariable.
Also,therecanbemanyexceptstatementsassociatedwithasingletry:
try:
c=a/b
exceptValueError:
c=0
exceptZeroDivisionError:
c=1000000
And,aswasmentioned,avariablecanholdthevalueoftheerrortobecaught:
k=ZeroDivisionError
try:
c=a/b
exceptk:
c=1000000
Finally,the exceptionnamecan be left outaltogether.Inthat caseany exceptionthat
occurswillbecaughtandtheexceptioncodewillbeexecuted:
try:
c=a/b
except:
c=0
2.8.1 Problem:AFinalLookatGuessaNumber
Thefinalversionoftheprograminvolvingguessinganumberlookslikethis:
choice=7
print(“Pleaseguessanumberbetween1and10:“)
playerchoice=int(input())
ifchoice==playerchoice:
print(“Youwin!”)
else:
print(“Sorry,youlose.”)
Usingexceptionsandwhathasbeendiscussedabouterrorchecking,thisprogramcan
beimproved.First,iftheuserenterssomethingthatisnotaninteger,itisanerror.This
should be caught using an exception. Also, rather than forcing the player to run the
programagain,aloopcanbeusedtoaskforanotherguess.Theinputshouldbewithin
the try statement. The except statement should print an error message, and the entire
collectionshouldbewithinaloopthatcontinuestoasktheusertoguessanumber.Hereis
abetterversion:
choice=7
guessed=False#Hastheuserguessedareasonable
number?
whilenotguessed:#Keeptryinguntiltheyhave
print(“Pleaseguessanumberbetween1and10:“)
try:#Catchpotentialinputerrors
playerchoice=int(input())
guessed=True#Successsofar
except:#Anerroroccurred.
print(“Sorry,yourguessmustbeaninteger.”)
ifchoice==playerchoice:#Correctguess?
print(“Youwin!”)
else:
print(“Sorry,youlose.”)
ThevariableguessedissettoTruewhenasuccessfulguessismade,andthisstopsthe
loopfromrepeating.Iftheuserentersarealnumberorastring,theexceptioniscaught
beforethathappens, theerror messageis printed,and theuser is askedto enteranother
guess.
Whatelseiswrongwiththiscode?Well,theuserisaskedtoenteranumberbetween1
and10,butthatvalueisnevercheckedtoseeifitisOK.True,ifitfallsoutsidetherange,
then it will always be an incorrect guess and the player will lose. It’s a penalty for not
payingattentiontotherules.Aprogramshouldwheneverpossiblegivetheuserasmuch
information as is reasonable, so it would be better to check the value of the variable
playerchoiceandgiveanerrormessageifitisoutofrange.Thebestwaytodothisisto
placethecheckaftertheexceptstatementatthebottomoftheloop,andsetthevariable
guessedtoFalseiftheguessisanimproperone.Thentheloopwillrepeatandtheplayer
willgetanotherguess.
Thisversionoftheprogramis:
choice=7
guessed=False
whilenotguessed:
print(“Pleaseguessanumberbetween1and10:“)
try:
playerchoice=int(input())
guessed=True
except:
print(“Sorry,yourguessmustbeaninteger.”)
ifplayerchoice<10orplayerchoice>10:#Istheguess
#in1..10?
print(“Yourguesswas”,playerchoice,”whichisoutofrange.”)
guessed=False#Nope.Guessagain
ifchoice==playerchoice:
print(“Youwin!”)
else:
print(“Sorry,youlose.”)
2.9 SUMMARY
Theabilitytorepeatacollectionofoperationsisanessentialpartofanyprogramming
language.Thewhileloophasaconditionatthebeginning,andsolongasthatconditionis
true,thestatementscomprisingtheloopwillbeexecutedrepeatedly.Theforloophasan
explicitlistofitemsforwhichtheloopwillbeexecuted,orarangeofnumericalvalues
thatdefinehowmanytimesthecodewillberepeated.
Most problems that are solved using a computer program have some degree of
repetitionimplicitintheimplementation,andsomecomputeralgorithmsarequiteexplicit
abouthowtheiterationsaretobesetupandhowmanyareneededtosolvetheproblem
(See:Exercises3and4).
CertainerrorsthatcanoccurinprogramcanbedetectedautomaticallybyPython.Ifthe
programmerdoesnotspecifyotherwise,thentheseerrorscauseamessageandpremature
program termination. The try-except statement allows the programmerto handle errors
withoutendingtheprogram, andpermitsbetter communicationofthekind oferrorthat
occurred,inthecontextoftheprogram,totheprogrammeroruser.
Exercises
1.Giventhefollowingdefinitions:
var1=12
var2=100
var3=-2
var4=0
Whatisprintedbythefollowingwhileloops:
a.whilevar1<var2:
print(var1)
var1=var1+30
b.whilevar1<var2:
print(var1)
var1=var1*2
c.whilevar1>0:
var4=var4+1
var1=var1–1
print(var1,var2)
d.whilevar1>0:
var4=var4+1
var1=var1–var4
print(var1,var2)
e.whilevar1<var3:
print(”*”,end=””)
var3=var3+2
f.whilevar2>var1*var4:
var1=var1+1
var4=var4+1
print(var1,var2)
2.Whatwouldbeprintedbythefollowingforloops:
a.foriinrange(1,10):
print(i)
b.foriin(1,10):
print(i)
c.foriin(“red”,“green”,“blue”):
print(i)
d.foriinrange(0,10):
forjinrange(1,10):
ifi==j:
print(i)
e.foriinrange(0,10):
forjinrange(0,50):
ifi*i==j:
print(i)
f.foriin(0,10):
i=i*2
print(i)
g.foriinrange(1,10):
forjinrange(1,i):
print(j,end=””)
print()
h.oriinrange(0,10):
i=i+1
forjinrange(1,i):
print(j,end=””)
print()
3.TheGreekmathematicianZeno(c.450BCE)iscreditedwithcreatingtheparadoxof
theTortoiseandAchilles.AtortoisechallengedthegreatheroandathleteAchillestoa
footrace.Allthetortoiseaskedwasaten-yardheadstart.Theideawasthatoncethe
racebegan,Achillescouldruntheten-yardheadstartinasmalltime;however,inthat
sametimethetortoisewouldmoveforwardasmallamount,perhapsayard.When
Achillesmadeupthatyard,thetortoisewouldhavemovedaheadagainasmall
distance;andsoon.ThelogicwasthatAchillescouldnevercatchup.The
misunderstandinghereisthataninfinitelylongseriesofnumberscanadduptoafinite
value.WriteasmallPythonprogramthatsumsthenumbers½,¼,⅛,1/16,andsoon
for20iterationsandsuggestwhatthesumwouldbeifitwerecarriedtoaninfinite
numberofiterations.
4.OnewaytocalculatethesquarerootofanumberistouseNewton’smethod.This
startswithaninitialguess:ifthesquarerootofxisbeingcomputed,thenafairinitial
guessgwouldbex/2.Successiveestimatesaregivenbytheexpression:
newg=(g+x/g)/2
Successiveestimatesarenearerandnearertotheactualsquareroot.Writeaprogramto
computethesquarerootofanumberthatisenteredfromthekeyboard.
5.CompletetheprogramthatdrawsahistogramfortheearningsofWidgetCorpforfour
quartersof2016.Earningsare:
a.190000
b.340000
d.873000
d.439833
6.ModifytheprograminExercise5sothatthedataforthefourquartersisreadfromthe
terminal(i.e.,enteredbytheuserfromthekeyboard).Testitforthevalues:
a.900000
b.874000
d.200000
d.439000
7.ModifythesolutiontoExercise6inChapter1(makingchange)sothatitmakes
effectiveuseofaforloop.Theprogramshouldstillreadanumberbetween1and99,
whichisanamountofchangetobegiven,andprintthecoinvaluesthatwouldbe
used.Modifyittonotuse50-centpieces,becausenobodyhasthoseanymore.
8.Convertthefollowingforloopsintotheequivalentwhileloop:
a.foriinrange(1,10):
print(i,i*i)
b.sum=0
foriin(range(10,0,-1):
sum=sum+i
print(i,sum)
9.AgoodsolutiontoExercise4above(squareroot)woulddetectnegativenumbersand
printamessagetotheeffectthatsquarerootsofnegativenumbersdonotexist(notas
realnumbers,anyway).ModifythesolutiontoExercise4touseanexceptiontodeal
withthatsituation,andhandleotherpotentialerrors.
NotesandOtherResources
Online tutorial on Python loops:
http://www.tutorialspoint.com/python/python_loops.htm
Cornell University summary of if statements and loops:
http://www.cs.cornell.edu/courses/cs1130/2012sp/1130selfpaced/module2/module2part1/ifloop.html
Sthurlow.com,Loops,loops,loops…,http://sthurlow.com/python/lesson04/
1.HenryFordandSamuelCrowther.(1922).MyLifeandWork,GardenCity
Publishing,GardenCity,NY,http://www.gutenberg.org/ebooks/7213
2.DavidBeazleyandBrianK.Jones.PythonCookbook,3rdEdition:Recipesfor
MasteringPython3,http://www.onlineprogrammingbooks.com/python-cookbook-
third-edition/
CHAPTER3
SEQUENCES:STRINGS,TUPLES,AND
LISTS
3.1Strings
3.2TheTypeBytes
3.3Tuples
3.4Lists
3.5SetTypes
3.6Summary
Inthischapter
ItwasmentionedinChapter2thatforloopsinPythonaredifferentfromthosefoundin
manyotherlanguages.InJavaandC++,aforloophas a veryexplicitincrement;a for
statementlookslikethisinJava:
Fromthisitcanbeinferredthatthevariableiwillstartoutas0,andsolongasiisless
than10theloopwillcontinue.Aftereachiterationthevalueofiwillbeincreasedby1,
andthentheconditionistestedagain.
InPythontheiterationismoreimplicit,withtheloopcontrolvariabletakingononeofa
setofvaluesinturn.Thereisanimplicationhere,too,thatthereisakindofthing,atype
thatavariablecanhave,thatamountstoalistorsequenceofother,simplerthings.Thisis
true, and using variables having these types is an essential part of writing useful and
effectivecode. Pythonoffersstrings,tuples, and lists as objects that consist of multiple
parts.Theyarecalledsequencetypes.Anintegerorafloatisasinglenumber,whereasa
sequencetypeconsistsofacollectionofitems,eachofwhichisanumberoracharacter.
Eachmemberofasequenceisgivenanumberbasedonitsposition:thefirstelementin
thesequenceisgiven0,thesecondis1,andsoon.Thisisafundamentaldatastructurein
Pythonandhasinfluencedthesyntaxofthelanguage.
Stringsarefamiliarobjectsandhavebeenusedinprogramsalready,sothediscussion
willbeginthere.
3.1 Strings
Astringis asequenceofcharacters. Thewordsequenceimpliesthattheorderofthe
characterswithinthestringmatters,andthatiscertainlytrue.Stringsmostoftenrepresent
the way that communication between a computer and a human takes place. Human
languageconsistsofwordsandphrases,andeachwordorphrasewouldbeastringwithin
a program. The order of the characters within a word matters a great deal to a human
because some sequences are words and others are not. The string “last” is a word, but
“astl” is not. Also, the strings “salt” and “slat” are words and use exactly the same
charactersas“last”butinadifferentorder.
Becauseordermatters,therepresentationofastringonacomputerwillimposeanorder
onthecharacterswithin,andsotherewillbeafirstcharacter,asecond,andsoon,andit
shouldbepossibletoaccesseachcharacterindividually.Astringwillalsohavealength,
whichisthenumberofcharacterswithinit.
Acomputerlanguagewillprovidespecificthingsthatcanbedonetosomethingthatisa
string:thesearecalledoperations,andatypeisdefinedatleastpartlybywhatoperations
canbedonetosomethingofthattype.Becauseastringrepresentstextinthehumansense,
theoperationsonstringsshouldrepresentthekindsofthingsthatwouldbedonetotext.
This would include printing and reading, accessing any character, linking strings into
longerstrings,searchingastringforaparticularword,andsoon.
The examples of code written so far use only string constants. These are simply
characters enclosed in either single or double quotes. Assigning a string constant to a
variablecausesthatvariabletohavethestringtypeandgivesitavalue.Sothestatements:
name=“JohnDoe”
address='121SecondStreet'
causethevariablesnamednameandaddresstobestringswiththeassignedvalue.Note
thateithertypeofquotecanbeused,butastringthatbeginswithadoublequotemustend
withone.
Astringbehavesasifitscharactersarestoredasconsecutivecharactersinmemory.The
first character in a string is at location or index 0, and can be accessed using square
bracketsafterthestringname.Usingthedefinitionsabove,name[0]is“J”andname[5]=
“D.”Ifanindexisspecifiedthatistoolarge,itresultsinanerror,becauseitamountstoan
attempttolookpasttheendofthestring.
How many characters are there in the string name? The built-in function len() will
returnthelengthofthestring.Thelargestlegalindexisonelessthanthisvalue:thefirst
characterofastringnamehasindex0,andthefinalonehasindex7;thelengthis8.Thus,
any index between 0 and len(name)-1 is legal. The following code prints all of the
charactersofnameandcanbethoughtofasthebasicpatternforcodethatscansthrough
thecharactersinstrings:
foriinrange(0,len(name)):
print(name[i],end=””)
Thismay be a little confusing, but rememberthat the range(0,n)does notinclude n.
Thislooprunsthroughvaluesofifrom0tolen(name)-1.
Somelanguageshaveacharactertype,butPythondoesnot.Astringoflengthoneis
whatPython uses instead.Acomponent ofa string istherefore another string.The first
character of the string name, which is name[0], is “J,” the string containing only one
character.
3.1.1 ComparingStrings
Twostringscanbecomparedinthesamemannerasaretwointegersorrealnumbers,
by using one of the relational operators ==, !=, <, >, <= or >=. What it means for two
stringstobeequalissimpleandreasonable:ifeachcorrespondingcharacterintwostrings
isthesame,thenthestringsareequal. That is,forstringsaandb,ifa[0]==b[0], and
a[1]==b[1]andsoontothefinalcharactern,anda[n]==b[n],thenthetwostringsaand
bareequalanda==b.Otherwise,a!=b.Bytheway,thisimpliesthatequalstringshave
thesamelength.
What about inequalities? Strings in real life are often sorted in alphabetical order.
Namesinatelephonebook,filesinadoctor’soffice,andbooksinastore:thesetendto
appear in a logical order basedon the alphabet. This is also true in Python. The string
“abc” is less than the string “def,” for example. Why? Because the first letter in “abc”
comesbeforethefirstletterin“def”;inotherwords,“abc”[0]<“def”[0].Yes,characters
instringconstantscanbeaccessedusingtheirindex.
Astrings1islessthanstrings2ifallcharactersfrom0throughkinthetwostringsare
equal,ands1[k+1]<s2[k+1].Sothefollowingstatementsaretrue:
“abcd”<“abce”
“123”<“345”
“ab”<“abc”
Inthelastexample,thespacecharacter“.”issmallerthan(i.e.,comesbefore)theletter
“c.”Whatifthestringsarenotthesamelength?Thestring“ab”<“abc”,soiftwostrings
areequaltotheendofoneofthem,thentheshorteroneisconsideredtobesmaller.These
rulesareconsistentsofarwiththosetaughtingradeschoolforalphabetization.Trailing
spaces do not matter. Leading spaces can matter, because a space comes before any
alphabeticcharacter;thatis,
““<“a”.Thus“ab”>“z”.
Digitscomebeforelowercaseletters.“1”<“a”,and“1a”<“a1”.Mostimportantly,
uppercase comes before lowercase, so “John” < “john”. All of these rules are
consistent with those that secretaries understand when filing paper documents. As an
examplethatcomparesstrings,considerthefollowing:
a=“J”
b=“j”
c=“1”
ifb<c:
print(“Lcase<numbers”)
else:
print(“Lcase>numbers”)
ifa<c:
print(“Ucase<numbers”)
else:
print(“Ucase>numbers”)
Thisresultsintheoutput:
Lcase>numbers
Ucase>numbers
Problem:DoesaCityName,EnteredattheConsole,Come
beforeoraftertheNameDenver?
This involves reading a string and comparing it against the constant string “Denver.”
Lettheinputstringbereadintoavariablenamedcity.Thentheansweris:
city=input()
ifcity<“Denver”:
print(“ThenamegivencomesbeforeDenverinan
alphabeticlist”)
elifcity>“Denver”:
print(“ThenamegivencomesafterDenverinan
alphabeticlist”)
else:
print(“ThenamegivenwasDenver”)
If“Chicago”istypedattheconsoleasinput,theresultis:
Chicago
ThenamegivencomesbeforeDenverinanalphabeticlist
However,ifcaseisignoredand“chicago”istypedinstead,thentheresultis:
chicago
ThenamegivencomesafterDenverinanalphabeticlist
because, of course, the lowercase “c” comes (as do all lowercase letters) after the
uppercase“D”atthebeginningof“Denver.”
3.1.2 Slicing–ExtractingPartsofStrings
Toaperson,astringusuallycontainswordsandphrases,whicharesmallerpartsofa
string. Identifying individual words is important. To Python this is true also. A Python
programconsistsofstatementsthatcontainindividualwordsandcharactersequencesthat
each have a particular meaning. The words “if,” “while,” and “for” are good examples.
Individualcharacterscanbereferencedthroughindexing,butcanwordsorcollectionsof
charactersbeaccessed?Yes,ifthe
location(index)ofthewordisknown.
Problem:Identifya“Print”StatementinaString
Thestatement:
print(“Lcase<numbers”)
appearsintheexampleprogramabove.Thiscanbethoughtofasastring,andassignedto
avariable:
statement='print(“Lcase<numbers”)'
Question:isthisaprintstatement?Itisifthefirstfivecharactersaretheword“print.”
Eachofthosecharacterscouldbetestedindividuallyusing:
ifstatement[0]=='p':
ifstatement[1]=='r':
ifstatement[2]=='i':
ifstatement[3]=='n':
ifstatement[4]=='t':
ifstatement[5]=='':
#Thisisaprintstatement.
Thisisprettyugly,andissomethingthatisneededoftenenoughthatPythonoffersa
nicerway to do it. A slice is a set of continuous characters within a string. This means
their indices are consecutive, and they can be accessed as a sequence by specifying the
rangeofindiceswithinbrackets.Thesituationaboveconcerningtheprintstatementcould
bedonelikethis:
ifstatement[0:5]==“print”:
Thesliceheredoesnotincludecharacter5,butis5characterslongincludingcharacters
0through 4 inclusive.Aslice fromitoj(i.e.,x[i:j]) does not include character j.This
meansthatthefollowingstatementsproducethesameresult:
fname[0]
fname[0:1]
Ifthefirstindexisomitted,thenthestartindexisassumed,sothestatement:
ifstatement[0:5]==“print”:
isthesameas:
ifstatement[:5]==“print”:
Ifthesecondindexisomitted,thenthelastlegalindexisassumed,whichistosaythe
indexofthefinalcharacter.Sotheassignment:
str=statement[6:]
results in the value of str being “(“Lcase < numbers”).” Both indices can be omitted,
which does sound silly, but really just means from the first to the last character, or the
entirestring.
3.1.3 EditingStrings
Pythondoesnotallow themodificationofindividual partsofa string.Thatis,things
like:
str[3]=“#”
str[2:3]=“..”
are not allowed. So how can strings be modified? For example, consider the string
variable:
fname=“image”
IfthisissupposedtobethenameofaJPEGimagefile,thenitmustendwiththesuffix
“.jpg.”
Problem:CreateaJPEGFileNamefromaBasicString
Thestringfnamecanbeeditedtoendwith“.jpg”inafewways,buttheeasiestoneto
useistheconcatenationoperator“+’.
Toconcatenatemeans“tolinkorjointogether.”Ifthevariablesaandbarestrings,then
a+b is the string consisting of all characters in a followed by all characters in b; the
operator “+” in this context means to concatenate, rather than to add numerically. The
designers of Python and many other languages that implement this operator think of
concatenationasstringaddition.
Tousethistocreatetheimagefilename,simplyconcatenate“.jpg”tothestringfname:
fname=fname+“.jpg”
Theresultisthatfnamecontains“image.jpg.”
File suffixes are very often the subject of string manipulations and provide a good
example of string editing. For instance, given a file name stored as a string variable
fname,isthesuffix“.jpg”?Basedontheprecedingdiscussion,thequestion
canbeansweredusingasimpleifstatement:
iffname[len(fname)-4:len(fname)]=='.jpg':
Usingasliceitcouldalsotaketheform:
iffname[len(fname)-4:]==“.jpg”
Avaluablethingtoknowisthatnegativeindicesindexfromtheright-handsideofthe
string;thatis,fromtheend.Sofname[-1]isthefinalcharacterinthestring,fname[-2]is
theoneprevioustothat,andsoon.Thelast4characters,thesuffix,wouldbecapturedby
usingfilename[-4:].
Problem:ChangetheSuffixofaFileName
Some individuals use the suffix “.jpeg” instead of “.jpg.” Some programs allow this,
othersdonot.Somecodethatwoulddetectandchangethissuffixwouldbe:
iffname[len(fname)-5:]==“.jpeg”:#identfythejpeg
#suffix
fname=fname[0:len(fname)-5]#removethelastfive
#characters
fname=fname+“.jpg”#appendthecorrect
#suffix
Problem:ReversetheOrderofCharactersinaString
There are things about any programming language that could be considered to be
“idioms.” These are things that a programmer experienced in the use of that language
wouldconsidernormaluse,butthatothersmightconsiderodd. This problemexposesa
Python idiom. Given what is known so far about Python, the logical approach to string
reversalmightbeasfollows:
#cityhasalegalvalueatthispoint
k=len(city)
foriinrange(0,len(city)):
city=city+city[k-i-1]
city=city[len(city)//2:]
Thisreversesthestringnamedcitythatexistspriortotheloopandcreatesthereversed
string.Itdoessointhefollowingway:
1.Letibeanindexintothestringcity,startingat0andrunningtothefinalcharacter.
2.Indexacharacterfromtheendofthestring,startingatthefinalcharacterand
steppingbackwardsto0.Sincethelastcharacterislen(city)andthecurrentindexis
i,thecharactertobeusedinthecurrentiterationwouldbek-i-1wherekisthe
lengthoftheoriginalstring.
3.Appendcity[k-i-1]tothenendofthestring.Alternatively,anewstringrscouldbe
createdandthischaracterappendedtoitduringeachiteration.
4.Afterallcharactershavebeenexamined,thestringcitycontainstheoriginalstring
at the beginning and the reversed string at the end. The first characters can be
removed,leavingthereversedstringonly.
AnexperiencedPythonprogrammerwoulddothisdifferently.Thesyntaxfortakinga
slicehasavariationthathasnotbeendiscussed;athirdparameterexists.Astringslicecan
beexpressedas:
myString[a:b:c]
whereaisthestartingindex,bisthefinalindex+1,andcistheincrement.
Increment?If:
str=“Thisstringhas30characters.”
Thenstr[0:30:2] is“Ti tighs3hrces,” which isevery secondcharacter.The increment
representsthewaythestringissampled,thatis,everyincrementcharacteriscopiedinto
theresult.Mostrelevanttothecurrentexample,theincrementcanbenegative.Theidiom
forreversingastringis:
print(str[::-1])
Ashasbeenexplained,thevalueofstr[:]isthewholestring.Specifyinganincrementof
-1impliesthatthestringisscannedfrom0totheend,butinreverseorder.Thisisfarfrom
intuitive,butisprobablythewaythatanexperiencedPythonprogrammerwouldreversea
string.Anyprogrammershouldusethepartsofanylanguagethattheycomprehendvery
well,andshouldkeepinmindthelikelyskillsetofthepeoplelikelytoreadthecode.
Problem:IsaGivenFileNameThatofaPythonProgram?
APythonprogramterminateswiththesuffix“.py.”Anobvioussolutiontothisproblem
istosimplylookatthelast3charactersinthestringstoseeiftheymatchthatsuffix:
ifs[len(s)-3:len(s)]=='.py':
print(“ThisisaPythonprogram.”)
Perhaps.Butis“PROGRAM.PY”alegalPythonprogram?Ithappensthatitis.Sois
“program.Py”and“program.pY.”Whatcanbedonehere?
3.1.4 StringMethods
A good way to do the test in this case is to convert the suffix to all upper- or all
lowercase before doing the comparison. Comparing against “.py” means converted to
lowercase,whichisdonebyusingabuilt-inmethodnamedlower:
s1=s[len(s)-3:len(s)]
ifs1.lower()=='.py':
print(“ThisisaPythonprogram.”)
Thevariables1isastringthatwillcontainthefinal3charactersofs.Theexpression
s1.lower()createsacopyofs1inwhichallcharactersarelowercase.It’scalledamethod
todistinguishitfromafunction,buttheyareverysimilarthings.Youshouldrecallthata
method is simply a function that belongs to one type or class of objects. In this case
lower() belongs to the type (or class) string. There could be another method named
lower() that belongs to another class and that did a completely different thing. The dot
notationindicatesthatitisamethod,andwhatclassitbelongsto:thesameclassofthings
thatthevariablebelongsto.Inaddition,thevariableitselfisreallythefirstparameter;if
lowerwereafunction,thenitmightbecalledbylower(s1)insteadofs1.lower().Inthe
lattercase,the“.”isprecededbythefirstparameter.
Stringsallhavemanymethods.Inthetablebelowthevariablesisthe
targetstring,theonebeingoperatedupon.Thismeansthatthemethodnamesbelowwill
appearfollowing“s.,”asins.lower().Letthevalueofsbegivenby
s=“hello toyou all.” Thesemethodsareintendedto providetheoperationsneededto
makethestringtypeinPythonfunctionasamajorcommunicationdevicefromhumansto
aprogram.
3.1.5 SpanningMultipleLines
Textasseeninhumandocumentsmaycontainmanycharacters,evenmultiplelinesand
paragraphs.Aspecialdelimiter,thetriplequote,isusedwhenastringconstantistospan
manylines.Thishasbeenmentionedpreviouslyinthecontextofmultilinecomments.The
regularstringdelimiterswillterminatethe stringattheend oftheline. The triplequote
consists of either of the two existing delimiters repeated three times. For example, to
assign the first stanza of Byron’s poem “She Walks in Beauty” to the string variable
poem:
poem='''Shewalksinbeautylikethenight
Ofcloudlessclimesandstarryskies,
Andallthat'sbestofdarkandbright
Meetsinheraspectandhereyes;
Thusmellow'dtothattenderlight
WhichHeaventogaudydaydenies.'''
Whenpoemisprintedthelineendingsappearwheretheywereplacedintheconstant.
Thisexampleisaparticularlygoodoneinthatmostpoemsrequirethatlinesendprecisely
wherethepoetintended.
AnotherexampleofastringthatmustbepresentedjustastypedisaPythonprogram.A
programcanbeplacedinastringvariableusingatriplequote:
program=“““list=[1,2,4,7,12,15,21]
foriinlist:
print(i,i*2)”””
When printed this string has the correct form to be executed by Python. In fact, the
followingstatementwillactuallyexecutethecodeinthestring:
exec(program)
3.1.6 ForLoopsAgain
Earlierinthissectionaforloopwaswrittentoprinteachcharacterinthestring.That
loopwas:
foriinrange(0,len(name)):
print(name[i],end=””)
Obviouslythestringcouldhavebeenprintedusing:
print(name)
butitwasbeingusedasanexampleofindexingindividualcomponentswithinthestring.
Thecharacters donot need tobe indexed explicitlyin Python;the loop variablecan be
assignedthevalueofeachcomponent:
foriinname:
print(i,end=””)
Inthiscasethevalueofiisthevalueofthecomponent,notitsindex.Eachcomponent
ofthestringisassignedtoiinturn,andthereisnoneedtotestfortheendofthestringor
toknowitslength.Thisisabetterwaytoaccesscomponentsinastringand,asithappens,
canbeusedwithallsequencetypes.Whetheran
indexisusedorthecomponentsarepulledoutoneatatimedependsontheproblembeing
solved;sometimestheindexisneeded,othertimesitisnot.
3.2 THETYPEBYTES
Astring is a sequence of characters, a sequence being defined as a collection within
whichordermatters.Stringsare commonlyusedfor communicationbetweencomputers
andhumans:toprintheadingsandvaluesonthescreen,andtoreadobjectsincharacter
stringform.Humansdealwithcharactersverywell.Thetypebytesrepresentsasequence
of integers, albeit small ones. A bytes object of length 1 is an 8-bit integer, or a value
between0and255.Abytesobjectoflengthgreaterthan1isasequenceofsmallintegers.
Tobeclear,ifsisastringandbisabytesthen:
s[i]isacharacter
b[i]isasmallinteger
Astringconstant(literal)isasequenceofcharactersenclosedinquotes.Abytesliteral
isasequenceofcharactersenclosedinquotesandprecededbytheletter“b.”Thus:
'thisisastring'
isastring,whereas:
b'thisisastring'
hastypebytes.Anymethodthatappliestoastringalsoappliestoabytesobject,butbytes
objects have some new ones. In particular, to convert a bytes object to a string, the
decode()methodisusedandacharacterencodingshouldbegivenastheparameter.Ifno
parameterisgiven,thenthedecodingmethodistheonecurrentlybeingused.Therearea
fewpossibledecodingmethods(e.g.,“utf-8”).Sotoconvertabytesobjectbtoacharacter
strings,thefollowingwouldwork:
s=b.decode(“utf-8”)
Aquestionremains:“whyisthebytestypeneeded?”Whatisitusedfor?Because(and
this is a little ahead of what is needed) it implements the buffer interface. Certain file
operationsrequireabufferinterfacetoaccomplishtheirtasks.Anythingreadfromsome
specifictypesoffilewillbeofthetypebytes,forexample,asithasthatinterface.This
will be discussed further in Chapters 5 and 8, but for the moment it simply serves to
explainwhythetypeexistsatall.Otherthanthebufferinterface,thebytestypeisvery
muchlikeastring,andcanbeconvertedbackandforth.
3.3 TUPLES
Atupleisalmostidenticaltoastringinbasicstructure,exceptthatitiscomposedof
arbitrary components instead of characters. The quotes can’t be used to delimit a tuple
becauseastringcanbeacomponent,soatupleisgenerallyenclosedinparentheses.The
followingaretuples:
tup1=(2,3,5,7,11,13,17,19)#Primenumbersunder20
tup2=(“Hydrogen”,“Helium”,“Lithium”,”Beryllium”,“Boron”,
“Carbon”)
tup3=“hi”,“ohio”,“salut”
Ifthereisonlyoneelementinatuple,thereshouldbeacommaattheend:
tup4=(“one”,)
tup5=“two”,
That’sbecauseitwouldnotbepossibleotherwisetotellthedifferencebetweenatuple
andastringenclosedinparentheses.Is(1)atuple?Orisitsimplythenumber1?
Atuplecanbeempty:
tup=()
Becausetheyarelikestrings,eachelementinatuplehasanindex,andtheybeginat0.
Tuplescanbeindexedandsliced,justlikestrings.So:
tup1[2:4]is(5,7)
Concatenationislikethatofstringstoo:
tup4=tup4+tup5#yieldstup4=('one','two')
Asisthecasewithstrings,theindex-1givesthelastvalueinthetuple,-2givesthe
secondlast,andsoon.Sointheexampleabove,tup2[-1]is“Carbon.”Also,likestrings,
thetupletypeisimmutable;thismeansthatelementsinthetuplecan’tbealtered.Thus,
statementssuchas:
tup1[2]=6
tup3[1:]“bonjour”
arenotallowedandwillgenerateanerror.
Tuplesareanintermediateformbetweenstrings,whichhavejustbeendiscussed,and
lists, which will be discussed next. They are simpler to implement than list (are
lightweight)andaremoregeneralthanstrings.
Are tuples useful? Yes, it turns out, and part of their use is that they underlie other
aspectsofPython.
3.3.1 TuplesinForLoops
Sequences can be used in a for loop to control the iteration and assign to the loop
controlvariable.Tuplesareinterestinginthiscontextbecausetheycanconsistofstrings,
orintegersorfloats.Theloop:
foriin(“Hydrogen”,“Helium”,“Lithium”,”Beryllium”,“Boron”,
“Carbon”):
will iterate 6 times, and the variable i will take on the values in the tuple in the order
specified.Thevariableiisastringinthiscase.Incaseswherethetypesinthetupleare
mixed,thingsaremorecomplicated.
Problem: Print the Number of Neutrons in an Atomic
Nucleus
Considerthetuple:
atoms=(“Hydrogen”,1,“Helium”,2,“Lithium”,3,“Beryllium”,4,
“Boron”,5,“Carbon”,6)
andtheloop
foriinatoms:
print(i)
Thisprints:
Hydrogen
1
Helium
2
Lithium
3
Beryllium
4
Boron
5
Carbon
6
Thenumberfollowingthenameoftheelementistheatomicnumberofthatelement,
the number of protons in the nucleus. In this case the type of the variable i alternates
betweenstringandinteger.Forelementswithalowatomicnumber(lessthan21),agood
guess for the number of neutrons in the nucleus is twice the number of protons. The
problemisthatsomeofthecomponentsarestringsandsomeareintegers.Theprogram
shouldonly do the calculationwhen it is inan iteration having aninteger value for the
loopvariable,becauseastringcan’tbemultipliedbytwo.
Abuilt-infunctionthatcanbeofassistanceisisinstance.Ittakesavariableandatype
name and returns True if the variable is of that type and False otherwise. Using this
function,hereisafirststabataprogramthatmakestheneutronguess:
atoms=(“Hydrogen”,1,“Helium”,2,“Lithium”,3,“Beryllium”,4,
“Boron”,5,“Carbon”,6)
foriinatoms:
ifisinstance(i,int):
j=i*2
print(“has“,i,“protonsand“,j,”neutrons.”)
else:
print(“Element“,i)
Inotherwords,initerationswhereiisanintegerasdeterminedbyisinstance,thenican
legallybemultipliedby2andtheguessaboutnumberofneutronscanbeprinted.
Anotherwaytosolve thesameproblemwould betoindexthe elementsofthe tuple.
Elements0,2,4,etc.(evenindices)refertoelementnames,whiletheothersrefertoatomic
numbers.Thiscodewouldlookasfollows:
atoms=(“Hydrogen”,1,“Helium”,2,“Lithium”,3,“Beryllium”,4,
“Boron”,5,“Carbon”,6)
foriinrange(0,len(atoms)):
ifi%2==1:
j=atoms[i]*2
print(“has“,atoms[i],“protonsand“,j,
”neutrons.”)
else:
print(“Element“,atoms[i])
Notethatinthiscasetheloopvariableisalwaysaninteger,andisnotanelementofthe
tuplebutisanindexatwhichtofindanelement.That’swhytheexpressionatoms[i]is
usedinsidetheloopinsteadofsimplyiasbefore.
3.3.2 Membership
Tuplesarenotsetsinthemathematicalsense,becauseanelementcanbelongtoatuple
morethanonce,andthereisanordertotheelements.However,somesetoperationscould
beimplementedusingtuplesbylookingatindividualelements.Setunionandintersection,
forexample.TheintersectionoftwosetsAandBisthesetofelementsthataremembers
ofAandalsomembersofB.Themembershipoperatorfortuplesisthekeywordin:
If1intuple1:
The intersection of A and B, where A and B are tuples, could be found using the
followingcode:
foriinA:
ifiinB:
C=C+i
ThetupleCwillbetheintersectionofAandB.Itworksbytakingeachknownelement
ofAandtestingtoseeifitisamemberofB;ifso,itisaddedtoC.
Problem: What Even Numbers Less than or Equal to 100
areAlso
PerfectSquares?
Thiscouldbeexpressedasasetintersectionproblem.Thesetofevennumberslessthan
100couldbeenumerated(thisisnotactualcode):
A=(2,4,6,8,10…andsoon
Orcouldbegeneratedwithinaloop:
A=()#Startwithanemptytuple
foriinrange(0,51):#forappropriateintegers
A=A+(i*2,)#addthenextevennumbertothe
#tuple
#Can'tsimplyuseA+ibecauseiisinteger,notatuple.
Similarly,theperfectsquarescouldbeenumerated:
B=(4,9,16,25,36,49,64,81,100)
Or,again,createdinaloop:
B=()
foriinrange(0,11):
B=B+((i*i),)
NowthesetAcanbeexamined,elementbyelement,toseewhichmembersalsobelong
toB:
C=()
foriinA:
ifiinB:
C=C+(i,)
Theresultis:(0,4,16,36,64,100).
Twoimportantthingsarelearnedfromthis.First,whenconstructinganewtuplefrom
components,onecan beginwithan emptytuple.Second, individualcomponentscan be
addedtoatupleusingtheconcatenationoperator“+,”buttheelementshouldbemadeinto
atuplewithonecomponentbeforedoingtheconcatenation.
3.3.3 Delete
A tuple is immutable, meaning that it cannot be altered. Individual elements can be
indexedbutnot changedor deleted. Whatcan bedoneis tocreate anewtuple thathas
newelements;inparticular,deletinganelementmeanscreatinganewtuplethathasallof
theotherelementsexcepttheonebeingdeleted.
Problem:DeletetheElementLithiumfromtheTupleAtoms,
alongwithItsAtomicNumber.
Goingbacktothetupleatoms,deletingoneofthecomponents—inparticular,Lithium
—beginswithdeterminingwhichcomponentLithiumis;thatis,whatisitsindex?Sostart
atthefirstelementofthetupleandlookforthestring
“Lithium,”stoppingwhenitisfound.
foriinrange(0,len(atoms)):
ifatoms[i]==“Lithium”:#Founditatlocationi
break;
else:
i=-1#notfound
Knowing the index of the element to be deleted, it is also known that all elements
before that one belong to the new tuple and all elements after it do too. The elements
beforeelementi can be written as atoms[0:i]. Each element consists of a string and an
integer, and assuming that both are to be deleted means that the elements following
element i are atoms[i+2:]. In general to delete one element the second half would be
atoms[i+1:].Finishingthecodethatdeletes“Lithium”:
ifi>=0:
atoms=atoms[0:i]+atoms[i+2:]
So the tuple atoms has not been altered so much as it has been replaced completely
withanewtuplethathasnoLithiumcomponent.
3.3.4 Update
Again, because a tuple is immutable, individual elements cannot be changed. A new
tuple can be created that has new elements; in particular, updating an element means
creatinganewtuplethathasalloftheotherelementsexcepttheonebeingupdated,and
thatincludesthenewvalueinthecorrectposition.
Problem: Change the Entry for Lithium to an Entry for
Oxygen
An update is usually a deletion followed by the insertion or addition of a new
component.Adeletionwasdoneintheprevioussection,sowhatremainsistoaddanew
component where the old one was deleted. Inserting the element Oxygen in place of
Lithiumwouldbegininthesamewayasthesimpledeletionalreadyimplemented:
foriinrange(0,len(atoms)):
ifatoms[i]==“Lithium”:#Founditatlocationi
break;
else:
i=-1#notfound
Next,anewtupleforOxygeniscreated:
newtuple=(“Oxygen”,8)
AndfinallythisnewtupleisplacedatlocationiwhileLithiumisremoved:
ifi>=0:
atoms=atoms[0:i]+newtuple+atoms[i+2:]
However,anupdatemaynotalwaysinvolveadeletion.IfLithiumisnotacomponent
ofthetupleatoms,thenperhapsOxygenshouldbeaddedtoatomsanyway.Where?How
aboutattheend?
else:#Ifiis-1thenthenewtuplegoesattheend
atoms=atoms+newtuple
3.3.5 TupleAssignment
One of the unique aspects of Python is so-called tuple assignment. When a tuple is
assignedtoavariable,thecomponentsareconvertedintoaninternalformthatistheone
tuplesalwaysuse.Thisiscalledtuplepacking,andishasalreadybeenencountered:
atoms=(“Hydrogen”,1,”Helium”,2,“Lithium”,3,“Beryllium”,4,
“Boron”,5,“Carbon”,6)
Whatisreallyinterestingisthattupleunpackingcanalsobeused.Considerthetuple:
srec=('Parker','Jim',1980,'Math550','C+','Cpsc302','A+')
whichisatuplepackingofastudentrecord.Itcanbeunpackedintoindividualvariables
inthefollowingway:
(fname,lname,year,cmin,gmin,cmax,gmax)=srec
Whichisthesameas:
fname=srec[0]
lname=srec[1]
year=srec[2]
cmin=srec[4]
gmin=srec[5]
cmax=srec[6]
gmax=srec[7]
Ofcourse,theimplicationisthatNvariablescanbeassignedthevalueofNexpressions
orvariables“simultaneously”ifbotharewrittenastuples.Exampleswouldbe:
(a,b,c,d,e)=(1,2,3,4,5)
(f,g,h,i,j)=(a,b,c,d,e)
Theexpression
(f,g,h,i,j)=2**(a,b,c,d,e)
isinvalidbecausetheleftsideof“**”isnotatuple,andPythonwon’tconvert2intoa
tuple.Also:
(f,g,h,i,j)=(2,2,2,2,2)**(a,b,c,d,e)
isalsoinvalidbecause“**”isnotdefinedontuples,norareotherarithmeticoperations.
Aswithstrings,“+”meansconcatenation,though,so(1,2,3)+(4,5,6)yields(1,2,3,4,5,6).
Exchangingvaluesbetweentwovariablesisacommonthingtodo.It’sanessentialpart
of a sorting program, for example. The exchange in many languages requires three
statements:atemporarycopyofoneofthevariableshastobemadeduringtheswap:
temp=a
a=b
b=temp
Because of the way that tuples are implemented, this can be performed in one tuple
assignment:
(a,b)=(b,a)
This is a little obscure, not to an experienced Python programmer but certainly to a
beginner,andoftentoexperiencedprogrammersinotherlanguages.AJavaprogrammer
couldseewhatwasmeant,butinitiallytherationalewouldnotbeobvious.Thisstatement
deservesacommentsuchas“performanexchangeofvaluesusingatupleassignment.’
3.3.6 Built-InFunctionsforTuples
Asexamplesforthetablebelow,usethefollowing:
T1=(1,2,3,4,5)
T2=(-1,2,4,5,7)
Inaddition,tuplescanbecomparedusingthesameoperatorsasforintegersandstrings.
Comparison is done on an element-by-element basis, just as it is with strings. In the
exampleabove,T1>T2becauseatthefirstlocationwherethetwotuplesdiffer(theinitial
component), the element in T1 is greater than the corresponding element in T2. It is
necessaryforthecorrespondingelementsofthetupletobecomparable;thatis,theyneed
tobeofthesametype.Soifthetuplest1andt2aredefinedas:
t1=(1,2,3,“4”,“5”)
t2=(-1,2,4,5,7)
thentheexpressiont1>t2isnot allowed.A stringcan’tbecomparedagainstaninteger,
andtheelementindexedby3oft1isastring,whereastheelementindexedby3oft2is
anint.
3.4 Lists
OnewaytothinkofaPythonlististhatitisatupleinwhichthecomponentscanbe
modified.TheyhavemanypropertiesofanarrayofthesortonemightfindinJavaorC,
inthattheycanbeusedasaplacetostorethingsandhaverandomaccesstothem;any
elementcanbereadorwritten.Theyareoftenusedasonemightuseanarray,buthavea
greaternaturalfunctionality.
Initiallyalistlookslikeatuple,butusessquarebracketstodelimitit.
list1=[2,3,5,7,11,13,17,19]#Primenumbersunder20
list2=[“Hydrogen”,“Helium”,“Lithium”,”Beryllium”,“Boron”,
“Carbon”]
list3=[“hi”,“ohio”,“salut”]
Alistcanbeempty:
list4=[]
andbecausetheyareliketuplesandstrings,eachelementinalisthasanindex,andthey
begin(asusual)at0.Listscanbeindexedandsliced,asbefore:
list1[2:4]is[5,7]
Concatenationislikethatofstringstoo:
list6=list1+[23,31]
yields[2,3,5,7,11,13,17,19,23,31]
Negativevaluesindexfrom theendof thestring.However, unlikestringsand tuples,
individualelementscanbemodified.So:
list1[2]=6
resultsinlist1being[2,3,6,7,11,13,17,19].Also:
list3[1:]=“bonjour”
resultsinlist3takingthevalue—oops,itbecomes:
[‘hi’,“b’,“o’,“n’,“j’,“o’,“u’,“r’].
That’sbecauseastringisasequencetoo,andthisstringconsistsofsevencomponents.
Eachcomponentofthestringbecomesacomponentofthelist.Ifthestring“bonjour”is
supposedtobecomeasinglecomponentofthelist,thenitneedstobedonethisway:
list3[1:]=[“bonjour”]
The other components of list3 are sequences, and now so is the new one. However,
integersarenotsequences,andtheassignment:
list1[2]=[6,8,9]
resultsinthevalueoflist2being:
[2,3,[6,8,9],7,11,13,17,19]
Thereisalistwithinthislist;thatis,thethirdcomponentoflist1isnotaninteger,butis
alistofintegers.That’slegitimate,andworksfortuplesaswell,butmaynotbewhatis
intended.
Problem:ComputetheAverage(Mean)ofaListofNumbers
Themeanisthesumofallnumbersinacollectiondividedbythenumberofnumbers.
Ifasetofnumbersalreadyexistsasalist,calculatingthemeanmightinvolvealoopthat
sumsthemfollowedbyadivision.Forexample,assumingthatlist1=[2,3,5,7,11,13,
17,19]:
mean=0.0
foriinlist1:
mean=mean+i
mean=mean/len(list1)
Itcanbeseenthatalistcanbeusedinalooptodefinethevaluesthattheloopvariablei
will take on, a similar situation to that of a tuple. A second way to do the same thing
wouldbe:
mean=0.0
foriinrange(0,len(list1)):
mean=mean+list1[i]
mean=mean/len(list1)
Inthiscasetheloopvariableiisanindexintothe list andnotalistelement,butthe
result is the same. Python lists are more powerful than this, and making use of the
extensivepowerofthelistsimplifiesthecalculationgreatly:
mean=sum(list1)/len(list1)
Thebuilt-infunctionsumwillcalculateandreturnthesumofalloftheelementsinthe
list.Thatwasthepurposeoftheloop,sotheloopisnotneededatall.Thefunctionsthat
workfortuplesalsoworkforlists(min,max,len),butsomeofthepoweroflistsisinthe
methodsitprovides.
3.4.1 EditingLists
Editingalistmeanstochangethevalueswithinit,usuallytoreflectanewsituationto
behandledbytheprogram.Themostobviouswaytoeditalististosimplyassignanew
valuetooneofthecomponents.Forexample:
list2=[“Hydrogen”,“Helium”,“Lithium”,”Beryllium”,“Boron”,
“Carbon”]
list2[0]=“Nitrogen”
print(list2)
resultsinthefollowingoutput:
[‘Nitrogen’,“Helium’,“Lithium’,“Beryllium’,“Boron’,“Carbon’]
Thissubstitutionofacomponentisnotpossiblewithstringsortuples.Itispossibleto
replaceasinglecomponentwithanotherlist:
list2=[“Hydrogen”,“Helium”,“Lithium”,”Beryllium”,“Boron”
“Carbon”]
list2[0]=[“Hydrogen”,“Nitrogen”]
resultsin:
list2=[['Hydrogen','Nitrogen'],'Helium','Lithium',
'Beryllium','Boron','Carbon']
3.4.2 Insert
This is not normally what is thought of as an insertion, though. To place new
componentswithinalist,theinsertmethodisprovided.Thismethodplacesacomponent
ataspecifiedindex;thatis,theindexofthenewelementwillbetheonegiven.Toplace
“Nitrogen”atthebeginningoflist2,whichisindex0:
list2.insert(0,“Nitrogen”)
The first value given to insert, 0 in this case, is the index at which to place the
component,andthesecond valueisthe thingtobe inserted. Inserting“Nitrogen”atthe
endofthelistwouldbeaccomplishedby:
list2.insert(len(list2),“Nitrogen)
However,considerthis:
list2.insert(-1,“Nitrogen)
Willthisinsert“Nitrogen”attheend?No.Atthebeginningofthestatement,thevalue
oflist2[-1]is“Carbon.”Thisisthevalueatindex5.Therefore,the
insertof“Nitrogen”willbeatindex5,resultingin:
[ꞌHydrogenꞌ,ꞌHeliumꞌ,ꞌLithiumꞌ,ꞌBerylliumꞌ,ꞌBoronꞌ,
ꞌNitrogenꞌ,ꞌCarbonꞌ]
3.4.3 Append
Anotherwaytoaddsomethingtotheendofalististousetheappendmethod:
list2.append(“Nitrogen”)
Resultsin:
[ꞌHydrogenꞌ,ꞌHeliumꞌ,ꞌLithiumꞌ,ꞌBerylliumꞌ,ꞌBoronꞌ,
ꞌCarbon’,ꞌNitrogen’]
Remember, the “+” operation will only concatenate a list to a list, so the equivalent
expressioninvolving“+”wouldbe:
list2=list2+[“Nitrogen”]
3.4.4 Extend
The extend method does pretty much the same things as the “+” operator. With the
definitions:
a=[1,2,3,4,5]
b=[6,7,8,9,10]
print(a+b)
a.extend(b)
print(a)
theoutputis:
[1,2,3,4,5,6,7,8,9,10]
[1,2,3,4,5,6,7,8,9,10]
However,ifappendhasbeenusedinsteadofextendabove:
a=[1,2,3,4,5]
b=[6,7,8,9,10]
print(a+b)
a.append(b)
print(a)
theresultwouldhavebeen:
[1,2,3,4,5,6,7,8,9,10]
[1,2,3,4,5,[6,7,8,9,10]]
3.4.5 Remove
The remove method does what is expected: it removes an element from the list. But
unlike insert, for example, it does not do it using an index; the value to be removed is
specified.So:
list1=[“Hydrogen”,“Helium”,“Lithium”,“Beryllium”,“Boron”,
“Carbon”]
list1.remove(“Helium”)
results in the list1 being [‘Hydrogen’, “Lithium’, “Beryllium’, “Boron’, “Carbon’].
Unfortunately,ifthecomponent being deletedisnota memberofthe list,thenanerror
occurs.Therearewaystodealwiththat,oratestcanbemadefortryingtodeleteanitem:
if“Nitrogen”inlist1:
list1.remove(“Nitrogen”)
Ifthereismorethanasingleinstanceoftheitembeingremoved,thenonlythefirstone
willberemoved.
3.4.6 Index
Whendiscussingtuplesitwaslearnedthattheindexmethodlookedthroughthetuple
andfoundtheindexatwhichaspecifieditemoccurred.Theindexmethodforlistsworks
inthesameway.So:
list1=[“Hydrogen”,“Helium”,“Lithium”,“Beryllium”,“Boron”,
“Carbon”]
print(list1.index(“Boron”))
prints“4,”becausethestring“Boron”appearsatindex4inthislist(startingfrom0,of
course).Ifthereismorethanoneoccurrenceof“Boron”inthelist,thentheindexofthe
firstone(i.e.,smallestindex)isreturned.Ifthevalueisnotfoundinthestring,thenan
erroroccurs.Again,itmightbeappropriatetocheck:
if“Boron”inlist1:
print(list1.index(“Boron”))
3.4.7 Pop
Thepopmethodiseffectivelythereverseorinverseofappend.Itremovesthelastitem
(i.e., the one having the largest index) from the list. If the list is empty, then an error
occurs.Foranexample:
list1=[“Hydrogen”,“Helium”,“Lithium”,“Beryllium”,“Boron”,
“Carbon”]
list1.pop()
print(list1)
printstheresult:
[ꞌHydrogenꞌ,ꞌHeliumꞌ,ꞌLithiumꞌ,ꞌBerylliumꞌ,ꞌBoronꞌ]
Toavoidtheerrorthatcanoccurifthelistisempty,simplychecktoseethatthelength
ofthelistisgreaterthanzerobeforeusingpop:
iflen(list1)>0:
list1.pop()
Themethodiscalledpopbecauseitrepresentsawaytoimplementtheoperationofthe
samenameonadatastructurecalledastack.
3.4.8 Sort
This method places the components of a list into ascending order. Using the list1
variablethathasbeenusedsooften:
list1=[“Hydrogen”,“Helium”,“Lithium”,“Beryllium”,“Boron”,
“Carbon”]
list1.sort()
print(list1)
theresultis:
[ꞌBerylliumꞌ,ꞌBoronꞌ,ꞌCarbonꞌ,ꞌHeliumꞌ,ꞌHydrogenꞌ,
ꞌLithiumꞌ]
whichisinalphabeticorder.Themethodwillsortintegersandfloatingpointnumbersas
well.Stringsandnumberscannotbemixed,though,becausetheycan’tbecompared.So:
list2=[“Hydrogen”,1,“Helium”,2,“Lithium”,3,“Beryllium”,4,
“Boron”,5]
list2.sort()
resultsinanerrorthatwillbesomethinglike:
list2.sort()
TypeError:unorderabletypes:int()<str()
The meaning of this error should be clear. Things of type int(integer) and things of
typestr(string)can’tbecomparedagainsteachotherandsocan’tbeplacedinasensible
order if mixed. For sort to work properly, all of the elements of the list must be of the
sametype.Itisalwayspossibletoconvertonetypeofthingintoanother,andinPython
convertinganintegertoastringisaccomplishedwiththestr()function;stringtointegeris
convertedusingint(). So str(3) would result in “3,” and int(“12”)is 12. An error will
occurifitisnotpossible,soint(12.2)willfail.
Ifeachelementofalistisitselfalist,itcanstillbesorted.Considerthelist:
z=[[“Hydrogen”,3],[“Hydrogen”,2],[“Lithium”,3],
[“Beryllium”,4],[“Boron”,5]]
Whensortedthisbecomes:
[['Beryllium',4],['Boron',5],['Hydrogen',2],['Hydrogen',3],
['Lithium',3]]
Eachcomponentofthislistiscompatiblewiththeothers,consistingofastringandan
integer.Thustheycanbecomparedagainsteachother.Noticethattherearetwoentriesfor
hydrogen:onewithanumber2andonewithanumber3.Thesortmethodarrangesthem
correctly.Alistissortedbyindividualelementsinsequenceorder,sothefirstthingtested
would be the string. If those are the same, then the next element is checked. That’s an
integer,sothecomponentwiththesmallestintegercomponentwillcomefirst.
3.4.9 Reverse
Inanysequencetheorderofthecomponentswithinitisimportant.Reversingthatorder
isalogicaloperationtoprovide,butmaynotbeusedveryoften.Oneinstancewhereit
canbeimportantisafterasort.Thesortmethodalwaysplacescomponentsintoascending
order.Iftheyaresupposedtobeindescendingorder,thenthereversemethodbecomes
valuable.Asanexampleconsidersortingthelistq:
q=[5,6,1,5,4,9,9,1,6,3]
q.sort()
Thevalueofqatthispointis
[1,1,3,4,5,5,6,6,9,9]
Toplacethislistisdescendingorderthereversemethodisused:
q.reverse()
andtheresultis
[9,9,6,6,5,5,4,3,1,1]
Itishardtosaywhetherascendingorderisneededmoreoftenthandescendingorder.
Namesareoftensortedsmallestfirst(ascending),butdatesaremorelikelytorequiremore
recentdatesbeforelaterones(descending).
3.4.10 Count
This method is used to determine how many times a potential component of a list
actuallyoccurs.Itdoesnotreturnthenumberofelementsinthelist—thatjobisdoneby
thelenfunction.Usingthelistqasanexample:
q=[5,6,1,5,4,9,9,1,6,3]
print(1,q.count(1),2,q.count(2),3,q.count(3),99,
q.count(99))
willresultintheoutput:
122031990
where the spacing is enhanced for emphasis. This says that there are 2 instances of the
number1(1,2)inthelist,zeroinstancesof2(2,0),oneinstanceofthenumber3(3,1)and
noneof99(99,0).
3.4.11 ListComprehension
Whencreatingalistofitems,twomechanismshavebeendiscussed.Thefirstistouse
constants,asinthelistqintheprevioussection.Thesecondappendsitemstoalist,and
thiscouldbedonewithinaloop.Makingalistofperfectsquarescouldbedonelikethis:
t=[]
foriinrange(0,10):
t=t+[i*i]
whichcreates thelist [0, 1,4, 9, 16,25, 36, 49,64, 81]. Thiskind of thingis common
enoughthataspecialsyntaxhasbeencreatedforitinPython—thelistcomprehension.
Thebasicideaissimpleenough,althoughsomespecificcasesarecomplicated.Inthe
situationaboveinvolvingperfectsquares,theelementsinthelistaresomefunctionofthe
index. When that is true the loop, index, and function can be given within the square
bracketsasadefinitionofthelist.Sothelisttcouldbedefinedas:
tt=[i**2foriinrange(10)]
The for loop is within the square brackets, indicating that the purpose is to define
componentsofthelist.Thevariableihereistheloopvariable,andi**2 is thefunction
thatcreatestheelementsfromtheindex.Thisisasimpleexampleofalistcomprehension.
Creatingrandomintegervalues?Noproblem:
tt=[random.randint(0,100)foriinrange(10)]
Thefirstsixelementsinalluppercase?
list1=[“Hydrogen”,“Helium”,“Lithium”,“Beryllium”,
“Boron”,“Carbon”]
ss=[i.upper()foriinlist1]
This is a very effective way to create lists, but it does depend on having a known
connectionbetweentheindexandtheelement.
3.4.12 ListsandTuples
Atuplecanbeconvertedintoalist.Listshaveagreaterfunctionalitythantuples;that
is,theyprovidemoreoperationsandgreaterabilitytorepresentdata.Ontheotherhand,
they are more complicated and require more computer resources. If something can be
representedasatuple,thenitislikelybesttodoso.Atupleisdesignedtobeacollection
ofelementsthatasawholerepresentsomemorecomplicatedobject,butthatindividually
areperhapsofdifferenttypes.ThisisratherlikeaCstructorPascalrecord.Alistismore
oftenusedtoholdasetofelementsthatallhavethesametype,morelikeanarray.Thisis
a good way to think of the two types when deciding what to use to solve a specific
problem.
Python provides tools for conversion. The built-in function list takes a tuple and
convertsitintoalist;thefunctiontupledoesthereverse,takingalistandturningitintoa
tuple.Forexample,convertinglist1intoatuple:
tuple1=tuple(list1)
print(tuple1)
yields
(ꞌHydrogenꞌ,ꞌHeliumꞌ,ꞌLithiumꞌ,ꞌBerylliumꞌ,ꞌBoronꞌ,
ꞌCarbonꞌ)
Thisisseentobeatuplebecauseofthe“(‘and’)”delimiters.Thereverse:
v=list(tuple1)
print(v)
printsthetextline:
[ꞌHydrogenꞌ,ꞌHeliumꞌ,ꞌLithiumꞌ,ꞌBerylliumꞌ,ꞌBoronꞌ,
ꞌCarbonꞌ]
andthesquarebracketsindicatethisisalist.
3.4.13 Exceptions
Exceptionsaretheusualwaytocheckforerrorsofindexingandmembershipinlists.
Theerrorisallowedtooccur,buranexceptionistestedandhandledinthecasewhere,for
example,anitembeingdeletedisnotinthelist.
Problem:DeletetheElementHeliumfromaList
Earlier,asanexampleoftheremovemethod,aprogramsnippetwaswrittentodelete
theelementHeliumfromalistofelements:
list1=[“Hydrogen”,“Helium”,“Lithium”,“Beryllium”,“Boron”,
“Carbon”]
if“Helium”inlist1:
list1.remove(“Helium”)
Becauselist1maynothaveHeliumasoneofthecomponents,acheckwasmadebefore
anattempttodeleteit.Anattempttodeleteanelementfromalistwheretheelementdoes
notappearinthatlistresultsinanAttributeError.Ratherthanperformanexplicittest,a
Pythonprogrammerwouldmorelikelyuseanexceptionhere.Theerrorcanbecaughtas
follows:
list1=[“Hydrogen”,“Helium”,“Lithium”,“Beryllium”,“Boron”,
“Carbon”]
try:
list1.remove(“Helium”)
except:
print('Can'tfindHelium')
Theadvantageofthisoverallowingtheerrortooccuristhattheprogramcancontinue
toexecute.
Problem:DeleteaSpecifiedElementfromaList
Giventhesamelist,readanelementfromthekeyboardanddeletethatelementfromthe
list.Thebasiccodeisthesame,butnowthestringisenteredandcouldbeanythingatall.
It’seasiertotestaprogramwhenitcanbemadetofailonpurpose.Thenameisentered
usingtheinputfunctionandisusedastheparametertoremove.Nowitispossibletotest
allofthecodeinthisprogramwithoutchangingit.First,hereistheprogram:
list1=[“Hydrogen”,“Helium”,“Lithium”,“Beryllium”,“Boron”,
“Carbon”]
s=input(“Enter:”)
try:
list1.remove(s)
except:
print('Can'tfind',s)
print(list1)
Properlytestingaprogrammeansexecutingallofthestatementsthatcompriseitand
ensuringthattheanswergiveniscorrect.Sointhiscase,firstdeleteanelementthatisa
partofthelist.TryLithium.Hereistheoutput:
Enter:Lithium
[“Hydrogen”,“Helium”,“Beryllium”,“Boron”,“Carbon”]
Thisiscorrect.Thesearethestatementsthatwereexecutedinthisinstance:
list1=[“Hydrogen”,“Helium”,“Lithium”,“Beryllium”,“Boron”,
“Carbon”]
s=input(“Enter:”)
try:
list1.remove(s)#Thiswassuccessful
print(list1)
NowtrytodeleteOxygen.Outputis:
Enter:Oxygen
Can’tfindOxygen
[‘Hydrogen’,“Helium’,“Lithium’,“Beryllium’,“Boron’,“Carbon’]
Thisiscorrect.Thesestatementswereexecuted:
list1=[“Hydrogen”,“Helium”,“Lithium”,“Beryllium”,“Boron”,
“Carbon”]
s=input(“Enter:”)
try:
list1.remove(s)#thiswasnotsuccessful
except:
print('Can'tfind',s)
print(list1)
Allofthecodeintheprogramhasbeenexecutedandtheresultscheckedforbothmajor
situations. For any major piece of software this kind of testing is exhausting, but it is
reallytheonlywaytominimizetheerrorsthatremaininthefinalprogram.
3.5 SETTYPES
Somethingoftypesetisanunorderedcollectionofobjects.Anelementcanonlybea
memberofagivensetonce,sointhatsenseitismuchlikeamathematicalset.Infact,
that’sthepoint.Becauseasetisunordered,operationssuchasindexingandslicingarenot
provided.Itdoessupportmembership(is),size(len()),andloopingonmembership(fori
inset).
Anyone (probably an older person) who knows the Pascal language has some
familiaritywiththesettypeinPython.
Mathematicalsetshavecertainspecific,well-definedoperations,andthoseareavailable
onaPythonsetalso.
Subsetset1<set2meansset1isatruesubsetofs2.
Intersectionset1 & set2 creates a new set containing members in common with
both.
Unionset1|set2createsanewsetwithallelementsofboth.
Differenceset1-set2createsanewsetwithmembersthatarenotinboth.
Equalityset1==set2istrueifbothsetscontainonlythesameelements.
Creatinganewobjectoftypesetisamatterofspecifyingeitherthatitisasetorwhat
theelementsare.So,onewayistousethe{}syntax:
set1={1,3,5,7,9}
ortousetheconstructor:
set2=set(range(1,10))
whichgivestheset{1,2,3,4,5,6,7,8,9}.So:
set1<set2isTrue
set1&set2is{9,1,3,5,7}Note:orderdoesnotmattertoaset.
set1|set2is{1,2,3,4,5,6,7,8,9}
set2–set1is{8,2,4,6}
Anewelementcanbeaddedtoasetusingadd():
set1.add(11)
andremovedusingremove():
set1.remove(11)
ordiscard():
set1.discard(11)
Iftheelementbeingremovedisnotintheset,thenanerrorwilloccur(KeyError)when
remove()iscalled,butnotwithdiscard().Thisshouldbetestedfirstorbeplacedinan
exceptstatement.
All of the examples so far involve integers belonging to a set, but other types can
belongaswell:floatingpointnumbers,strings,andeventuples(notlists).Forexample,
thefollowingarelegalsets:
{“a”,“e”,“i”,“o”,“u”}
{“cyan”,“yellow”,“magenta”}
{(2,4),(3,9),(4,16),(5,25),(6,36),(7,49)}
3.5.1 Example:Craps
Crapsisadicegame,forthoseunfamiliarwithit,andcommonlyinvolvesbettingon
theoutcome.Theplayer(shooter)rollstwodice.If,onthefirstroll(pass),atotalof7or
11isobtained,thentheshooterwins.Ontheotherhand,aninitialrollof2,3,or12loses
immediately.Anyotherrolliscalledthepoint.Inthatcasetheshootercontinuestoroll
thedice.Ifa7isobtainedthentheshooterloses,andifthepointnumberisrolledthenthe
shooter wins. The shooter continues to roll until one or the other occurs. One way to
implementthisgameinPythonistousesets.
Elementsofthesetswillbevaluesoneachdie,whichistosayoneroll.Therearetwo
dice so a total of 36 combinations exist. A single roll is a tuple, such as (1,1) or (3,4).
There are only 12 distinct sums of two dice, and multiple ways to achieve them. A
sequencenamedrollwillbecreatedthatcontainsasetforeachpossiblevalue,andthatset
containsallofthewaysthatthevaluecanbeobtained.Forinstance,therearetwowaysto
rolla3,so:
roll[3]={(1,2),(2,1)}
Initiallyasetiscreatedforeachpossiblerollofapairofdiceandthenisinitializedas
described:
fromrandomimport*
roll=list(range(0,13))#Createtheemptylist
foriinrange(1,13):#andfillwithemptysets.
roll[i]=set()
foriinrange(1,7):#Nowforeachpossibleroll
forjinrange(1,7):#oftwodice,addthatroll
k=i+j#totheelementofrollfor
roll[k].add((i,j))#thatvalue(sumofthe
#dice)
Nowroll[i]containsallofthewaystorollavalueofi,Inparticular,roll[7]containsall
waystorolla7androll[11]containsallwaystorollan11.Thus,alloftherollsthatwill
winonthefirstpasscanbeplacedinasingleset,theunionofroll[7]androll[11]:
winner=roll[7]|roll[11]
Similarly,therollsthatwilllosefortheshooteronthefirstpassare:
loser=roll[2]|roll[3]|roll[12]
Ifanyotherrollisthrown,thenthatbecomesthepoint.Rollingadieamountstogetting
arandomnumberbetween1and6inclusive,or:
die1=randrange(1,7)
die2=randrange(1,7)
Rememberthatrandrange()producesanumberlessthanthesecondparameter.
Giventhisroll,thepointisthesetroll[die1+die2].Continuingtheprogramfromthedie
rolls:
val=(die1,die2)#Atuple,thecurrentroll
print(“Shooterrolls“,val)
ifvalinwinner:#Isthistupleawinner?
print(“Theshooterwins!”)
elifvalinloser:#Isitaloser?
print(“Theshooterloses”)
else:
point=roll[die1+die2]#Definethepointset
print(die1+die2,”isyourpoint.”)
Nowthedicearerolledrepeatedly.Iftherollisinthepointset,thentheshooterwins.If
therollisa7(inthesetroll[7])thentheplayerloses.Otherwisetheshooterrollsagain.
whileTrue:#Repeatuntilawinor
#losshappens
die1=randrange(1,7)#Rollthedice
die2=randrange(1,7)
val=(die1,die2)#valisatuple
print(“Rolls“,val)
ifvalinroll[7]:#Any7rollloses
print(“Theshooterloses!”)
break
ifvalinpoint:#Rollingthe'point'wins.
print(“Theshootermakesthepoint.Awinner!”)
break
Andthat’sthegame.Inarealcrapsgamethisentireprocessisrepeated,andbetsare
placedoneachindividualgameastowhethertheplayerwillwinorlose.
3.6 SUMMARY
Thereisakindtypethatavariablecanhavethatamountstoalistorsequenceofother,
simpler,things.Thisistrue,andusingvariableshavingthesetypesisanessentialpartof
writing useful and effective code. Python offers strings,tuples, and lists as objects that
consistofmultipleparts.Theyarecalledsequencetypes.
Astringis asequenceofcharacters. Thewordsequenceimpliesthattheorderofthe
characters within the string matters, and that is true of a string. Strings most often
represent the way that communication between a computer and a human takes place.
Stringscanbeindexedtoseewhatcharacterisinanyposition(e.g.,s[i]),canbesearched
for a string that occurs with it, can have characters concatenated to it, and many other
usefuloperations.Ifastringscontainsaninteger,thenint(s)willyieldthatintegerand
str(i)willcreateastringfromanintegeri.
Atupleisalmostidenticaltoastringinbasicstructure,exceptthatitiscomposedof
arbitrary components instead of characters. Examples are tup1 = (2, 3, 5) and tup2 =
(“Hydrogen”,“Helium”,“Carbon”). A tuple can contain mixed type, such as integers
andstrings:tup3=(“star”,1,“planet”,2).Anelementofatuplecannotbealtered,soit
issaidtobeimmutable,althoughconcatenationispossible.
Alistislikeatuplebutisnotimmutable,soindividualelementscanbemodified.Alist
usessquarebracketsasadelimiter,insteadofparenthesesasusedforatuple.Changingan
elementinvolvesindexingit,soiflist1isalistthenlist1[2]=
6modifieselement2ofthatlist.
Asetisanunorderedcollectionofobjects.Anelementcanbealmostanytype,butcan
only occur in a set once. This mimics a mathematical set. Elements can be added and
removed,andthesetoperationsunion,intersection,anddifferencecanbeperformed.
Exercises
Fortheexercisesbelow,assumethefollowingdefinitions:
str1=“okraistheclosestthingtonyloni'veevereaten.”
str2=“pullthestring,anditwillfollowwhereveryou
wish.”
str3=“letoutalittlemorestringonyourkite.”
str4=“everystringisadifferentcolor,adifferentvoice.”
vowels='aeiou'
atoms=(“Hydrogen”,1,“Helium”,2,“Lithium”,3,“Boron”,5,“Carbon”,6,
“Oxygen”,8)
1.Whatisprintedbythefollowingcodesnippets:
a)foriinrange(0,len(str3)):
print(str3[i],end=ꞌꞌ)
b)foriinrange(0,len(str3)):
print(i,end='')
c)foriinrange(0,len(str3)):
print(str2[i],end='')
d)foriinstr3:
print(i,end=ꞌꞌ)
e)foriinstr3:
ifiinvowels:
print(i,end=ꞌꞌ)
f)foriinstr1:
ifnot(iinvowels):
print(i,end=ꞌꞌ)
2.Constructaloopthatprintsoutallcharactersofstr4thatcorrespondtoavowelin
str3.Note:thetwostringsaredifferentlengths.
3.ACaesarcypherisawaytotransmitasecretmessage.Whenencodingamessage
eachcharacterisreplacedbyonethatisafixeddistancefurtheralongthealphabet.If
thatdistanceis6,forexample,theletter“a”wouldbereplacedby“g,”whichis6
positionsfurtheralong.Thecharactersattheendwillwraparoundtothebeginning,so
“z”willbecome“f.”WritesomePythoncodethatwillencodestr1inthisway.Ensure
thatitworksbydecryptingthefollowingstring:
“varrznkyzxotm,gtjozcorrlurruccnkxkbkxeuacoyn.”
Ans:uqxgoyznkiruykyzznotmzuteruto’bkkbkxkgzkt.
4.WriteaPythonsnippetthatwillcreatetwotuplesfromthesingletupleatoms:one
namedelementsthatcontainsonlythenames,andonecallednumbersthatcontains
theatomicnumbersoftheelementsinthetupleatoms.
5.WriteaPythonprogramthatreadsnumbersfromthekeyboardandappendsthemtoa
tuple.Stoptheprocesswhenanegativenumberisenteredandthenprintthetuplethat
wascreated.
6.Adeckofplayingcardsconsistsof52items:eachonehasoneoffoursuits(clubs,
diamonds,hearts,andspades)andwithineachsuitarevaluesfrom1–10and“Jack,”
“Queen,”and“King.”WriteaPythonprogramthatcreatesadeckofcards,shuffles
them,andprintsouttheresult.
7.WriteaPythonprogramthatreadsnames(singlewords)oneatatimefroma
keyboardanddeletesthemfromalistnamednamesiftheyarealreadyelementsof
thatlist.Ifthenameisnotalreadyamemberofthelist,thenitwillbeadded.Typing
thename“quit”terminatestheprogram.
8.Assumethatastringnamedtempexistsandhasavalue.WritePythoncodethatwill
printtempbackwards.
9.Apalindromeisaphrase(astring)thatreadsthesameforwardsandbackwards.The
name“hannah”isapalindrome;sois“Ogopogo,”thenameofamonsterwholivesin
lakeOkanogan,andtheword“redivider.”WriteaPythonprogramthatdetermines
whetheragivenstringisapalindrome.
10.Mostexamplesofpalindromescontainspacesandpunctuation,andthesecharacters
areignoredwhendecidingwhetherornotthephraseispalindromic.Soiscase.Thus,
thephrase“Ipreferpi”isapalindrome.Withtheseconsiderationsinmind,writea
Pythonprogramthatdetermineswhetherastringisapalindromeornot.
NotesandOtherResources
Built-intypes:https://docs.python.org/3.4/library/stdtypes.html?highlight=set#set
Pythonstrings:https://docs.python.org/3/library/string.html
RulesofCraps:http://www.bigmcasino.com/learn-more/learn-to-play-craps/what-are-
the-basic-rules-of-craps/
1.DavidMertz.(2003).TextProcessinginPython,AddisonWesleyProfessional,
ISBN-13:978-0321112545.
2.DavidMakinson.(2012).Sets,Logic,andMathsforComputing,2ndedition,
Springer,ISBN-13:978-1447124993.
3.J.D.Oldham.(2005).WhathappensafterPythoninCS1?JournalofComputing
SciencesinColleges,20(6),7–13.
CHAPTER4
FUNCTIONS
4.1FunctionDefinition:SyntaxandSemantics
4.2FunctionExecution
4.3Recursion
4.4CreatingPythonModules
4.5ProgramDesignUsingFunctions–Example:TheGameofNim
4.6Summary
Inthischapter
ThereisalargeandusefulsetoffunctionsbuiltintoPython.Thesearesometimessimply
therefortheusing,likeprintandinput,andsometimesarepartofamodulethatmustbe
imported, like random. However, as large as this collection of functions is, it is
impossible that it will include everything that every programmer needs. At some point
there will be a need to create a function that does something new, and Python should
permitthis.
Why would a programmer want to create a function of their own? It is partly out of
convenience; if some section of code can be invoked as a function instead of being
repeated many times, then there will be less typing involved. It is also to support more
correct programs: a small code unit like a function can be very thoroughly tested and
nearlyguaranteedtobecorrect.Anditisalsotosupportreuseofworkingcode:oncea
functionis tested,it canbe placed ina collectionof code (module)and usedagain and
againinsteadofbeingrewrittenmanytimes.
A function is really just some code that has a name, and can be executed simply by
invokingthatname.Itusuallyrepresentssometaskthathastobedonefairlyfrequently,
but that’s not a requirement. Some functions are invoked (or called) only once. In this
context a function is a way to break up a long piece of code into many shorter pieces
which,ashasbeenpointedout,areeasiertotestandmaintain.
Afunctionshouldalsohaveonesingletask,oratleastonemaintask.Thattaskshould
berepresentedinthefunctionname.Afunctionnamedmaximumshouldhavethetaskof
locatingthemaximumofsomething;afunctionnamedcosineshouldcalculatethecosine
ofanangle.Ifafunctionisnamedwilmaittellsanotherprogrammerwhoisreadingthe
code nothing about the what the program is doing, and if a function named cosine
computes the square root of a number then it is not just uninformative but misleading.
NevermindthatthePythonlanguagedoesnotinsistthatnamesbeinformative;thereisa
social compact between programmers that says that you should be as clear as possible
aboutwhatyourcodeisdoing.
Thefactthatmanyfunctionsreturnavaluehasbeenskippedover,butitisakeypartof
thefunctionconstruct.Thecodewithinthefunctionhasapurpose,andoftenthatpurpose
isconcentratedinthereturnvalue.Howeveritworks,andwhateverthecodelookslike,
the purpose of the cosine function is to return a single value that is the mathematical
cosineofagivenangle.Thenatureofthefunctionisencapsulatedinthatvalue.Thereare
some functions that do not explicitly return a value; such a function might be called to
printanerrormessageordrawagraphicalobjectinawindow.Evenifitisnotspecifically
declaredinthedefinition,allfunctionsreturnsomething.Ifnotdefined,thenitreturnsa
valuecalledNone.
Enoughexposition—howcanfunctionsbedeclaredandusedinPython?
4.1 FUNCTIONDEFINITION:SYNTAXAND
SEMANTICS
Unlike in the cases of if statements or for statements, a function definition does not
involvetheword“function.”AsanexampleofasimpledefinitioninPython,imaginea
programthatneedsafunctiontoprinttwenty“#”charactersonaline.Itcouldbedefined
as:
defpound20():
foriinrange(0,20):
print(“#”,end=””)
TheworddefisknowntoPythonandalwaysbeginsthedefinitionofafunction.Thisis
followedbythenameofthefunction,inthiscasepound20becausethefunctionprints20
poundcharacters(alsoknownasahashcharactersoroctothorpe).Thencomesthelistof
parameters,whichcanbethoughtofasatupleofvariablenames.Inthiscasethetupleis
empty,meaningthatnothingispassedtothefunction.Finallycomesthe“:”characterthat
definesanewsuitethatcomprisesthecodebelongingtothefunction.Fromhereonthe
codeisindentedonemorelevel,andwhentheindentationrevertstotheoriginallevelthe
functiondefinitioniscomplete.
Callingthis functionisamatterofusingitsnameasastatementorinanexpression,
beingcarefultoalwaysincludethetupleofparameters.Evenwhenthetupleisempty,it
helpsdistinguishafunctionfromavariable.Acalltothisfunctionwouldbe:
pound20()
andtheresultwouldbethat20“#”characterswouldbeprintedononelineoftheoutput
console.
Afunctioncanbegivenorpassedoneormorevaluesthatwilldeterminetheresultof
the function. A function cosine, for example, would be passed an angle, and that angle
would be used to compute the cosine. Each call to cosine passing a different value can
yieldadifferentresult.Inthecaseofthefunctionthatprintspoundcharacters,itmightbe
usefultopassitthenumberofpoundcharacterstoprint.Itshouldnotbecalledpound20
anymorebecauseitdoesnotalwaysprint20characters.Itwillbecalledpoundnthistime:
defpoundn(ncharacters):
foriinrange(0,ncharacters):
print(“#”,end=””)
Thevariablencharactersthatisgiveninparenthesesafterthefunctionnameiscalleda
parameteroranargument,andindicatesthenamebywhichthefunctionwillrefertothe
value passed to it. This name is known only inside of the function, and while it can be
modified within the function, this modification will not have any bearing on anything
outside.Thecalltopoundnmustnowincludeavaluetobepassedtothefunction:
poundn(3)
Whenthiscallisperformed,thecodewithinpoundnbeginsexecuting,andthevalueof
ncharacters is 3, the value that was passed. It prints 3 characters and returns. A
subsequent call to poundn could be passed a different number, perhaps 8, and then
ncharacterswouldtakeonthevalue8andthefunctionwouldprint8characters.Itwill
printasmanycharactersasrequestedthroughtheparameter.
4.1.1 Problem:UsepoundntoDrawaHistogram
InChapter2, a simple histogram was created from some print statements and loops.
Thesamecodewasrepeatedmanytimes,oneforeachhistogrambar.Asithappensthe
character used to draw the histogram bars was the pound character, so the function
poundn could be used as a basis for a histogram program. As a reminder, here is the
outputthatisdesired:
EarningsforWidgetCorpfor2016
Dollarsforeachquarter
==============================
Q1:########190000
Q2:################340000
Q3:##########################################873000
Q4:####################439833
Each pound character represents $20000, and there are four variables that hold the
profitforeachofthefourquarters:q1,q2,q3,andq4.Giventhese
criteria, a solution using poundn would call the function four times, once for each
quarter:
print(“EarningsforWidgetCorpfor2016”)
print(”Dollarsforeachquarter“)
print(”===============================”)
q1=190000#Thedollaramountsforprofits
q2=340000#ineachofthefourquartersof2016
q3=873000
q4=439833
print(”Q1:“,end=””)
poundn(int(q1/20000))#Rawdollaramountisdividedby
#20000
#toyieldthenumberofcharacters.
print(”“,q1)
print(“Q2:“,end=””)
poundn(int(q2/20000))
print(”“,q2)
print(“Q3:“,end=””)
poundn(int(q3/20000))
print(”“,q3)
print(“Q4:“,end=””)
poundn(int(q4/20000))
print(”“,q4)
Eachprofitvaluemustbescaledbydividingby20000,justashappenedbefore.Inthis
casetheresultingvalueispassedtopoundnindicatingthenumberof“#”’stodraw.
4.1.2 Problem:GeneralizetheHistogramCodeforOtherYears
Anycompanywillneedtodofinancialreportseveryyearatleast.Hiringaprogrammer
todothistaskonacomputerisnotareasonablethingtodo,becausecomputerscanbe
madeto dothis jobin a very general way.For example,given thateach year will have
fourquartersandeachquarterwillhaveaprofit,whynotstorethesedataasalist?Each
yearwillhaveonelistcontainingfouritems,andthenameofthevariablecouldinitially
berelatedtotheyear:
profit2016=[190000,340000,873000,439833]
Theprofitforthefirstquarterisprofit2016[0],thesecondquarterisprofit2016[1],and
soon.Usingthisvariablemeanspassingoneoftheelementsofthelisttopoundninstead
ofasimplevariable,butthatisfine,it’salegalexpression.Sodrawingthecharactersfor
thefirstquarterwouldbedonewiththefollowingcode:
poundn(int(profit2016[0]/20000))
Nowconsiderwhatelsegetsprinted.Toprinteverythingforthefirstquarterthecode
was:
print(“Q1:“,end=””)
poundn(int(profit2016[0]/20000))
print(”“,q1)
Thismeansthatthelabelontheleft,“Q1,”theparameterstopoundn,andtheactual
value of the profit are needed. All of these are available, and can be provided within a
simpleloop.Assumingthattheloopvariableirunsfrom0to3,thecodewithinthatloop
that duplicates the previous example can be constructed one line at a time. In each
iterationthequarterisi+1becauseistartsat0;convertthattoastringandbuildthelabel
“Q1:”fromit:
print(Q1:“,end=””)
print(“Q”+str(i+1)+”:“,end=””)
Thisisprobablythetrickiestpart.Thelabelstringisconstructedfromtheletter“Q,”a
numberbetween1and4indicatingthequarter,and“:”fortheterminalstring.Theyare
simplyconcatenatedtogetherintheprintstatement.
Nowcallpoundnasbefore:
poundn(int(profit2016[i]/20000))
poundn(int(profit2016[i]/20000))
finally,printtherawdollarvalueontheright:
print(”“,q1)
print(”“,profit2016[i])
So,usingthisplantheentirehistogramcanbedrawnusingonlyfourstatements:
foriinrange(0,4):
print(“Q”+str(i+1)+”:“,end=””)
poundn(int(profit2016[i]/20000))
print(”“,profit2016[i])
That’sprettybrief,readable,andgeneral.Still,there’sanotherstep.Sincethiswillbe
doneeveryyear,whynotcreateafunctionthattakesthedataandtheyearasparameters,
anddothewholejob?Itshallbecalledpqhistogram:
defpqhistogram(profit,year):
print(“EarningsforWidgetCorpfor“+str(year))
print(”Dollarsforeachquarter“)
print(”==============================”)
foriinrange(0,4):
print(“Q”+str(i+1)+“:“,end=””)
poundn(int(profit[i]/20000))
print(”“,profit[i])
Thefunctionpqhistogramproducesthesameoutputasdidtheoriginalprogram,and
does so more generally and concisely. This function also brings to light two new ideas.
Oneisthatitispossibletopassmorethanoneparametertoafunction.Thesecondisthat
itispossibletocallafunctionfromwithinanotherfunction;inthiscasepoundniscalled
frominsideofpqhistogram.Thecallismadeafterdefiningthelistthatcontainstheprofit
values:
profit2016=[190000,340000,873000,439833]
pqhistogram(profit2016,2016)
Itisimportanttonotethattheparametersarepositional;thatis,thefirstvaluepassed
willcorrespondtothefirstnameintheparameterlist,andthesecondtothesecond.This
isthedefaultforfunctionswithanynumberofparameters.
NOTEA def statement is not a declaration. Such things are foreign to Python. A def
statementexecutes,andit“creates”anewfunctioneachtimeitisexecuted.This
isanadvancedtopic,andwillbehandledlater.
4.2 FUNCTIONEXECUTION
Whenafunctionis called,thefirststatementof thatfunctionstartsto execute,andit
continuesstatementbystatementthroughthecodeuntilthelaststatementofthatfunction
or until it returns prematurely. When that last statement executes, then execution will
continue from the place where it was called. As a function can be called from many
places, Python has to remember where the function was called so that it can return.
Parameterscanbeexpressionsorvariables,andnormallydiffereachtimethefunctionis
called.Functionscanalsoaccessvariablesdefinedelsewhere.
Mostimportantly,andafactorthathasnotbeendealtwithyet,isthatfunctionsreturn
values.
4.2.1 ReturningaValue
All functions return a value, and as such can be treated within expressionsas if they
werevariableshavingthatvalue.So,assumingtheexistenceofacosinefunction,itcould
beusedinanexpressionintheusualways.Forexample:
x=cosine(x)*r
ifcosine(x)<0.5:
print(cosine(x)*cosine(x))
In these cases the value returned by the function is used by the code to calculate a
furthervalueortocreateoutput.Theexpression“cosine(x)”resolvestoavalueofsome
Pythontype.Themostcommonpurposeofafunctionistocalculateavalue,whichisthen
returned to the calling part of the program and can possibly be used in a further
calculation.Buthowdoesafunctiongetitsvalue?Inareturnstatement.
Thereturnstatementassignsavalueandatypetotheobjectreturnedbythefunction.It
alsostopsexecutingthefunctionandresumesexecutionatthelocationwherethefunction
was called. A simple example would be to return a single value, such as an integer or
floatingpointnumber:
return0
returnsthevalue0fromafunction.Thereturnvaluecouldbeanexpression:
returnx*x+y*y
Afunctionhasonlyonereturnvalue,butitcanbeofanytype,soitcouldbealistor
tuplethatcontainsmultiplecomponents:
return(2,3,5,7,11)
return[“fluorine”,“chlorine”,“bromine”,“iodine”,“astatine”]
Expressionscanincludefunctioncalls,soareturnvaluecanbedefinedinthiswayas
well;forexample:
returncosine(x)
Oneofthesimplestfunctionsthatcanbeusedasanexampleisonethatcalculatesthe
squareofitsparameter.Itnonethelessillustratessomeinterestingthings:
defsquare(x):
returnx*x
Theprintstatement:
print(square(12))
willprint:
144
Interestingly,thestatement:
print(square(12.0))
resultsin:
144.0
Thesamefunctionreturnsanintegerinonecaseandafloatintheother.Why?Because
thefunctionreturnstheresultofanexpressioninvolvingitsparameter,whichinonecase
wasanintegerandintheotherwasreal.Thisimpliesthatafunctionhasnofixedtype,
andcanreturnanytypeatall.Indeed,thesamefunctioncanhavereturnstatementsthat
returnaninteger,afloat,astring,andalistindependentoftypeoftheparameterpassed:
deftest(x):#Returnoneoffourtypesdependingonx
ifx<1:
return1
ifx<2:
return2.0
ifx<3:
return“3”
return[1,2,3,4]
print(test(0))
print(test(1))
print(test(2))
print(test(3))
Theoutput:
1
2.0
3
[1,2,3,4]
Problem:WriteaFunctiontoCalculatetheSquareRootof
its
Parameter
Two thousand years ago the Babylonians had a way to calculate the square root of a
number. They understood the definition of a square root: that if y*y = x then y is the
squarerootofx.Theyfiguredoutthatifywasanoverestimatetothetruevalueofthe
squarerootofx,thenx/ywouldbeanunderestimate.Inthatcase,abetterguesswouldbe
toaveragethosetwovalues:thenextguesswouldbey1=(y+x/y)/2.Theguessafterthat
wouldbey2=(y1+x/y1)/2,andsoon.Atanypointinthecalculationtheerror(difference
betweenthecorrectanswerandtheestimate)canbefoundbysquaringtheguessyiand
subtractingxfromit,knowingthatyi*yiissupposedtoequalx.
Thefunctionwillthereforestartbyguessingwhatthesquarerootmightbe.Itcannotbe
0becausethenx/ywouldbeundefined.xisagoodguess.Thenconstructaloopbasedon
theexpressiony2=(y1+x/y1)/2,ormoregenerallyyi+1=(yi+x/yi)/2foriterationi.At
first,runthisloopafixednumberoftimes,perhaps20.Hereisthefunctionthatresults:
defroot(x):#Computethesquarerootofx
y=x#Firstguess:toobig,probably
foriinrange(1,20):#Iterate20times
y=(y+x/y)/2.0#Averagethepriorguessand
#x/y
returny#Returnthelastguess
This correctly computes the square root of 2 to 15 decimal places. This is probably
morethanisnecessary,meaningthattheloopisexecutingmoretimesthanitneedsto.In
fact,changingthe20iterationstoonly6stillgives15correctplaces.Thisisexceptional
accuracy: if the distance between the Earth and the Sun were known this accurately it
wouldbe within 0.006inches of thecorrect value. TheBabylonians seem tohave been
veryclever.
What’s the square root of 10000? If the number of iterations is kept at 6, then the
answer is avery poor one indeed:323.1. Why? Some numbers (large ones) need more
iterationsthanothers.Toguaranteethatagoodestimateofthesquarerootisreturned,an
estimateoftheerrorshouldbeused.Whentheerrorissmallenough,thenthevaluewill
begoodenough.Theerrorwillbex-yi*yi.Thefunctionshouldnotloopafixednumberof
times,butinsteadshouldrepeatuntiltheerrorislessthan,say,0.0000001.Thisfunction
willbenamedroote,wherethe“e”isfor“error.”
#ComputerthesquarerootofXto7decimalplaces
defroote(x):
y=x#yissupposedtobethesquareroot
#ofx,so
e=abs(x-y*y)#theerrorisx–y*y
whilee>0.0000001:#repeatwhiletheerrorisbigger
#than0.0000001
y=(y+x/y)/2.0#Newestimateforsquareroot
e=abs(x-y*y)Newerrorvalue
returny
Thisfunctionwillreturnthesquarerootofanypositivevalueofxtowithin7decimal
places.Itshouldcheckfornegativevalues,though.
4.2.2 Parameters
Aparametercanbeeitheraname,meaningthatitisaPythonvariable(object)ofsome
kind, or an expression, meaning it has a value but no permanence in that it can’t be
accessedlateron—ithasnoname.Botharepassedtoafunctionasanobjectreference.
The expression is evaluated before being given to the function, and its type does not
matterinsofarasPythonwillalwaysknowwhatitis;itsvalueisassignedanamewhenit
ispassed.Consider,forexample,thefunctionsquareinthefollowingcontext:
…
pi=3.14159
r=2.54
c=square(2*pi*r)
print(“Circumferenceis“,c)
Theassignmentstopiandrareperformed,andwhenthecalltosquare
occurs,theexpression2*pi*risevaluatedfirst.Itsvalueisassignedtoatemporary
variable,whichispassedastheparametertosquare.Insidethefunctionthisparameter
isnamedx,andthefunctioncalculatesxsquaredandreturnsitasavalue.Itisasifthe
followingcodeexecutes:
pi=3.14159
r=2.54
#callsquare(2*pi*r)
parameter1=2*pi*r#settheparametervalue
x=parameter1#Firstparameterisnamedxinside
#SQUARE
returnvalue=x*x#CodewithinSQUARE,returnx*x
c=returnvalue#assignresultoffunction
#calltoc
print(“Circumferenceis“,c)
Thisisnothowafunctionisimplemented,butitshowshowtheparameteriseffectively
passed;acopyismadeoftheparametersandthosearepassed.Iftheexpression2*pi*r
was changed to a simple variable, then the internal location of that variable would be
passed.
Passingmorestructuredobjectsworksthesamewaybutcanbehavedifferently.Ifalist
ispassedtoafunctionthenthelistitselfcannotbemodified,butthecontentsofthelist
canbe.Thelistisassignedanothername,butitisthesamelist.Tobeclear,considera
simplefunctionthateditsalistbyaddinganew
elementtotheend:
defaddend(arg):
arg.append(“End”)
z=[“Start”,“Add”,“Multiply”]
print(1,z)
addend(z)
print(1,z)
Thelistassociatedwiththevariablezischangedbythisfunctioncall.Itnowendswith
thestring“End.”Outputfromthisis:
1[ꞌStartꞌ,ꞌAddꞌ,ꞌMultiplyꞌ]
2[ꞌStartꞌ,ꞌAddꞌ,ꞌMultiplyꞌ,ꞌEndꞌ]
Whyisthis?Becausethenamezreferstoathingthatconsistsofmanyotherparts.The
namezisusedtoaccessthem,andthefunctioncan’tmodifythevalueofzitself.Itcan
modifywhatzindicates;thatis,thecomponents.Thinkofit,ifitmakesitsimpler,asa
levelofindirection.Abookcanbeexchangedbetweentwopeople.Thereceiverwritesa
noteinitandgivesitback.It’sthesamebook,butthecontentsarenowdifferent.
Asmallmodificationtoaddend()illustratessomeconfusingbehavior.Insteadofusing
appendtoadd“End”tothelist,usetheconcatenationoperator“+”:
defaddend(arg):
arg=arg+[“End”]
z=[“Start”,“Add”,“Multiply”]
print(1,z)
addend(z)
print(2,z)
Nowtheoutputis:
1[ꞌStartꞌ,ꞌAddꞌ,ꞌMultiplyꞌ]
2[ꞌStartꞌ,ꞌAddꞌ,ꞌMultiplyꞌ]
The component “End” is not a part of the list zanymore. It was made a component
insideofthe function,but it’s notpresent afterthe functionreturns.This isbecause the
statement:
arg=arg+[“End”]
actuallycreatesanewlistwith“End”asthefinalcomponent,andthenassignsthatnew
list as a value to arg. This represents an attempt to change the value that was passed,
which can’t happen: changing the value of arg will not change the value of the passed
variablez.So,withinthefunctionargis anew listwith“End” asthe finalcomponent.
Outside,thelistzhasnotchanged.
ThewaythatPythonpassesparametersisthesubjectofalotofdiscussiononInternet
blogsandlists.Therearemanynamesgivenforthemethodused,andwhilethetechnique
isunderstood,itdoesdifferfromthewayparametersarepassedinotherlanguagesandis
confusingtopeoplewholearnedanotherlanguagelikeJavaorCbeforePython.Thething
torememberisthattheactualvalueofthething(anobjectreference)beingpassedcan’t
beassignedanewvalueinsidethefunction,butthethingsthatitreferencesorpointsto
canbemodified.
Multipleparameters are passedby position; the firstparameter passed is given to the
firstonelistedinthefunctiondeclaration,thesecondonepassedisgivento the second
onelistedinthedeclaration,andsoon.Theyareallpassedinthesamemanner,though,as
objectreferences.
4.2.3 DefaultParameters
Itispossibletospecifyavalueforaparameterintheinstancethatitisnotgivenoneby
thecaller.Thatmaynotseemtomakesense,buttheimplicationisthatitwillsometimes
be passed explicitly and sometimes not. When debugging code it is common to embed
print statements in specific places to show that the program has reached that point.
Sometimes it is important to print out a variable orvalue there, other times it is just to
showthattheprogramgottothatstatementsafely.Considerafunctionnamedgothere:
defgothere(count,value):
print(“GotHere:“,count,”valueis“,value)
thenthroughouttheprogram,callstogotherewouldbe sprinkledwithadifferentvalue
forcount every time; the value of count indicates the statement that has been reached.
Thisisawayofinstrumentingtheprogram,andcanbeveryusefulforfindingerrors.So
thecodebeingdebuggedmaylooklike:
year=2015#Thecodebelowisnotespecially
#meaningful
a=year%19#andisanexampleonly.
gothere(1,0)
b=year//100
c=year%100
gothere(2,0)
d=(19*a+b-b//4-((b-(b+8)//25+1)
//3)+15)%30
e=(32+2*(b%4)+2*(c//4)-d-(c%4))%7
f=d+e-7*((a+11*d+22*e)//451)+114
gothere(3,f)
month=f//31
day=f%31+1
gothere(4,day)
returndate(year,month,day)
Outputis:
GotHere:1valueis0
GotHere:2valueis0
GotHere:3valueis128
GotHere:4valueis5
201545
Theprogramreacheseachofthefourcheckpointsandprintsapropermessage.Thefirst
two calls to gothere did not need to print a value, only the count number. The second
parametercouldbegivenadefaultvalue,perhapsNone,andthenitwouldnothavetobe
passed.Thedefinitionofthefunctionwouldnowbe:
defgothere(count,value=None):
ifvalue:
print(“GotHere:“,count,”valueis“,value)
else:
print(GotHere:“,count)
andtheoutputthistimeis:
GotHere:1
GotHere:2
GotHere:3valueis128
GotHere:4valueis5
201545
Theassignmentwithintheparameterlistgivesthenamevalueaspecialproperty.Ithas
a default value. If the parameter is not passed, then it takes that value; otherwise, it
behavesnormally.Thisalsomeansthatgotherecanbecalledwithoneortwoparameters,
which can be very handy. It is important to note that the parameters that are given a
defaultvaluemustbedefinedaftertheonesthatarenot.That’sbecauseotherwiseitwould
notbeclearwhatwasbeingpassed.Considerthe(illegal)definition:
defwrong(a=1,b,c=12):
…
Nowcallwrongwithtwoparameters:
wrong(2,5)
Whatparametersarebeingpassed?Isitaandb?Isitaandc?Itisimpossibletotell.A
legaldefinitionwouldbe:
defright(b,a=1,c=12)
Thisfunctioncanbecalledas:
right(19)
inwhichcaseb=19,a=1,andc=12.Itcanbecalledas:
right(19,20)
inwhichcaseb=19,a=19,andc=12.Itcanbecalledas:
right(19,19,19)
inwhichcaseb=19,a=19,andc=19.Buthowcanitbecalledpassingbandcbutnota?
Likethis:
right(19,c=19)
Inthiscaseahasbeenallowedtodefault.Theonlywaytopasscwithoutalsopassinga
istogiveitsnameexplicitlysothatthecallisnotambiguous.
4.2.4 None
Mistakes happen when writing code. They are unavoidable, and much time is spent
gettingridofthem.Onecommonkindofmistakeistoforgettoassignareturnvaluewhen
one is needed. This is especially likely when there are multiple points in the function
whereareturncanoccur.Inmanyprogramminglanguagesthiswillbecaughtasanerror,
butinPythonitisnot.Instead,afunctionthatisnotexplicitlyassignedareturnvaluewill
returnaspecialvaluecalledNone.
None has its own type (NoneType), and is used to indicate something that has no
definedvalueortheabsenceofavalue.Itcanbeexplicitlyassignedtovariables,printed,
returnedfromafunction,andtested.Testingforthisvaluecanbedoneusing:
ifx==None:
orby:
ifxisNone:
4.2.5 Example:TheGameofSticks
Thisis arelatively simplecombinatorial gamethat involvesremoving sticks or chips
fromapile.Therearetwoplayers,andthegamebeginswithapileof
21sticks.Thefirstplayerbeginsbyremoving1,2,or3sticksfromthepile.
Thenthenextplayerremovessomesticks,again1,2,or3ofthem.Playersalternatein
this way. The player who removes the last stick wins the game; in other words, if you
can’tmove,youlose.
Functionsareusefulintheimplementationofthisgamebecausebothplayersdosimilar
things.Theactionconnectedwithmakingamove,displayingthecurrentposition,andso
onarethesameforthehumanplayerandthecomputeropponent. The currentstatusor
stateofthegameissimplyanumber,thenumberofsticksremaininginthepile.When
thatnumberiszero,thenthegameisover,andtheloseriswhateverplayerissupposedto
movenext.Thecodeforapairofmoves,onefromthehumanandonefromthecomputer,
mightbecodedinPythonasfollows:
displayState(val)#Showthegameboard
userMove=getMove()#Askuserfortheirmove
val=val–userMove#Makethemove
print(“Youtook“,userMove,”sticksleaving“,val)
ifgameOver(val):
print(“Youwin!”)
else:
move=makeComputerMove(val)#Calculatethe
#computerꞌsmove
print(“Computertook“,move,”sticksleaving“,val)
ifgameOver(val):
print(“Computerwins!”)
Thecurrentstateofthegameisdisplayedfirst,andthenthehumanplayerisaskedfor
theirmove.Themoveissimplythenumberofstickstoremove.Whenthemovehasbeen
made,iftherearenosticksleftthenthehumanwins.Otherwise,thecomputercalculates
and makes a move; again, if no sticks remain then the game is over, in this case the
computerbeingthewinner.Thisentiresectionofcodeneedstoberepeateduntilthegame
isover,ofcourse.
There are four functions that must be written for this version: displayState(),
getMove(),gameOver(),andmakeComputerMove().
The function displayState() prints the current situation in the game. Specifically, it
printsone“O”characterforeachstickstillinthepile,anddoessoinrowsof6.Atthe
beginningofthegamethisfunctionwouldprint:
OOOOOO
OOOOOO
OOOOOO
OOO
whichis21sticks.Thecodeis:
defdisplayState(val):
k=val#Krepresentsthenumberofsticksnot
#printed
whilek>0:#Solongassomearenotprinted…
ifk>=6:#Ifthereisawholerow,printit.
print(“OOOOOO“,end=””)
k=k–6#Sixfewersticksareunprinted
else:
forjinrange(0,k):#Printtheremainder
print(“O“,end=””)
k=0#Noneremain
print(””)
Thisshouldbeobvious.Alsonotethatthefunctionisnamedforwhatitdoes.Itdoes
onlyonething;itmodifiesnovaluesoutsideofthefunction,anditservesapurposethatis
neededmultipletimes.Theseareallgoodpropertiesofafunction.
ThefunctiongetMove()willprintaprompttotheuser/playeraskingforthenumberof
sticks they wish to remove and reads that value from the keyboard, returning it as the
function value. Again, this function is named for what it does and performs a single,
simpletask.Onepossibilityforthecodeis:
defgetMove():
n=int(input(“Yourmove:Takeawayhowmany?“))
whilen<=0orn>3:
print(“Sorry,youmusttake1,2,or3sticks.”)
n=int(input(“Yourmove:Takeawayhowmany?“))
returnn
ThefunctiongameOver()istrivial,butlendsstructuretotheprogram.Allitdoesistest
toseewhetherthevalueofval,thegamestatevariable,iszero.Itleavesopentheideathat
theremaybeotherendofgameindicatorsthatcouldbetestedhere.
defgameOver(state):
ifstate==0:
returnTrue
returnFalse
Finally, the most complicated function, getComputerMove(), can be attempted.
Naturallyagoodgamepresentsachallengetotheplayer,andsothecomputershouldwin
the game it if can. It should not play randomly if that is possible. In the case of this
particulargame,thewinningstrategyiseasytocode.Theplayertomakethefinalmove
wins,soifthereare1,2,or3sticksleftattheend,thecomputerwouldtakethemalland
win.Forcingthehumanplayertohave4sticksmakesthishappen.Thesameistrueifthe
computercangivethehumanplayer(i.e.,leavethegameinthestateofhaving)8,12,or
16 sticks. This can be demonstrated by playing the game with actual sticks. So, if the
humanmovesfirst(asitdoesinthisimplementation)thecomputertriestoleavethegame
inastatewherethereare16,12,8,or4sticksleftafteritsmove.Thecodecouldbe:
defgetComputerMove(val):
n=val%4
ifn<=0:
return1
else:
returnn
There are a couple of details needed to finish this game properly that are left as an
exercise.
4.2.6 Scope
Avariablethatisdefined(firstused)inthemainprogramiscalledaglobalvariable,
andcanbeaccessedbyallfunctionsiftheyaskforit.Avariablethatisusedinafunction
canbeaccessedby thatfunction,andis notavailableinthemain program.It’s calleda
localvariable.Thisschemeiscalledscoping:thelocationsinaprogramwhereavariable
can be accessed is called its scope. It’s all pretty clear unless a global variable has the
samenameasalocalone,inwhichcasethequestionis:“whatvalueisrepresentedbythis
name?”Ifavariablenamed“x”isglobalandafunctionalsodeclaresavariablehavingthe
samename,thisiscalledaliasing,anditcanbeaproblem.
InPython,avariableisassumedtobelocalunlesstheprogrammerspecificallysaysitis
global.Thisisdoneinastatement;forexample:
globala,b,c
tells Python that the variables named a, b, and c are global variables, and are defined
outsideofthefunction.Thismeansthatafterthefunctionhascompletedexecution,those
variablescanstillbeaccessedbythemainprogramandbyanyotherfunctionsthatdeclare
themtobeglobal.
Globalvariablesarethoughtbysomeprogrammerstobeabadthing,butinfactthey
canbequiteusefulandcanassistinthegeneralityofthefunctionsthatareapartofthe
program.Aglobalvariableshouldrepresentsomethingthatis,infact,global,something
thatshouldbeknowntothewholeprogram.Forinstance,iftheprogramisonethatplays
checkersorchess,thentheboardcanbeglobal.Thereisonlyoneboard,anditisessential
tothewholeprogram.Thesameappliestoanyprogramthathasacentralsetofdatathat
manyofthefunctionsneedtomodify.
Anexampleofcentraldataisgamestateinavideogame.IntheSticksgameprogram
forexample,thefunctiongetComputerMove()takesaparameter—thegamestate.There
isonlyonegamestate,andalthoughforsomegamesitcaninvolvemanyvalues,inthis
casethereisonlyonevalue:thenumberofsticksremaining.Thefunctioncanberewritten
tousethegamestatevariablevalasaglobalinthefollowingway:
defgetComputerMove():
globalval
n=val%4
ifn<=0:
return1
else:
returnn
Similarly, the function that determines whether the game is over could use val as a
globalvariable.OntheotherhanditwouldbepoorstylisticformtohavegetMove()usea
globalfortheuser’smove.Thenamedoesimplythatthefunctionwillgetamove,andso
thatvalueshouldbereturnedasanexplicitfunctionreturnvalue.
Ifavariableisnamedasglobal,thenthatnamecannotbeusedinthefunctionasalocal
variable as well. It would be impossible to access it, and it would be confusing. It is a
commonprogrammingerrortoforgettodeclareavariableasglobal.Whenthishappens
thevariableisanewonelocaltothefunction,andstartsoutwithavalueof0.Thusno
syntaxerrorisdetected,butthecalculationwillalmostcertainlybeincorrect.Itmightbea
goodideatoidentifyglobalvariablesintheirname.Forexample,placethestring“_g”at
the end of the names of all globals. The game state above would be named val_g, for
example.Thiswouldbearemindertodeclarethemproperlywithinfunctions.
Other kinds of data that could be kept globally would include lists of names,
environment or configuration variables, complex data structures that represent a single
underlying process, and other programming objects that are referred to as singletons in
software engineering. In Python, because they have to be explicitly named in a
declaration,thereisaconstantreminderofthevariable’sscope.
4.2.7 VariableParameterLists
Theprint()functionisinterestingbecauseitseemstobeabletoacceptanynumberof
parametersanddealwiththem.Thestatement:
print(i)
printsthevalueofthevariablei,and
print(i,j,k)
prints the value of all three variables i, j, and k. Is this some sort of special thing
reserved for print() because Python knows about it? Nope. Any function can do this.
Considerafunction:
fprint(“formatstring”,variablelist)
where the format string can contain the characters “f” or “i” in any combination. Each
instanceofalettershouldcorrespondtoavariablepassedtothefunctioninthevariable
list,anditwillbeprintedasafloatingpointifthecorrespondingcharacterintheformat
stringis“f”andasanintegerifitis“i.”Thecall:
fprint(“fi”,12,13)
will print the values 12 and 13 as a float and an integer respectively. How can this be
writtenasaPythonfunction?
Thefunctionwouldstartoutwiththefollowingdefinition:
deffprint(fstring,*vlist)
The expression *vlist represents a set of positional parameters, any number of them.
Thisisprecededbyaspecificparameterfstring,whichwillbethe
formatstring.Asimpletestofthiswouldbetojustprintthevariablesinthelisttoseeifit
works:
deffprint(fstring,*vlist)
forvinvlist:
printv
Whencalledasfprint(“”,12,13,14,15)thisprints:
12
13
14
15
It removes some of the magic to point out that what is going on is that the list of
variablesafterthe*characteristurnedintoatuple,whichispassedastheparameter,so
the*vlistactuallycountsasasingleparameterwithmanycomponents.Nomagic.
Tofinishtheoriginalfunction,whathastobedoneistopeelcharactersoffofthefront
of the format string, match them against a variable, and print the result as the format
characterdictates.Soitisthesameloopasabove,butalsoanindexintotheformatstring
increaseseachtimethroughandisusedtoindicatetheformat.Itisalsoimportantthatthe
numberofformatitemsequalsthenumberofvariables:
deffprint(s,*vlist):
i=0
iflen(s)!=len(vlist):#Formatstringand
#variablelistagree?
print(“Theremustbethesamenumberofvariablesasformatitems.”)
return
forvinvlist:#Foreachvariable
ifs[i]==“f”:#Isthecorresponding
#formatꞌfꞌ?
fv=float(v)#Yes.Makeitafloat
print(fv,”“,end=””)#…andprintit
elifs[i]==“i”:#Isthecorresponding
#formatꞌiꞌ?
iv=int(v)#Yes.Makeitan
#integer
print(iv,““,end=””)#…andprintit
else:
print(“?”,end=””)#Donꞌtknowwhatthis
#is.Printit
i=i+1
Alloftheknownpositionalparametersmustcomebeforethevariablelist;otherwisethe
endofthevariablelistcan’tbedetermined.Thereisasecondcomplication,thatbeingthe
existenceofnamedparameters.Thoseareindicatedbyaparametersuchas**nlist.The
two“*” characters indicatea list of named variables. Thisis properly amore advanced
topic.
4.2.8 VariablesasFunctions
BecausePythoniseffectivelyuntypedandvariablescanrepresentanykindofthingat
all, a variable can be made to refer to a function; not the function name itself, which
always refers to a specific function, but a variable that can be made to refer to any
function.Considerthefollowingfunctions,eachofwhichdoesonetrivialthing:
defprint0():
print(“Zero”)
defprint1():
print(“One”)
defprint2():
print(“Two”)
defprint3():
print(“Three”)
Nowmakeavariablereferenceoneofthesefunctionsbymeansofan
assignmentstatement:
printNum=print1#Notethatthereisnoparameterlist
#given
ThevariableprintNum nowrepresentsa function,and wheninvoked, thefunction it
representswillbeinvoked.So:
printNum()
willresultintheoutput:
One
Why did the statement printNum = print1 not result in the function print1 being
called?Becausetheparameterlistwasabsent.Thestatement:
printNum=print1()
resultsinacalltoprint1atthatmoment,andthevalueofthevariableprintNumwillbe
the return value of the function. This is the essential syntactic difference: print1 is a
functionvalue,andprint1()isacalltothefunction.Toemphasizethispoint,hereissome
codethatwouldallowtheEnglishnameofanumberbetween1and3tobeprinted:
ifa==1:
printNum=print1#Assignthefunctionprint1to
#printNum
elifa==2:
printNum=print2#Assignthefunctionprint2to
#printNum
else:
printNum=print3#Assignthefunctionprint3to
#printNum
…
printNum()#Callthefunctionrepresentedby
#printNum
Therearemoresubtleusesinthiscase.Considerthisuseofalist:
a=1
printList=[print0,print1,print2,print3]
printNum=printList[a]
printNum()
willresultintheoutput:
One
Thefinaliterationofthisistocallthefunctiondirectlyfromthelist:
printList[1]()
ThisworksbecauseprintList[1]isafunction,andafunctioncallisafunctionfollowed
by().Seemsoverlycomplicated,doesn’tit?Itisrarelyused.
Forthosewithaninterestorneedformathematics,considerafunctionthatcomputes
thederivativeorintegralofanotherfunction.Passingthefunctiontobedifferentiatedor
integratedasaparametermaybethebestwaytoproceedinthesecases.
Example:FindtheMaximumValueofaFunction
Maximizingafunctioncanhaveimportantconsequencesinreallife.Thefunctionmay
represent how much money will be made by manufacturing various objects, how many
patientscangetthroughanemergencywardinanhour,orhowmuchfoodwillbegrown
withparticularcrops.Ifthefunctioniswellbehavedthentherearemanymathematically
soundwaystofindamaximumorminimumvalue,butifafunctionishardertodealwith,
thenlessanalyticalmethodsmayhavetobeused.Thisproblemproposesasearchforthe
best pair of parameters to a problem that could be solved using a method called linear
programming.
Theproblemgoeslikethis:
A calculator company produces a scientific calculator and a graphing calculator.
Long-term projections indicate an expected demand of at least 100 scientific
and80graphingcalculatorseachday.Becauseoflimitationsonproductioncapacity,
no more than 200 scientific and 170 graphing calculators can be made daily. To
satisfyashippingcontract, atotalofat least200calculators much beshippedeach
day.
If each scientific calculator sold results in a $2 loss, but each graphing calculator
producesa$5profit,howmanyofeachtypeshouldbemadedailytomaximizenet
profits?
Let s be the number of scientific calculators manufactured and g be the number of
graphingcalculators.Fromtheproblemstatement:
100<=s<=200
80<=g<=170
Also:
s+g>200,org>200-s
Finally,theprofit,whichistobemaximized,is:
P=–2s+5g
First,codetheprofitasafunction:
defprofit(s,g):
return-2*s+5*g
Asearchthroughtherangeofpossibilitieswillrunthroughallpossiblevaluesofsand
allpossiblevaluesofg;thatis,sfrom100to200andgfrom80to170.Thefunctionwill
beevaluatedateachpointandthemaximumwillberemembered:
#Rangeforsisx0..x1
#Rangeforgisy0..y1
#s+gmustbe>=sum
defsearchmax(f,x0,y0,x1,y1,sum):
pmax=-1.0e12
ps=-100
pg=-100
forsinrange(x0,x1+1):#Forallpossibles
forginrange(y0,y1+1):#Forallpossibleg
ifs+g>=sum:#Conditionisok?
p=f(s,g)#Calculatetheprofit.
ifp>=pmax:#Bestsofar?
pmax=p#Yes.
ps=s#Saveitand
pg=g#theparameters
return((ps,pg))
Finally,thecallthat doestheoptimization callsthesearch function passingtheprofit
functionasaparameter:
c=searchmax(profit,100,80,200,170,200)
print(c)
Theanswerfoundisthetuple(100,170),ors=100andg=170,whichagreeswiththe
correctanswerasfoundbyothermethods.Thisisonlyoneexampleofthevalueofbeing
abletopassfunctionsasparameters.Mostofthecodethatdoesthisismathematical,but
mayaccomplishpracticaltaskslikeoptimizingperformance,drawinggraphsandcharts,
andsimulatingreal-worldevents.
4.2.9 FunctionsasReturnValues
Justasanyvalue,includingafunction,canbestoredinavariable,anyvalue,including
afunction,canbereturnedbyafunction.IfafunctionthatprintsanEnglishnameofa
numberisdesired,itcouldbereturnedbyafunction:
defprint0():
print(“Zero”)
defprint1():
print(“One”)
defprint2():
print(“Two”)
defprint3():
print(“Three”)
defgetPrintFun(a):#Returnafunctiontoprinta
#numericvalue0..3
ifa==0:
returnprint0#Returnthefunctionprint0as
#theresult
elifa==1:
returnprint1#Returnthefunctionprint1as
#theresult
elifa==2:
returnprint2#Returnthefunctionprint2as
#theresult
else:
returnprint3#Returnthefunctionprint3as
#theresult
Callingthisfunctionandassigningittoavariablemeansreturningafunctionthatcan
printanumericalvalue:
printNum=getPrintFun(2)#AssignafunctiontoprintNum
andthen:
printNum()#CallthefunctionrepresentedbyprintNum
resultsintheoutput:
Two
The function printFun returns, as a value, the function to be called to print that
particular number. Returning the name of the function returns something that can be
called.
Why would any of these seemingly odd aspects of Python be useful? Allowing a
general case, permitting the most liberal interpretation of the language, would permit
unanticipatedapplications,ofcourse.Andtheabilitytouseafunctionasavariablevalue
andareturnresultareanaturalconsequenceofPythonhavingnospecifictypeconnected
withavariableatcompilationtime.Therearemanyspecificreasonstousefunctionsin
thisway,ontheotherhand.Imagineafunctionthatplotsagraph.Beingabletopassthis
functionanotherfunctionto beplottedissurely themostgeneral waytoaccomplishits
task.
4.3 RECURSION
Recursion refers to a way of defining things and a programming technique, not a
languagefeature.Somethingthatisrecursiveisdefinedatleastpartlyintermsofitself.
Thisseemsimpossibleatfirst,butconsiderthecaseofagrocerylist(notaPythonlist)of
itemssuchas:
milk,bread,coffee,sugar,peanutbutter,cheese,jam
Eachelementinthelistcanbecalledanitem,andrepresentssomethingtobepurchased
atagrocerystore.Thesmallestlistisonehavingonlyasingleelement:
milk
Thus,alistcanbesimplyanitem.Whatelsecanitbe?Itappearstobeabunchofitems
separatedbycommas.Onewaytodescribethisistosayitcanbeanitemfollowedbya
commafollowedbyalist.Thecompletedefinitionis,presumingthatthesymbol->means
“canbedefinedas”:
list->item#listcanbedefinedasanitem
list->item,list#listcanbedefinedasanitem,acomma,andalist
Inthiswaythelistmilkisdefinedasalistbythefirstrule.Thelistmilk,breadisalist
becauseitisanitem(milk)followedbyacommafollowedbyalist(bread).Itisplain
that a list is defined here in terms of itself, or at least in terms of a previous partial
definitionofitself.
When talking about functions, a function is recursive if it contains within it a call to
itself.Thisisnormallydoneonlywhenthethingthatitisattemptingtoaccomplishhasa
definitionthatisrecursive.Recursionasaprogrammingtechniqueisanattempttomake
the solution simpler. If it does not, then it is inappropriate to use recursion. A problem
somebeginningprogrammershavewiththeideasofarecursivefunctionisthatitappears
that it does not terminate. Of course, it is essential that a function does return, and a
program that never ends is almost always in error. The problem really is how to make
certainthatachainoffunctioncallsterminateseventually.
Thefollowingfunctionwillneverreturnoncecalled:
defrecur1(i):
recur1(i+1)
print(i)
Itwillnotresultinanyoutput,either.Whynot?Becausethefirstthingitdoesiscall
itself,andalwaysdoesso.Whenitdoes,thenextthingisdoesiscallitselfagain,andthen
again,andsoon.Thefollowingfunction,ontheotherhand,willterminate:
defrecur2(i):
ifi>0:
recur2(i-1)
print(i)
Whencalleditchecksitsparameteri.Ifthatparameterisgreaterthanzero,thenitcalls
itselfwithasmallervalueofi,meaningthateventuallyiwillbecomesmallerthan0and
thechainofcallswillstop.Whatwillbeprinted?Thefirstcalltorecur2thatdoesnotend
upcallingitselfiswheni==0,sothefirstthingprintedwillbe0.Thenthefunctionreturns
tothepreviousrecursivecall,whichhadtobewherei==1.Thesecondthingprintedwill
be1.Andsoonuntilitreturnstotheoriginalcalltothefunctionwiththeoriginalvalueof
i,atwhichpointitprintsi.Thisisatrivialexampleofarecursivefunction,butillustrates
howtoexitfromthechainofcalls:theremustbeaconditionthatdefinestherecursion.
Whenthatconditionfails,therecursionceases.
Eachcalltothefunctioncanbethoughtofasaninstanceofthatfunction,anditwill
createallofthelocalvariablesthataredeclaredwithinit.Eachinstancehasitsowncopy
ofthese, includingits parameters, andeach callreturns to thecaller as occurswith any
otherfunction call. So, when the recursivecallto recur2()returns, thenext thing to be
done will be (in this case) to print the parameter value. A call to recur2() passing the
parameter4willresultinthefollowinginstancesofthatfunctionbeingcreated:
Bytracingthroughthestatementsthatareexecutedinthisway,itcanbeseenthatthe
recursiondoesend,andtheoutputorresultcanbeverified.
One important use of recursion is in reducing a problem into smaller parts, each of
which has a simpler solution than does the whole problem. An example of this is
searchingalistforanitem.Ifnames=[Adams,Alira,Attenbourough,…]isaPython
listofnamesinalphabeticalorder,answerthequestion:“DoesthenameParkerappearin
this list?” Of course there is a built-in function that will do this, but this example is a
pedagogicalmoment,andanywayperhapsthebuilt-infunctionisslowerthanthesolution
thatwillbedevisedhere.
ThefunctionwillreturnTrueorFalsewhenpassedalistandaname.Theobviousway
tosolvetheproblemistoiteratethroughthelist,lookingatalloftheelementsuntilthe
namebeingsearchedforiseitherfoundoritisnotpossibletofinditanymore(i.e.,the
current name in the list is larger than the target name). Another, less obvious way to
conductthesearchistodividethelistinhalf,andonlysearchthehalfthathasthetarget
nameinit.Considerthefollowingnamesinthelist:
…BroadbentButterworthCaitCaraCarlingDeversDillanEberlyFoxworthy…
The name in the middle of this list is Carling. If the name being searched for is
lexicographicallysmallerthanCarling,thenitmustappear inthefirsthalf;otherwiseit
must appear in the second half. That is, if it is there at all. A recursive example of an
implementationofthisis:
#Searchthelistforthegivenname,recursively.
defsearchr(name,nameList):
n=len(nameList)#Howmanyelementsinthis
#list?
m=n/2
ifname<nameList[m]:#targetnameisinthefirst
#half
returnsearchr(name,nameList[0:m])#Searchthe
#firsthalf
elifname>nameList[m]:#targetmustbeinthe
#secondhalf
returnsearchr(name,nameList[m:n]#Searchthe
#secondhalf
else:
returnTrue
Ifthenameisinthelist,thisworksfine.Onewaytothinkofthisisthatthefunction
searchr()will take a string and a list as parameters and find the name in the list if it’s
there.Thewayitworksisnotclearfromoutsidethefunction(withoutbeingabletosee
thesource)andshouldnotmatter.SO:ifthetargetistobefoundinthefirsthalfofthelist,
forexample,thencallsearchr()withthefirsthalfofthelist.
searchr(name,nameList[0:m])
Thefactthatthecallisrecursiveisnotreallytheconcernoftheprogrammer,butisthe
concern of the person who created the Python system. Now, how can the problem of a
namenotbeinginthelistbesolved?
Whenthenameisnotinthelist,theprogramwillcontinueuntilthereisbutoneitemin
thelist.Ifthatitemisnotthetarget,thenitisnottobefound.So,ifn=1(onlyoneitemin
thelist)andnameList[0]isnotequaltothetarget,thenthetargetisnottobefoundinthe
listandthereturnvalueshouldbeFalse.Thefinalprogramwillthereforebe:
defsearchr(name,nameList):
n=len(nameList)#Howmanyelementsinthislist?
m=int(n/2)
ifn==1andnameList[0]!=name:#Endoftherecursive
#calls
returnFalse#Itꞌsnotinthis
#list.
ifname<nameList[m]:#targetnameisinthefirst
#half
returnsearchr(name,nameList[0:m])#Searchthe
#firsthalf
elifname>nameList[m]:#targetmustbeinthe
#secondhalf
returnsearchr(name,nameList[m:n])#Searchthe
#secondhalf
else:
returnTrue
Many algorithms have fundamentally recursive implementations, meaning that the
effectivesolutionincodeinvolvesarecursivefunctioncall.Manystandardexamplesin
beginning programming are not properly implemented recursively. Commonly
encounteredsampleswitharecursivesolutionincludethefactorial,whichhasarecursive
definition but is not best implemented in that manner, and any other basically linear
technique (linear search, counting, min/max finding) that does not do a reasonable
subdivision.Testingthefirstcomponent,forexample,andthenrecursivelylookingatthe
remainingelementsisapoorwaytouserecursion.Itwouldbemuchbettertousealoop.
Here’sanexample:findthemaximumvalueinagivenlist.Thenon-recursivemethod
(reasonable)wouldbe:
defmax(myList):
max=myList[0]
foriinrange(1,len(myList)):
ifmyList[i]>max:
max=myList[i]
returnmax
Thisisaneffectivewaytofindthelargestvalueinalist,andisprettyeasilyunderstood
byaprogrammerreadingthecode.Nowhereisarecursivesolution:
defmaxr(myList):
m1=myList[0]
iflen(myList)>1:
m2=maxr(myList[1:])
else:
returnm1
ifm1>m2:
returnm1
else:
returnm2
This function works by subdividing the list into two parts, as is often done with a
recursivesolution.Theideaistocomparethefirstelementinthelistwiththemaximumof
theremainderofthelisttoseewhichisbigger.Forthisparticularproblemthisisnotan
obvious approach. It is less efficient and less obvious than the iterative version that
preceded it. The use of recursion simplifies some problems, but it is not a universally
applicable technique and should never be used to show off. Examples of very useful
recursivefunctionswillbeexaminedinlaterchapters.
4.3.1 AvoidingInfiniteRecursion
Thereisalimittohowmanytimesafunctioncancallitselfwithoutreturning,because
eachcallusesupsomeamountofmemory,andmemoryisafiniteresource.Usuallywhen
this happens a programming error has occurred and the function has slipped into an
infiniterecursion, in which it will continue to call itself without end. Recursion can be
confusingtovisualizeandthissortofproblemoccursfrequently.Howcanitbeavoided?
Programming the function correctly eliminates the problem, of course, but there are
some basic rules that will avoid the problem at early stages. Assuming that global
variablesarenotbeingreferenced:
1.Afunctionthatbeginswithacalltoitselfisalwaysinfinitelyrecursive.Thefirst
thingthefunctiondoesiscallitself,andnomatterwhattheparametersareitcan
neverend.
2.Everyrecursivecallwithinafunctionmusthaveaconditionuponwhichthatcall
willbeavoided.Thefunctionmayreturnsometimebeforethecallismade,or
perhapsthecallhappenswithinanifstatement,buttheremustbesuchacondition.
IfitexistsitisexpressibleasaBooleanexpression,andthisshouldbeplacedina
commentneartherecursivecall.Thecallissuspectuntilthishappens.
3.Avoidpassingafunctiontoitself.Thecalltoaparameterhidesthefactthat
recursionistakingplace.
4.Itispossibletohaveaglobalvariablethatisacountofthedepthofrecursion.The
functionwillincrementthiscountwheneverarecursivecallismadeanddecreaseit
justbeforereturning.Ifthecountevergetslargerthanareasonableestimateofthe
maximumdepth,thenthefunctioncouldstopanymorecallsandbackout,oran
errormessagecouldbeprinted.
4.4 CREATINGPYTHONMODULES
Insomeoftheexamplesgivensofarthereisastatementatthebeginningthatlookslike
“import name.” The implication is that there are some functions that are needed by the
programthatareprovidedelsewhere,possiblybythePythonsystemitselforperhapsby
some other software developer. The idea of writing functions that can be reused in a
straightforwardwayisveryimportanttothesoftwaredevelopmentprocess.Itmeansthat
no programmer is really alone; that code is available for doing things like generating
randomnumbersorinterfacingwiththeoperatingsystemortheInternet,andthatitdoes
not to be created each time. In addition, there is an assumption that a module works
correctly.Whenaprogrammerbuildsacollectionofcodefortheirownuse,itneedstobe
testedasthoroughlyaspossible,andfromthattimeonitcanbeusedinapackagewith
confidence.Ifaprogramhaserrorsinit,thenlookinthecodeforthatprogramfirstand
notinthemodules.Thismakesdebuggingcodefaster.
Whatisamodule?Itissimplyafunctionorcollectionoffunctionsthatresideinafile
whose name ends in “.py.” Technically, all of the code developed so far qualifies as
modules. Consider as an example the function from the previous section that finds the
maximumvalueinalist.Savethefunctionsmax()andmaxr()inafilenamedmax.py.
NowcreateanewPythonprogramnamedusemax.pyandplaceitinthesamedirectoryas
max.py.Ifthetwofilesareinthesamedirectory,theycan“see”eachotherinsomesense.
Hereissomecodetoplaceinthefileusemax.py:
importmax
d=[12,32,76,45,9,26,84,25,61,66,1,2]
print(“MAXis“,max.max(d),“MAXRis“,max.maxr(d))
ifmax.maxr(d)!=max.max(d):
print(“***NOTEQUAL****”)
Thisprogramisjustatestofthetwofunctionstomakecertainthattheyreturnthesame
valueforthesamelist,thevariabled.Notetwothings:
1.Thestatementimportmaxoccursatthebeginningoftheprogram,meaningthat
thecodeinsidethisfileisavailabletothisprogram.Pythonwilllookinsideofthis
fileforfunctionandvariablenames.
2.Whenthefunctionmax()ormaxr()iscalled,thefunctionnameisprecededbythe
modulename(max)andaperiod.ThissyntaxinformsthePythonsystemthatthe
namemaxr()(forexample)isfoundinthe
modulemaxandnotelsewhere.
ThefirsttimethatthemoduleisloadedintothePythonprogramthecodeinthemodule
isexecuted.Thisallowsanyvariableinitializationstobeperformed.Henceforththatcode
is not executed again, and functions within the module can be called knowing that the
initializationshavebeenperformed.
Themodulecouldresideinthesamedirectoryastheprogramthatusesit,butdoesnot
haveto.ThePythonsystemrecognizesasetofdirectoriesandpathsandmodulescanbe
placedinsomeofthoselocationsaswell,makingiteasierforotherprogramsonthesame
computertotakeadvantageofthem.Onthecomputerusedtocreatetheexamplesinthis
book, the directory C:\Python34\Lib can be used to store modules, and they will be
recognizedbyimportstatements.
Finally, if the syntax max.maxr(list) seems a bit cumbersome, then it is possible to
importspecificnamesfromthemoduleintotheprogram.Considerthefollowingrewrite
ofusemax.py:
frommaximportmax,maxr
d=[12,32,76,45,9,26,84,25,61,66,1,2]
print(“MAXis“,max(d),”MAXRis“,maxr(d))
ifmaxr(d)!=max(d):
print(“***NOTEQUAL****”)
Thestatementfrommaximportmax,maxrinstructsPythontorecognizethenames
maxandmaxrasbelongingtothemodulenamedmax(i.e.,asresidinginthefilenamed
max.py).Inthatcasethefunctioncanbecalledbysimplyreferencingitsname.
There would appear to be a name conflict with the package named max and the
functionnamedmax,butinfactthereisnot.Indeed,it’snotuncommontofindthissortof
namingrelationship(example:random.random()).Themodulenamemaxreferstoafile
name,max.py.Thefunctionnamemaxreferstoafunctionwithinthatfile.
4.5 PROGRAMDESIGNUSINGFUNCTIONS–
EXAMPLE:THEGAMEOFNIM
Nimisagamesooldthatitsoriginshavebeenlost.ItwaslikelyinventedinChina,and
isoneoftheoldestgamesknown.Itwasalsooneofthefirstgamestohaveacomputeror
electronicimplementationandhasbeenthefrequentsubjectofassignmentsincomputer
programming classes. This program will implement the game and will play one side. It
will serve as an example of how to design a computer program using functions and
modularity—itisanexampleofa
top-downdesign.
The game starts with three rows of objects, such as sticks or coins, and there are a
different number of objects in each row. In this version there are 9, 7, and 5 “sticks,”
whicharerepresentedby“|”characters.Aplayermayremoveasmanyobjectsfromone
rowastheychoose,buttheymustremoveatleastoneandmusttakethemonlyfromone
row.Playerstaketurnsremovingobjects,andtheplayertakingthatfinaloneisthewinner.
Playing this game involves asking the user for two numbers: the row from which to
removesticks,andhowmanytoremove.Thehumanplayerwillbepromptedfortherow,
thenthenumber.Thenthecomputerwillremovesomesticks(takeitsturn)andprintthe
newstate.
Alistnamedvalcontainsthenumberofsticksineachrow.Initially:
val=[5,7,9]
Thisisthegamestate,andiscriticaltothegameasitdefineswhatmovesarepossible.
Also,whenthestateis[0,0,0]thenthegameisover.
WhentheuserchosestoremoveNsticksfromrowMtheactionis:
val[M]=val[M]-N
OfcourseNandMmustbetestedtomakecertainthatMisbetween0and2,andMis
aslargeasval[M].Mdefinestherowchosentoremovesticksfrom,andNisthenumber
ofstickstoremove.Amovecanthereforebedefinedasalist[row,sticks].
A program that uses functions should be built from the highest level of abstraction
downwards.Thatis,themainprogramshouldbedevelopedfirst,andshouldbeexpressed
intermsoffunctionsthatdologicalthings,butthatmaynotbedesignedorcodedyet.So,
themainprogramcouldlooksomethinglikethis:
val=[5,7,9]#thegamestate:5,7,and9sticks
done=False#Isthegameover?
userMove=[-1,-1]#Amoveisarowandanumberof
#sticks.
print(“ThegameofNim.”)
rules()#Printtherulesforthegame
whilenotdone:#Rununtilthegameisover
displayState(val)#Showthegameboard
prompt(userMove)#Askuserfortheirmove
ok=legalMove(userMove,val)#Wastheplayerꞌsmove
#OK?
whilenotok:
print(“Thismoveisnotlegal.”)
displayState(val)
prompt(userMove)#Askuserfortheirmove
ok=legalMove(userMove,val)
makeMove(userMove)#Makeit
ifgameOver(val):
print(“Youwin!”)
break;
print(“Stateafteryourmoveis“)#displayit.
displayState(val)
Thisprogramisbuiltusingcomponents(modules)thatarenotwrittenyet,butthathave
apurposethatisdefinedbywhattheprogramneeds.Thosemodules/functionsare:
rules()-Printouttherulesofthegame
displayState(v)-Printthegamestate(howmanysticksineachrow)
prompt()-Asktheuserfortheirmove
legalMove(r,n)-isthemovelegal?
makeMove(r,n)-makethismove
Usingfunctions,thefirstthingthatisneededistodisplaythegamestate.Itprintsthe
numberofsticksineachofthethreerows,anddoessoinagraphicalway(i.e.,ratherthan
justdisplayingthenumbers).Giventhesituationasdescribedsofar,thenon-trivialsuch
functionisdisplayState(),whichprintsthecurrentstateofthegame—howmanysticksin
eachrow.Itwillbepassedalistrepresentingthecurrentstate.
defdisplayState(val):#valisthelistwith
#thestate
forjinrange(0,3):#thereare3rows;
#printeachone
print(j+1,“:“,end=””)#Printtherownumber
foriinrange(0,val[j]):#val[j]isthecurrent
#row
print(“|“,end=””)#printaꞌ|ꞌforeach
#stick
print(””)#printanendofline
When called at the beginning of the game, here’s what the result of a call to this
functionwouldbe:
1:|||||
2:|||||||
3:|||||||||
Thisfunctiondoesasingletask,usesaparametertoguideitandmakeitmoregeneral,
andisnamedforwhatitdoes.Thesearesignsofagoodfunction.Notethatthefirstrowis
labeled“1,”butisinfactelement0ofthelist.Itiscommoninuserinterfacestoadaptto
the standard human numbering scheme that begins with 1 instead of 0. When the user
entersarownumbercaremustbetakentosubtract1fromitbeforeusingitasanindex.
There is no required order for writing these functions, but the next one used in the
programisprompt().Thiswillasktheusertoinputarowandthenreadarownumber,
thenprompttheusertoenteranumberofstickstoremoveandthenreadthatvaluetoo.
The two numbers will be placed into a list that was passed so that the values can be
returnedtothecaller.
defprompt(move):
row=input(“Yourmove:whichrow?“)#Promptforrow&
#readit
sticks=input(”howmanysticks?”)#Promptfor
#sticks&read
#Convertrowtointegeranddecrementtobefrom0to2.
move[0]=int(row)-1#Assigntothelist[0]
move[1]=int(sticks)#Assignvaluetolist[1]
This function again does a simple task, uses a parameter, and is named pretty
appropriately.
Nextisthequestion“Isthismovelegal”?Amoveislegaliftherowisbetween0and2
inclusive,andifthenumberofsticksinthatrowisgreaterthanorequaltothenumberof
stickstoberemoved.ThefunctionreturnsTrueorFalse.
deflegalMove(move,state):
row=move[0]#Whichrowwasrequested?
sticks=move[1]#Howmanysticks
ifrow<0orrow>2:#Legalnumberofrows?
returnFalse#No
ifsticks<=0orsticks>val[row]:#Legalnumberof
#sticks?
returnFalse#No
returnTrue#Bothwereok,sothemoveisOK.
Makingamoveinvolvesdecreasingthespecifiedrowbythespecifiednumberofsticks.
This could have been done in legalMove() if it were thought to be OK to do multiple
thingsinafunction.Eventuallythatwillbenecessary,butfornowanewfunctionwillbe
written,namedmakeMove(),thatwillactuallyimplementaspecifiedplayinthegame.
defmakeMove(move,state):
row=move[0]#Subtractmove[1]sticksfrom
sticks=move[1]#thosethatareinrowmove[0].
#Placethenewnumberofsticksinthestatelist
state[row]=state[row]-sticks
Thereisastrategythatwillpermitaplayertoalwayswin.Itinvolvescomputingwhat
amountstoaparityvalueandmakingamovetoensurethatparityismaintained.Consider
theinitialstateandthestateaftertakingtwosticksfromrow1:
Row1=5=0101row1=3=0011
Row2=7=0111row2=7=0111
Row3=9=1001row3=9=1001
Parity10111101
The parity is determined by looking at each digit in the binary representation of the
values.Ineachcolumn(digitposition)theparitybitforthatcolumnis1ifthenumberof1
bitsinthecolumnisoddand0ifitiseven.Thiscanbecalculatedusingtheexclusive-OR
operator,whichis“^”inprocessing.ThestrategyinNimistomakeamovethatmakesthe
parityvalue0. Itturnsout thatthisis alwayspossibleif parityisnot 0;in thesituation
above,thecomputermightremove5sticksfromrow3givingthestate:
row1=3=0011
row2=7=0111
row3=4=0100
Parity0000
Thisiswhatthesketchdoesaftereverymovetheplayermakes:itmakesallpossible
moves, computing the parity after each one. When the one with zero parity is found it
makes that move. The function eval() calculates the current parity value as
val[0]^val[1]^val[2].
NOTEThe computer always wins because the user always makes the first move.
Alternatingwhomovesfirstwouldmakethegameplayfairer.
4.5.1 TheDevelopmentProcessExposed
IntheintroductiontotheNimprogramitwassaidthatthiswasanexampleoftop-down
design. This means that the larger program, or the main program, is designed first. The
questionshouldbewhatarethestepsinvolvedinsolvingthisproblem?Theanswertothat
questioniswrittendownintermsoffunctionsthathavenotbeenwrittenyet,butthathave
aknownandrequiredpurposewithinthesolution.IntheNimgameitisknownthatthe
user’smovewillhavetobereadfromthekeyboardandthatthecurrentstateofthegame
willhavetobedisplayed,sothosetwofunctionscanbepresumedtobeimportanttothe
solutionwhensketchingthemainprogram.
Oncethehigh-levelpartoftheprogramhasbeendevised,itcanbetypedinandtested.
Thefunctionsthatareneededbutarenotyetwrittencanbecodedasstubs:functionsthat
donotimplementtheirtaskbutthatarepresentandpreventsyntaxerrors.Thefirsttryata
solutionofthissortdoesnot,ofcourse,solvetheproblem,butissimplyasteptowardsthe
solution.InthecaseofNim,theveryfirststepcouldbesomethinglike:
Repeat
Displaythegame
Askuserfortheirmove
Makeuserꞌsmove
Generatecomputerꞌsmove
Makecomputer'smove
Untilsomeonewins
Displaythewinner
Noneofthesesteps arewrittenasproper Python code,andthat’sOK forafirststep.
TranslatingthisintoPythonwouldbeagoodidea:
Done=false
whilenotdone:#Rununtilthegameisover
displayState()#Showthegameboard
prompt()#Askuserfortheirmove
makeMove()#Makeit
ifnotgameOver():#Computermove?
makeComputerMove()#Determinecomputer'smove
done=gameOver()#Isthegameover?
printWinner()
Atthispointinthedesignthedatastructurestobeusedinthesolutionhavenotbeen
devised, nor have the algorithms needed. This is merely a sequence of steps that could
leadtoaprogramthatworks.Thefunctionscannowbewrittenasstubs:
defdisplayState():defprompt():
print(“Displaystate”)print(“Entermove”)
defmakeMove():defgameOver():
print(“Makemove”)ifrandom.random()<0.2:
returnFalse
returnTrue
defmakeComputerMove():defprintWinner():
print(“computeamove”)print(“Thewinneris:”)
Theoutputfromthisprogrammightbe:
Displaystate
Entermove
Makemove
Displaystate
Entermove
Makemove
computeamove
Thewinneris:
Theexactoutputwillberandom,dependingonwhatthereturnvalueofgameOver()is.
Thiscodecanbethoughtofasoneiterationofthesolution,orasaprototype.Thenext
step will be to refine the solution by implementing one of the stubs. Each time that
happensasetofdecisionswilllikelybemadeconcerningthenatureofthedatastructures
used to implement the solution: the use of a list for the game state, for instance. Three
integerscouldhavebeenusedinstead,butoncethedecisionismadeonewayortheother
itshouldbestuckwithunlessitbecomesinfeasible.
Repeatedly implementing the stubs creates new prototypes, each one more functional
than the one before. Some of the functions may require an application of this same
process.Complexfunctionscanbecodedintermsofotherstubs,andsoon.Thesimpler
functions,suchasthosethatcalculatebasedonlyontheirparameter,shouldbecompleted
firstandshouldnotinvolvepermanentdesignchoices.
A programming process of this kind can be thought of as iterativerefinement. At all
times after the first step, a complete program that compiles and runs will exist to be
demonstratedandrefined.Thiscanbeveryuseful,especiallywhendealingwithgraphical
user interfaces and games. The interface might well be complete before any real
functionality is present, and this permits a demonstration of the concept before the
programisdone.
4.6 Summary
Pythonallowsaprogrammertocreateafunctionthatdoessomethingnew.Afunctionis
reallyjustsomecodethathasaname,andcanbeexecutedsimplybyinvokingthatname.
Itusuallyrepresentssometaskthathastobedonefairlyfrequently.Afunctionshouldalso
haveonemaintask,andthattaskshouldberepresentedinthefunctionname:maximum,
square, search, and so on. Many functions return a value, and finding that value is
frequentlythepurposeofthefunction(e.g.,sine,cosine).
Thenameofafunctioncanbeusedtocallthatfunction,butitcanalsobeassignedtoa
variable,passedasaparametertoanotherfunction,orreturnedasavalue.Afunctioncan
have variables that belong to it; they are called local variables and vanish after the
functionreturns.Theycanalsousevariablesdefinedoutsideofthefunctioniftheyappear
inaglobalstatement.
AspecialvaluenamedNoneisusedtorepresentnovalue,andisreturnedbyafunction
thatdoesnotexplicitlyreturnsome othervalue.Amoduleisafunctionorcollectionof
functionsthatresideinafilewhosenameendsin“.py.”
Theuseoffunctionscanorganizeacomputerprograminalogicalway.Aprogramcan
be defined in terms of functions that are desired but not yet written, and then those
functionscanbedefinedascodeorintermsofotherfunctions.Functionsareoftennamed
butareincomplete,andarecalledstubs—theypermittheprogramtobecompiledwhile
stillunderdevelopment.
Afunctionthatcallsitselfissaidtoberecursive.Suchfunctionscanbeveryvaluablein
simplifyingthecodeforsomealgorithms,especiallyonesinwhichsomethingisactually
defined in terms of itself, but care must be taken when programming to ensure that a
recursivefunctionalwaysultimatelyreturns.
Exercises
1.WriteaPythonfunctionthattakesatupleofnumbersasaparameterandreturnsthe
location(index)ofthemaximumvaluefoundinthattuple.
2.Awordprocessingsystemsometimesneedstoshortenawordtomakeitfitonaline.
Writeafunctionthattakesastringcontainingasinglewordanddecideswhereto
hyphenateit.Ahyphencanoccurbeforetheendings:“ing,”“ed,”“ate,”“tion,”or
“ment.”Itcouldalsooccurafteraprefix:“pre,”“post,”“para,”“pro,”“con,”or
“com.”Otherwise,placeahyphensomewhereinthemiddleoftheword.Thefunction
shouldreturnatuplecontainingthefirstandsecondhalfofthewordsplitatthe
hyphen.
3.Pascal’striangleisanarrangementofnumbersinrowsandcolumnssuchthateach
numberinarowisthesumofthetwonumbersaboveit.Anexampleis:
1
11
121
1331
Write a function triangle(n) that prints the first n rows of such a triangle. Useextra
marksforproperindentationsoitlookslikeatriangle.
4.Writeafunctionthatreturnsthevalueofaquadraticfunctionataparticularxvalue.A
quadraticisapolynomialoftheform:
ax2+bx+c
The function quad() is passed values for a, b, c, and x and returns the value of the
polynomial.
5.Aquadraticpolynomialhasarootatanyvaluexforwhichthevalueofthe
polynomialiszero;thatis,anyxsuchthat
ax2+bx+c=0
Therecanonlybeatmosttwosuchvalues(atuple),andtheexpressionforfindingthese
valuesofxis:
Writeafunction(root(a,b,c))thatreturnsthetworootsofaquadraticequationhaving
beenpasseda,b,andc.Theresultisatuple,orifthereisnosolution(i.e.,squareroot
ofanegativenumber,ora=0)thenitreturnsNone.
6.Writeafunction(inputfloat(s))thattakesasingleparameter,astringtobeusedasa
prompt,andreturnsanumberreadfromtheconsole.Thefunctionmustpromptthe
userforthenumberusingthegivenstring,readtheinput,andreturntheresultasa
floatingpointnumber.IfanerroroccursreturnNone.
7.Thegameoftabletennisissometimescalledping-pong.Writefunctionsping()and
pong()thateachtake,asaparameter,aprobabilityofhittingtheball.Aprobabilityis
between0.0and1.0.ThefunctionreturnsTrueiftheballisreturnsandFalse
otherwise.Therearetwosidestothegame,andeachsideserves(playsfirst)twice,
thentheothersideservestwice.Itwillbeassumedherethattheserveralways
succeeds.Ifpingisservingthenpong()getscalledfirst,thenifpongsucceededthen
ping()getscalled,andsoon.Thesidethatmadethelastsuccessfulhitwinsapoint.
Thegamegoesto11points,butmustbewonbya2-pointmargin.Writeaprogram
thatsimulatesping-pongusingtwofunctionsnamedping()andpong().
8.Inmutualrecursiontwofunctionscalleachother,usuallyrepeatedlytosomedepth.
So:AcallsB,whichcallsAagain,whichcallsBagain,andsoon.Recodetheping-
pongexercise(Number7above)sothatping()callspong()andpong()callsping().
Thefunctionsreturnastring,thatofthewinneroftheexchange.
9.Writeafunctionprime(n)thatreturnsTrueifthenumbernisprime,andFalse
otherwise.Howmanyprimenumbersaretherebetween1and1000?
NotesandOtherResources
Tutorial on Python Functions:
http://www.tutorialspoint.com/python/python_functions.htm
Also:http://anh.cs.luc.edu/python/hands-on/3.1/handsonHtml/functions.html
1.ThomasS.Ferguson.GameTheory,
https://www.math.ucla.edu/~tom/Game_Theory/comb.pdf
2.D.G.Luenberger.(1973).IntroductiontoLinearandNonlinearProgramming
(Vol.28),Reading,MA:Addison-Wesley.
3.MitchellWand.(1980).Induction,Recursion,andProgramming,NorthHolland,
NY,http://tocs.ulb.tu-darmstadt.de/82570701.pdf
CHAPTER5
FILES:INPUTANDOUTPUT
5.1WhatIsaFile?ALittle“Theory”
5.2KeyboardInput
5.3UsingFilesinPython:LessTheory,MorePractice
5.4WritingToFiles
5.5Summary
Inthischapter
Intheearlydaysofcomputing,whenarespectablemachinewouldfillaroom,thefilewas
invented.Itwasanobvious thing,really:apackagein whichtoplacedata onatapeor
disk drive. Storage that was not memory, which was pretty fast, was called secondary
storage,andwasterriblyslowcomparedtohowfastacomputercouldexecuteinstructions
(which is terribly slow compared to how fast a modern computer can execute
instructions).
Filesstillexist,andatypicalPChashundredsofthousandsofthem.Thedetailsofhow
filesareimplementedisinteresting,butunimportanttothediscussionofhowtousethem
inPython.Thefocuswillbeonhowandwhytousethemeffectively.
Thefirstthingtoknowaboutafileisthatitisacollectionofbytesstoredonadiskor
similardevice.Onesetofbytescanlookverymuchlikeanother,andunlesstheformatof
thefile(i.e.,thewaythebytesareordered)anditsbasiccontents(i.e.,whatkindofthing
the bytes represent) is known ahead of time, the information stored there is unusable.
Computer programs are written assuming that the files they will read have a particular
nature; if given a file that does not have that nature, the program will not function
properly.
Whatkindsoffilesarethere?Hereisashortlist:
1.Textfiles.Thesecontaincharactersthatapersoncanread,andcanbethoughtofas
documents.
2.Executablefiles.Theseholdinstructionsthatacomputercanexecute.Suchafileis
aprogramoran“app.”
3.Datafiles.Itcouldalsobeatextfileifitisstoredascharacters,butitcouldbeaset
ofbytesthatrepresentintegersorrealnumbers.
4.Imagefiles.Therearemanytypesofimagefiles,andtheycontainpicturesin
digitalformat.ManydigitalcamerasuseaformatcalledJPEG,butGIForPNGare
twoofmanyothers.Notonlyareimagesstoredinsuchafile,butalsodataabout
howlargetheimageis,whenitwastaken,andotherdetails.
5.Soundfiles.ThemorecommonsoundfileistheMP3,buttherearemanyothers.
6.Video.MPEGandAVIarestandardformatsforvideo,andthereareagreatmany
filesofthissortavailableontheInternet.
7. Web pages. These are a special kind of text file. They can be examined and
modifiedusingbasictexteditors,butcan’tbeviewedproperly(i.e.,asawebpage)
exceptthroughabrowser,whichisreallyaspecialkindofdisplayutilitythatcan
bothdrawimagesandconnecttotheInternettodownloadmoreinformation.
Allofthese files,and indeedall files,have certain thingsin common.Some ofthese
thingscanbeignoredwhenwritingPythonprograms,butotherscannot.
Fileshavenames.Thefirstwaytoaccessafileisusuallybyspecifyingitsname.In
human folklore, knowledge of a true name allows one to affect another person or
being;knowingsomething’struenamegivesthepersonpoweroverthatthing.Soit
iswithfiles.Knowingthenameofafileisthewaytoaccesstheinformationwithin.
Fileshaveasize.Itisusuallyexpressedinbytes,whichistosaysimplecharacters.
One byte is one traditional alphabetic character, although there are now many
standards for characters in German and Swedish and Chinese that break that rule.
Knowinghowlargeafileishelpswhenusingitasinput,andwhenwritingafileits
sizegrows.
Basicoperationsonafilearereadandwrite.Toreadfromafilemeanstoexamine
a byte (at least); usually bytes are read in large blocks for efficiency. This means
movingacopyofthebytesfromthediskintomemory,becauseaprogramcanonly
examinedatathatisinmemory.Writingisthereverseprocess:abyteorbytesare
copiedfrommemoryontodisk.
Filesmustbeopenbeforetheycanbeused.Toopenafileaprogrammustknow
itsname,andtheninvoketheopenfunctionorprogram.Ifthetruenameofthefile
givesyoupoweroverit,thenopenisthespellusedtowieldthatpower.Whethera
filewillbe reador writtenisnormally decidedat the timethe fileis opened. The
openfunctionandmanyotherfile-relatedoperationsbelongtotheoperatingsystem
of the computer, and not normally to the language. It’s one reason why so much
softwareisnotportable.
Onlyone programata timecan writeto afile.Many programscan read afile
simultaneously,butonlyonecanwritetoit,andnotwhileanyoneelseisreadingit.
Many computers can have more than one user accessing a file at a time, and the
Internetcertainlyallows manyusers toaccess a web page atone time,and aweb
pageisafile.However,chaosensuesifmorethanoneusercanchangeafileatthe
samemoment.
Anotherthingtoconsideristhattext,andthereforetextfiles,areaprincipalmeansfor
communicationbetweenhumansandcomputers.Itiscriticalthatanyschemeforwriting
text to a file takes into account the human aspects of text: sentences, lines, paragraphs,
special characters, numbers, and so on. This chapter will be concerned with the way in
whichPythoncanusefiles,withfilesasaconceptingeneral,andwithhowhumansthink
ofdataandfiles.
5.1 WHATISAFILE?ALITTLE“THEORY”
Theclaimintheprevioussectionwasthatafileis“acollectionofbytesstoredonadisk
or similar device.” This is correct, but it does not have enough detail to enable a
programmertotakeadvantageofthewayafileisimplementedtocreateagoodprogram.
Whatisneededisanunderstandingofthedevicesthatcontainfilesandtheiradvantages
and limitations. This information will begin to explain the traditional mechanisms that
have evolved for using files from programming languages generally and Python in
particular.
The file as a data structure was devised for storing information on tapes and disks.
Togetherwithsomeotherdevicesthatareusedrarely(e.g.,cramfiles)thesearereferred
toassecondarystorage,whereprimarystoragewouldbecomputermemory.Memorywas
(andstillis)tooexpensivetostoreeverythingthatisneededonacomputer,sosecondary
storagehastheadvantagesofbeingcheaperthanmemoryandcancontainamuchlarger
amountofdata.ModerndiskscancontainTerabytesofdata,whereoneTerabyte(Tb)is
1012bytes.Ithasbeenestimatedthatahumanbeing’sfunctionalmemoryisabout1.25
Tb.ATerabyteisalotofstorage.
Most secondary storage devices store data magnetically. Since tapes are rarely seen
anymore, the example presented here will be that of a disk. A disk is a circular platter
madeofglassorceramicmaterialandcoatedwithathinlayerofmagneticmaterial,often
acompoundofiron.That’swhytheylookbrown:ironoxideorrustisthatcolor.Thedisk
ismountedonaspindlethatisconnectedtoamotor,whichspinsitatahighrateofspeed.
Adevicecalledaread/writeheadismadetositabovethemovingdiskbutverynearto
it.Thisdeviceisbasicallyasmallpieceofmagnetizablemetalwrappedinafinewire,not
unliketheread/writeheadsinanoldvideotaperecorder(VCR)orcassettemachine.Itisa
propertyofmagnetsandcoilsthatamovingmagnetwillcreate(induce)anelectriccurrent
inanearbycoil,andacoilwithacurrentflowingthroughitcancreateamagneticfield.
So, to write data to the moving disk, a current is sent to the read/write head, which
createsasmallmagneticmarkonthediskbelowthehead.Magnetshavetwoorientations;
theyhaveaNorthPoleandaSouthPole.Currentflowingonewaywillcreateamagnetin
thediskthathasaNorthPoleappearingbeforetheSouthPole,oranN-Smark.Current
flowingtheotherdirectionthroughtheheadwillcreateamagnetonthediskthathasthe
SouthPoleappearingbefore theNorthPole,or anS-Nmark. Oneorientation,say N-S,
willrepresentabinarynumber“1,”andtheother(S-N)willrepresenta“0.”Inthisway,
binarynumberscanbewrittentothesurfaceofthemovingdisk.
Reading numbers involved the magnetic regions of the disk passing quickly past the
read/writeheadandinducingsmallcurrentsinthecoil.Theseareamplifiedandclassified
by a simple electronic circuit that will detect current flow one way as N-S and another
wayasS-N,thusallowingbinarynumberstobereadfromthedisk.
Therearesomeverycomplicatedphysicsinvolvedinadiskdrive.Theread/writehead
mustbeveryclosetothesurfaceofarapidlyrotatingdisk,ascloseas3nanometers.To
accomplishthis,theheadisactuallyaerodynamically
flyingabovethedisk.Ifiteveractuallytouchesthedisksurfacetheresultiscatastrophic.
Atthespeedsinvolvedalargesectionofthemagneticmaterialonthedisk’ssurfacewill
bescrapedaway,andalldatatherewillbelost.Inaddition,theread/writeheadwillalmost
certainlybedamaged.Thiseventiscalledaheadcrash,andnormallyresultsintheentire
diskdrivebeingruined.It’sonereasonthatfrequentbackupcopiesofalldatashouldbe
made.
Thepicturethatisdevelopingisthatofadevicethatreturnsdataasastreamofbits.To
makebestuseoftheareaofthedisk,theread/writeheadcanmovefromtheouteredgeof
thedisktonearlythecenter.Imagineasetofconcentriccirclesonthedisk’ssurface:the
moving read head can position itself over any of them and read the data that had been
writtenthere.
The disk is divided into a set of concentric circles called tracks, each of which
corresponds to one position of the read/write head (Figure 5.2a). The head can move
across the disk surface, but for obvious reasons the positions are quantized: position 0-
Ntrackscanbereachedthroughcommandstoacontrollerthatchangetheheadposition.
Theoutermosttrackisnumbered0,andnumbersincreaseastheheadmovesinwardtothe
center.Thediskisalsodividedintosectors,eachofwhichisawedge-shapedportionof
thedisk(Figure5.2b).Theseareagainnumbered0toNsectors,andcreateanaddressfora
setofbits.Datacanbereadfromsector3track12bypositioningthereadheadovertrack
12andwaitingforsector3torotateintopositionunderthehead.Thedatatakesaslongto
readasthesectortakestopassunderthereadhead.
Thisdescriptionanswerstwoimportantquestions.First,datacanbeaccessedbyusing
the<track,sector>address.Thedatainasingletrackandsectorisablock,andallblocks
are the same size in terms of bits for the sake of convenience, traditionally 512 bytes
(4096bytesforAFdrives).Second,itexplainswhy
accessingdatatakessolongwhenreadingfromadisk.Disksrotateat7200RPMor120
revolutionspersecond;thisisonerotationevery8.3milliseconds.
5.1.1 HowAreFilesStoredonaDisk?
Afilecanbethoughtofasasetofblocks.Ifblocksare512bytesinsizeandsomedata
tobestoredin afileconsistsof Nbytes,then that filewillneed blocks,thenextlarger
integerthanN/512;it’snotpossibletohavetwofilesshareasingleblock.
Itgetsmorecomplicated,though,becauseitwillnotalwaysbepossibletohaveallof
theblocksthatbelongtoafilelienexttoeachother.Afilemightconsistofmanyblocks,
allofwhicharesomedistanceapartintermsoftheirtrackandsector.Thereisaneedfora
datastructuretoconnecttheseblocksinthecorrectordertomakeafile.It’snotveryhard
todobutisanotherstep.Thisdatastructureiswrittentothediskalso.Theresultisthat
readingafilemeansfindingthelocationofthisdatastructureonthedisk,gettingthetrack
andsectorvalues,andthenreadingthedatafromthoseandcopyingitintomemory.The
datastructurecontainingthesectorsisusuallyfoundthroughafilenamethattheuserhas
provided.Thereisalistoffilenamesandthetrack/sectoraddressoftheirindexsectorsin
aspecialfilesomeplaceonthedrive,orinmanyplaces.Filesystemstendtobeorganized
hierarchically,sothatonemainnameisaccessedtofindthefileswithinthatpartofthe
disk(directory),andwithinthatdirectoryarenamesofmorefilesanddirectories.Itisa
significantpartofthefunctionofanoperatingsystemlikeLinuxorWindowstoprovidea
convenientwaytoaccessfiles.
5.1.2 FileAccessisSlow
Howlongdoesittaketoaccessablockofdataonthedisk?Itdependsonwherethe
diskheadisandwherethediskrotationhasplacedthetargetblockatthetimetherequest
ismade.Therewillbeonlyastatisticalanswer,butforarandomblockitcouldtakean
averageof10mStomovetheheadtothecorrecttrack(seektime),andwilltakehalfofa
rotation(4.15mS).Addtothisthetimeneededtoreadtheblock,whichis8.3*1/Nsectors
mS,orabout0.008mSforadiskwith1024sectors.Thiscanbeignored,andthetimeto
accessarandomblockcanbeestimatedas14.15milliseconds.
As a comparison, fast computer memory can access data within 8 nanoseconds. If a
person could write the word “Gigabyte” on a whiteboard in 8 nanoseconds, then what
couldtheydoin14milliseconds?TheycouldcopytheentireBibleontotheboardover16
times.Diskisvastlyslowerthanmemory,andinordertousethedataitmustbecopied
intomemory.Thisisabottleneckinmanycomputersystems.
5.2 KEYBOARDINPUT
Readingdatafrom thekeyboard isvery differentfrom readingdata froma file.Files
exist before being read, and normally have a fixed size that is known in advance. It is
commontoknowtheformatofafile,sothatthefactthatthenextdatumisanintegerand
theonefollowingthatisafloatisoftenknown.Whenauserisenteringdataatakeyboard
thereisnosuchinformationavailable.
Infact,theusermaybemakingupthedataastheygoalong.Beforegettingtoofarinto
fileinputitisimportanttounderstandthekindoferrorsthatcanhappeninteractively.
Theseare usingtype errors, where the userenters data that is thewrong typefor the
programmertouse:astringinsteadofaninteger,forexample.Thiskindoferrorcanarise
infileinputalsoiftheformatisnotknowninadvance.
5.2.1 Problem:ReadaNumberfromtheKeyboardandDivideItby2
Inthisinstancetheproblemisoneoftype:howtotreatintegerslikeintegersandfloats
likefloats.Whenthestringsisreadinit’sjustastring,anditissupposedtocontainan
integer. However, users will be users, and some may type in a float by mistake. The
program should not crash just because of a simple input mistake. How is this situation
handled?
Theproblemisthatwhenthestringisconvertedintoaninteger,ifthereisadecimal
pointorothernon-digitcharacterthatdoesnotbelong,thenanerrorwilloccur.Itseems
thatananswerwouldbetoputtheconversionintoatrystatementblockandifthestring
hasadecimalpointthenconvertthestringtofloatwithintheexceptpart.Somethinglike
this:
s=input(“Inputaninteger:“)
try:
k=int(s)
ks=k//2
except:
z=float(s)
k=int(z/2)
print(k)
Iftheusertypes“12”inresponsetotheprompt“Inputan integer:”thentheprogram
prints“6.”Iftheusertypes“12.5”thentheprogramcatchesaValueError,because12.5
isnotalegalinteger.Theexceptpartisexecuted,convertingthenumbertofloatingpoint,
dividingby2,thenfinallyconvertingtoaninteger.
Oneproblemisthattheexceptpartisnotpartofthetry, soerrors thathappen there
willnotbecaught.Imaginethattheusertypes“one”inresponsetotheprompt.Thecallto
int(s)resultsinaValueError,andtheexceptpartisexecuted.Thestatement:
z=float(s)
willresultinanotherValueError.Thisonewillnotbecaughtandtheprogramwillstop
executing,givingamessagelike:
ValueError:couldnotconvertstringtofloat:'one'
s=input(“Inputaninteger:“)
try:
k=int(s)
k=k//2
exceptValueError:
try:
z=float(s)
k=int(z/2)
exceptValueError:
k=0
print(s,k)
5.3 USINGFILESINPYTHON:LESSTHEORY,MORE
PRACTICE
ThegeneralparadigmforreadingandwritingfilesisthesameinPythonasitisinmost
otherlanguages.Thestepsforreadingorwritingafilearethese:
1.Openthefile.Thisinvolvescallingafunction,usuallynamedopen,andpassing
thenameofthefiletobeused.Sometimesthemodeforopeningispassed;thatis,a
filecanbeopenedforinput,output,update(bothinputandoutput)andinbinary
modes.Thefunctionlocatesthefileusingthenameandreturnsavariablethatkeeps
trackofthecurrentstateofinputfromthefile.Aspecialcaseexistsifthereisno
filehavingthegivenname.
2.Readdatafromthefile.Usingthevariablereturnedbyopen,afunctioniscalled
to read data. The function might read a character, or a number, or a line, or the
wholefile.Thefunctionisoftencalledread,andcanbecalledmultipletimes.The
nextcalltoreadwillreadfromwherethelastcallended.Aspecialcaseexistswhen
allofthedatahasbeenreadfromthefile(calledtheendoffilecondition).
OR
3.Writedatatothefile.Usingthevariablereturnedbyopen,afunctioniscalledto
writedatatothefile.Thefunctionmightwriteacharacter,oranumber,oraline,or
manylines.Thefunctionisoftencalledwrite,andcanbecalledmultipletimes.The
nextcalltowritewillcontinuewritingdatafromwherethelastcallended.Writing
datamostfrequentlyappendsdatatotheendofthefile.
4.Closethefile.Closingafileisalsoaccomplishedusingacalltoafunction(yes,it
isusuallynamedclose).Thisfunctionfreesstorageassociatedwiththeinput
processandinsomecasesunlocksthefilesoitcanbeusedbyotherprograms.A
variablereturnedbyopenispassedtoclose,andafterwardsthatvariablecan’tbe
usedforinputanymore.Thefileisnolongeropen.
5.3.1 OpenaFile
Pythonprovidesafunctionnamedopenthatwillopenafileandreturnavaluethatcan
beusedtoreadfromorwritetothefile.Thatvalueactuallyreferstoacomplexcollection
of values that refers to the file status and is called a handle or a file descriptor in the
computingliterature,althoughknowledgeofthedetailsisnotneededtouseit.Itcanbe
thoughtofassomethingoftypefile,andmustbeassignedtoavariableorthefilecan’tbe
accessed. The open function is given the name of the file to be opened and a flag that
indicates whether the file is to be read from or written to. Both of these are strings. A
simpleexampleofacalltoopenis:
infile=open(“datafile.txt”,“r”)
Thiswillopenafilenamed“datafile.txt”thatresidesinthesamedirectoryasdoesthe
Pythonprogram,andopensitforinput:the“r”flagmeansread.Itreturnsthehandleto
thevariableinfile,whichcannowbeusedtoreaddatafromthefile.
Therearesomedetailsthatarecrucial.Thenameofthefileonmostcomputer
systemscanbeapathname,whichistosaythenameincludingalldirectorynamesthat
are used to find it on your computer. For example, on some computers the name
“datafile.txt” might have the complete path name
“C:/parker/introProgramming/chapter05/datafile.txt.”Ifpathnamesareused,thefilecan
beopenedfromanydirectoryonthecomputer.Thisishandyforlargedatasetsthatare
usedbymultipleprograms,suchasnamesofcustomersorsuppliers.
The read flag “r” that is the second parameter is what was called the mode in the
previousdiscussion. The “r” flag meansthat the file will beopen for reading only, and
startsreadingatthebeginningofthefile.Thedefaultistoreadcharactersfromthefile,
whichispresumedtobeatextfile.Openingwiththemode“rb”opensthefileinbinary
format,andallowsreadingnon-textfiles,suchasMP3andvideofiles.
Passingthemode“w”meansthatthefileistobewrittento.Ifthefileexists,thenitwill
beoverwritten;ifnot,thefilewillbecreated.Using“wb”meansthatabinaryfileistobe
written.
Appendmodeisindicatedbythemodeparameter“a,”anditmeansthatthefilewillbe
openedforwriting;ifthefileexists,thenwritingwillbeginattheendoftheexistingfile.
Inotherwords,thefilewillnotstartoverasbeingemptybutwillbeaddedto,attheend
ofthefile.Themode“ab”appendsdatatoabinaryfile.Thereareafewothermodesthat
willbediscussedwhentheyareneeded.
Ifthefiledoesnotexistanditisbeingopenedforinput,thereisaproblem.It’sanerror,
of course; a nonexistent file can’t be read from. There are ways to tell whether a file
exists,andtheerrorcausedbyanonexistentfilecanbecaughtandhandledfromwithin
Python. This involves an exception. It is always a bad idea to assume that everything
works properly, and when dealing with files it is especially important to check for all
likelyproblems.
FileNotFoundExceptions
Theproperwaytoopenafileiswithinatry-exceptpairofstatements.Thiswillensure
thatnonexistentfilesorpermissionerrorsarecaughtratherthancausing
theprogramtoterminate.Thebasicschemeissimple:
try:
infile=open(“datafile.txt”,“r”)
exceptFileNotFoundError:
print(“Thereisnofilenamedꞌdatafile.txt'.Pleasetryagain”)
return#endprogramorabortthis
#sectionofcode
The exception FileNotFoundError will be thrown if the file name can’t be found.
Whattodointhatcasedependsontheprogram:ifthefilenamewastypedinbytheuser,
thenperhapstheyshouldgetanotherchance.Inanycasethefileisnotopenanddatacan’t
beread.
There are multiple versions of Python on computers around the world, and some
versionshavedifferentnamesforthings.TheexampleshereallusePython3.4.Inother
versionstheFileNotFoundErrorexceptionhasanothername;itmaybeIOErrororeven
OSError. The documentation for the version being used should be consulted if a
compilationerroroccurswhenusingexceptionsandsomebuilt-infunctions.Forthe3.4
compilerversion,allthreeseemtoworkwithamissingfile.
All attempts to open a file should take place while catching the FileNotFoundError
exception.
5.3.2 ReadingfromFiles
Afterafileisopenedwithareadmode,thefiledescriptorreturnedcanbeusedtoread
datafromthefile.Usingthevariableinfilereturnedfromthecalltoopen()above,acall
tothemethodread()cangetacharacterfromthefile:
s=infile.read(1)
Readingonecharacteratatimeisalwaysgoodenough,butisinefficient.Ifablockon
diskis512characters(bytes)thenthatshouldbeagoodnumberofbytestoreadatone
time, or a multiple of that. Reading more data than you need and saving it is called
buffering,andbuffersareusedinmanyinstances:livevideoandaudiostreaming,audio
players,andeveninprogramminglanguagecompilers.Theideaistoreadalargerblock
ofdatathanisneededatthemomentandtohanditoutasneeded.Readingabuffercould
bedoneas:
s=infile.read(512)
and then dealing characters from the strings one at a time as needed. A buffer is a
collection of memory locations that is temporary storage for data that was recently on
secondarystore.
Text files, those that contain printable characters that humans can read, are normally
arrangedaslinesseparatedbyacarriagereturnoralinefeedcharacter,somethingusually
calledanewline.Anentirelinecanbereadusingthereadline()function:
s=infile.readline()
Alineisnotusuallyasentence,somanylinesmightbeneededtoreadonesentence,or
perhaps only half of a line. Computer text files are structured so that humans can read
them, but the structure of human language and convention is not understood by the
computer,noritisbuiltintothefilestructure.However,itisnormalforpeopletomake
datafilesthatcontaindataforaparticularitemoreventononeline,followedbydatafor
thenextitem.Ifthisistruethenonecalltoreadline()willreturnalloftheinformationfor
aparticularthing.
EndofFile
Whentherearenomorecharactersinthefile,read()willreturntheemptystring:“”.
Thisiscalledtheendoffilecondition,and it isimportant thatit bedetected.There are
manywaystoopenandreadfiles,butforreadingcharactersinthiswaytheendoffileis
checkedasfollows:
infile=open(“data.txt”,“r”)
whileTrue:
c=infile.read(1)
ifc=='':
print(“Endoffile”)
exit()
else:
c=infile.read(1)
Whenreadingafileinaforstatement,theendoffileishandledautomatically.Inthis
casethelooprunsfromthefirstlinetothefinallineandthenstops.
forcinf:
print(“'”,c,“'”)
Oddlyanexceptioncan’tbeusedinanobviouswayforhandlingtheendoffileonfile
input.However,whenreadingfromtheconsoleusingtheinput()functiontheexception
EOFErrorcanbecaught:
whileTrue:
try:
c=input()
print(c)
exceptEOFError:
print(“Endfile”)
break
There are many errors that could occur for any set of statements. It is possible to
determinewhatspecificexceptionhasbeenthrowninthefollowingmanner:
whileTrue:
try:
c=input()
print(c)
exceptExceptionasx:
print(x)
break
Thiscodeprints“EOFwhenreadingaline”whentheendoffileisencountered.
CommonFileInputOperations
There are a few common ways to use files that should be mentioned as patterns.
Althoughoneshouldneveruseapatternifitisnotunderstood,it’ssometimeshandyto
haveafewsimplesnippetsofcodethatareknowntoperformbasictaskscorrectly.For
example, one common operation to use with files is to read each line from a file,
followedbysomeprocessingstep.Thislookslike:
f=open(“data.txt”,“r”)
forcinf:
print(“'”,c,“'”)
f.close()
Theexpressioncinfresultsinconsecutivelinesbeingreadfromthefilesintoastring
variablec,andthisstopswhennomoredatacanbereadfromthefile.
Anotherwaytodothesamethingwouldbetousethereadline()function:
f=open(“data.txt”,“r”)
c=f.readline()
whilec!='':
print(“'”,c,“'”)
c=f.readline()
f.close()
Inthiscasetheendoffilehastobedeterminedexplicitly,bycheckingthestringvalue
thatwasreadtoseeifitisnull.
Anothercommonfileoperationistocopyafiletoanother,characterbycharacter.A
fileisopenedforinputandanotherforoutput.Thebasic“readafile”patternisused,with
theadditionofafileoutputaftereachcharacterisread:
f=open(“data.txt”,“r”)
g=open(“copy.txt”,“w”)
c=f.read(1)
whilec!='':
g.write(c)
c=f.readline(1)
f.close()
g.close()
Afilterisaprogramthatreadsdatafromafileandconvertsittosomeotherform,then
writesit out. Thisis often done from standard input andoutput, but can be done in the
middleofafilecopy.Forexample,toconvertatextfiletoalllowercase,thepatternabove
isusedwithasmallmodification:
f=open(“data.txt”,“r”)
g=open(“copy.txt”,“w”)
c=f.read(1)
whilec!='':
g.write(c.lower())
c=f.readline(1)
f.close()
g.close()
Thisfiltercanbedoneinlesscodeiftheentirefilecanbereadinatonce.Theread()
functioncanreadalldataintoastring.
f=open(“data.txt”,“r”)
g=open(“copy.txt”,“w”)
c=f.read()
g.write(c.lower())
f.close()
g.close()
Twofilescanbemergedintoasinglefileinmanyways:onefileafteranother,aline
fromonefilefollowedbyalinefromanother,characterbycharacter,andsoon.Asimple
mergingoftwofileswhereoneiscopiedfirstfollowedbytheotheris:
f=open(“data1.txt”,“r”)
outfile=open(“copy.txt”,“w”)
c=f.read()
outfile.write(c)
f.close()
g=open(“data2.txt”,“r”)
c=g.read()
outfile.write(c)
g.close()
outfile.close()
Amorecomplexproblemoccurswhenbothfilesaresortedandaretoremainsorted
afterthemerge.Ifeachlineisinalphabeticalorderineachfilethenmergingthemmeans
readingalinefromeachandwritingtheonethatissmallest.Whenonefileiscomplete,
theremainderofthesecondfileiswrittenandallfilesareclosed.
f=open(“data1.txt”,“r”)
g=open(“data2.txt”,“r”)
outfile=open(“copy.txt”,“w”)
cf=f.readline()
cg=g.readline()
whilecf!=””andcg!=””:
ifcf<cg:
outfile.write(cf)
cf=f.readline()
else:
outfile.write(cg)
cg=g.readline()
ifcf==””:
outfile.write(cg)
cg=g.read()
outfile.write(cg)
else:
outfile.write(cf)
cf=f.read()
outfile.write(cf)
f.close()
g.close()
outfile.close()
Copying the input from console to a file means reading each line using input() and
writingittothefile.Thiscodeassumesthatanemptyinputlineimpliesthatthecopyingis
complete.
outfile=open(“copy.txt”,“w”)
line=input(“!“)
whilelen(line)>1orline[0]!=”!”:
outfile.write(line)
outfile.write(“\n”)
line=input(“!“)
outfile.close()
Theendofthelineisindicatedbyacharacter,whichisrepresentedbythestring“\n.”
Readingcharactersfromafilewillreadtheendoflinecharacteralso,anddetectingitcan
beveryimportant.
f=open(“data.txt”,“r”)
c=f.read(1)
whilec!='':
print(“'”,c,“'”)
c=f.read(1)
ifc=='\n':
print(“Newline”)
CSVFiles
A very common format for storing data is called Comma Separated Variable (CSV)
format,namedforthefactthateachpairofdataitemshaveacommabetweenthem.CSV
filescanbeuseddirectlybyspreadsheetssuchasExcelandbyalargecollectionofdata
analysistools,soitisimportanttobeabletoreadthemcorrectly.
AsimpleCSVfilenamedplanets.txtisprovidedforexperimentingwithreadingCSV
files.ItcontainssomebasicdatafortheplanetsinEarth’ssolarsystem,andwhilethereis
noactualstandardforhowCSVfilesmustlook,thisoneistypicalofwhatisusuallyseen.
Thefirstlineinthefilecontainsheadingsforeachofthevariablesorcolumns,separated
bycommas.Thisisfollowedbyninelinesofdata,oneforeachplanet.It’sasmalldata
fileasthesethingsarecounted,butillustrativeforthepurpose.Hereitis:
Problem: Print the Names of Planets Having Fewer Than
TenMoons
Thisisnotaveryprofoundproblem,andusestherawdataasitappearsonthefile.The
filemustbeopenedandtheneachlineofdataisread,thevalueofthe11thdataelement
(i.e.,index10)retrievedandcomparedagainst10.Iflarger,thenameoftheplanet(index
0)isprinted.Theplanis:
Openthefile
Read(skipover)theheaderline
Foreachplanet
Readalineasstrings
BreaksintocomponentsbasedoncommasgivinglistP
IfP[10]<10printtheplanetname,whichisP[0]
Itisall somethingthat hasbeen donebefore exceptforbreaking thestring intoparts
basedonthecomma.Fortunately,thedesignersofPythonanticipatedthiskindofproblem
and have provided a very useful function: split(). This function breaks up a string into
parts using a specified delimiter character or string and returns a list in which each
componentisonesectionofthefracturedstring.Forexample:
s=“Thisisastring”
z=s.split(”“)
yieldsthelistz=[“This”,“is”,“a”,“string”].Itsplitsthestringsintosubstringsateach
space character. A call like s.split(“,”) should give substrings that are separated by a
comma.Giventheabovesketchandthesplit()function,thecodenowprettymuchwrites
itself.
try:
#Openthefile
infile=open(“planets.txt”,“r”)
#Read(skipover)theheaderline
s=infile.readline()
#Foreachplanet
foriinrange(0,8):
#Readalineasstrings
s=infile.readline()
#BreaksintocomponentsbasedoncommasgivinglistP
P=s.split(“,”)
#IfP[10]<10printtheplanetname,whichisP[0]
ifint(P[10])<10:
print(P[0],”hasfewerthan10moons.”)
exceptFileNotFoundError:
print(“Thereisnofilenamed'planets.txt'.Pleasetryagain”)
Thingstonotice:almosttheentireprogramresideswithinatrystatement,sothatifthe
filedoesnotexist,thenamessagewillbeprintedandtheprogramwillendnormally.Also
notethatP[10]hastobeconvertedintoaninteger,becauseallcomponentsofthelistPare
strings.Stringsarewhathasbeenreadfromthefile.
CSV files are common enough so that Python provides a module for manipulating
them.Themodulecontainsquitealargecollectionofmaterial,andforthepurposesofthe
planets.pyprogramonlythebasicsareneeded.Toavoidthedetailsofageneralpackage,a
simplerversionisincluded withthisbook:simpleCSV has the essentials neededto read
mostCSVfileswhilebeingwritteninsuchawaythatabeginningprogrammershouldbe
abletoreadandunderstandit.
Touseit,thesimpleCSVmoduleisfirstimported.Thismakestwoimportantfunctions
available:nextRecord()andgetData().ThenextRecord()functionreadsoneentireline
ofCSVdata.Itallowsskippinglineswithoutexaminingthemindetail(likeheaders).The
functiongetData()willparseonelineofdata,thelastoneread,intoatuple,eachelement
ofwhichisoneofthecomma-separatedfields.
ThesimpleCSVlibraryneedstobeinthesamedirectoryastheprogramthatusesit,or
beinthestandardPythondirectoryforinstalledmodules.Thesourcecoderesidesonthe
accompanyingdisc
andiscalledsimpleCSV.py.TheprogramabovecanberewrittentousethesimpleCSV
moduleasfollows:
importsimpleCSV
try:#Read(skipover)theheaderline
infile=open(“planets.txt”,“r”)#Openthefile
simpleCSV.nextRecord(infile)#Readtheheader
foriinrange(0,8):#Foreachplanet
simpleCSV.nextRecord(infile)#Readalineand
#collectsubstrings
#inalist
p=simpleCSV.getData(infile)
ifint(P[10])<10:#Ifnumberofmoonsless
#than10
print(P[0],”hasfewerthan10moons.”)
#printtheplanetname
exceptFileNotFoundError:
print(“Thereisnofilenamed'planets.txt'.Pleasetryagain”)
Problem:PlayJeopardyUsingaCSVDataSet
ThetelevisiongameshowJeopardyhasbeenontheairfor35yearsinoneofitstwo
incarnations,andisperhapsthebestknownsuchprogramontelevision.Playersselecta
topicandapointvalueandareaskedatriviaquestionthattheymustanswerintheformof
a question. There are sets of questions that have been used in Jeopardy over the years,
someinCSVform,soitshouldbepossibletostageasimulatedgameusingPythonasthe
moderator.
A simple version of the game could work like this: read a bunch of questions and
answers, and select questions at random to ask. Questions that have single word
unambiguous answers would be best. The player types in an answer and wins if they
answertencorrectlybeforegettingthreewrong.
Asinglelineofdatafromthefilemightlooklikethis:
5957,2010-07-06,Jeopardy!,”LETꞌSBOUNCE”,”$600”,”Inthiskid’sgame,youbounce
asmallrubberballwhilepickingup6-prongedmetalobjects”,”jacks”
Thereare7differentdatafieldshereseparatedbycommas.They are:ShowNumber,
Air Date, Round, Category, Value, Question, and Answer; all are strings, but some
questionsmaycontaincommas.TheCSVmodulecandealwiththat.
Therearemanywaysthatarandomquestioncanbechosen.Onewouldbetoreadallof
thedataintoalist,butthatwouldrequirealotofmemory.Onewaywouldbetorandomly
readaquestionfromthefile,butthatwouldbehardtodobecauseeachlinehasadifferent
length. What could be done relatively easily would be to pick a random number of
questionstoskipoverbeforereadingonetouse.So,selectarandomnumberKbetweenN
andM,readKquestions,andthereadthenextoneandasktheuserthatquestion.When
theendofthefileisreached,itcanbereadagainfromthebeginning.Ifthefileislarge
enoughitwouldbeunlikelytoaskthesamequestiontwiceinashorttimeperiod.
Hereisasketchofhowthismightwork:
Openinfileasthefileofquestionstobeused
Whilegamecontinues:
SelectarandomnumberKbetweenNandM
ForI=NtoM:
Readalinefromthefile
Ifnomorelines:
Closeinfileandreopen
Readaquestionandprintit,asktheuserforananswer
Readtheuser’sanswerfromthekeyboard
Iftheuser’sansweriscorrect:
Countrightanswers
Else:
Countwronganswers
IftheCSVmoduleisused,theparsingoftheinputfileisdealtwith.Whatisnewabout
this? When all of the data in the file has been used, the program may not be complete.
Whatisdonethenisnew:closethefile,reopenit,andstartagainfromthebeginning.This
isanunusualactionforaPythonprogram,butillustratestheflexibilityofthefilesystem.
There is a nested try-except pair, the outer one that checks the existence of the file of
questionsandtheinneronethatchecksfortheendofthefile.Whenthefileisreopeneda
newreaderhastobecreated,becausetheoldoneisconnectedtoaclosedfile.Thefileon
thediskisthesame,butwhenitisopenedagainanewhandleisbuilt;theoldCSVreader
islinkedtotheoldhandle.
The program counts the number of right answers (CORRECT) and the number of
wrongones(INCORRECT).Whenthereare10correctanswersor3incorrectones,the
game is over; a variable again is set to False and the main while loop exits. A break
could have been used, but having the condition become False is the polite way to exit
fromawhileloop.
Theentireprogramlookslikethis:
#Jeopardy!
importsimpleCSV,random
try:
infile=open(“q.txt”,“r”)#Openthefile
simpleCSV.nextRecord(infile)#Read(skipover)the
#headerline
CORRECT=0
INCORRECT=0
again=True
whileagain:
k=random.randint(5,10)#Howmanyquestionsto
#skip?
foriinrange(0,k):
ifnotsimpleCSV.nextRecord(infile):
#Skipthisquestion
infile.close()
print(“Reopening”)
infile=open(“JEOPARDY_small.txt”,“r”)
simpleCSV.nextRecord(infile)
s=simpleCSV.getData(infile)#Readthequestion
#tobeasked.
print(s[5])#Printthequestion
a=input()#Readtheanswer
ifa.lower()==s[6].lower():#Doesplayeranswer
#agree?
CORRECT=CORRECT+1#Yes.countto10.
ifCORRECT>=10:
print(“Youwin!”)
again=False
else:
INCORRECT=INCORRECT+1#No.Countto3
print(“Sorry.Theansweris“,s[6])
ifINCORRECT>12:
print(“Youlose.”)
again=False
exceptFileNotFoundError:
print(“Thereisnoquestionfile.Wecan'tplay.”)
TheWithStatement
Adifficultywiththecodepresentedsofaristhatitdoesnotcleanupafteritself.Afile
shouldbeclosedafterinputfromitoroutputtoitisfinished;noneoftheprogramswritten
so far do that, at least not after the file operations are complete. There has been no
significant discussion of the close() operation, but what it does has been described.
Normallywhenaprogramterminates,itsresourcesarereturnedtothesystem,including
the closing of any open files. Intentionally closing a file is important for three reasons:
first,iftheprogramabortsforsomereason,openfilesshouldbeclosedbythesystembut
may not be, and file problems can be the result. Second, as in the Jeopardy program,
closingafilecanbeusedasastepinre-usingit.Openingitagainstartsreadingitatthe
beginning. Third, closing a file frees its resources. Programs that use many files and/or
manyresourceswillprofitfromfreeingthemwhentheyarenolongerneeded.
The Python with statement, in its simplest form, takes care of many of the details
surroundingfileaccess.Anexampleofitsuseis:
try:
withopen(“planets.txt”)asinfile:#Openthefile
simpleCSV.nextRecord(infile)#Readtheheader
foriinrange(0,9):#Foreachplanet
simpleCSV.nextRecord(infile)#Readaline,
#makealist
P=simpleCSV.getData(infile)
ifint(P[10])<10:#Ifnumberofmoons
#lessthan10
print(P[0],”hasfewerthan10moons.”)
#printthename
exceptFileNotFoundError:
print(“Thereisnofilenamed'planets.txt'.Pleasetryagain”)
Oncethefileisopen,thewithstatementguaranteesthatcertainerrorswillbedealtwith
and the file will be closed. The problem is that the file has to be open first, so the
FileNotFounderrorshouldstillbecaughtasanexception.
5.4 WRITINGTOFILES
Thefirststepinwritingtoafileisopeningit,butthistimeforoutput:
outfile=open(“out.txt”,“w”)
The“w”asthesecondparametertoopen()meanstoopen thefileforwriting. When
writingtoafileitisimportanttonotethatopeningitwillcreateanewfilebydefault.Ifa
filewiththegivennamealreadyexistsitwillberewritten,andthepreviouscontentswill
begone.
Thebasicfileoutputfunctioniswrite();ittakesaparameter,astringtobewrittento
thefile.Itonlywritesstrings,sonumbersandothertypeshavetobeconvertedintostrings
before being written. Also, there is no concept of a line. This function simply moves
characterstoafile,oneatatime,intheordergiven.Inordertowritealine,anendofline
character has to be written. This is usually specified in a string as “\n,” spoken as
“backslashn.”The“n”standsfornewline.
Example:WriteaTableofSquarestoaFile
This will illustrate the typical code involved in writing to a file. The file must be
opened,thenaloopfrom0to25isconstructed.Eachnumberinthatrangeiswrittento
thefile,asisthatnumbermultipliedbyitself.Eachoutputstringrepresentsaline,andso
musthaveanewlinecharacteraddedtotheend.
outfile=open(“out.txt”,“w”)
outfile.write(”XXsquared\n”)
foriinrange(0,25):
sout=”“+str(i)+”“+str(i*i)+”\n”
outfile.write(sout)
outfile.close()
Notethattheintegersareexplicitlyconvertedintostringsandconcatenatedintoaline
tobewritten.Theelementsofthelinecouldbewritteninseparatecallstowrite:
outfile=open(“out.txt”,“w”)
outfile.write(”XXsquared\n”)
foriinrange(0,25):
outfile.write(”“)
outfile.write(str(i))
outfile.write(”“)
outfile.write(str(i*i))
outfile.write(“\n”)
outfile.close()
Theoutputfileisclosedafteralldatahasbeenwritten.
5.4.1 AppendingDatatoaFile
Openingthefilein“w”modestartswritingatthebeginningofthefile,andwillresult
inexistingdatabeinglost.Thisisnotalwaysdesirable.Forexample,whatifalogfileis
beingcreated?Thelogshouldcontainarecordofeverythingthathashappened,notjust
themostrecentthing.
Opening the file in append mode, signified by the parameter “a,” opens the file for
outputandstartswritingattheendofthefileifitalreadyexists.Thismeansthatdatacan
beaddedtotheendofanexistingfile.
Example: Append Another 20 Squares to the Table of
SquaresFile
Thepreviousexamplecreatedafilenamed“out.txt”andwrote26linestoit.Itwasa
tableofsquares,andthefinalonewas24.Thisexamplewillthereforebeginat25andadd
20morevaluestothetable.
Themaindifferenceistheopeningoftheoutputfileinappendmode,andstartingthe
loopat25insteadofat0:
outfile=open(“out.txt”,“a”)
foriinrange(25,45):
sout=”“+str(i)+”“+str(i*i)+”\n”
outfile.write(sout)
outfile.close()
The file “out.txt” will contain the squares of the integer between 0 and 44 inclusive
afterthisprogramruns.
5.5 SUMMARY
Filesarecomputerstructureswithinwhichdataarestored,andalmostalwaysresideon
disk devices, tape devices, or other secondary storage. Files have some common
properties: files have names; files have a size; basic operations on a file are read and
write;filesmustbeopenbeforetheycanbeused;onlyoneprogramatatimecanwriteto
afile.Accesstodataonafileismuchslowerthanaccesstodatainmemory,butfiledata
hastobemovedintomemorybeforeitcanbemanipulated.
Exceptionsareeventsthatoccurwhileaprogramisexecuting,suchasdividingbyzero.
Ratherthancheckforallpossibleexceptionseverytimeastatementisexecuted,Python
providesatry-exceptstatementthatallowstheprogrammertoprovidecodetorunwhen
anerroroccurs.SpecificnamedexceptionsexistinPythonthatcanbespecificallycaught,
likeValueError,orallexceptionscanbecaughtbynotspecifyingaparticularone.
Filesareopenedusingacalltoopenpassingafilenameandamode.Ifthemodeis“r”
then the file will be read from; if it is “w” it will be written to. Example: x =
open(“input.txt”,“r”).Readingfromafilexisaccomplishedbyareadcall:x.read(n)
willreturnastringofncharacters;x.readline()willreturnonelinefromthefilex.When
there are no more characters in the file read() will return the empty string: “”. This is
calledtheendoffilecondition,anditisimportantthatitbedetected.
A CSV (comma separated values) file is a specific format that is common for some
kinds of data, including spreadsheets. The simpleCSV package provided on the
accompanyingdisccanbehelpfulinreadingthesefiles.
Outputtoafilexisdonewithacalltowrite:x.write(s)writesthestringstothefile
representedbyx.Thestring“\n”representstheendofaline.
NOTE ThischapterwillbeextendedinChapter8toexpandthekindoffileoperations
anddatathatcanbereadfromandwrittentoafile.
Exercises
1.Writeaprogramthatreadsafilenamefromtheuser(console)andprintsouthow
manycharactersbelongtothatfile.
2.Writeaprogramthatopensafilecontainingalistoffilenames.Foreachoneprintthe
filenamefollowedbyYESifthatfileexistsinthecurrentdirectoryandNOifitdoes
not.
3.Createafilecopyfacility.Theprogramtobewrittenreadsthenameofafilefromthe
userconsoleandcreatesanotherfilewiththesamecontents.Iftheoriginalfileis
named“xx.txt”thenthenewfilewillbenamed“xx-copy.txt.”Theoriginalfilewill
alwayshaveanameendingin“.txt,”andsowillthecopy.
4.TheCSVfile“avatardata.csv”containssavedinformationconcerningthepreferred
avatarsforplayersofavideogame.Thefieldsare:playercode(integer),avatartype
(string,noquotes),numberoftimesthisavatarwasplayedatthislevel(integer),a
gamelevelreached(integer,outof12),andthehighestscoreachievedonthislevel
(integer);thereisnoheader.Readthisfileanddetermineandprintwhichplayer/avatar
hasthehighestscoreoneachlevel.
5.UsingaPythonprogram,createaCSVfilefrom“avatardata.csv”thatcontainsonly
informationforlevel10.
6.InanHTMLfile(i.e.,awebpage)animagetobedisplayedisusuallyidentifiedina
sourcetagoftheform:src=“name.jpg.”Thequotesareapartofthetag,andthetext
betweenthemisanimagefilename.WriteaprogramthatreadsanHTMLfileand
printsthenamesofalloftheimagefilesthatitreferences.
7.Auserwillspecifythenameofanimagefile(i.e.,afilenamethatendsin“.jpg,”
“.gif,”or“.png”)fromtheconsole.Yourprogramwillreadthisnameandcreate
“disp.html,”anhtmlfilethat,whenopenedbyabrowser,willdisplaythisimage
(requiresaknowledgeofbasicHTML).
8.Twofiles,namedsorted1.txtandsorted2.txt,containnumericdatathatappearinthe
fileinsortedascendingorder(whenlookedatasastring).Mergethesetwofilesto
createasinglefilehavingthedataofboth,alsoinsortedorder.
NotesandOtherResources
PythonCSVLibrary:https://docs.python.org/3/library/csv.html
1.RemziArpaci-DusseauandAndreaArpaci-Dusseau.(2015).OperatingSystems:
ThreeEasyPieces,AmazonDigitalServices,Inc.
2.DanielP.BovetandMarcoCesati.(2005).UnderstandingtheLinuxKernel,
O’ReillyMedia.
3.DominicGiampaolo.(1999).PracticalFilesystemDesign,MorganKaufmann
Publishers,Inc.,http://www.nobius.org/~dbg/practical-file-system-design.pdf,
www.nobius.org/~dbg/practical-file-system-design.pdf
4.RobertStetson.(2013).HowDiskDrivesWork,CreateSpaceIndependentPublishing
Platform.
5.Jeopardyquestionsmaybefoundathttps://docs.google.com/uc?
id=0BwT5wj_P7BKXUl9tOUJWYzVvUjA&export=download
CHAPTER6
CLASSES
6.1ClassesandTypes
6.2ClassesandDataTypes
6.3SubclassesandInheritance
6.4DuckTyping
6.5Summary
Inthischapter
Howmanyjokesbeginwithaphraselike“Amanwalksintoabar”?Somanythatwhen
someonehearsthatphrase,itislikelythattheywillassumeitisajoke.So,toruinthejoke
andspeak philosophically, what is a man,what isa bar, andwhat does walking entail?
Walkingseemstobesomethingthatamancando,anactiontheycanperform.Andabar
isaplacewhereamancanwalk.Canamandoanythingelsebutwalk?Isabartheonly
placeamancanwalkto?
It seems silly to examine a sentence in that way, but in the context of a computer
program it may be more meaningful. Imagine that this discussion involves a computer
game or simulation. A man now represents some kind of thing or object that is
manipulatedbytheprogram.Amanhaspropertiesandthingsitcando,whichistosay
operationsitcanperform.Whatpropertiesdoesamanhave?Well,asasmallsubsetofthe
possibilities:
Property Type
Name String
Sex Boolean
Phonenumber Integer
Height Float
Weight Float
Job String?
Home(location,address) String?
Interests ArrayofString
Income Float
Possessions(otherobjects) Arrayofobject
Spouse person
Children Arrayofperson
Soamanwouldappeartobeacomplexdatatypehavinganumberofproperties.Note
especially that a man can have a property or characteristic called spouse. A spouse is
somethingcalledaperson;soisaman,really.Thisisprettyabstract,butstaywithit:a
manisaperson,andperhapssomeofthecharacteristicsofamanarereallythoseof(i.e.,
inheritedfrom)aperson.Infact, it wouldappearthatmostofthemare.Theonly thing
that distinguishes a man from other persons would (from the list above) be sex, which
wouldbe(perhaps)falseforamanandtrueforawoman,anotherkindofperson.
Imagine that there is a whole class of things called person that have most of these
properties.Amancouldbederivedfromthis,sincemanhasmanyofthesepropertiesin
common.Awomancouldbeanotherclass,perhapshavingafewdifferentproperties.A
mancouldhave,forexample,a“dateoflastprostateexam”asaproperty,butawoman
couldnot.Awomancouldhavea“dateoflastpapsmear,”butamancouldnot.Atsome
point,personhasmanycommon
characteristics,butmanhassomethatwomandoesnotandviceversa.
So,consideringtheoriginalproposition:whatisabar?Itisclearlysomething(object)
thatcanhold(contain)aman.Perhapsitcancontainmanymen.Women?Whynot?Ifa
personhastobeeitheramanorawoman,thenabarcancontainsomenumberofpersons.
Abarisaclassofobjectsthatcanholdorcontainsomenumberofpersons.Itwouldbea
containerclass,onesupposes,oraholderofsomekind.
So,thephrase“Amanwalksintoabar”mightbeexpressedas:
aMan.walksInto(aBar)
whereaManisaparticularman(aspecificinstanceofamanclass)andaBarisaspecific
instanceofaclassofobjectsknownasbar.ThismanhasaName,whichistosaythatone
ofthepropertiesthatamanhasisaName,andthisisreallyjustavariable.Sinceeach
individualmanhasaName,therehastobeawayofgettingat(accessing)eachone.Itis
donethrougheachinstance,likeso:
print(aMan.Name)#Accessing/printingthename.
aMan.Name=“TedSmith”#Assigningtothename.
Using this syntax, the dot (“.”) is placed after the name of the instance. The syntax
“aMan.Name” means “look at the variable aMan, which is an instance of man, for a
propertycalledName.”
Okay, so what is walksInto in the above expression aMan.walksInto(aBar)?
Consideringthesyntaxjustdescribed,itwouldappeartobeafunctionthatwasapartof
thedefinitionofman.Ittakesoneparameter,whichissomethinghavingthetypebar.
Thismayallseemveryabstractstill,butthiswayoflookingatthingsseemssensiblein
thatitappearstoorganizeinformationandprovideaclearandformalwaytoaccessitand
manipulateit.Thisdiscussionhasbeenametaphorfortheconceptofaclassandtheideas
behindobjectorientation, two key elements of modern programming structures. Python
permitstheprogrammertodefineclasseslikethemanorbarobjectspreviouslydescribed
and to use them to encapsulate variables and functions and create convenient modular
constructions.
6.1 CLASSESANDTYPES
A class, in the general sense, is a template for something that involves data and
operations(functions).Anobjectisaninstanceofaclass,aspecificinstantiationofthe
template.DefiningaclassinPythoninvolvesspecifyingaclassnameandacollectionof
variablesandfunctionsthatwillbelongtothatclass.Themanclassthathasbeenreferred
to so far has only a few characteristics that we know about for certain. It does have a
functioncalledwalksInto,asone
example.Afirstdraftofthemanclasscouldbeasfollows:
classman:
defwalksInto(aBar):
#codegoeshere
Afunctionthatbelongstoaclassisgenerallyreferredtoasamethod.Thisterminology
likelyrefersbacktoalanguagedevisedinthe1970snamedSmalltalk.Accordingtothe
standardforthatlanguage,“Amethodconsistsofa
sequence of expressions. Program execution proceeds by sequentially evaluating the
expressions in one or more methods.”In the above example, walksInto is a method;
essentially,amethodisanyfunctionthatispartofaclass.
Classescanhavetheirowndatatoo,whichwouldbevariablesthat‘belong’totheclass
inthattheyexistinsideit.Suchvariablescanbeusedinsidetheclassbutcan’tbeseen
fromoutside.
Lookingcloselyat thesimpleclass manabove, notice that it isactually still a rather
abstractthing.Inthenarrativeaboutamanwalkingintoabaritwasaspecificman,as
indicated by a variable aMan. So it would seem that a class is really a description of
something,andthatexamplesorinstancesshouldbecreatedinordertomakeuseofthat
description.Thisiscorrect.Infact,manyindividualinstancesofanyclasscanbecreated
(instantiated) and assigned to variables. To create a new instance of the class man,the
followingsyntaxcouldbeused:
aMan=man()
Whenthisisdoneallofthevariablesusedinthedefinitionofmanareallocated.Infact,
whenevera new manclass is created, aspecial method thatis local toman is called to
initializevariables.Thismethodistheconstructor,andcantakeparametersthathelpin
the initialization. Creating a man might involve giving him a name, so the instantiation
maybe:
aMan=man(“JimParker”)
Inthiscase theconstructor acceptsa parameter,astring, andprobably assignsit toa
variablelocaltotheclass(Name,mostlikely).Theconstructorisalwaysnamed__init__:
def__init__(self,parameter1,parameter2,…):
Theinitialparameternamedselfisareferencetotheclassbeingdefined.Anyvariable
thatisapartofthisclassisreferredtobyprefixingthevariablenamewith“self.”Tomake
aconstructorformanthatacceptedaname,itwouldlooklikethis:
def__init__(self,name):
self.Name=name
Whenamaniscreated,thestatementwouldbe:
aMan=man(“JimParker”)
This metaphor has fulfilled its purpose for the moment. There are some exercises
concerningitattheendofthechapter,butanothermorepracticalexamplemightbebetter
now.
6.1.1 ThePythonClass–SyntaxandSemantics
ThemanwalksintoabarexampleillustratesmanyaspectsofthePythonclassstructure
but obviously omits many details, especially formal ones that can be so important to a
programmer.Aclasslookslikea function inthatthereisakeyword, class,andaname
andacolon,followedbyanindentedregionofcode.Everythinginthatindentedregion
“belongs”totheclass,andcannotbeusedfromoutsidewithoutusingtheclassnameor
thenameofavariablethatisaninstanceoftheclass.
The method __init__ is used to initialize any variables that belong to the class. Java
wouldcallthismethodaconstructor,andthat’showitwillbereferencedheretoo.Any
variables that belong to the class must be accessed through either an instance (from
outside of the class) or by using the name self (from within the class). So, self.name
wouldrefertoavariablethatwasdefinedinsideoftheclass,whereassimplyusingname
wouldrefertoavariablelocaltoamethod.When__init__iscalled,asetofparameters
canbepassedandusedtoinitializevariablesintheclass.Ifthefirstparameterisself,it
meansthatthemethodcanaccessclass-localvariables;otherwiseitcannot.Normallyself
ispassedto__init__oritcan’tinitializethings.Anyvariableinitializedwithin__init__
andprefixedbyselfisaclass-localvariable.Anymethodthatispassedselfasaparameter
candefinea newclass-local variable,butit makessense toinitializeall ofthemin one
placeifthat’spossible.
Asimpleexampleofaclass,initialization,andmethodis:
classperson:
def__init__(self,name):
self.name=name
defintroduce(self):
print(“Hi,mynameis“,self.name)
me=person(“Jim”)
me.introduce()
This class has two methods, __init__()andintroduce(). After the class is defined, a
variablenamedmeisdefinedandisgivenanewinstanceofthepersonclasshavingthe
name“Jim.”Thenthisvariableisusedtoaccesstheintroducemethod,whichprintsthe
introduction message “Hi, my name is Jim.” A second instance could be created and
assignedtoasecondvariablenamedyouusing:
you=person(“Mike”)
andthemethodcall
you.introduce()
would result in the message “Hi, my name is Mike.” Any number of instances can be
created,andsomehavethesamenameasothers—theyarestilldistinctinstances.
Anewclass-localvariablecanbecreatedbyanymethod.Inintroduce(),forexample,a
newlocalnamedintroductionscanbecreatedsimplybyassigningavaluetoit.
defintroduce(self):
print(“Hi,mynameis“,self.name)
self.introductions=True
ThisvariableisTrue if the method introductions has been called. The main program
canaccessthisvariabledirectly.Ifthemainprogrambecomes:
me=person(“Jim”)
me.introduce()
print(me.introductions)
thentheprogramwillgeneratetheoutput:
Hi,mynameisJim
True
This is the essential information needed to define and use a class in Python. A more
complexexamplewouldbeusefulinseeinghowthesefeaturescanbeusedinpractice.
6.1.2 AReallySimpleClass
Acommonexampleofabasicclassisapoint,aplaceonaplanespecifiedbyxandy
coordinates.Thebeginningofthisclassis:
classpoint:
def__init__(self,x,y):
self.x=x
self.y=y
Thissimplyrepresentsthedataassociatedwithamathematicalpoint.Whatmoredoesit
need?Well,twopointshaveadistancebetweenthem.Adistancemethodcouldbeadded
tothepoint:
defdistance(self,p):
d=(self.x-p.x)*(self.x-p.x)+(self.y-p.y)*
(self.y-p.y)
returnsqrt(d)
If a traditional function were to be used to compute distance, it would be written
similarlybutnotidentically.Itwouldtaketwopointsasparameters:
defdistance(p1,p2):
d=(p1.x-p2.x)*(p1.x-p2.x)+(p1.y-p2.y)*(p1.y-p2.y)
returnsqrt(d)
Thedistance method uses one of the points as a preferred parameter,in a sense. The
distancebetweenpointsp1andp2wouldbecalculatedas:
d=p1.distance(p2)ord=p2.distance(p1)
usingthedistancemethod,butas:
d=distance(p1,p2)
if the function was used. To a degree the difference is a philosophical one. Is distance
somepropertythatapointhasfromanotherpoint(themethod),orisitsomethingthatisa
thingthatiscalculatedfortwothings(thefunction)?Aprogrammerbegins,afterawhile,
toseethemethodsanddataofaclassasbelongingtotheobject,andassomehowbeing
propertiesofit.That’swhatmakesaclassatypedefinition.
Many object-oriented languages offer the concept of accessor methods. Some
languages do not allow variables that belong to a class to be used directly, or allow
specificcontrolsonaccesstothem.Thetruthisthathavingtheabilitytofindthevalueof
variablesandtomodifythemisgenerallyabadidea.Iftheonlyplacethataclasslocal
variable can be modified is within the class, then that limits the places where that can
occur, and allows more control over what is possible. Preventing errors in programs is
partlyamatterofrestrictingactionstoasmallregion,ofknowingexactlywhatisgoingon
atalltimes.
Similarly,ifsomeobjectoutsideofaclasshasaccesstothelocalvariablesofthatclass,
thenitpromotesadependencyonaspecificimplementation,andoneoftheadvantagesof
an object-oriented implementation is that the interface to the class is fixed and
independentofthewaythatclassisimplemented.Itmayseemobviousthatapointobject
has an x, y position and that those would be real numbers, but the point class is the
simplestclass,andtakingadvantageofhowaclassiscodeditnotalwayshealthy.
Allthatanaccessormethoddoesisreturnavalueimportanttoauserofaclass.Thex
and y positions are variables local to the class, and many would agree that they should
haveanaccessormethod:
defgetx(self):
returnself.x
defgety(self):
returnself.y
Rewritingthedistance()methodtouseaccessormethodschangesitonlyslightly:
defdistance(self,p):
d=(self.x-p.getx())*(self.x-p.getx())+
(self.y-p.gety())*(self.y-p.gety())
returnsqrt(d)
Methodscalledmutatorsorsettersareusedtomodifythevalueofavariableinaclass.
They may do more than that, such as checking ranges and types, and tracking
modifications.
defsetx(self,x):
self.x=x
defsety(self,y):
self.y=y
Thereareothermethodsthatcouldbeaddedtoeventhissimpleclassjustincasethey
wereneeded,suchastodrawthepoint,toreturnastringthatdescribestheobject,torotate
abouttheoriginorsomeotherpoint,touseadestructormethodthatiscalledwhenthe
objectisnolongerneeded,andsoon.Untilitisknownwhattheclasswillbeusedfor,
there may not be any value for this effort, but if a class is being provided for general
utility, like the Python string, as much functionality would be provided as the
programmer’s imagination could invent. A draw method could simply print the
coordinates,andcouldbeusefulfor
debugging:
defdraw(self):
print(“(“,self.x,“,”,self.y,“)“)
Usingthisclass involvescreatinginstances and usingthe providedmethods,and that
shouldbeall.Atriangleconsistsofthreepoints.Atriangleclasscouldbedefinedas:
classtriangle:
def__init__(self,p0,p1,p2):
self.v0=p0
self.v1=p1
self.v2=p2
self.x=(p0.getx()+p1.getx()+p2.getx())/3
self.y=(p0.gety()+p1.gety()+p2.gety())/3
defset_vertices(self,p0,p1,p2):
self.v0=p0
self.v1=p1
self.v2=p2
defget_vertices(self):
return((self.v0,self.v1,self.v2))
defgetx(self):
returnself.x
defgety(self):
returnself.y
The (x, y) value of a triangle is its center, or the average value of the x and the y
coordinatesofthevertices.Thesearethebasicmethods.Atriangleislikelytobedrawn
somehow,andthenextchapterwillexplainhowtodothatspecifically.However,without
knowingthedetails,atriangleisasetoflinesdrawnbetweentheverticesandsomightbe
donethatway.Asitis,usingtextonly,itwillprintitsvertices:
defdraw(self):
print(”Triangle:”)
self.v0.draw()
self.v1.draw()
self.v2.draw()
Thetrianglecanbemovedtoanewposition.Achangeinthexandylocationspecifies
thechange,anditisdonebychangingthecoordinatesofeachofthevertices:
defmove(self,dx,dy)
coord=p0.getx()
p0.setx(coord+dx)
coord=p0.gety()
p0.sety(coord+dy)
coord=p1.getx()
p1.setx(coord+dx)
coord=p0.gety()
p1.sety(coord+dy)
coord=p2.getx()
p2.setx(coord+dx)
coord=p2.gety()
p2.sety(coord+dy)
self.x=self.x+dx
self.y=self.y+dy
In this way of expressing things, it is clear that moving the triangle is a matter of
changingthecoordinatesofthevertices.Ifeachpointhadamove()method,thenitwould
beclearer:movingatriangleisamatterofmovingeachofthevertices:
defmove(self,dx,dy):
p0.move(dx,dy)
p1.move(dx,dy)
p2.move(dx,dy)
self.x=self.x+dx
self.y=self.y+dy
Whichofthesetwomove()methodsseemsthebestdescriptionofwhatishappening?
Themorecomplexaretheclasses,themorevaluethereisinmakinganefforttodesign
themtoeffectivelycommunicatetheirbehaviorsandtomakethingseasiertoexpandand
modify. It is also plain that the move() method for a point is simpler than that for a
triangle.Thatfactisinvisiblefromoutsidetheclass,anditactuallynotrelevant.
6.1.3 Encapsulation
Intheexampleofthepointclassthereisnoactualneedforanaccessormethod,because
thevariablescanbeaccessedfromoutsidetheclass,inspiteoftheargumentsthathave
beengivenformorecontrolleduseofthesevariables.
A careful programmer would want to ensure the integrity of classes by forcing the
variablestoremainprotectedinsomeway,andPythonallowsthiswhilenotrequiringit.
Thevariablesxandyareaccessibleandmodifiablefromoutsidebecauseofhowthey
arenamed.Anyvariable nameina classthatbeginswith anunderscorecharacter (“_”)
cannotbemodifiedbycodethatdoesnotbelongtotheclass.Suchavariableissaidtobe
protected.Avariablenamethatbeginswithtwounderscorecharacterscan’tbemodified
orevenexaminedfromoutsideoftheclass,andissaidtobeprivate.Allothervariables
arepublic.Thisappliestomethodnamestoo,sothemethod__init__()thatistheusually
constructorisprivate.
Rewritingthepointclasstomaketheinternalvariablesprivatewouldbedonelikethis:
classpoint:
def__init__(self,x,y):
self.__x=x
self.__y=y
defgetx(self):
returnself.__x
defgety(self):
returnself.__y
detsetx(self,x):
self.__x=x
defsety(self,y):
self.__yy=y
defdistance(self,p):
d=(self.__x-p.getx())*(self.__x-p.getx())+
(self.__y-p.gety())*(self.__y-p.gety())
returnsqrt(d)
defmove(self,dx,dy):
self.__x=self.__x+dx
self.__y=self.__y+dy
defdraw(self):
print(“(“,self.__x,“,”,self.__y,“)“)
Now the internal variables x and y can’t be modified or even have their values
examinedunlessexplicitlyallowedbyamethod.
6.2 CLASSESANDDATATYPES
Consideraninteger. How can it be described so that a person who has not used one
beforecanimplementsomethingthatlooksandactslikeaninteger?Thisisaspecificcase
of the general problem faced when using computers—to describe a problem in enough
detailsothatamachinecansolveit.Thedefinitioncouldstartwiththeideathatintegers
canbeusedforcountingthings.Theyarenumbersthathavenofractionalpart,andthat
havebeenextendedsothattheycanbepositiveornegative.
What can be done with them? Integers can be added and subtracted, multiplied and
divided.Whendividingtwointegerstherecanbeanintegerremainderleftover.Theycan
bedisplayedinmanyforms:asbase10numbers,inanyotherbase,asRomannumerals,
andsoon.Thereareotheroperationsonintegers,butthesearethemostcommonlyused
ones.
What has been done here is to define a type. Python types, the ones built into the
language,includetheintegertype,aswellasfloatingpointnumbers,strings,andsoon.
Each is characterized by an underlying implementation, which is often hidden from the
programmer,andasetofoperationsthataredefinedonthingsofthattype.Thisisafair
definitionofatypeingeneral.Aclasscanbeusedtoimplementatype—notoneofthe
typesthatthelanguagealreadyefficientlyprovides,butnewtypesthatprogrammersfind
usefulfortheirpurposes.Themanandpersonclassesdescribedearliercanbethoughtof
astypes.
Whendesigningprogramsthatuseclasses,itislikelythattheclassesrepresenttypes,
althoughtheymaynotbecompletelyimplemented.Thedesignschemewouldbetosketch
ahigh-levelsolutionandobservewhatcomponentsofthatsolutionlookandbehavelike
types.Those components can be implementedas classes. The remainder ofthe solution
willhavestructureimposedonitbyvirtueofthefactthattheseothertypesexistandare
defined to be used in specific ways. Types can hide their implementation, for example.
Theunderlyingnatureofanintegerprobablydoesnotmattermuchtoaprogrammermost
times,andsocanbehiddenbehindtheclassboundary.Thishastheaddedfeaturethatit
encourages portability: if the implementation has to change, the class can be rewritten
whileprovidingthesameinterfacetotheprogrammer.
Theoperationson the typeareimplemented asmethods.The methodscanaccess the
internal structure of the class while providing the desired view ofthe data and ways of
manipulatingit.Ifaclassnamedintegerexisted,forexample,thenadd(),subtract(),and
soonwouldbemethods.Theninstancesofthisclasscouldbeimplemented:
a=integer(5)
b=integer(21)
andcomputingasumwouldbe:
c=a.add(b)
Theunderlyingrepresentationofanintegerisunknowntoauserofthisclass.Allthatis
known is the interface, described as methods. If the interface is well-documented, then
that’s all a programmer needs to know. In fact, exposing too much of the class to a
programmercancompromiseit.
6.2.1 Example:ADeckofCards
Traditionalplayingcardsthesedayshaveredandblackcolors,foursuits,andatotalof
52cards,13ineachsuit.Individualcardsarecomponentsofadeck,andcanbesorted:a2
islessthana3,aJacklessthanaKing,andsoon.TheAceisaproblem:sometimesitis
thehighcard,sometimesthelowcard.Acardwouldpossessthecharacteristicsofsuitand
value.Whenplayingcardgames,cardsaredealtfromthedeckintohandsofsomenumber
ofcards:13cardsforbridge,
5formostpokergames,andsoon.Thevalueofacardusuallymatters.Sometimes
cards are compared against each other (poker), sometimes the sum of values is key
(blackjack,cribbage),andsometimesthesuitmatters.Theseusesofadeckofcardscanbe
usedtodefinehowclasseswillbecreatedtoimplementcardgamesonacomputer.
Operationsonacardcouldincludetoviewit(itcouldbefaceuporfacedown)andto
compare it against another card. Comparison operations could include a set of complex
specificationstoallowforacesbeinghighorlowandforsomecardshavingspecialvalues
(spades,baccarat)soadefinitionstepmightbeveryimportant.
Adeckisacollectionofcards.Thereareusuallyoneofeachcardinadeck,butinsome
places(e.g.,LasVegas) therecouldbe fourormore complete decksusedwhen playing
blackjack.Operationsonadeckwouldincludetoshuffle,toreplacetheentiredeck,andto
deal a card or a hand. With these things in mind, a draft of some Python classes for
implementingacarddeckcanbecreated:
Thewaythatthemethodsareimplementeddependsontheunderlyingrepresentation.
Whentheprogrammercallsdeal()theyexpectthemethodtoreturnacard,whichisan
instanceofthecardclass.Howthathappensisnotrelevanttothem,butitisrelevantto
the person who implements the class. In addition, how it happens may be different on
differentcomputers,andaslongastheresultisthesameitdoesnotmatter.
Forexample,acardcouldbeaconstantvaluerthatrepresentedoneofthe52cardsin
the deck. The class could contain a set of values for these cards and provide them to
programmersasareference:
classcard:
CLUBS_1=1
DIAMONDS_1=2
…
HEARTS_ACE=51
SPADES_ACE=52
Def__init__(self,face,suit):
…
ThevariablesCLUBS_1,DIAMONDS_1,andsoonareaccessibleinallinstancesof
the card class and have the appropriate value. Variables defined in this way have one
instanceonly,andaresharedbyallinstances.
Asecondimplementationcouldbeasatuple.Theaceofclubswouldbe(Clubs,1),for
instance.Eachhasadvantages,butthesewillnotbeapparenttotheuseroftheclass.For
example, the tuple implementation makes it easier todetermine the suit of a card. This
matterstogamesthathavetrumpsuits.Theintegervalueimplementationmakesiteasier
todeterminevaluesanddospecificcomparisons.Thevalueofacardcouldbestoredina
tuplenamedranks,forexample,andranks[r]wouldbeanumericalvalueassociatedwith
thespecificcard.
6.2.2 ABouncingBall
Animations and computer simulations see the world as a set of samples captured at
discretetimes.Ananimation,forexample,isasetofimagesofsomescenetakenatfixed
timeintervals,generally1/24thofasecondor1/30thofasecond. Simulationsuse time
intervalsthatare appropriate to the thingbeing simulated.This exampleis asimulation
andanimationofabouncingball,firstinonedimensionandthenintwodimensions.
Aballdroppedfromaheighthfallstothegroundwhenreleased.Itsspeedincreasesas
itfalls,becauseitisbeingpulleddownwardsbygravity.Thebasicequationgoverningits
movementis:
s=1/2at2+v0t(6.1)
wheresisthedistancefallenattimet,v0isthevelocitytheobjecthadattimet=0,anda
isthevalueoftheacceleration.Foranobjectattheearth’ssurface,thevalueofais32
feet/second2=9.8meters/second2.Foraballbeingdropped,v0is0,sinceitisstationary
initially.So,thedistancesatsuccessivetimeintervalsof0.5secondswouldbe:
Aclasscouldbemadethatwouldrepresentaball.Itwouldhaveapositionandaspeed
at any given time, and could even be drawn on a computer screen. Making it bounce
wouldbeamatterofgivingtheballavaluethatindicatedhowmuchofitsenergywould
belosteachtimeitbounced,meaningthatitwouldeventuallystopmoving.Writingthe
codefortheclassBallcouldbeginwiththeinitialization(theconstructor):
classBall:
def__init__(self,height,elasticity):
self.height=height
self.e=e
self.speed=0.0
self.a=32.0
Thiscreatesandinitializesfourvariablesnamedheight,e,a,andspeedthatarelocalto
the class. Remember, the parameter selfrefers to the classitself, and any variable that
beginswith‘self.’isapartoftheclass.Avariablewithinthefunction__init__thatdidnot
beginwith‘self.’andwasnotglobalwouldbelongtothefunction,andwouldbecreated
anddestroyedeachtimethatfunctionwascalled.
Amethod(function)thatcalculatestheheightoftheballataspecifictimeissomething
elsethattheBallclassshouldprovide.Thisissimplythevalueoftheclasslocalvariable
height,so:
defheight(self):
returnself.height
The self parameter has to be passed, otherwise the function can’t access the local
variableheight.Thesimulationwillneedvaluesofheightasafunctionoftime,andtime
willincreaseindiscretechunks.Thiscouldbeimplementedinacoupleofways:theclass
couldkeeptrackofthetimesinceitwasdropped,oritcouldusethetimeincrementto
determine the next speed and position. If the former then a new class variable must be
usedtostorethetime;ifthelatterthenameanshastobefoundtoincrementthespeed
ratherthanusingtotalduration.Thissecondideaissimplerthanitsounds.Theequationof
motion s = 1/2at2 + v0t can use a time increment in place of t, and v0 would be the
velocityatthestartofthetimeinterval;thisyieldsthenewposition.Thenewvelocitycan
befoundfromarelatedequationofmotion,whichis:
v=at+v0(6.2)
wheretisagainthetimeincrementandv0isthespeedatthebeginningoftheinterval.
Thefunctionthatupdatesthespeedandpositioninthismannerwillbecalleddelta:
defdelta(self,dt):
s=0.5*self.a*dt*dt+self.speed*dt
height=height-s
self.speed=self.speed+self.a*dt
Heretheparameterdtisthetimeinterval,andsothatcanbevariedbytheprogrammer
togetpositionvaluesatvariousresolutions.
FornowthiswillbetheBallclass.Somecodeisneededtotestthisclassandshowhow
well(orwhether)itworks,andthiswillbethemainpartoftheprogram.Aninstanceof
Ball has to be created and then the delta method will be called repeatedly at time
incrementsof,foranexample,0.1seconds.Atableofheightandtimecanbeconstructed
inthis way, and itis a simplematter to see whether the numbers arecorrect. The main
programis:
b=Ball(12.0,0.5)
foriinrange(0,20):
b.delta(0.1)
print(“Attime“,i*0.1,”theballhasfallento”,b.height(),”Feet”)
Theresultsarewhatshouldbeexpected,showingthatthisclassfunctionscorrectly:
Attime0.0theballhasfallento12.0Feet
…
Attime0.5theballhasfallento7.999999999999997Feet
…
Attime1.0theballhasfallento-4.0000000000000036Feet
…
Attime1.5theballhasfallento-24.000000000000004Feet
…
Attime2.0theballhasfallento-52.000000000000014Feet
…
Attime2.5theballhasfallento-88.00000000000003Feet
…
Becausetheinitialheightwas12feet,thedistancefallenis12minusthevaluegiven
above,so:4,16,36,64,and100feet,whichisinagreementwiththeinitialtableforthe
timeslisted.Itappearstoworkcorrectly.
Thiscodedoesnotyetdothebounce,though.Whentheheightreaches0theballisat
groundlevel.Itshouldthenbounce,whichistosaybeginmovinginthereversedirection,
withaspeedequaltoitsformerdownwardspeedmultipliedbytheelasticityvalue.This
doesnotseemhardtodountilitisrealizedthattheballisnotlikelytoreachaheightof0
exactlyatatimeincrementboundary.Atonepointtheballwillbeabove0andthenafter
the next time unit the ball will be below 0. When does it actually hit the ground, and
where will be the ball actually be at the end of the time increment? This is not a
programmingissuesomuchasanalgorithmicormathematicalone,butisadetailthatis
importanttothecorrectnessoftheresults.
Itseemsclearthatthebouncecomputationshouldbeperformedinthemethoddelta().
Theheightvalueintheclassbeginsatapositivevalueanddecreasestowards0astheball
falls. During some specific call to delta(), the ball will have a positive height at the
beginningofthecallandanegativeoneattheend;thismeansabounceshouldhappen.At
thattimetheheightoftheballwillbenegative.Theheightofthebouncedballattheend
of the time interval will be the negated value of the height (i.e., so it is positive again)
multipliedbytheelasticity.
Thespeedthatshouldbeusedinthebounceisbasednotthefinalspeedbutthespeed
theballwastravelingatthetimewhentheheightwas0.Thishappenswhenself.height-s
iszero,orwhen:
self.height=s=0.5*self.a*dt*dt+self.speed*dt
Solve this for the time xt that makes the equation work out, which would be the
standardsolutiontoaquadraticequationthatistaughtinhighschool:
Thevalueofxtwillbebetween0anddt,andisthetimewithintheincrementatwhich
theballstrucktheground.Atthistimetheballwillbemovingwithspeed(self.speed+
self.a*xt) instead of (self.speed + self.a*dt) for a normal time interval. The ball will
reverse direction and reduce speed by the value of elasticity. Now the ball is moving
upwards.
The ball will be slowed by gravity until it stops on its upward path and drops down
again.Atthetopofthepathitsspeedwillbe0;atthebeginningofthetimeintervalthe
speed will be negative and at the end it will be positive, and that’s how the peak is
detected.Thissituationismuchsimplerthanthebounce.
Theannotatedprogramisasfollows:
#Ball.py
importmath
classBall:
#Constructor/initializer
def__init__(self,height,elasticity):
self.height=height#Currentheightoftheball
self.e=elasticity#Howmuchenergyisretainedeachbounce
self.speed=0.0#Currentspeedoftheball,
#initially0,down+
self.a=32.0#Acceleration:G=32ft/sec^2
#WhatJavawouldcallanaccessor:notreallyneeded.
defgetHeight(self):
returnself.height
#Calculatethenewheightandspeedforachangeintimeofdtseconds.
defdelta(self,dt):
startHeight=self.height#Rememberthestate
#beforedt
startSpeed=self.speed
s=0.5*self.a*dt*dt+self.speed*dt#Equation1:
#positionupdate
self.height=self.height-s
self.speed=self.speed+self.a*dt#Equation2:Speed
#update
ifself.height<0:#Thesignchanged;bounce,when?
#Equation3:Solvethequadraticequationtofindthetime
#ofbounce
xt=(-startSpeed-math.sqrt(startSpeed*startSpeed+2*self.a*startHeight))/self.a
ifxt<0:
xt=(-startSpeed+math.sqrt(startSpeed*startSpeed+2*self.a*startHeight))/self.a
print(“Bouncesattime“,xt)
#Equation2withelasticity
self.speed=-(self.speed+self.a*xt)*self.e
self.height=-self.height*self.e#Correctthe
#height
ifself.e<0.03:self.e=0.0
else:self.e=self.e-0.03
#Peakoftheupwardbounce,velocitychangessignfrom+to-
elifstartSpeed*self.speed<0:#Ifsigndiffersthen
#theproductis-ve
self.speed=0#Speedis0atthetop
#ofthebounce
print(“Peak”)
print(”Newspeedis“,self.speed,”andheightstarts
at“,self.height)
ifself.height<0.:
self.height=0.
b=Ball(12.0,0.5)#Initialheight12feet,elasticityis0.5
s=Screen(20,40)
foriinrange(0,50):
b.delta(0.1)#Timeincrementis0.1seconds
Howcanthisprogrambeeffectivelytested?Thecomputedvaluescouldbecompared
againsthandcalculations,butthisistime-consuming.Itwasdoneforafewcasesandthe
simulation was accurate. For this example, another program was written in a different
programminglanguagetocalculatethesamevalues,andtheresultfromthetwoprograms
wascompared—theywerenearlyexactlythesame.Thisisnotdefinitive,butiscertainlya
good indication that this simulation is working properly. In both programs similar
approximationsweremade,andthenumbersagreedtosevendecimalplaces.
Thisclasswillbeexpandedlatertoincludeananimationofthebouncingball.
6.2.3 Cat-A-Pult
Early in the development of personal computers, a simple game was created that
involved shooting cannons. The player would set an angle and a power level and a
cannonballwouldbefiredtowardstheopposingcannon.Iftheballstruckthecannonthen
it would be destroyed, but if not then the opposing player (or the computer) would fire
back at the player’s cannon. This process would continue until one or the other cannon
was destroyed. This game evolved with time, having more complex graphics,
mountainous terrain, and more complex aspects. Its influence can be seen in modern
gameslikeAngryBirds.
A variation of this game is proposed asan example of how classes can be used. The
basicideaistoeliminateamousethatiseatingyourgardenbyfiringcatsatit;hencethe
namecat-a-pult.Thegamewillusetextasinputandoutput,becausenographicsfacility
isavailableyet.Aplayertypestheangleandthepowerlevelandthecomputerwillfirea
catatthemouse.Thelocationwherethecatlandswillbemarkedonasimplecharacter
display, and the player can try again. The goal is to hit the mouse with as few tries as
possible.
BasicDesign
Beforewritinganycode,oneneedstoconsidertheitemsinthisgameandtheactions
theycantake.Theitemswillbeclasses,theactionswillbemethods.Thereseemtobetwo
items:acannonball(acat)andacannon.Thetarget(themouse)couldbeaclasstoo.The
cannonhasalocation,anangle,andapowerorforcewithwhichthecannonballwillbe
ejected. Both of the last two factors affect the distance the ball travels. The cannon is
givenatargetasaparameter—inthisexamplethetargetwillbeanothercannon,basically
toavoidmakingyetanotherclassdefinition.
The action a cannon can perform is to be fired. This involves releasing a cannonball
with a particular speed and direction from the location of the cannon. In this
implementation an instance of the cannonball class will be created when the cannon is
firedandwillbegiventheangleandvelocityasinitialparameters;theballwill,fromthen
on, be independent. As a class, the ball has a position (x, y) and a speed (dx, dy). The
actionthatitcanperformistomove,whichwillbeaccomplishedusingamethodnamed
step(),andtocollidewithsomething,
accomplishedbythemethodtestCollision().
DetailedDesign
Inthemetaphorofthisgame,thecannonballisactuallyacatandthetargetisamouse,
buttotheprogramthesedetailsarenotimportant.Here’swhatisimportant:
ClassCannonClassBall
Has:positionx,ypositionx,y
angle(whenfired)speeddx,dy
power(whenfired)name(text)
target(anothercannon)target(aCannonclassinstance)
ballgravity(forcechangingtheheight)
Does:firestep
steptestforcollision
AlloftheHasaspectsareclasslocalvariables,andinthisdesigntheywillbeinitialized
withinthe__init__methodofeachclass.Thiswouldentailthefollowing:
self.x=xself.x=x
self.y=yself.y=y
self.power=0self.dx=dx
self.angle=0self.dy=dy
self.target=targetself.target=target
self.ball=Noneself.gravity=1.0
self.name=””
The game is essentially one-dimensional. The cannonball will land at a specific x
coordinate,andifthatisnearenoughtothexcoordinateofthetarget,thenthetargetis
destroyed and the game is over. Without a way to draw proper graphics, this can be
imaginedasasimpletextdisplaywiththecannonononesideofthescreenandthetarget
ontheother,somethinglikethatseeninFigure6.1.
Theslashcharacter(“/”)ontheleftrepresentsthecannon,andthe“Y”representsthe
mouse,whichisthetarget.Thecannonisathorizontalcoordinate12,andthemouseisat
60;bothverticalcoordinatesare0.
AlloftheDoesaspectsrepresentactions,orthingstheclassobjectcando.Whenthe
cannonisfiredtheballiscreatedatthecannoncoordinates(12,0)andisgivenaspeed
that is related to the angle and power level using the usual trigonometric calculations
learnedinhighschool(Figure6.2):
dy=sin(angle*3.1415/180.0)
dx=cos(angle*3.1415/180.0)
The angles passed to sin and cos must be in radians, so the value PI/180 is used to
convert degrees into radians. The coordinates in this case have y increasing as the ball
moves upwards. So, when the cannon is fired a ball is created that has the x and y
coordinates of the cannon and the dx and dy values determined as above. This is
accomplishedbyamethodnamedfire():
Fire:takesanangleandapower.
Angleisindegrees,between0and360
Powerisbetween0and100(apercentage)
1.Computevaluesfordxanddyfromangleandpower,wheremaxpoweris0.1
2.Createaninstanceof Ballgivingitx,y, dx,dy,a name(“cat”),andatarget(the
mouse)
Thesimulationmakestimestepsofafixeddurationandcalculatespositionsofobjects
at the end of that step. Each object should have a method that updates the time by one
interval, and it will be named step(). The cannon does not move, but sometimes has a
cannonballthatithasfired,soupdatingthestatusofthecannonshouldupdatethestatus
oftheballaswell:
Step:makeone-timestepforthisobjectinthesimulation.Noparameter.
1.Ifaballhasbeenfired,thenupdateitsposition.Thisisdonebycallingthestep()
methodoftheball.
Thisdefinesthecannon.
Theballmustalsopossessastep()method,anditwillupdatetheball’spositionbased
onitscurrentspeedandlocation.Thexpositionisincreasedbydx,andtheyisincreased
bydy.Gravitypullsdownontheball,effectivelydecreasingtheverticalspeedoftheball
duringeachinterval.After sometrialsit wasdeterminedthatthe valueofdy should be
decreasedbythe valueof gravity during each interval. If the ball strikes the ground, it
shouldstopmoving.Whendoesthishappen?Whenybecomessmallerthan0.Whenthis
occurs,setdxanddyto0,andchecktoseeiftheimpactlocationisneartothetarget.
Step:makeaone-timestepforthisobjectinthesimulation.Noparameter.
1.Letx=x+dx,changingthexposition
2.Lety=y+dy,changingtheyposition
3.Decreasedybygravity(dy=dy-gravity)
4.Iftheballhasstrucktheground
5.Letdx=dy=gravity=0
6.Checkforcollisionwithtarget
Checkingtoseeiftheballhitthetargetisamatteroflookingatthexvalueoftheball
andthexvalueofthetarget.Ifthedifferenceissmallerthansomepredefinedvalue,say
1.0, then the target was hit. This is determined by a method that will be called
testCollision().Ifthecollisionoccurredthensuccesshasbeenachievedbytheplayer,so
setaflagthatwillendthegame.
testCollision:checktoseeiftheballhashitthetargetand,ifso,setaflag:
1.Subtractthexpositionoftheballfromthexpositionofthetarget.Callthisd.
2.Ifd<=1.0thensetaflagdonetoTrue.
ThisdefinestheclassBallandcompletesthetwomajorclasses.
Themainprogramthatusestheseclassescouldlooksomethinglikethis:
mouse=Cannon(60,0,None)#Createthetarget
player=Cannon(12,0,mouse)#createthecannon
player.fire(42,65)#Example:firecannonat
#42degrees65%power
done=False#initializevariable
#ꞌdoneꞌ
whilenotdone:#solongasthesimulation
#isnotover
player.step()#Updatethepositionof
#theball.
ActualcodeformostofthisexampleisshowninFigure6.4,andtheentireprogramis
ontheaccompanyingdisc.Includedinthediscversionisanextraclassthatdrawseach
stateofthegameascharactergraphicsthatcanbedisplayedinthePythonoutputwindow;
theexampleinthefiguredoesnotincludeanyoutput,andisunsatisfyingtoexecute.The
program on the disc will generate a numeric and graphical representation of the state,
showingtheaxes,thecannon,theball,andthetargetaftereachstep.Thesecanbemade
into distinct text files and can be made into an animation using MovieMaker on a
WindowscomputerorFinalCutonaMac.Suchananimationisalsoincludedonthedisc,
andisnamedcatapult.mp4.
The process by which Cat-a-pult was designed and coded loosely defines a way to
designandcodealmostanyprogramthatusesclasses.
6.3 SUBCLASSESANDINHERITANCE
Classesaredesignedaslanguagefeaturesthatcanrepresentahierarchyofinformation
orstructure.Aclasscanbeusedtodefineanother,andpropertiesfromthefirstclasswill
bepassedon(inherited)bytheother.Aclassthatisbasedonanotherinthiswayiscalled
asubclass,andexplanatoryexamplessuffusetheInternet:apetclasswithdogsandcatsas
special cases; a polygon having triangles and rectangles as subclasses; a dessert class,
havingsubclassespie,cake,andcookie;eventheinitialexampleinthischapterofaman
and a woman class and the person class that they can be derived from. A subclass is a
morespecificcaseofthesuperclass(orparentclass)onwhichitisbased.
The examples above are for explanation, and are not really useful as software
components, which begs a question about whether subclasses are really useful things.
Theyare,butitrequiresnon-trivialexamplestoreallydemonstratethis.
6.3.1 Non-TrivialExample:ObjectsinaVideoGame
Tosomedegreeallobjectsinagamehavesomethingsincommon.Theyarethingsthat
can interact with other game objects; they have a position within the volume of space
definedbythegame,andtheyhaveavisualappearance.Thus,adescriptionofaclassthat
couldimplementagameobjectwouldinclude:
classgobject:
position=(0,0,0)#Objectpositionin3D
visual=None#Graphicsthatrepresent
#theobject
def__init__(self,pos,vis)
defgetPosition(self):
defsetPosition(self,p):
defsetVisual(self,v):
defdraw(self):
Anyonewhohasplayedavideogameknowsthatsomeoftheobjectscanmovewhile
otherscannot.Objectsthatmovecanhavetheirpositionchange,andthepositionhastobe
updatedregularly.Anobjectthatcanmovecanhaveaspeedandamethodthatupdates
theirposition;otherwiseitislikeagobject.Thisisagoodcaseforasubclass:
classmobject(gobject):
speed=(0,0,0)#Speedinpixelsperframethe
#x,y,zdirections
def__init__(self,s)
defgetSpeed(self):
defsetSpeed(self,s):
defmove(self):
defcollision(self,gobject):
Thesyntaxofthishasthesuperclassgobjectasaparameter(apparently)ofthesubclass
mobjectbeingdefined.Ifaninstanceofagobjectiscreated,its__init__methodiscalled
andtheresultingreferencehasaccesstoallofthemethodsinthegobjectdefinition,just
asonewouldexpect.Ifaninstanceofmobjectiscreated,the__init__methodofmobject
iscalled,butnotthatofgobject.Nonetheless,allpropertiesandmethodsofbothclasses
areavailablethroughthemobjectreference;thatis,thefollowingislegal:
m=mobject((12,0,0))#Creatembojectwithspeed
#(12,0,0)
m.draw()#Drawthisobject
eventhoughanmobjectdoesnotpossessamethoddraw();themethoddefinedinthe
parent class is accessible and will be used. When the mobject is created it is also a
gobject, and all of the variables and methods belonging to a gobject are defined also.
However,the__init__()methodforgobjectisnot calledunlessthe mobject__init__()
methoddoesso.Thismeansthat,forthemobject,thevaluesofpositionandvisualare
not specified by the constructor and will take the default values they were given in the
gobjectclass.Ifnosuchvaluewasgiven,theywillbeundefinedandanerrorwilloccurif
theyarereferenced.
Callingthe__init__()methodoftheparentclasscanbedoneasfollows:
super().__init__((10,10,10),None)
In this instance the constructor for gobject is called, passing a position and a visual.
Thiswouldnormallybedoneonlyinthe__init__()ofthesubclass.
Nowconsiderthefollowingcode.Themethodsaremainlystubsthatprintamessage,
buttheoutputoftheprogramisinstructive:
Outputfromthisis:
gobjectinitfromthecreationofthegobjectinstanceg
gobjectinitwhenmiscreateditcallstheparent__init__
mobjectinitfromthemobject__init__whenmiscreated
(10,10,10)m.getPosition,showingaccesstoparent
methods
Movem.movecall
Drawm.drawcall,againshowingaccesstoparentmethod
Attemptingtocallg.move()wouldfailbecausethereisnomove()methodwithinthe
gobjectclass.Hence,ifanobjectwaspassedtoafunctionthatwouldattempttomoveit,
itwouldbecriticaltoknowwhethertheparameterpassedwasagobjectoranmobject.
Consideramethodthatmovesanobjectxoutofthepathofanmobjectinstanceifitcan,
orchangesthepathofthemobjectifitcannot.Thismethod,nameddodge(),mightdothe
following:
defdodgeself,(x):
c=x.getPosition()
c=c+(dx,dy,0)
x.setPosition(c)
However,iftheparameterisaninstanceofagobject,thenitshouldnotbemoved.The
functionisinstance()canbeusedtodeterminethis.Theresultof:
isinstance(x,gobject)
willbeTrueifxisagobjectandFalseotherwise.IfFalse,thenitcan’tbemovedandthe
dodge()methodwillhavetomovethecurrentmobjectoutofthewayinstead:
defdodgeself,(x):
ifisinstance(x,gobject):
self.position=self.position+(dx,dy,0)
else:
c=x.getPosition()
c=c+(dx,dy,0)
x.setPosition(c)
6.4 DUCKTYPING
In many programming languages, types are immutable and compatibility is enforced.
ThisisnotgenerallytrueinPython,butstillthereareoperationsthat
requirespecifictypes.Indexingintoastringortuplemustbedoneusingsomething
muchlikeaninteger,andnotbyusingafloat.Nowthatclassescanbeusedtobuildwhat
amountstonewtypes,moreattentionshouldbepaidtothethingsatypeshouldofferand
therequirementsthisputsonaprogrammer.APythonphilosophycouldbethatthefewer
restrictionsthebetter,andthisisaprincipleofducktypingaswell.
Itshouldnotreallymatterwhattheexacttypeoftheobjectisthatisbeingmanipulated,
onlythatitpossesses thepropertiesthatare needed.Inavery simplecase,considerthe
classes point and triangle that were discussed at the beginning of this chapter. It was
proposed that both could have a draw() method that would create a graphical
representationoftheseonthescreen,andbothhaveamove()methodaswell.Afunction
couldbewrittenthatwouldmoveatriangleawayfromapointanddrawthemboth:
defmoveaway(a,b)
dx=a.getx()-b.getx()
dy=a.gety()-d.gety()
a.move(dx/10,dy/10)
b.move(-dx/10,-dy/10)
Question: which of the parameters, a or b, is the triangle, and which is the point?
Answer:itdoesnotmatter.Bothclasseshavethemethodsneededbythisfunction,namely
getx(),gety(),andmove().Becauseofthisthecallsaresymmetrical,
andbothofthefollowingarethesame:
moveaway(a,b)
moveaway(b,a)
Infact,aclassthatpossessesthesethreemethodscanbepassedtomoveaway()anda
resultwillbecalculatedwithouterror.Theessenceofducktypingisthat,solongasan
objectofferstheserviceneeded(i.e.,amethodofthecorrectnameandparameterset)to
anotherfunctionormethod,thenthecallisacceptable.Thereisawaytotellwhetherthe
classinstanceahasagetx()method.Thebuilt-infunctionhasattr():
ifhasattr(v1,“getx”):
x=v1.getx()
Thefirstargumentisaclassinstance,andthesecondisthenameofthemethodthatis
needed,asastring.ItreturnsTrueifthemethodexists.
Thenamecomesfromtheoldsayingthat“ifsomethingwalkslikeaduckandquacks
likeaduck,thenitisaduck.”Aslongasaclassoffersthethingsaskedfor,thenitcanbe
usedinthatcontext.
6.5 SUMMARY
A class, in the general sense, is a template for something that involves data and
operations(functions).Anobjectisaninstanceofaclass,aspecificinstantiationofthe
template.DefiningaclassinPythoninvolvesspecifyingaclassnameandacollectionof
variablesandfunctionsthatwillbelongtothatclass.Amethodisafunctionthatbelongs
toaclass,andsocanhaveeasyaccesstoitsinternaldata.Asafirstparameter,amethod
canbepassedtheselfvariablebydefault,whichcanbethoughtofasareferencetothe
objectcurrentlyexecuting.Thus,withinamethod,theexpressionself.xreferstoavariable
xdefinedintheclass.Anobjectiscreatedusingthenameoftheclass:foraclassnamed
thing,aninstancexiscreatedusingx=thing().Whenthisoccurs,ifthereisamethodin
thingnamed__init__,thenthatmethodiscalled.Thisisreferredtoasaninitializerora
constructor.
Accessingmethodsinanobjectisdoneusinga“dot”notation:obj.method().Variables
canbeaccessedinthiswaytoo.
Asubclassisaclassthatpossessesallofthepropertiesofsomeotherclass,theparent
classorsuperclass,plussomenewones.Thedataandmethodsoftheparentclasscanbe
accessedfromthesubclass(orchildclass).Asubclassofthingnamedsomethingwould
bedefinedusingthesyntax:
classsomething(thing):
Aclasscanrepresentanewtype,wheremethodsrepresentoperations.
Public variables can be accessed and modified from outside of a class; protected
variablescanbeaccessedbutnotmodifiedfromoutsideofaclass,andmustbeginwithan
underscore character (e.g., “_variable”); private variables can neither be accessed nor
modifiedfromoutsideoftheclass,andmustbeginwithtwounderscorecharacters(e.g.,
“__variable”).
Theprincipleofducktypingisthatitshouldnotreallymatterwhattheexacttypeofthe
objectisthatisbeingmanipulated,onlythatitpossessestheproperties
thatareneeded.
Exercises
1.Defineaclassnamedsquareinwhichtheconstructtakesthelengthofthesideasa
parameter.Thisclassshouldhaveamethodarea()thatcomputesandreturnsthearea
ofthesquare.
2.Defineasubclassofsquarenamedbuttonthatalsohasalocation,passedasXandY
parameterstotheconstructor.Abuttonalwayshasawidthof10.Thebuttonclasshas
thefollowingmethods:
center()Returnthecoordinatesofthecenterofthebutton
label(s)Setthevalueofatextlabeltobedrawntos
3.Createaclassclient.Aclientisadata-onlyclassthathasnomethodsotherthan
__init__(),butthatholdsdata.Inthiscasetheclientclassholdsaname,acategory
(retailorcommercial),atimevalue(integer),andaservicevalue(integer).Allvalues
areestablishedwhentheinstanceiscreatedbypassingparametersto__init__().Now
createtwosubclassesofclient,oneforeachcategory,retailandcommercial.
4.Defineaclassnamedfractionthatimplementsfractionalnumbers.Theconstructor
takesthenumeratoranddenominatorasparameters,andtheclassprovidesmethodsto
add,multiply,negate(makenegative),print,andfindthereciprocalofafraction.Test
thisclassbycalculating:
14/16*3/4
1/2–1/4
Bonus:resultsarereducedtosmallestpossibledenominator.
5.Giventhefollowingclass:
classvalue:
def__init__(self)
self.val=randrange(0,100)
andtheinitialization:
t=()
foriinrange(0,100):
v=value()
t=t+(v,)
writethecodethatscansthetupletandlocatesthesmallestintegersavedinanyofthe
classinstances.
6.CreateaclassthatsimulatesaNANDlogicgatewiththreeinputs.Theoutputwillbe
1unlessallthreeinputsare1,inwhichcasetheoutputis0.Everytimeaninputis
changed,theoutputischangedtoreflectthenewstate;thus,methodstoseteachinput
andtocalculatetheresultwillbeneeded,inadditiontoamethodthatreturnsthe
output.
7.Aqueueisadatastructurethatacceptsnew(incoming)dataatoneend(theback)and
storesitintheorderofarrival,givingthedataatthefrontofthequeuewhen
requested.It’slikealineatacashierinastore:customerswaitforthecashierinorder
ofarrival.Implementaqueueasaclass;ithasoperationsinto()andout()toadd
thingsandremovethingsfromthequeue,andempty(),whichreturnsTrueifthe
queuehasnodatainit.Whatisaddedtothequeueareobjectsofaclassclient,asseen
inExercise6.3above.
8.Simulation:Thegestationperiodforarabbitis28–32days,andtheywillbreeda
weekafterhavingalitter.Afemalerabbit(adoe)willbreedforthefirsttimeatabout
100daysold.Createaclassthatrepresentsarabbitandsimulatethegrowthofarabbit
populationthatstartswiththreedoesatday0.Assumealittersizeofbetween3and8,
andthathalfoftheoffspringwillbemale.Increasetimeby1dayatatimeandanswer
thequestion:“Howmanyrabbitswilltherebeafter1year?”iftheinitialpopulationis
threedoesandonemale(buck).
NotesandOtherResources
NotesonPythonClasses:
http://www.jesshamrick.com/2011/05/18/an-introduction-to-classes-and-inheritance-
in-python/
August12,2015.http://componentsprogramming.com/using-the-right-terms-method/
Duck typing in Python:
http://www.voidspace.org.uk/python/articles/duck_typing.shtml
1.R.Chugh,P.Rondon,andR.Jhala.(2012,January).Nestedrefinements:
Alogicforducktyping,ACMSIGPLANNotices,47(1),231–244.
2.Ole-JohanDahl.(2002).Thebirthofobjectorientation:
TheSimulalanguages,inSoftwarePioneers:ContributionstoSoftwareEngineering,
editedbyManfredBroyandErnstDenert,Programming,SoftwareEngineering,and
OperatingSystemsSeries,Springer,79–80.
3.O.-J.DahlandK.Nygaard.(1968).Classandsubclassdeclarations,inSimulation
ProgrammingLanguages,editedbyJ.Buxton,ProceedingsfromtheIFIPWorking
ConferenceinOslo,May1967,NorthHolland.
4.AdeleGoldbergandAlanKay.Smalltalk-72InstructionManual,44.
5.ANSISmalltalkStandardv1.9199712NCITSX3J20draft,Section3.1,9.
6.B.Liskov,ASnyder,R.Atkinson,andC.Schaffert.(1977).Abstractionmechanisms
inCLU,CommunicationsoftheACM,20(8),564–576.
CHAPTER7
GRAPHICS
7.1IntroductiontoGraphicsProgramming
7.2Summary
Inthischapter
Since the advent of Windows, computer graphics has been assumed as a feature of a
computer.Beforethatitwasarelativelyrarething,relegatedtosomeresearch,toafew
expensive Hollywood movies, and to science fiction. Believe it or not, the first use of
computergraphicsinacommercialmotionpicturewasinthefilmTheAndromedaStrain
(1971) in which it was used to show a rotating 3D map (they called it an electronic
diagram)oftheundergroundinstallationwheretheactionmainlytakesplace.Afewyears
laterthefilmWestworld(1973)used2½minutesofdigitallyprocessedvideotoshowthe
visualperspectiveofanandroid.Itwasaverytime-consumingandexpensivetaskatthat
time;ittookabout8hourstoprocess10secondsoffilm,orabout120hoursinall.
Modern computers all possess very fast graphics cards that perform most of the
renderingtasks,andaddedtothebuilt-insoftwareoncurrentoperatingsystems,itallows
for a very sophisticated yet simple-to-use graphical/windows interface to desktop
computers. The graphics software is hierarchical; the screen itself is merely an array of
pictureelements(pixels)thatcanbesettoanycolor,anditisdifficulttoseehowthatcan
bemadetodisplaycomplexpictures.
Ithasreachedthepointwhereeverythingseenonacomputerscreenisactuallydrawn—
icons,windows,backgrounds,andeventext.
What this means is that interacting with a computer is now done with graphics, not
characters and text. Since that is the situation, it makes sense to permit a beginning
programmertoexperimentwithprogramminggraphicsapplications.
7.1 INTRODUCTIONTOGRAPHICSPROGRAMMING
Atthemostprimitivelevelofgraphicssoftwareistheabilitytosetindividualpixels.It
is,aswasmentioned,quitedifficulttousethiscapabilitytocreatecomplexpictures.How
isadogdrawn,orabuilding,orevenjustastraightline?Thosethingshavebeenfigured
out,fortunately.
Soatthebottomlayerofsoftwarearefunctionsthatmanipulatepixels.Atthenextlevel
arelinesandcurves;thesearethebasiccomponentsofdrawingsandsketches.Anartist
with a pencil uses lines and curves to represent scenes. At the level above lines are
functions that use lines to create other objects, such as rectangles, circles, and ellipses.
These can be line drawings or can be filled with colors. The next higher levels can be
arguedabout,buttextisprobablyinthenextsoftwarelayerandthenshadingandimages
followedby3Dobjects,whichincludesperspectivetransformationandtextures.
Pythondoesnotitselfhavegraphicstools,butvariousmodulesthatareassociatedwith
Python do. The standard graphical user interface library for use with Python is tkinter.
Therearemanyfeaturesofthismodule,includingthecreationofwindows,drawing,user
interface widgets like buttons, and a host of other features. It is free and is normally
included in the Python distribution but it can easily be downloaded and used with any
Pythonversion.BecausetherearemanywaysthatPythoncanbeconfiguredonvarious
differentsystems,theinstallationprocesswillnotbedescribedindetailhere.Agraphics
modulewillbedescribedandisincludedontheDVDthataccompaniesthisbook,andit
requiresthattkinterbeavailable.Thisisalmostalwaystrue(rememberthatthisbookuses
Python3).
ThelibrarythatallowsgraphicsprogrammingiscalledGlib.IftheGlib.pyfileisinthe
samedirectoryasthesourcecodeforthePythonprogramthatusesit,thenitshouldwork
fine. Glib consists of a collection of functions that implement the first few levels of a
graphicssystemandthatcreateawindowwithinwhichdrawingcantakeplace.Itdoesnot
allowforinteraction,animation,sound,orvideo,allofwhicharethesubjectofthenext
chapter.
7.1.1 Essentials:TheGraphicsWindowandColors
To start creating computer graphics, it is necessary to understand how colors and
images are represented. When using a computer everything must be represented as
numbers.Apixelisthecolorofapictureataparticularlocation,andsotheremustbea
way to describe a color at that place. In physics frequency is used: each color has a
specific frequency of electromagnetic radiation. Unfortunately, this does not map very
wellontoacomputerdisplay,becauseitisbasedontelevisiontechnology.OnaTV,there
are three colors, red, green, and blue, that are used in various proportions to represent
everycolor.Therearered,green,andbluedotsontheTVscreenthatarelituptovarious
degreestocreatethecolorsthatareseen.Thisisbasedonthewayahumaneyeseescolor;
there are red, green, and blue sensors in the eye that in combination create our color
perception.Anotherreasonthatfrequencyisnotusedisthattherearecolorsthatarenot
accuratelyrepresentedasfrequencies;theydonotappearintherainbow.Thecolorspink
andbrownaretwoexamples.
So,eachcolorinthegraphicssystemisrepresentedasthedegreeofred,green,andblue
thatcombinetocreatethatcolor.Inthatsenseitisabitlikemixingpaint.Yellow,ona
computer,isamixtureofredandgreen.Eachpixelthereforehasthreecomponents:ared,
green,andblue component.Thesecouldbe expressedas percentages, butwhenusing a
computeritisbettertoselectnumbersbetween0and255(8bits,or1byte)foreachcolor.
Each pixel requires 3 bytes of storage; actually 4 bytes in some cases, as will be seen
shortly. If an image contains 100 rows of 100 pixels, then it has 10,000 pixels and is
10000*3=30000bytesinsize.
To humans, colors have names. Here’s a list of some named colors and their RGB
equivalents:
Thereare,ofcourse,agreatmanymorenamedcolors,andevenmorecolorsthatcanbe
represented with RGB values in this way—16,777,202 of them in fact. Each pixel is a
colorvalue.AllgreyvalueshavethespecialsituationR=G=B,sothereare256 distinct
valuesofgreyrangingfromblacktowhite.
The graphics library provides functions for creating colors. The functions are
cvtColor()andcvtColor3():
cvtColor(g) - Return a color value that has R=G=B = g, which is to say it
specifiesagreylevel=g.
cvtColor3(r,g,b)-ReturnacolorvaluethathasthespecifiedR,G,B
values.
When using Glib the program must initialize the system before drawing anything. All
graphics must take place between a call to startdraw() and a call to enddraw(). If
startdraw() is not called then important items will not be initialized, most especially a
drawing window will not be created and an error will occur at some time during
execution.Ifenddraw()isnotcalledthennothingwillbedrawn.Foranabstractexample,
ahypotheticalmainprogramcouldbe:
#startofprogram
Manycalculations
…
startdraw(width,height)
#Graphicscalls
…
enddraw()
The function startdraw() accepts two parameters: the width and the height of the
drawingregiontobecreated.Thesevaluescanberetrievedbyaprogramusingcallstothe
functions Width() and Height() if they are needed. Startdraw() creates the needed
windowanddrawingarea,initializesstartingcolors,fonts,andmodes,butdoesnotopen
thewindow.Nothingthatisdrawnwillbevisibleuntilenddraw()iscalled.
7.1.2 PixelLevelGraphics
Theonlypixelleveloperationdrawsapixelataspecifiedlocation;so,forexample,the
call:
point(x,y)
willsetapixelatcolumn(horizontalposition)xandrow(verticalposition)ytoacolor.
Whatcolor?Glibhastwodefaultcolorsthatcanbeset:thefillcolor,whichisthecolor
withwhichpixelswillbedrawn,polygonsandellipseswillbefilled,andcharacterswill
bedrawn;andthestrokecolor,thecolorwithwhichlineswillbedrawn.Settingthefill
colorisdoneusingacalltothefill()function:
fill(200)#Setthefillcolorto(200,200,200)
fill(100,200,100)#Setthefillcolorto(100,200,100)
Usingoneparametersetsthefillcolortoagreyvalue,whereaspassingthreeparameters
specifies the red, green, and blue values (respectively) of the fill color. Similarly the
functionstroke()canacceptoneorthreeparameters:
stroke(200)#Setthestrokecolorto(200,200,200)
stroke(255,0,0)#Setthestrokecolorto
#(255,0,0)=red
Thereisonemorefunctionthatcouldbethoughttobeinthepixellevelcategory.The
functionbackground()isusedtosetthebackgroundcolorofthegraphicswindow.Again,
itcanaccepteitheroneparameter(grey)orthree
(color).Thisleadstothefirstexample.
Example:CreateaPageofNotepaper
Notepaperhasbluelinesseparatedbyenoughspacetowriteorprinttextbetweenthem.
It often has a red vertical line indicating an indentation level, a place to begin writing.
Drawingthisisamatterofdrawingasetofconnectedbluepixelsinasetofrows,and
thenmakingaverticalcolumnofredpixels.Hereisonewaytocodethis:
fromGlibimport*
y=60#Heightatwhichtostart
startdraw(400,600)#Begindrawingthings
background(255)#Papershouldbewhite
fill(0,0,200)#Fillcolor=pixel
#color=blue
forninrange(0,27):#Draw30horizontalblue
#lines
forxinrange(0,Width()):#Drawallpixelsinone
#line
point(x,y)#Drawabluepixel
y=y+20#Thenextlineis20
#pixelsdown
fill(200,0,0)#Pixelcolorred
foryinrange(0,Height()):#Drawconnectedvertical
#pixels
point(25,y)#toformthemarginline
enddraw()
The output of this program is shown in Figure 7.2a. When pixels are drawn
immediatelynexttoeachothertheyappeartobeconnected,andsointhiscasetheyform
horizontal and vertical lines. This is not easy to do for arbitrary lines; it is not obvious
exactlywhichpixelstofillforalinebetween,say,(10,20)and(99,17).That’swhythe
linedrawingfunctionsexist.
Example:CreatingaColorGradient
Whencreatingavisualonacomputer,thefirststepistohaveaclearpictureofwhatit
will look like. For this example, imagine the sky on a clear day. The horizon shows a
lighterbluethantheskydirectlyabove,andthecolorchangescontinuouslyalloftheway
from horizon to zenith. If a realistic sky background were needed, then it would be
necessarytodrawthisusingthetoolsavailable.Whatwouldthemethodbe?
First,decideonwhatthecolorisatthehorizon(y=ymax)andatthehighestpointinthe
scene(y=ymin).Nowask:“howmanypixelsbetweenthosepoints?”Thechangeinpixel
color will be the color difference from ymax to ymin divided by the number of pixels.
Nowsimplydrawrowsofpixelsbeginningwiththe
horizon and moving up the image (i.e., decreasing Y value), changing the color by this
amounteachtime.
Asanimplementation,assumethatthecoloratthehorizonwillbeblue=(40,40,255)
andthetopoftheimagewillbe(40,40,128),adarkerblue.Theheightoftheimagewill
be400pixels;thechangeinblueoverthatrangeis127units.Thus,thecolorchangeover
each pixel is going to be 127.0/400, or about 0.32. A color can’t change a fractional
amount,ofcourse,butwhatthismeansisthatthebluevaluewilldecreaseby1unitfor
every 3-pixel increase in height. Do not forget that the horizon is at the bottom of the
image, which has the greatest Y coordinate value, so that an increase in Y means a
decreaseinheightandviceversa.
Theexampleprogramthatimplementsthisis:
fromGlibimport*
blue=128
startdraw(400,400)
delta=127.0/Height()
foryinrange(0,Height()):
yy=Height()-y
fill(40,40,blue)
forxinrange(0,Width()):
point(x,y)
blue=blue+delta
enddraw()
Figure7.2bshowswhatthegradientimagelookslike(afull-colorversionofthisand
allimagesisontheaccompanyingdisc).
7.1.3 LinesandCurves
Straight lines and curves are more complex objects than pixels, consisting of many
pixelsinanorganizedarrangement.Alineisactuallydrawnbysettingpixels,though.The
factthataline() functionexists meansthat the programmerdoes nothave to figure out
whatpixelstodrawandcanfocusonthehigherlevelconstruct,thelineorcurve.
Alineisdrawnbyspecifyingtheendpointsoftheline.UsingGlibthecallis:
line(x0,y0,x1,y1)
whereoneendofthelineisat(x0,y0)andtheotherisat(x1,y1).Thecolorofthelineis
specified by the stroke color. If any part of the line extends past the boundary of the
window,that’sOK;thelinewillbeclippedtofit.
Example:NotepaperAgain
Theexampleofdrawingapieceofnotepapercanbedoneusinglinesinsteadofpixels,
andwillbealittlefaster.Setthestrokecolortoblueanddrawacollectionofhorizontal
lines (i.e., that have the same Y coordinate at the endpoints) separated by 20 pixels, as
before. Then draw a vertical red line for the margin. The program is a variation on the
previousversion:
fromGlibimport*
y=60#Heightatwhichtostart
startdraw(400,600)#Begindrawingthings
background(255)#Papershouldbewhite
stroke(0,0,200)#Fillcolor=pixel
#color=blue
forninrange(0,27):#Draw30horizontalbluelines
line(0,y,Width(),y)#Drawabluehorizontalline
y=y+20#Thenextlineis20pixels
#down
stroke(200,0,0)#Pixelcolorred
line(25,0,25,Height())#Drawaverticalredline
enddraw()
Theoutputfromthisprogramisthesameasthatfortheversionthatdrewpixels,which
isshowninFigure7.2a.
Acurveistrickierthanaline,inthatitishardertospecify.ThemethodusedinGlibis
basedonthatintkinter:acurve(arc)isdefinedasaportionofanellipsefromastarting
angleforaspecifiednumberofdegrees,asreferencedfromthecenteroftheellipse.The
angle0degreesishorizontalandtotheright;90degreesisupwards(decreasingYvalue).
Theellipseisdefinedbyaboundingrectangle,specifyingtheupperleftandlowerright
coordinates of a box that just holds the ellipse. So, looking at Figure7.3, the rectangle
definedbytheupperleftcornerat(100,50)andthelowerrightat(300,200)hasacenter
at (200, 125) and contains an ellipse slightly longer than it is high (upper left of the
figure).Thefunctionthatdrawsacurveisnamedarc(),andtakestheupperleftandlower
rightcoordinates,astartingangle,andthesizeofthearcalsoexpressedasanangle.
Intheupperrightofthefigureisthearcdrawnbythecallarc(100,50,300,200,0,90),
whichmeansthatthepartoftheellipsefromthe0degreepointcounterclockwisefor90
degreeswillbedrawn.Theexampleatthelowerleftofthefiguredrawsthecurvefrom
the 45-degree point for 90 degrees, resulting in the upper section of the ellipse being
drawn. The final arc shown, at the lower right, uses a negative angle. The call
arc(100,50,300,200,45,-60) starts at the 45-degree point and draws the arc clockwise,
becausetheanglespecifiedwasnegative.
Thiswayofspecifyingarcsisfineforsimpleexamplesandsingle curves, butmakes
combiningmanyarcsintoamorecomplexcurveratherdifficult.Joiningtheendstogether
smoothlyisthetrick.
Thearc()functionhasanotherparameterthatcanbespecified.Theseventhparameter
tells the system what kind of arc to draw; the possibilities are ARC, CHORD, and
PIESLICE,andthedefaultvalueARCisillustratedinthefigure.Thecall:
arc(100,50,300,200,45,-60,CHORD)
givestheresultshowninFigure7.3a.Theresultofthecall:
arc(100,50,300,200,45,-60,PIESLICE)
isshowninFigure7.34b.Iffillingisturnedon,thentheseshapeswouldbefilledwiththe
defaultfillcolor.Amajorexampleinvolvingcurves/arcsisthepiechartprogramlaterin
thischapter.
7.1.4 Polygons
For the purposes of discussion, a polygon will include all closed regions, including
ellipses and circles. In that context, the arc() function can be used to draw circles and
ellipsesbyspecifyinganangleverynearto360degrees.Thecall:
arc(100,100,300,300,0,359.9,ARC)
willdrawanearlycompletecircle,onethatlookscomplete.However,thereisafunction
thatdrawsrealcirclesandrealellipses:itisnamedellipse().Inthedefaultdrawingmode
theellipse()functionispassedthecoordinatesofthecenteroftheellipseanditswidthand
heightinpixels.Ifthewidthandheightareequal,thentheresultisacircle.Thecall:
ellipse(100,100,30,40)
drawstheellipseshowninFigure7.4a.Thecolorwithwhichtheellipseisfilledisthefill
color.Ifnofillingisdesiredthenacalltonofill()turnsofffilling.
The default mode for drawing ellipses is referred to as CENTER mode, where the
centeroftheellipseisgiven.Therearethreeothers:RADIUSmode,inwhichthewidth
and height parameters represent semi-major and semi-minor axes; CORNER mode in
which the upper left corner is specified instead of the center; andCORNERS mode, in
which the upper left and the lower right corner of the bounding box are specified. The
mode is changed using a call to the function ellipsemode(), passing one of the four
constantsastheparameter:CENTER,RADIUS,CORNER,orCORNERS.
A rectangle is drawn using the rect() function. Again, there are four drawing modes.
Considerthecallrect(x,y,w,h):
CENTER mode: x, y are the coordinates of the center; w andh are the width and
height.
RADIUSmode:x,yarethecoordinatesofthecenter;wandrarethehorizontaland
verticaldistancestoanedge.
CORNERmode:x,yarethecoordinatesoftheupperleftcorner;
wandharethewidthandtheheight.
CORNERSmode:x,yarethecoordinatesoftheupperleftcorner;
wandharethecoordinatesofthelowerrightcorner.
Aswiththeellipse,therectangledrawingmodeischangedbycallafunction,thistime
rectmode().
Thefunctiontriangle()draws—yes,atriangle.Itispassedthreepoints,whichistosay
sixparameters:triangle(10,10,20,20,30,30)drawsatrianglebetweenthethreepoints
(10,10),(20,20),and(30,30).
7.1.5 Text
Drawingtextisnotcomplicated,butchangingfontsismoreofaproblem.
Afontissavedonafileandhastobeinstalled.Ifafontisspecifiedbyaprogrambutdoes
notexist,theneitherthefinishedimagewilllookdifferentfromwhatwasanticipatedor
anerrorwilloccur.Drawingtextisperformedbyacalltotext():
text(”Hellothere.”,x,y)
Thisdrawsthetextstring“Hellothere!”inthegraphicswindowsothatthelowerleftof
thetextisatlocationx,y.Thedefaulttextsizeis12pixels,butcanbechangedbyacallto
textsize()passingthesizedesired.
7.1.6 Example:AHistogram
Ahistogramisawaytovisualizenumericaldata.Itisespeciallyusefulfordiscretedata
likecolorsorpoliticalpartiesorchoicesofsomekind,butcanalsobeusedforcontinuous
data.Itdisplayscountsofsomethingagainstsomeothervalue,acategory;percentageof
peoplevotingforspecificparties,ortheheightsofgradesixgirls.Itdrawsbarsofvarious
heightseachrepresentingthenumberofentriesineachcategory.Inthisexampletheonly
problem is the plotting of the histogram, but the more general programming problem
wouldincludecollectingandorganizingthedata.Inthiscasetheprogramwillreadadata
filenamed “histogram.txt” that containsa few key values. The programvariable names
andthecorrespondingdatafilevaluesare:
Whencreatingagraph,itpaystodesignitcarefully.Inthiscasethehistogramwillhave
thegeneralappearanceshowninFigure7.5a.Thisvisuallayouthelpswiththedetailsof
thecode,especiallyifthedesignhasbeendrawnongraphpapersothatthecoordinates
caneasilybedetermined.
Assumethatthevariablesneededhavebeenreadfromthefile(See:Exercise2).Here’s
whattheprogrammustdo:
Createawindowabout600x600pixelsinsize.
Drawthehorizontalandverticalaxes(120,80)
Drawthetitleandaxislabels.
Determinethewidthandheightofeachrectangle.
Foriinrange(0,ncategories)
Drawrectanglei
Drawlabeli
Development can now proceed according to the plan. Create a window, set the
background,andchangethefonttoHelvetica(afavorite):
startdraw(600,600)
background(180)
setfont(“Helvetica”)
Nowdrawtheaxes.Y-axis(vertical)from100,100to100,500;X-axis(horizontal)from
100,500to500,500.Useathickerlinebysettingthestrokeweightto3pixels.
strokeweight(3)
line(100,100,100,500)#Yaxis
line(100,500,500,500)#Xaxis
Thetitleisinalargefont(24pixels)atthetoppartofthecanvas(y=80)
textsize(24)#Titleusesabigfont
text(title,120,80)#It’satthetopofthedrawing
Thehorizontalaxislabelisinasmallerfont(14pixels)atthebottomofthecanvas(y-
580). It looks nicer if the text is basically centered. It’s hard to do this exactly because
thereisnoeasywaytofindouthowlongastringwillbewhendrawn.However,astring
that is 14 pixels in size and N characters long will be approximately N*14 pixels long
when drawn. The axis is 400 pixels wide, so the amount of leftover space will be 400-
N*14. Divide this in half to get the indentation of the left to approximately center the
string!
textsize(14)#Labelsuseamediumsizedfont
cx=(400-len(hlabel)*14)/2#Howmanypixelstoindent
text(hlabel,100+cx,580)#Centeredatthebottom
Drawing the vertical label is more difficult, and so it will be done later. Make it a
functionandmoveon
verticalLabel(vlabel)
It is time to draw the rectangles. The width of each one will be the same, and is the
width of the drawing area divided by the number of categories. The height will be the
heightofthedrawingareadividedbythemaximumvaluetobedrawn,maxsize.Compute
those values and set the line thickness to one pixel, then set a fill color. Using the
CORNER rectangle drawing mode is easiest for this application, as all that needs to be
figuredoutistheupperleftcoordinate—thewidthandheightarealreadyknown.
wid=(400-10)/ncategories#Widthofaboxinpixels
ht=390.0/maxsize#Eachvalueisthismanypixelshigh
strokeweight(1)#Rectangleoutline1pixelthick
fill(200,50,200)#Purplefill
rectmode(CORNER)#Cornermodeiseasiest
Nowmakealoopthatdrawseachrectangle.TheXpositionofarectangleisitsindex
timesthewidthofarectangle—that’seasy.Theheightoftherectangleisthevalueofthat
data element multiplied by the variable ht that was determined before. It’s also a good
ideatodrawthevaluebeingrepresentedatthetopofthebar,whichisjustaboveandto
therightoftherectangle’supperleft.
foriinrange(0,ncategories):
ulx=100+i*wid+2#UpperleftX
uly=500-val[i]*ht#UpperleftY
rect(ulx,uly,wid,val[i]*ht-2)#Heightisval[i]*ht
text(val[i],ulx+5,uly-2)#Drawthevalueatthetop
Finally,drawthelabelsforeachrectangle.ThesearebelowtheXaxis,centeredmore
orlesswithinthehorizontalregionforeachbin.ThelabelsstartattheYaxis(X=100or
so)andtheirlocationincreasesbythewidthofthebinduringeachiterationofthedrawing
loop.TheYlocationisfixed,at520—theXaxisis500.Finally,anattempttocenterthese
labelsisdoneinthesamewaythatitwasdoneforthehorizontallabel,buttheparameters
aredifferent.
x=100+2#StartatX=100withextraforthethinckline
textsize(10)#Useasmallsizefont
fill(255,255,255)#Textwillbewhite
foriinrange(0,ncategories):#foreachrectangle
cx=(wid-len(lab[i]*9))#Indexnttocenterthetext
ifcx<0:cx=0#Indentcan’tbenegative
text(lab[i],x+cx/2,520)#Drawthelabel
x=x+widNextlabelisonerectangleright
enddraw()
Drawingtheverticallabelinvolvespullingouttheindividualwordsanddrawingeach
oneonitsownpixelrow.Wordsareseparatedbyspaces(blanks),soonewayofdrawing
theverticaltextistolookforaspaceinthetext,drawthatword,thenmovedownafew
pixels,extractthenextword,drawit,andsoonuntilallwordshavebeendrawn.Thetext
willbedrawnstartingatX=12,andtheinitialverticalpositionwillbe200,movingdown
(increasingY)by40pixelsforeachword.ThisisdonebythefunctionverticalLabel(),
whichispassedthestringtobedrawn:
defverticalLabel(v):
lasti=0#Indexofthenextwordinthestring
x=12#StartdrawingatX=12
y=200#andY=200
foriinrange(0,len(v)):#Lookatallcharactersin
#thelabel
if(v[i]==”“):#Ablackindicatetheendofa
#word
text(v[lasti:i],x,y)#Drawtheword
y=y+40#Movedown40pxiels
lasti=i#endofthiswordisthe
#startofthenext
text(v[lasti:],x,y)#Drawthelastword
#afterexitingtheloop
This program is available on the disk as two examples: “viperHisto.py” and
“gradesHisto.py,”which drawhistograms of twodifferentsetsof data.The output from
thesetwoprogramsisshowninFigure7.6.
Thisisaminimalprogram,andwon’talwayscreateaniceimage.Labelsthataretoo
longandusetoomanycategoriescancausebadlyformattedgraphics.
7.1.7 Example:APieChart
A pie chart is really just a histogram where the relative size of the categories is
illustratedbyanangleinsteadoftheheightofarectangle.Eachclassisshownasapie-
shapedsliceofacirclewhoseareaisrelatedtoitsproportionofthewholesample.Pie-
shapedregionsare easy, becausearc()will drawthem. So, using the same examples as
before,lookspecificallyatthegradesdata:thereare38studentswhosegradesarebeing
displayed,andthereare360degreesinacircle.Acategoryof10 students,forexample
(suchasthosereceivinga“B”grade),willrepresentapieslicethatis10/38ofthewhole
circle,orabout95degrees.Theprocessseemstobetodeterminehowmanydegreeseach
categoryrepresentsanddrawapiesliceofthatsizeuntilthewholepie(circle)isusedup.
Createawindowabout600x600pixelsinsize
Drawthetitlelabel
Establishafillcolor
Foriinrange(0,ncategories)
DeterminetheangleAusedforthiscategoryi
DrawarcfrompreviousangleforAdegrees
Drawlabeliforthisslice
Changethefillcolor
Thelabelsmaypresentaproblem,astheymaynotfitinsidethepieslice.Itisprobably
besttodisplaythelabeloutsideofthesliceanddrawalinetotheslicethatrepresentsit.
Theprogramissimilartothatforthehistogram.So,beginningafterthelabelisdrawn:
Findthetotalnumberofelementsinallcategories—inthiscase,thenumberofstudents
intheclass.Thisisthesumofallelementsinval.
totalSize=0
r=255
fill(r,200,200)
foriinrange(0,ncategories):
totalSize=totalSize+val[i]
Eachcountval[i]inacategoryrepresentsval[i]/totalSizeoftheentiredataset,orthe
angle360.0*val[i]/totalSize.Theconstant360/totalSizewillbenamedanglePerCount.
Nowstartingatangle0degrees,createapie-shapedarcthesizeofeachcategory:
angle=0
foriinrange(0,ncategories):
span=val[i]*anglePerCount
arc(150,150,450,450,angle,span,PIESLICE)
Drawthelabel—thishasbeenleftforlater,socallafunction.
label(300,300,150,lab[i],angle,span)
Theangletostartdrawingmustbeincreasedsothatthenextarcstartswherethisone
leftoff:
angle=angle+span
Changethefillcolorsothateachpiepieceisadifferentcolor.Thecode
belowchangestheredcomponentjustalittle.
r=r-20
fill(r,200,200)
Figure7.7bshowsawaytodeterminewherealabelcouldgo;alinefromthecenterof
thecirclethroughtheouteredgepointsinthedirectionofthelabel.Simplyfindthexand
ycoordinates.Theycoordinateisthesineoftheangle*thedistancefromthecenter,and
the x coordinate is the cosine of the angle * the same distance. For a distance, use the
radius*1.5.Thefunctionlabel()cannowbewritten:
deflabel(xx,yy,r,s,a1,ap):
angle=a1+ap/2#Bisector=startangle+halfof
#span
d=r*1.25#Distance
x=cos(angle*3.1415/180.)*d+xx#Angleisradians
y=-sin(angle*3.1415/180.)*d+yy#Yisinverted
text(s,x,y)
TheresultisillustratedinFigure7.8.
There’sonemorethingthatcouldbeaddedtothepiechartprogram.Sometimes,oneof
thepiecesismovedoutofthecircletoemphasizeit.Itturnsoutthatthisusefulfeature
canbeimplementedinamannerverysimilartothewaythelabelsweredrawn.Findthe
bisectoroftheangleforthatsectionandbeforeitisdrawnidentifyanewcenterpointfor
that piece a few pixels down that bisector. This pulls the piece away from the original
circlecenter.
Thecodeisbrief,andisincludedbelowtobecomplete:
defpull(x,y,a1,ap):
angle=a1+ap/2#a1istheangle,apisthespan
d=12#Pulloutonlyasmallamount
#(12pixels)
y=-sin(angle*3.1415/180.)*d+y#Newcenter
#coordinates
x=cos(angle*3.1415/180.)*d+x
arc(x-150,y-150,x+150,y+150,a1,ap,PIESLICE)
Thisfunctioniscalledinsteadofthecalltoarc()forthepiecethatistobepulledout.A
sampleoutputisshowninFigure7.8b.
7.1.8 Images
Unlike the graphical components displayed so far, an image is fundamentally a
collectionofpixels.Acameracapturesanimageandstoresitdigitallyaspixels,andsoit
wasneveranythingelse.Displayinganimagemeansdrawingeachpixelintheappropriate
color,ascaptured.
Glibcanloadanddisplayimagesinalimitedfashion.Imagesresideinfilesofvarious
formats: JPEG, GIF, BMP, PNG. The same image in each format is stored in a quite
distinctway,anditcanrequirealotofcodejusttogetthepixelsfromtheimage.Python,
through tkinter, allows GIF image files to be read directly. Any image file can be
convertedinto a GIF by using one of a hundred conversion tools, includingPhotoshop,
Paint,andGimptonameafew.
ThefunctionloadImage()willreadaGIFimagefileandreturnanimageofasortthat
canbedisplayedinthegraphicswindow.Thefile“charlie.gif”isaphotoofcheckpoint
CharlieinBerlin,andhasbeenincludedontheaccompanyingdisc.Itcouldbereadintoa
Pythonprogramwiththecall:
im=loadImage(“charlie.gif”)
Thevariableimnowholdstheimage,andwhilethedetailsarenotcompletelyrelevant,
itisgoodtoknowthatim.widthandim.heightgivethewidthandheightoftheimagein
pixels.Displaying the image is a matterof calling the function image()and passing the
imageandthecoordinateswheretheupperleftcorneroftheimageistobeplaced.Acall
suchas:
image(im,0,0)
woulddisplaythecheckpointCharlieimage.
ThesmallestPythonprogram(usingGlib)thatcanloadanddisplayanimageisthus5
lines:
fromGlibimport*
startdraw(600,600)
im=loadImage(“charlie.gif”)
image(im,0,0)
enddraw()
However,thisdisplaystheimageinawindowthatisbiggerthanitis.ThatmaybeOK,
butoftenanimageistobeusedasabackgroundimage,andthesizeofthewindowshould
bethesameastheimagesize.Ifthatiswhatisneededthereisaspecialfunctiontocall:
imageSize().Itmustbecalledbeforestartdraw(),ofcourse,becauseitreturnsthesizeof
theimage,andhencethesizetobepassedtostartdraw().Itreturnsatuplethathasthe
widthasthefirstcomponentandtheheightasthesecond.
Openingawindowthatisthesamesizeasanimageisaccomplishedasfollows:
fromGlibimport*
s=imageSize(“charlie.gif”)
startdraw(s[0],s[1])
im=loadImage(“charlie.gif”)
image(im,0,0)
enddraw()
TheoutputofthisprogramisshowninFigure7.9.
Pixels
AnimageasreturnedbyloadImage()isabuilt-intypenamedPhotoImage.Itisreally
not designed to be used for much except displaying in the window, but some more
advanced operations are possible, if slow. For example, individual pixel values can be
extracted and modified. Glib offers a handful of low-level functions to make pixel
operationseasier.
Animageconsistsofrowsandcolumnsofpixels,andapixelisacolor.Thecolorofthe
pixelathorizontalpositionxandverticalpositionyofimagepiccanbeaccessedusing
thecall:
c=getpixel(pic,x,y)
The value of c is a color, and the components of this can be accessed using the
functions:
r=red(c)
g=green(c)
b=blue(c)
Settingthevalueofthepixelatlocation(x,y)isaccomplishedbycallingsetpixel().It
requiresthattheimage,thecoordinates,andthecolorbepassedasparameters:
setpixel(pic,x,y,c)
Thesefunctionsoperateonanimage,notdirectlyonthescreendisplay.Changestoan
imagewillonlybevisibleiftheyaredonebeforetheimageisdisplayed.
Example:IdentifyingaGreenCar
Thereisapatternherethatisimportanttorecognizewhenworkingwithimagesatthe
pixellevel—therasterscan.Allofthepixelsintheimageareusuallyexaminedoneata
timeusinganestedloop.Itwilllooklikethis:
foriinrange(0,Width()):
forjinrange(0,Height()):
#Dosomethingtopixel(i,j)
Thisexampleusescolortoidentifythepixelsthatbelongtoacarinanimage,asseen
in Figure 7.10. The problem requires identifying pixels that are “green” and somehow
makingthemstandoutintheimage.Whatisgreen?Allpixelshaveagreencomponent.
Whensomethingisgreen,thegreencomponentisthemostsignificantone;itislargerthan
theredandbluecomponentsbysomemargin.Inthiscasethatmarginwillbearbitrarily
setat20,andifitdoesnotworkthenitcanbemodified.Ifapixelisgreenitwillbesetto
black,otherwiseitwillbecomewhite;thiswillmakethepixelsthatbelongtothecarstand
out.
Theprogrambeginsbycreatingawindowandreadingintheimage:
startdraw(640,480)
im=loadImage(”eclipse.gif”)
Nowlookatallofthepixels,searchingforagreenone:
foriinrange(0,Width()):
forjinrange(0,Height()):
c=getpixel(im,i,j)#Getthecolorofthepixel
#(i,j)
Ifthepixelisgreen,thenchangeittoblack.Otherwise,changeittowhite:
ifgreen(c)>(red(c)+20)andgreen(c)>(blue(c)+20):#Green?
setpixel(im,i,j,cvtColor3(0,0,0))#Black
else:
setpixel(im,i,j,cvtColor3(255,255,255))#White
AnimagecanbesavedintoafilebycallingtheGlibfunctionsave():
save(im,“out.gif”)
TheoutputofthisprogramisshowninFigure7.10b.Therearesome“green”pixelsthat
donotbelongtothecar,butmostofthecarpixelshavebeenidentified.
Example:Thresholding
Imageprocessingisalargesubjectallbyitself,andthisparticularlibraryisnotthebest
choice for exploring it in detail. There are some basic things that can be done, and
commononesincludethresholding,edgeenhancement,noisereduction,andcount,allof
whichcanbedoneusingGlib.Thresholdinginparticularisanearlystepinmanyimage-
analysis processes. It is the creation of a bi-level image, having just black and white
pixels,fromagreyorcolorimage.Thepreviousexampleisdifferentfromthresholdingin
thataparticularcolorwasbeingsearchedfor.InthresholdingasimplegreyvalueT,the
threshold,isusedtoseparatepixelsintoblackandwhite:allpixelshavingavaluesmaller
thanTwillbeblack,andtheotherswillbewhite.
Glibhasonemorefunctionthatisuseful,especiallyinthiscontext.Thefunctiongrey()
willconvertacolorintoasimplegreylevel,whichisanintegerintherange0to255.It
finds the mean of the three color components. The thresholding program begins in the
samewayasdidthepreviousexample.Lookatthecolorofallofthepixelsintheimage,
oneatatime:
startdraw(640,480)
im=loadImage(”eclipse.gif”)
foriinrange(0,Width()):
forjinrange(0,Height()):
c=getpixel(im,i,j)
This is the standard scan of all pixels. Now convert the color c to a grey level and
comparethatagainstthethresholdT=128.Pixelshavingagreylevelbelow128willbeset
toblack,theremainderwillbewhite:
ifg<T:
setpixel(im,i,j,cvtColor3(0,0,0))
else:
setpixel(im,i,j,cvtColor3(255,255,255))
image(im,0,0)
save(im,“out.gif”)
enddraw()
Theresult,theimagedisplayedbythisprogram,isshowninFigure7.11.
Transparency
AGIFimagecanhaveonecolorchosentobetransparent,meaningthatitwillnotshow
upandanypixeldrawnpreviouslyatthesamelocationwillbevisible.Thisisveryhandy
ingamesandanimations.Imagesarerectangular,whereasmostobjectsarenot.Considera
smallimageofadoughnut;thepixelssurroundingitandintheholecanhavethepixelsset
tobetransparent,andthenwhentheimageisdrawnoveranotheronethebackgroundwill
beseenthroughthehole.
The transparency value must be set within the image by a program. Photoshop, for
example,candothis.ThenwhenPythondisplaystheimages,thebackgroundimagemust
bedisplayedfirst,followedbytheimageswithtransparency.Asanexample,Figure7.12a
shows a photo of the view through the rear and side windows of a Volvo. The window
glassarea,theplaceswheretransparencyisdesired,havebeencoloredyellow.Thecolor
yellowwasthenselectedinPhotoshopastransparent,andtheimagewassavedagainasa
GIF. A short Python program using Glib will display a background image and the car
imageoverit,andthebackgroundwillbeseenthroughthewindowregionsasinFigure
7.12b.Theprogramis:
fromGlibimport*
s=imageSize(“car.gif”)#Getsizeofimage
startdraw(s[0],s[1])#Openwindowoftheright
#size
s=loadImage(“perseus.gif”)#Backgroundimage
t=loadImage(“car.gif”)#Carimagewithtransparency
image(s,0,0)#Displaybackgroundfirst
image(t,0,0)#thendisplaythecarimage
enddraw()
7.1.9 GenerativeArt
Ingenerativeartanartworkisgeneratedbyacomputerprogramthatusesanalgorithm
createdbytheartist.Theartististhecreativeforce,thedesignerofthevisualdisplay,and
thecomputerimplementsit.TherearemanygenerativeartiststobefoundontheInternet:
onelistcanbefoundhere:
http://blog.hvidtfeldts.net/index.php/generative-art-links/
Muchgenerativeartisdynamic,whichistosayitinvolvesmotionand/orinteraction,
butmanyworksareequivalenttopaintingsanddrawings(static).Glibcouldbeatoolfor
helping create these sorts of generative art works. Unlike other sorts of computer
programs, those associated with art do not have a known predictable result that can be
affirmedascorrectornot.Itistruethatanartistbeginswithanideaofwhattheirwork
should look like and what the message underlying it is, but paintings, sculptures, and
generativeworksrarelyfinishthewaytheybegan.
So,eitherbeginwithanideaofwhattheimagewilllooklikeoradmitthatthewhole
thingisanexperimentandcouchtheideaintermsofasentenceortwo.Here’sonesuch
sentence:“Imagineacollectionofstraightlinesradiatingfromasetofrandomlyplaced
points within the drawing window, with each set of lines drawn in a saturated strong
color.”
NowanattemptwouldbemadetocreatesuchanimageusingthefunctionsthatGlib
offers. It is often the case that the first few tries are in error, but that one of them is
interesting.Anartistwouldpursuetheinterestingcourseinsteadofstickingtotheoriginal
idea,ofcourse.Hereisanexample:thecodebelowwaswrittenwiththeideathatitwould
produceacollectionoflinesradiatingfromthepoint(400,600)from0degrees(horizontal
right)to180degrees(horizontalleft)withthecolorvaryingslightly:
r=255
foriinrange(1,180,2):
stroke(r,128,128)
line(400,600,cos(i*conv)*500,sin(i*conv)*500)
r=r-0.5
Thecalltoline()shouldhavebeen:
line(400,600,400+cos(i*conv)*500,600-sin(i*conv)*500)
soas toinvert theY coordinate.Instead,this createda muchmore interestingimage.
Thecodeforoneofthefourloopsinthefinalcodeis:
x=randrange(100,Width())
y=randrange(100,Height()-100)
r=255
foriinrange(1,180,2):
stroke(128,r,128)
line(x,y,sin(i*conv)*500,cos(i*conv)*500)
r=r-0.5
Sometimesasmallerrorcanresultinamoreinterestingresult.Thisisrarelythecase
whenwritingscientificorcommercialsoftware.
Generativeartshouldbeunderthecontroloftheartist,butdoesuserandomelementsto
addinteresttotheimage.InthepieceSnowBoxesbyNoahLarsenasetofrectanglesis
drawn,butthespecificsizeandlocationoftheserectanglesisrandomwithinconstrained
parameters.Theoverallcolorisalsorandomwithinspecifiedboundaries.Eachrectangle
isdrawnasacollectionofwhitepixelswithadensitythathasbeendefinedspecifically
for that rectangle so that the image consists of spatters of white pixels that can be
identified as rectangular regions (Figure 7.14). Each time the program is executed a
different image is created. The program for Snow Boxes was originally written in a
languagecalledProcessing,butaPythonversionthatusesGlibis:
#Snowboxes
#OriginalbyNoahLarsen,@earlatron
fromGlibimport*
fromrandomimport*
startdraw(640,480)
background(randrange(0,75),randrange(150,255),
randrange(0,75))
fill(255,255,255)
foriinrange(0,10000):
point(randrange(0,Width()),randrange(0,Height()))
foriinrange(0,20):
xs=randrange(0,Width())
ys=randrange(0,Height())
xe=randrange(xs,xs+randrange(30,300))
ye=randrange(ys,ys+randrange(30,300))
forjinrange(0,10000):
point(randrange(xs,xe+1),randrange(ys,ye+1))
enddraw()
7.2 SUMMARY
Since the advent of Windows, computer graphics has been assumed as a feature of a
computer, but this has not always been true. Python does not have built-in features for
doing graphics, but the standard user interface library tkinter does, and the library that
accompanies this book, Glib, expands this and makes it more accessible. Drawing is
accomplished by setting pixels within a drawing window to a desired color. Colors are
specifiedbygivingtheamountofred,green,andbluethatcomprisethecolor.
Gliband most graphics libraries allow the userto draw lines, polygons,text, images,
andtosetpixels.Thesebasicfunctionsarecombinedbytheprogrammertocreatedesired
visualizations,suchashistogramsandpiecharts.
Exercises
1.WriteaPythonprogramtocreatetheimageshowninFigure7.15a.Theimageisgrey,
butthecolorsthataretobeusedtofillthecirclesaregivenastext.Youneednot
includethetextinyouroutput.
2.Drawasetof10linesseparatedhorizontallyby20pixels,eachparalleltotheline
specifiedbytheendpoints(10,20)and(200,421).Theselinesmaybeginanywhere
inthewindow.
3.Drawapyramidusingdarkgreybricks(rectangles)ascomponents.Thebaseofthe
pyramidistobe15brickshorizontally,andeachsuccessivelevelisonebricksmaller.
(Figure7.15b)
4.Drawacheckerboard.Eachsquareshouldbe20x20pixels,andthesquaresareredor
yellow,alternating.Acheckerboardis8x8squares.
5.Drawatriangle,asquare,aregularpentagon,andaregularhexagon.Labelthesewith
theirnamesastextabovetheshapes.
6.WriteaprogramtodrawavisualworkinthevisualstyleofPietMondrian’sfamous
rectangularcompositions,anexampleofwhichisshowninFigure7.15d.Could
triangularshapesbeusedinsteadofrectangles?
7.Modifythepiechartprogramsothatthedataisreadfromafilename
“piein.dat.”
8.Writeaprogramthatreadsthefilenameofanimageanddisplaystheimageina
windowthatiscorrectlysized.
9.Theprovidedimagenamed“digit.gif”containssomepixelsthatarepurered;thatis,
theyhaveapixelvalueof(255,0,0).Writeaprogramthatlocatesthesepixels,drawsa
circlearoundtheminadisplayoftheimage,andprintstheirxandycoordinates.
10.Anedgeinanimagehasthepropertythatthepixelvaluesononesideoftheedgeare
significantlydifferent(i.e.,morethan40levels)fromthoseontheotherside.Writea
pythonprogramthatreadsanimageandsetspixelsatverticaledgelocationstoblack
andallotherpixelstowhite;itthendisplaystheresultinawindow.Hints:convertthe
imagetogreyorselectonecolorvaluefortheedges;makeaworkingcopyofthe
image.
NotesandOtherResources
Tkinter-PythoninterfacetoTcl/Tk,https://docs.python.org/3/library/tkinter.html
Tkinter8.5reference.http://infohost.nmt.edu/tcc/help/pubs/tkinter/web/index.html
http://www.generativeart.com/
1.JohnF.Hughes,AndriesvanDam,MorganMcGuire,DavidF.Sklar,JamesD.Foley,
StevenK.Feiner,andKurtAkeley.(2013).ComputerGraphics:Principlesand
Practice,3rdEdition,Addison-WesleyProfessional.
2.RobinLanda,RoseGonnella,andStevenBrower.(2006).2DVisualBasicsfor
Designers,DelmarCengageLearning.
3.JeffreyMcConnell.(2005).ComputerGraphics:TheoryintoPractice,Jones&
BartlettLearning.
4.MattPearson.(2011).GenerativeArt,ManningPublications,ISBN-10:1935182625.
CHAPTER8
MANIPULATINGDATA
8.1Dictionaries
8.2Arrays
8.3FormattedText,FormattedI/O
8.4AdvancedDataFiles
8.5StandardFileTypes
8.6Summary
Inthischapter
A fair definition of Computer Science would be the discipline that concerns itself with
information.Computersareanenablingtechnology,butcomputerscienceislargelyabout
how to store, retrieve, represent, compress, display, transmit, and otherwise handle
information. Python happens to be pretty good at offering facilities for manipulating
information or, at a lower level, data. Data becomes information when a person can
interpretit,andinformationbecomesknowledgeonceunderstood.
Repeatingathemeofthisbook,dataonacomputerisstoredasnumbersnomatterwhat
itsoriginalformwas.Computerscanonlyoperateonnumbers,soanimportantaspectof
usingdata is therepresentation of complexthings as numbers.Based on manyyears of
experience, it seems to be possible in all cases, and the manner in which the data is
representedasnumbersisreflectedinthemethodsusedtooperateonthem.
Thischapterwillbeanexaminationofhowcertainkindsofdataarerepresentedandthe
consequencesinsofarascomputerprogramscanusethesedata.Pythoninparticularwill
beusedforthisexamination,althoughsomeofthediscussionismoregeneral.Ofcourse,
thediscussionwillbedrivenbypracticalthingsandbyhowthingscanbeaccomplished
usingPython.
Most data consist of measurements of something, and as such are fundamentally
numeric.Astronomersmeasurethebrightnessofstars,asanexample,andnotehowthey
vary or not as a function of time. The data consists of a collection of numbers that
represent brightness on some arbitrary scale; the units of measurements are always in
some sense arbitrary. However, units can be converted from one kind to another quite
simply,sothisisnotaproblem.Biologistsfrequentlycountthings,soagaintheirdatais
fundamentally numeric. Social scientists ask questions and collect answers into groups,
againanumericresult.Whatthingsarenot?
Photographsarecommonenoughinscienceandarenotnumericvaluesbutare,instead,
visual;theyrelatetoahumansensethatcanbeunderstoodbyotherhumanseasily,rather
thantoananalyticalapproach.Ofcoursemostphotographsareultimatelyanalyzedbya
computerthesedays,sotheremustbeawaytorepresentthemdigitally.Anotherhuman
sensethatisusedtoexaminedataishearing.Birdsmakesongsthatindicatemanythings,
includingwhattheyobserveandtheirwillingnesstomate.Soundsarevibrations,andcan
indicateproblemswithmachinery,theapproachofavehicle,thepresenceofapredator,or
thecurrentstateoftheweather.Touchislessoftenused,butisessentialinthecontrolof
objectsbyhumans.Apersoncontrollingadeviceatagreatdistancecanprofitfromthe
abilitytofeelthetouchofatoolacrossacomputernetwork.
Then there are search engines, which can be thought of as an extension of human
memoryandreasoning.Theabilityofhumanstoaccessinformationhasimprovedhugely
over the past twenty years. If the phrase “python data manipulation” is entered into the
Google search engine, over half a million results are returned. True, many may not
directly relate to the query as it was intended, but part of the problem will be in the
phrasingoftherequest.Bytheway,thefirstresponsetothequeryconcernsthepandas
modulefordataanalysis,whichmayinfacthavebeentherightanswer.
Howisallof thisdone?It doestakesome cleveralgorithmsandgood programming,
butitalsorequiresalanguagethatofferstherightfacilities.
8.1 DICTIONARIES
A Python dictionary is an important structure for dealing with data, and is the only
important language feature that has not been discussed until now. One reason is that a
dictionaryis more properly anadvanced structurethat isimplemented interms ofmore
basic ones. A list, for example, is a collection of things (integers, reals, strings) that is
accessed by using an index, where the index is an integer. If the integer is given, the
contentsofthelistatthatlocationcanberetrievedormodified.
A dictionary allows a more complex, expensive, and useful indexing scheme: it is
accessedbycontent.Well,byadescriptionofcontentatleast.Adictionarycanbeindexed
by a string, which in general would be referred to as a key, and the information at that
locationinthedictionaryissaidtobeassociatedwiththatkey.Anexample:adictionary
that returns the value of a color given the name. A color, as described in Chapter 7, is
specifiedbyared,green,andbluecomponent.Atuplesuchas(100,200,100)canbeused
torepresentacolor.Soinadictionarynamedcolorsthevalueofcolors[“red”]mightbe
(255,0,0)andcolors[“blue”]is(0,0,255).Naturally,itisimportanttoknowwhatnames
are possible or the index used will not be legal and will cause an error. So
colors[“copper”]mayresultinanindexerror,whichiscalledaKeyErrorforadictionary.
ThePythonsyntaxforsettingupadictionarydiffersfromanythingthathasbeenseen
before.Thedictionarycolorscouldbecreatedinthisway:
colors={'red':(255,0,0),'blue':(0,0,255),'green':(0,255,0)}
Thebraces{…}encloseallofthethingsbeingdefinedaspartofthedictionary.Each
entryisapair,withakeyfollowedbya“:”followedbyadataelement.Thepair“red”:
(255,0,0) means that the key “red” will be associated with the value (255,0,0) in this
dictionary.
Nowthenamecolorslookslikealist,butisindexedbyastring:
print(colors['blue'])
Theindexiscalledakeywhenreferringtoadictionary.That’sbecauseitisnotreally
anindex,inthatthestringcan’tdirectlyaddressalocation.Insteadthekeyissearchedfor,
andifitisalegalkey(i.e.,hasbeendefined)thecorresponding
dataelementisselected.Thedefinitionofcolorscreatesalistofkeysandalistofdata:
Whentheexpressioncolors[“blue”]isseen,thekey“blue”issearchedforinthelistof
allkeys.Itisfoundatlocation1,sotheresultoftheexpressionisthedataelementat1,
which is (0,0,255). Python does all of this work each time a dictionary is accessed, so
whileitlookssimpleitreallyinvolvesquiteabitofwork.
Newassociationscanbemadeinassignmentstatements:
colors['khaki']=(240,230,140)
Indeed,adictionarycanbecreatedwithanemptypairofbracesandthenhavevalues
givenusingassignments:
colors={}
colors['red']=(255,0,0)
...
Aswithothervariables,thevalueofanelementinadictionarycanbechanged.This
wouldchangetheassociationwiththekey;therecanonlybeonethingassociatedwitha
key.Theassignment:
colors['red']=(200.,0,0)
reassigns the value associated with the key “red.” To delete it altogether use the del()
function:
del(colors['blue'])
Other types can be used as keys in a dictionary. In fact, any immutable type can be
used.Henceitispossibletocreateadictionarythatreversestheassociationofnametoits
RGB color, allowing the color to be used as the key and the name to be retrieved. For
example:
names={}
names[(255,0,0)]='red'
names[(0,255,0)]='green'
Thisdictionaryusestuplesaskeys.Listscan’tbeusedbecausetheyarenotimmutable.
8.1.1 Example:ANaiveLatin–EnglishTranslation
Asuccessfullanguagetranslationprogramisdifficulttoimplement.Humanlanguages
are unlike computer languages in that they have nuances. Words have more than one
meaning,andmanywordsmeanessentiallythesamething.Somewordsmeanonething
inaparticularcontextandadifferentthinginanothercontext.Sometimesawordcanbea
noun and a verb. It is very confusing. What this program will do is substitute English
wordsforLatinones,usingaPythondictionaryasthebasis.
From various sites on the Internet a collection of Latin words with their English
counterpartshasbeencollected.Thisisatextfilenamed“latin.txt.”IthastheLatinword,
aspace,andtheEnglishequivalentonsinglelinesinthefile.Theprogramwillaccepttext
fromthekeyboardandtranslateitintoEnglish,wordbyword,assumingthatitoriginally
consisted of Latin words. The file of Latin words has 3129 items, but it should be
understoodthatonewordinanylanguagehasmanyformsdependingonhowitisused.
Manywordsaremissinginoneformoranother.
Thewaytheprogramworksisprettysimple.Thefileofwordsisreadinandconverted
intoadictionary.ThefilehasaLatinword,acomma,andanEnglishword,soalineis
read,convertedtoatupleusingsplit(),andtheLatinwordisusedasakeytostorethe
Englishwordintothedictionary.
Next, the program asks the user for a phrase in Latin, and the user types it in. The
phraseissplitintoindividualwordsandeachoneislookedupinthedictionaryandthe
Englishversionisprinted.This will notworkverywell ingeneral,butisa firststepin
creatingatranslationprogram.Thecodelookslikethis:
defload_words(name,dict): #Readthefileofwords
f=open(name,“r”)
s=f.readline() #Readonewordpair
whiles!=””: #exitwhenthefilehasbeen
#read
c=s.split(“,”) #Splitatthecomma
iflen(c)<2: #Possibleerror:nowords?
s=f.readline() #Readnextandcontinue
continue
sw=c[0].strip() #GetthelatinandEnglish
#words.
ew=c[1].strip()
iflen(ew)<=0: #OK?
s=f.readline() #Nope.Justskipit.
continue
ifew[-1]==“\n”: #Getrideoftheendline
ew=ew[0:-2]
dict[sw]=ew #Placeindictionary
s=f.readline() #Nextwordpairfromthefile
f.close() #Alwaysclosewhendone
dict={}
load_words(“latin.txt”,dict) #Readallofthewordpairs
inp=input(“Enteralatinphrase“) #GettheLatintext
whileinp!=””: #Done?
book=inp.split(”“) #Splitatwords
foriinrange(0,len(book)): #Foreachwordthisline
sword=book[i].lower() #Lowercase
try:
enword=dict[sword] #LookupLatinword
print(enword,end=””) #PrintEnglishversion
except:
print(sword,end=””) #Latinnotin
#dictionary
print(”“,end=””) #PrinttheLatin
print(“.”)
inp=input(“Enteralatinphrase“) #Doitagain
Of course translation is more complex than just changing words, and that’s all this
programdoes.Still,sometimesitdoesnotdotoobadly.AfavoriteLatinphrasefromthe
TVprogramTheWestWingis“Posthocergopropterhoc.”Giventhisphrasetheprogram
produced:
afterthisthereforebecauseofthis.
whichisaprettyfairtranslation.Tryinganother,“Alldogsgotoheaven”wassentto
anonlinetranslationprogramanditgave
omnescanesadcaelumireconspexerit
ThisprogramheretranslatesitbackintoEnglishas:
“alldogstoskygoconspexerit.”
Theword ‘conspexerit’ wasnot successfullytranslated,so itwas leftas itwas (the
onlineprogramtranslatesthatwordas“glance”).Thisisstillnotterrible.
Sadly,itmakesacompletehashoftheLord’sPrayer:
Paternosterquiesincaelissanctificeturnomentuum.
Adveniatregnumtuum.
Fiatvoluntastuasicutincaeloetinterra.
Panemnostrumquotidianumdanobishodieetdimittenobisdebitanostrasicutetnos
dimittimusdebitoribus.
Fiatvoluntastuasicutincaeloetinterra.
Amen
Isturnedinto:
fatherourthatyouareagainstheavensholynameyour.
downruleyour.
becomeslastyourasagainstheavenandagainstearth.
breadourdailydausdayanddimitteusdebitaourasandusforgivedebtors.
becomeslastyourasinheavenandinearth.
amen
A useful addition to the code would be to permit the user to add new words into the
dictionary. In particular, it could prompt the user for words that it could not find, and
perhaps even ask whether similar words were related to the unknown one, such as
“dimittimus”and“dimitte.”Ofcourse,beingabletohavesomebasicunderstandingofthe
grammarwouldbebetterstill.
8.1.2 FunctionsforDictionaries
The power of the store-fetch scheme in the dictionary is impressive. There are some
methods that apply mainly to dictionaries and that can be useful in more complex
programs.Themethodkeys()returnsthecollectionofallofthekeysthatcanbeusedwith
adictionary.So:
list(dict.keys())
isalistofallofthekeys,andthiscanbesearchedbeforedoinganycomplex
operationsonthedictionary.Thelistofkeysisnotinanyspecificorder,andiftheyneed
tobesortedthen:
sorted(dict.keys())
willdothejob.Thedel()methodhasbeenusedtoremovespecifickeysbutdict.clear()
willremoveallofthem.
The method setdefault() can establish a default value for a key that has not been
defined.Whenanattemptismadetoaccessadictionaryusingakey,anerroroccursifthe
keyhasnotbeendefinedforthatdictionary.Thismethodmakesthekeyknownsothatno
errorwilloccurandavaluecanbereturnedforit;None,perhaps.
dict.setdefault(key,default=None)
Otherusefulfunctionsinclude:
dict.copy() returnsa(shallow)copyofdictionary
dict.fromkeys() createsanewdictionarysettingkeysandvalues;e.g.,dict.fromkeys(
(“one”,“two”),3)creates{(“one”,3),(“two”,3)}
dict.items() returnsalistofdict’s(key,value)tuplepairs.
dict.values() returnslistofdictionarydict’svalues
dict.update(dict2) addsthekey-valuepairsfromdictionarydict2todict
TheexpressionkeyindictisTrueifthekeyspecifiedexistsinthedictionarydict.
8.1.3 DictionariesandLoops
Dictionaries are intended for random access, but on occasion it is necessary to scan
throughpartsorallofone.Thetrickistocreatealistfromthepairsinthedictionaryand
thenloopthroughthelist.Forexample:
for(key,value)indict.items():
print(key,”hasthevalue“,value)
Thekeysaregiveninaninternalorderwhichisnotalphabetical.Itisasimplematterto
sortthem,though:
for(key,value)insorted(dict.items()):
print(key,”hasthevalue“,value)
Byconvertingthedictionarypairsinalist,anyoftheoperationsonlistscanbeapplied
toadictionaryaswell.Itisevenpossibletousecomprehensionstoinitializeadictionary.
Forexample
d={angle:sin(radians(angle))foranglein(0,45.,90.,135.,180.)}
createsadictionaryofthesinesofsomeanglesindexedbytheangle.
8.2 ARRAYS
For programmers who have used other languages, Python lists have many of the
properties of an array, which in C++ or Java is a collection of consecutive memory
locationsthatcontainthesametypeofvalue.Listsmaybedesignedtomakeoperations
suchasconcatenationefficient,whichmeansthatalistmaynotbethemostefficientway
tostorethings.APythonarrayisaclassthatmimicsthearraytypeofotherlanguagesand
offersefficiencyinstorage,exchangingthatforflexibility.
Onlycertaintypescanbestoredinanarray,andthetypeofthearrayisspecifiedwhen
itiscreated.Forexample:
data=array('f',[12.8,5.4,8.0,8.0,9.21,3.14])
createsanarrayof6floatingpointnumbers;thetypeisindicatedbythe“f”asthefirst
parameter to the constructor. This concept is unlike the Python norm of types being
dynamicandmalleable.Anarrayisanarrayofonekindofthing,andanarraycanonly
holdarestrictedsetoftypes.
Thetypecode,thefirstparametertotheconstructor,canhaveoneof13values,butthe
mostcommonlyusedoneswillbe:
“b”AC++chartype
“B”AC++unsignedchartype
“i”:AC++inttype
“l”:AC++longtype
“f”:AC++floattype
“d”:AC++doubletype
Arraysareclassobjectsandareprovidedinthebuilt-inmodulearray,whichmustbe
imported:
fromarrayimportarray
Anarray is a sequence type, and has the basic properties and operations that Python
provides all sequence types. Array elements can be assigned to and can be used in
expressions, and arrays can be searched and extended like other sequences. There are
somefeaturesofarraysthatareunique:
frombytes
(s)
Thestringargumentsisconvertedintobytesequencesandappendedtothe
array.
fromfile(f,
num)
Readnumitemsfromthefileobjectfandappendthem.Aninteger,for
example,isoneitem.
fromlist(x) Appendtheelementsfromthelistxtothearray.
tobytes() Convertthearrayintoasequenceofbytesinmachinerepresentation.
tofile(f) Writethearrayasasequenceofbytestothefilef.
In most cases arrays are used to speed up numerical operations, but they can also be
used(andwillbeinthenextsection)toaccesstheunderlying
representationsofnumbers.
8.3 FormattedText,FormattedI/O
There is a generally believed theory among many users of data, including some
engineersandfinancialanalysts,thatifnumberslineupinnicecolumnsthentheymustbe
correct.Thisisobviouslynottrue,butappearancescanmatteragreatdeal,andnumbers
thatdonotlineupproperlyforeasyreadinglooksloppyandgivepeopletheimpression
thattheymaynotbeascarefullypreparedastheyshouldhavebeen.ThePythonprint()
functionasusedsofarsimplyprintsacollectionofvariablesandconstantswithnoreal
attentiontoaformat.Eachoneisprintedintheorderspecifiedwithaspacebetweenthem.
Sometimesthat’sgoodenough.
ThePythonversionssince2.7haveincorporatedastringformat()methodthatallowsa
programmertospecifyhowvaluesshouldbeplacedwithinastring.Theideaistocreatea
stringthatcontainstheformattedoutput,andthenprintthestring.Asimpleexampleis:
s=“x={}y={}”
fs=s.format(121.2,6)
Thestringfsnowcontains“x=121.2y=6.”Thebraceswithintheformatstringshold
theplaceforavalue.Theformat()methodlistsvaluestobeplacedintothestring,and
with no other information given it does so in order of appearance, in this case 121.2
followedby6.Thefirstpairofbracesisreplacedbythefirstvalue,121.2,andthesecond
pairofbracesisreplacedbythesecondvalue,whichis6.Nowthestringfscanbeprinted.
This is not how it is usually done, though. Because this is usually part of the output
process,itisoftenplacedwithintheprint()call:
print(“x={}y={}”.format(121.2,6))
wheretheformat()methodisreferencedfromthestringconstant.Noactualformattingis
donebythisparticularcall,merelyaconversiontostringandasubstitutionofvalues.The
way formatting is done depends on the type of the value being formatted, the most
commontypesbeingstrings,integers,andfloats.Anexamplewillbeilluminating.
8.3.1 Example:NASAMeteoriteLandingData
NASApublishesahugeamountofdataonitswebsites,andoneoftheseisacollection
ofmeteoritelandings.Itcoversmanyyearsandhasover4800entries.Thetaskassigned
hereistoprintanicelyformattedreportonselectedpartsofthedata.Thedataonthefile
has its fields separated by commas, and there are ten of them: name, id, nametype,
recclass,mass, Fall, year, reclat,reclong, and GeoLocation. The reportrequires that the
name,recclass,mass,reclatandreclongbearrangedinanicelyformattedsetofcolumns.
Readingthedataisamatterofopeningthefile,whichisnamed“met.txt,”andcalling
readline(),thencreatingalistofthefieldsusingsplit(“,”).Ifthisisdoneandthefields
aresimplyprintedusingprint(),theresultismessy.Anabbreviatedexampleis(simulated
data):
infile=open(“met.txt”,“r”)
inline=infile.readline()
whileinline!=””:
inlist=inline.split(“,”)
mass=float(inlist[4])
lat=float(inlist[7])
long=float(inlist[8])
print(inlist[0],inlist[3],inlist[4],inlist[7],
inlist[8])
inline=infile.readline()
infile.close()
Theresultis,aspredicted,messy:
AshdonH5121.1351998525487489.85924301385958-126.27404435776049
ArbolSoloH666.9477713434351625.567048824444797160.58088365396014
BaldwynL647.6388587105465-7.708508536783924-81.22266156597777
AnkoberL615.265523451122064-32.01862330869428102.31244557598723
AnkoberLL657.584802700693885-84.85880091616322106.31130649523368
AshCreekL662.13008952551615576.02832670618457-140.03422105516938
AlmahataSittaLL530.476879105555653-12.90674540458647.411816322674
Nothinglinesupincolumns,andthenumbersshowanimpossibledegreeofprecision.
Alsothereshouldbeheadings.
Thefirstfieldtobeprintediscalledname,andisastring;itisthenameofthelocation
wheretheobservationwasmade.Theprintstatementsimplyaddsaspaceafterprintingit,
and so the next thing is printed immediately following it. Things do not line up.
Formattingastringforoutputinvolvesspecifyinghowmuchspacetoallowandwhether
thestringshouldbecenteredoralignedtotheleftorrightsideoftheareawhereitwillbe
printed.Applyingaleftalignmenttothestringvariablenamedplacenameinafieldof16
characterswouldbedoneasfollows:
ꞌ{:16s}ꞌ.format(placename)
The braces, which have previously been empty, contain formatting directives. Empty
braces mean noformatting, and simply hold the place for a value. A full format could
containaname,aconversionpart,andaspecification:
{[name][‘!’conversion][‘:’specification]}
where optional parts are in square brackets. Thus, the minimal format specification is
‘“{}.” In the example “{:16s}” there is no name and no conversion parts, only a
specification.Afterthe“:”is‘16s,’meaningthatthedatatobeplacedhereisastring,and
that 16 characters should be allowed for it. It will be left aligned by default, so if
placenamewas“Atlanta,”theresultoftheformattingwouldbethestring“Atlanta,”left
aligned in a 16-character string. Unfortunately, if the original string is longer than 16
charactersitwillnotbetruncated,andallofthecharacterswillbeplacedintheresulting
stringevenifitmakesittoolong.
Torightalignastring,simplyplacea“>”characterimmediatelyfollowingthe“:”.So:
“{:>16s}”.format(“Atlanta”)
wouldbe“Atlanta.”Placinga“<”charactertheredoesaleftalignment(thedefault)and
“^” means to center it in the available space. The alignment specifications apply to
numbersaswellasstrings.
Thefirsttwovaluestobeprintedintheexamplearethecityname,whichisininlist[0]
andthemeteoriteclasswhichisinlist[3].Formattingtheseisdoneasfollows:
s='{:16s}{:10s}'.format(inlist[0],inlist[3])
Bothstringswillbeleftaligned.
Numericformats are morecomplicated. For integersthere is the totalspace to allow,
andalsohowtoalignitandwhattodowiththesignandleadingzeros.Theformatting
letterforanintegeris“d”,sothefollowingarelegaldirectivesandtheirmeaning:
Floatingpointnumbershavetheextraissueofthedecimalplace.Theformatcharacter
isoften“f,”butitcanbe“e”forexponentialformator“g”forgeneralformat,meaningthe
systemdecideswhethertouse“f”or“e.”Otherwise,theformattingofafloatingpointis
likethatofpreviousversionsofPythonandlikethatofCandC++:
Thenextthreevaluestobeprintedarefloatingpoint:themassofthemeteoriteandthe
location,aslatitudeandlongitude.Printingeachoftheseas7places,2totherightofthe
decimal,wouldseemtowork.Or,asaformat:“{:7.2f}.”
Thesolutiontotheproblemisnowathand.Thedataisreadlinebyline,convertedinto
alist,andthenthefieldsareformattedandprintedintwosteps:
infile=open(“met.txt”,“r”)
inline=infile.readline()
print(”PlaceClassMassLatitude
Longitude”)
whileinline!=””:
inlist=inline.split(“,”)
mass=float(inlist[4])
lat=float(inlist[7])
long=float(inlist[8])
print('{:16s}{:14s}{:7.2f}'.format(inlist[0],
inlist[3],mass),end=””)
print('{:7.2f}{:7.2f}'.format(lat,long))
inline=infile.readline()
infile.close()
Theresultis:
Place Class Mass Latitude Longitude
Bloomington L5 13.58 9.53 -150.85
Bogou LL6 121.09 -66.28 -53.08
Alessandria L4 106.11 63.68 10.96
BoXian L5 85.92 0.33 -50.28
Ashdon Eucrite-mmict 6.59 -88.22 -178.84
Berduc L6 111.76 -64.20 107.10
…
Therearemanymoreformattingdirectives,andahugenumberoftheircombinations.
8.4 ADVANCEDDATAFILES
File operations were discussed Chapter 5, but the discussion was limited to files
containingtext.Textiscrucialbecauseitishowhumanscommunicatewiththecomputer;
peopleare unhappy abouthaving to enterbinary numbers. On the other hand, textfiles
takeupmorespacethanneededtoholdtheinformationtheydo.Eachcharacterrequiresat
least one byte. The number 3.1415926535 thus takes up 12 bytes, but if stored as a
floatingpointnumberitneedsonly4or8dependingonprecision.
Thefilesystem onmost computersalso permitsa varietyof operationsthat havenot
beendiscussed.Thisincludesreadingfromanypointinafile,appendingdatatofiles,and
modifyingdata.Theneedforprocessingdataeffectivelyisamainreasonforcomputersto
exist at all, so it is important to know as much as possible about how to program a
computerforthesepurposes.
8.4.1 BinaryFiles
A binary file is one that does not contain text, but instead holds the raw, internal
representationofitsdata.Of course,allfileson acomputerdiskare binaryinthestrict
sense,becausetheyallcontainnumbersinbinaryform,butabinaryfileinthisdiscussion
does not contain information that can be read by a human. Binary files can be more
efficientthatotherkinds,bothinfilesize(smaller)andthetimeittakestoreadandwrite
them(less).Manystandardfilestypes,suchasMP3,existasbinaryfiles,soitisimportant
tounderstandhowtomanipulatethem.
Example:CreateaFileofIntegers
Thearraytypeholdsdatainaformthatismorenaturalformostcomputersthanalist,
andalsohasthetofile() methodbuilt in. If a collection of integers is to bewritten as a
binaryfile,afirststepistoplacethemintoanarray.Ifasetof10000consecutiveintegers
aretobewrittentoafilenamed“ints,”thefirststepistoimportthearrayclassandopen
theoutputfile.Noticethatthefileisopenin“wb”mode,whichmeans“writebinary”:
fromarrayimportarray
output_file=open('ints','wb')
Now create an array to hold the elements and fill the array with the consecutive
integers:
arr=array('i')
forkinrange(10000,20000):
arr.append(k)
Finally,writethedatainthearraytothefile:
arr.tofile(out)
out.close()
Thisfilehasasizelistedas40kbonaWindowsPC.Afilehavingthesameintegers
writtenastextis49kb.Thisisnotexactlyahugesavingofspace,butitdoesaddup.
Readingthesevaluesbackisjustassimple:
inf=open('ints','rb')
arrin=array('i')
forkinrange(0,10001):
try:
arrin.fromfile(inf,1)
except:
break
print(arrin[k])
inf.close()
Thetryisusedtocatchanendoffileerrorincaseswherethenumberofitemsonthe
fileisnotknowninadvance.Orjustbecausealwaysdoingsoisagoodidea.
Sometimesabinaryfilewillcontaindatathatisallofthesametype,butthatsituationis
not very common. It is more likely that the file will have strings, integers, and floats
intermixed. Imagine a file of data for bank accounts or magazine subscriptions; the
information included will be names and addresses, dates, financial values, and optional
data, depending on the specific situation. Some customers have multiple accounts, for
example.Howcanbinaryfilesbecreatedthatcontainmorethanonekindofinformation?
Byusingstructs.
8.4.2 TheStructModule
Thestructmodulepermitsvariablesandobjectsofvarioustypestobeconvertedinto
whatamountstoasequenceofbytes.Itisacommonclaimthatthisisinordertoconvert
between Python forms and C forms, because C has a struct type (short for structure).
However, many files exist that consist of mixed-type data in raw (i.e., machine
compatible) form that have been created by many programs in many languages. It is
possiblethatCissingledoutbecausethenamestructwasused.
Example:AVideoGameHighScoreFile
Videogameplayersneedlittleincentivetotryhardtowinagame,butformanyyearsa
specialrewardhasbeengiventothebetterplayers.Thegame
“remembers”thebestplayersandliststhematthebeginningandendofthegame.This
kindofegoboostisapartoftherewardsystemofthegame.Thegameprogramstoresthe
informationon afile indescending order ofscore. Thedata that issaved isusually the
player’snameorinitials,thescore,andthedate.Thismixesstringwithnumericdata.
Considerthattheplayer’snameisheldinavariablename,thescoreisanintegerscore,
andthe dateis aset ofthree stringsyear,month,andday. In this situation the size of
eachvalueneedstobefixed,soallow32charactersforthename,4foryear,2formonth,
and 2 for day. The file was created with the name first, then the score, then the year,
month,and day.The order matters because it will be read inthe sameorder thatit was
written.Onthefilethedatawilllooklikethis:
cccccccccccccccccccccccccccccccc iiii cccc cc cc
Player’sname Score Year Month Day
Each letter in the first string represents a byte in the data for this entry. The ‘c’s
representcharacters;the‘i’srepresentbytesthatarepartofaninteger.Thereare44bytes
in all, which is the size of one data record,which is what one set of related data is
generallycalled.Afilecontainstherecordsforalloftheelementsinthedataset,andin
thiscasearecordisthedataforoneplayer,oratleastonetimethattheplayerplayedthe
game.Therecanbemultipleentriesforaplayer.
Oneway to convert mixed data like thisinto a struct is to use the pack() method. It
takesaformatparameterfirst,whichindicateswhatthestructwillconsistofintermsof
bytes. Then the values are passed that will be converted into components of the final
struct.Fortheexampleherethecalltopack()wouldbe:
s=pack(“32si4s2s2s”,name,score,year,month,day)
Theformatstringis“32si4s2s2s”;thereare5partstothis,oneforeachofthevaluesto
bepacked:
32sisa32-characterlongstring.Itshouldbeoftypebytes.
iisoneinteger.However,2iwouldbetwointegers,and12iis12integers.
4sisa4-characterlongstring.
2sisa2-characterlongstring.
Otherimportantformatitemsare:
cisacharacter
fisafloat
disadoubleprecisionfloat
Thevaluereturnedfrompack()hastypebytes,andinthiscaseis44byteslong.The
highscorefileconsistsofmanyoftheserecords,allofwhicharethesamesize.Arecord
canbewrittentoafileusingwrite().So,aprogramthatwritesjustonesuchrecordwould
be:
fromstructimport*
f=open(“hiscores”,“wb”)
name=bytes(“JimParker”,'UTF-8')
score=109800
year=b”2015”
month=b”12”
day=b”26”
s=pack(“32si4s2s2s”,name,score,year,month,day)
f.write(s)
Readingthisfileinvolvesfirstreadingthestringofbytesthatrepresentedadatarecord.
Thenitisunpacked,whichisthereverseofwhatpack()does,andthevariablesarepassed
totheunpack()functiontobefilledwithdata.Theunpack()methodtakesaformatstring
asthefirstparameter,thesamekindofformatstringaspack()uses.Itwillreturnatuple.
Anexamplethatreadstherecordintheabovecodewouldbe:
fromstructimport*
f=open(“hiscores”,“rb”)
s=f.read(44)
name,score,year,month,day=unpack(“32si4s2s2s”,s)
name=name.decode(“UTF-8”)
year=year.decode(“UTF-8”)
month=month.decode(“UTF-8”)
day=day.decode(“UTF-8”)
The data returned by unpack are bytes, and need to be converted into strings before
beingusedinmostcases.Notetheinputmodeontheopen()callis“rb,”readbinary.
Afileinthisformathasbeenprovided,andisnamedsimply‘hiscore.’Whenaplayer
playsthegametheywillentertheirname;thecomputerknowstheirscoreandthedate.A
newentrymustbemadeinthe‘hiscore’filewiththisnewscoreinit.Howisthatdone?
StartwiththenewplayerdataforKarlHolter,withascoreof100000.Toupdatethe
fileitisopenedandrecordsarereadandwrittentoanewtemporaryfile(named“tmp”)
untiloneisfoundthathasasmallerscorethanthe100000thatKarlachieved.ThenKarl’s
recordiswrittentothetemporaryfile,andtheremainderof‘hiscores’iscopiedthere.This
createsanewfilenamed“tmp”thathasKarl’sdataaddedtoit,andinthecorrectplace.
Nowthatfilecanbecopiedto“hiscores”replacingtheoldfile,orthefilenamed“tmp”
canberenamedas“hiscores.”Thisiscalledasequentialfileupdate.
Renaming the file requires access to some of the operating system functions in the
moduleos;inparticular:
os.rename(“tmp”,“hiscores”)
8.4.3 RandomAccess
It seems natural to begin reading a file from the beginning, but that is not always
necessary. If the data that is desired is located at a known place in the file, then the
locationbeingreadfromcanbesettothatpoint.Thisisanaturalconsequenceofthefact
thatdiskdevicescanbepositionedatanylocationatanytime.Whynotfilestoo?
Thefunctionthatpositionsthefileataspecificbytelocationisseek():
f.seek(44)#Positionthefileatbyte44,
#whichisthesecondrecordinthehiscores
#file.
It’salsopossibletopositionthefilerelativetothecurrentlocation:
f.seek(44,1)#Positionthefile44bytesfromthis
#location,
#whichskipsoverthenextrecordin
#hiscores.
Afilecanberewoundsothatitcanbereadoveragainbycallingf.seek(0),positioning
the file at the beginning. It is otherwise difficult to make use of this feature unless the
recordsonthefileareofafixedsize,astheyareinthefile‘hiscores,’ortheinformation
onrecordsizesissavedinthefile.Somefilesareintendedfromtheoutsettobeusedas
randomaccessfiles.Thosefileshaveanindexthatallowsspecificrecordstobereadon
demand.This is verymuch like adictionary, buton afile. Assumingthat thescore for
playerArlenFranksisneeded,thenameissearchedforintheindex.Theresultisthebyte
offsetforArlen’shighscoreentryinthefile.
Arlen’srecordstartsatbyte352(8threcord*44bytes).Hejustplayedthegameagain
andimprovedhisscore.Whynotupdatehisrecordonthefile?Thefileneedstobeopen
for input and output, so mode “rb+,” meaning open a binary file for input and output,
wouldworkinthiscase.ThenpositionthefiletoArlen’srecord,createanewrecord,and
writethatonerecord.Thisisnew—beingabletobothreadandwritethesamefileseems
odd,butifthedatabeingwrittenisexactlythesamesizeastherecordonthefilethenno
harmshouldcomefromit.Theprogramis:
#readandprinthiscorefile
fromstructimport*
f=open(“hiscores”,“r+b”)#Openbinaryfile,inputand
#output
pos=44*8#Desiredrecordis8,44
#bytesper
f.seek(pos)#Seektothatpositionone
#thefile
s=f.read(44)#Readthetargetrecord
name=bꞌArlenFranksꞌ#Makeanewonewithanew
#score
score=100300
year=bꞌ2015ꞌ
month=bꞌ12ꞌ
day=bꞌ26ꞌ#Packthenewdata
ss=pack(“32si4s2s2s”,name,score,year,month,day)
f.seek(44*8)#Seektheoriginalposition
#again!
f.write(ss)#Writethenewdataover
#theold
f.close()#Closethefile
Thisworksfine,providedthatthepositionofArlen’sdatainthefileisknown.Itdoes
notmaintainthefileindescendingorder,though.
Example:MaintainingtheHighScoreFileinOrder
Thecircumstancesofthenewproblemarethataplayeronlyappearsinthehighscore
fileonceandthefileismaintainedindescendingorderofscore.Ifaplayerimprovestheir
score, then their entry should move closer to the beginning of the file. This is a more
difficultproblemthanbefore,butonethatisstillpractical.So,presumethataplayerhas
achievedanewscore.Theentireprocessshouldbe:
Gettheplayer’sold
score.
Readthefile,gettheplayer’srecord,unpackit.
Isthenewscorelarger? Ifnot,closethefile.Done.
Yes,sofindoutwhere
thescore
belongs,inthefile.
Lookatsuccessivelyprecedingrecordsuntiloneisfoundthathas
alargerscore.
Placethenewrecord
whereitbelongs.
Copytherecordsfromthenewpositionfortherecordaheadone
positionuntiltheoldpositionisreached.
Theprocessislikemovingaplayingcardclosertothetopofthedeckwhileleavingthe
other cards in the same order. It’s probably more efficient to move the record while
searchingforthecorrectposition,though.Eachtimethepreviousrecordisexamined,ifit
does not have a larger score then the record being placed is copied ahead one position.
Thisresultsinaprettycompactprogram,giventhenatureoftheproblem,butitisabit
trickytogetright.Forexample,whatifthenewscoreisthehighest?Whatifthecurrent
highscoregetsahigherscore?(See:Exercise11)
8.5 STANDARDFILETYPES
Everyone’s computer has files on it that the owner did not create. Some have been
downloaded; some merely came with the machine. It is common practice to associate
specific kinds of files, as indicated initially by some letters at the end of the file name,
withcertainapplications.Afilethatendsin“.doc,”forexample,isusuallyafilecreated
byMicrosoftWord,andafileendingin“.mp3”isusuallyasoundfile,oftenmusic.Such
fileshave a formatthat is understoodby existing softwarepackages, and someof them
(“.gif”)havebeenaroundforthirtyyears.
Eachfiletype hasbeen designedto makecertain operationseasy, andto pass certain
informationtotheapplication.Overtheyearsasetofdefactostandardshaveevolvedfor
howthesefilesarelaidout,andforwhatdataareprovidedforwhatkindsoffile.Andyet
most users and many programmers do not understand how these files are structured or
why.Manyusersdonotcare,ofcourse,andsomeprogrammerstoo,butopeningupthese
filestosomescrutinyisaneducationalexperience.
8.5.1 ImageFiles
Images have been processed using computers since the 1960s when NASA started
processingimagesat theJet Propulsion Laboratory.After someyearspeople (scientists,
mainly) decided that having standards for computer images would be useful. The first
formatswereadhoc,andbasedessentiallyonrawpixeldata.Rawdatameansknowing
what the image size is in advance, so headers were introduced providing at least that
information,leadingtotheTARGAformat(.tga)andtiff(TaggedImageFileFormat)in
themid-1980s.WhentheInternetandtheWorldWideWebbecamepopular,theGIFwas
invented, which compressed the image data. This was followed by JPEG and other
formatsthat couldbeused bywebdesignersand renderedbybrowsers, andeachhad a
specificadvantage.Afterall,reducingsizemeantreducingthetimeittooktodownloadan
image.
Onceafileformathasbeenaroundforafewyearsandhasbecomesuccessfulittends
tostickaround,somanyoftheimagefileformatscreatedinthe1980sarestillhereinone
formoranother.Therearenewonestoo,likePNG(PortableNetworkGraphics),which
have been specifically designed for the Internet. Older ones (like JPEG) have found
commonusesinnewtechnologies,likedigitalcameras.Aprogrammer/computerscientist
needstoknowaboutthenatureofthevariousformats,theirprosandconsasitwere.
8.5.2 GIF
TheGraphicsInterchange Formatisinteresting frommanyperspectives. First,it uses
compression to reduce the size of the file, but the compression method is not lossy,
meaningthattheimagedoesnotchangeafterbeingcompressedandthendecompressed.
ThecompressionalgorithmusediscalledLZW,andwillbediscussedinChapter10.GIF
usesacolormaprepresentation,soanelementintheimageisnotacolor,butinsteadisan
indexintoanarraythatholdsthecolor.Thatis,ifv=image[row][column]thenthecolor
ofthatpixelis(red[v],green[v],blue[v]).Thecoloritselfcouldbeafull24bits,butthe
valuevisabyte,andsoinaGIFtherecanonlybe256distinctcolors.GIFusesalittle-
endianrepresentation,meaningthattheleastsignificantbyteofmulti-byteobjectscomes
firstonthefile.
OneadvantageoftheGIFisthatoneofthecolorscanbemadetransparent.Thismeans
thatwhenthiscolorisdrawnoveranother,thecolorbelowshowsthrough.Itisessentially
a“donotdrawthispixel”value.Itisimportantforthingslikespritesincomputergames.
AnotheradvantageofGIFisthatmultipleimagescanbestoredinasinglefile,allowing
an animation to be saved in a single file. GIF animations have been common on the
Internetformanyyears,andwhiletheyusuallyrepresentsmall,briefanimationssuchas
Christmas trees with flashing lights, they can be as long and complex as television
programs.Still,thefactthattherecanonlybe256differentcolorscanbeaproblem.
AGIFisabinaryfile,butthefirstsixcharactersareaheaderblockcontainingwhatis
called a magic number, or an identifying label. For a GIF file the three characters are
always“GIF”andthenextthreerepresenttheversion;forthe1989standardthefirstsix
characters are “GIF89a.” Magic numbers are common in binary files, and are used to
identifythefiletype.Thefilenamesuffixdoesnotalwaystellthetruth.
Followingtheheaderisthelogicalscreendescriptor,whichexplainshowmuchscreen
spacetheimagerequires.Thisissevenbytes:
Canvaswidth 2bytes
Canvasheight 2bytes
Packedbyte 1byte
Asetofflagsandsmallvalues
Bit 8 765 4 321
Global color sort sizeof
Color resolution flag globalcolor
Table? table
Backgroundcolorindex 1byte
Pixelaspectratio 1byte
Thisis followed by the global colortable, other descriptors,and the imagedata. The
details can be found in manuals and online. The information in the first few bytes is
critical,though,andtheknowledgethatLZWcompressionisusedmeansthatthepixels
arenotimmediatelyavailable.Decompressionisdonetotheimageasawhole.
fromstructimport*
f=open(“test.gif”,“rb”)
s=f.read(13)#Readtheheader
id,ht,wd,flags,bci,par=unpack('6shhBBB',s)
#6shhBBB
f.close()
id=id.decode(“utf-8”)
print(id)
print(“Height”,ht,“Width”,wd)
print(“Flags:”,flags)
print(“Backgroundcolorindex:“,bci)
print(“Pixelaspectratio:”,par)
8.5.3 JPEG
AJPEGimageusesalossycompressionscheme,andsotheimageisnotthesameafter
compression as it was before compression. For this reason it should never be used for
scientific or forensic purposes when measurements will be made using the image. It
shouldneverbeusedforastronomy,forexample,
althoughitisperfectlyfineforportraitsandlandscapephotographs.
ThenameJPEGisanacronymfortheJointPhotographicExpertsGroup,andactually
refers to the nature of the compression algorithm. The file format is an envelope that
contains the image, and is referred to as JFIF (JPEG File Interchange Format). The file
headercontains20bytes:themagicnumberisthefirst4andbytes6–10.Thefirst4bytes
are hex FF, D8, FF, and E0. Bytes 6–10 should be “JFIF\0,” and this is followed by a
revisionnumber.Ashortprogramthatdecodestheheaderis:
fromstructimport*
f=open(“test.jpg”,“rb”)
s=f.read(20)#Readtheheader
b1,b2,a1,a2,sz,id,v1,v2,unit,xd,yd,xt,yt=unpack('BBBBh5sBBBhhBB',s)
#BBBBh5sBBBhhBB
f.close()
id=id.decode(“utf-8”)
print(id,“revision”,v1,v2)
ifb1==0xffandb2==0xd8:
print(“SOIchecks.”)
else:
print(“SOIfails.”)
ifa1==0xffanda2==0xe0:
print(“Applicationmarkerchecks.”)
else:
print(“Applicationmarkerfails.”)
print(“App0segmentis”,sz,“byteslong.”)
ifunit==0:
print(“Nounitsgiven.”)
elifunit==1:
print(“Unitsaredotsperinch.”)
elifunit==2:
print(“Unitsaredotspercentimeter.”)
ifunit==0:
print(“Aspectratiois“,xd,“:”,yd)
else:
print(“Xdensity:“,xd,”Ydensity:“,yd)
ifxt==0andyt==0:
print(“Nothumbnail”)
else:
print(“Thumbnailimageis“,xt,“x”,yt)
The compression scheme used in JPEG is very involved, but is does cause certain
identifiable artifacts in an image. In particular, pixels near edges and boundaries are
smeared,essentiallyaveragingvaluesacrosssmallregions
(Figure8.1). This can cause problems if a JPEG image is to be edited, for example in
PhotoshoporPaint.
8.5.4TIFF
TheTaggedImageFileFormathasapotentiallyhugeamountofmetadataassociated
withit,andthatisallintextforminthefile.It’safavoriteamongscientistsbecauseof
that:thedeviceusedtocapturetheimage,thefocallengthofthelens,time,subject,and
scoresofotherinformationcanaccompanytheimage.Infact,theTIFFhasbeenseconded
forusewithnumericnon-imagedataaswell.Theotherreasonitispopularisthatiscan
beusedwithuncompressed(raw)data.
ThewordTaggedcomesfromthefactthatinformationisstoredinthefileusingtags,
suchasmightbefoundinanHTMLfile—exceptthatthetagsinaTIFFarenotintext
form.Ataghasfourcomponents:anID(2bytes,whattagisthis?),adatatype(2bytes,
whattypearetheitemsinthistag?),adatacount
(4bytes,howmanyitems?),andabyteoffset(4bytes,wherearetheseitems?).Tagsare
identifiedbynumber,andeachtaghasaspecificmeaning.Tag257meansImageHeight
and256isImageWidth;315isthecodemeaningArtist,306meansDate/Time,and270is
the Image Description. They can be in any order. In fact, the whole file structure is
flexiblebecauseallcomponentsarereferencedusingabyteoffsetintothefile.
ATIFFbeginswithan8-byteImageFileHeader(IFH):
Byteorder:Thisis2bytes,andis“II”ifdataisinlittle-endianformand“MM”ifitis
big-endian.
VersionNumber:Always42.
FirstImageFileDirectoryoffset:4bytes,theoffsetinthefileofthefirst
image.
TheotherimportantpartofaTIFFistheImageFileDirectory(IFD),whichcontains
informationaboutthespecificimage,includingthedescriptivetagsanddata.TheIFHis
always8byteslongandisatthebeginningofthefile.AnIFDcanbealmostanysizeand
canbeanywhereinthefile;therecanbemorethanone,aswell.ThefirstIFDisfoundby
positioningthe file to the offsetfound in the IFH. Subsequentones are indicated in the
IFD.TheIFDstrictureis:
Numberoftags:2bytes
Tags:Arrayoftags,sizeunknown
NextIFDoffset:4bytes.FileoffsetofthenextIFD.Iftherearenomore,
then=0.
Thestructureofatagwasgivenpreviously,soaTIFisnowdefined.Theimagedata
canbe,andfrequentlyis,rawpixels,butcanalsobecompressedinmanywaysasdefined
bythetags.
The program below reads the IFH and the first IFD, dumping the information to the
screen:
#TIFF
fromstructimport*
f=open(“test.tif”,“rb”)
s=f.read(8)#ReadtheIFH
id,ver,off=unpack('2shL',s)
#2shL
id=id.decode(“utf-8”)
print(“TIFFIDis“,id,end=””)
ifid==“II”:
print(“whichmeanslittle-endian.”)
elifid==“mm”:
print(“whichmeansbig-endian”)
else:
print(“whichmeansthisisnotaTIFF.”)
print(“Version”,ver)
print(“Offset”,off)
f.seek(off)#GetthefirstIFD
n=0
b=f.read(2)#Numberoftags
n=b[0]+b[1]*256
#n=int(s.decode(“utf-8”))
foriinrange(0,n):
s=f.read(12)#Readatag
id,dt,dc,do=unpack(“hhLL”,s)
print(“Tag“,id,“type”,dt,“count”,dc,“Offset”,do)
f.close()
Whenthisprogramexecutes using “test.tif”as the input file, the firsttwo tags in the
IFDare256and257(widthandheight)whicharecorrect.
8.5.5 PNG
A PNG (Portable Network Graphics) file consists of a magic number, which in this
context is called a signature and consists of 8 bytes, and a collection of chunks,which
resembleTIFFtags.Thereare18differentkindsofchunk,thefirstofwhichisanimage
header.TheSignatureisalways:13780787113102610.Thebytes807871arethe
letters“PNG.”
Achunk haseither3or 4fields:alength field, achunktype,an optionalchunkdata
field, and a check code based on all previous bytes in the chunk that is used to detect
errors(calledacyclicredundancycheck,orCRC).
Theimageheaderchink(IHDR)hasthefollowingstructure:
Imagewidth: 4bytes
Imageheight: 4bytes
Bitdepth: 1byte.Numberofbitspersample(1,2,4,8,or16).
Colortype: 1byte.0(grey),2(RGB),3(colormap),4(greyscalewithtransparency)
or6(RGBwithtransparency)
Compression
method:
1byte.Always0.
Filtermethod: 1byte.Always0.
Interlace
method:
1byte.0=nointerlace.1=Adam7interlace
(See:references)
Thisfilehascompression,butitisnon-lossy.Italso,likeGIF,allowstransparency,but
allows full RGB color. It does not have an option for animations, though. Reading the
signatureandthefirst(IHDR)chunkisdoneinthefollowingway:
#PNG
fromstructimport*
b2=(137,80,78,71,13,10,26,10)#Correctheader
types=(“Grey”,””,“RGB”,“Colormap”,
“Greywithalpha”,””,“RGBA”)#Colortypes
f=open(“test.png”,“rb”)
s=f.read(8)#Readtheheader
b1=unpack('8B',s)
ifb1==b2:
print(“HeaderOK”)
else:
print(“Badheader”)
s=f.read(8)#ThenextchunkmustbetheIHDR
length,type=unpack(“>I4s”,s)#Unpackthefirst
8bytesprint(“Firstchunk:Lengthis”,length,“Type:”,type)
s=f.read(length)#Weknowthelength,readthechunk
wd,ht,dep,ctype,compress,filter,interlace=unpack(“>ii5B”,s)
#IIBBBBB
print(“PNGImagewidth=”,wd,“Height=”,ht)
print(“Imagehas“,dep,“bytespersample.”)
print(“Colortypeis“,types[ctype])
ifcompress==0:
print(“CompressionOK”)
else:
print(“Compressionshouldbe0butis”,compress)
iffilter==0:
print(“FilterisOK”)
else:
print(“Filtershouldbe0butis”,filter)
ifinterlace==0:
print(“Nointerlace”)
elifinterlace==1:
print(“Adam7interlace”)
else:
print(“Badinterlacespecified:“,interlace)
f.close()
8.5.6 SoundFiles
Asoundfilecanbealotmorecomplexthananimagefile,andsubstantiallylarger.To
properlyplaybackasound,itiscriticaltoknowhowitwassampled:howmanybitsper
sample,howmanychannels,howmanysamplespersecond,compressionschemes,andso
on.Thefilemustbereadableinrealtimeorthesoundcan’tbeplayedwithoutaseparate
decoding step. All that is really needed to display an image is its size pixel format and
compression.
There are, once again, many existing audio file formats. MP3 is quite complex, too
muchsotodiscusshere.TheusualoptiononaPCwouldbe“.wav”and,asithappens,
thatformatisnotespeciallycomplicated.
WAV
AWAVfilehasthreeparts:theinitialheader,usedtoidentifythefiletype;theformat
sub-chunk, which specifies the parameters of the sound file; and the data sub-chunk,
whichholdsthesounddata.
The initial header should contain the string “RIFF” followed by the size of the file
minus8bytes(i.e.,thesizefromthispointforward),andthestring“WAVE.”Thisis12
bytesinsize.
Thenext“sub-chunk”hasthefollowingform:
ID: =“fmt”
Size1: Sizeoftherestofthesub-chunk
Format: 1ifPCM,anothernumberifcompressed
No.ofChannels: mono=1,stereo=2,etc.
Samplerate: Soundsamplespersecond.CDrateis44100
Alignment: ShouldbeNo.ofchannels*samplerate*bitsper
sample/8
Bitspersample: AKAquantization.Bitsineachsample:8,12are
usual.
Thefinalsectioncontainsthe
following:
ID: =“data”
Size: Numberofbytesinthedata
Data: Theactualsounddata,asalargeblockofSizebytes.
Aprogramthatreadsthefirsttwosub-chunksis:
#WAV
fromstructimport*
f=open(“test.wav”,“rb”)
s=f.read(12)
riff,sz,fmt=unpack(“4si4s”,s)
riff=riff.decode(“utf-8”)
fmt=fmt.decode(“utf-8”)
print(riff,sz,“bytes“,fmt)
s=f.read(24)
id,sz1,fmt,nchan,rate,bytes,algn,bps=unpack
(“4sihhiihh”,s)
#4sihhiihh
id=id.decode(“utf-8)”)
print(“IDis”,id,“Channels“,nchan,“Samplerateis“,rate)
print(“Bitspersampleis“,bps)
iffmt==1:
print(“FileisPCM”)
else:
print(“Fileiscompressed“,fmt)
print(“Byteratewas“,bytes,“shouldbe“,rate*nchan*bps/8)
8.5.7 OtherFiles
Every type of file has a specific purpose and a format that is appropriate for that
purpose.Forthatreasonthenatureoftheheadersandthefilecontentsdiffer,butthefact
thattheheadersandotherspecificfieldsexistshouldbynowmakesomesense.Whena
programisaskedtoopenafilethereshouldbesomewaytoconfirmthatthecontentsof
the file can be read by the program. The code that has been presented so far is only
sufficienttodeterminethefiletypeandsomeofitsbasicparameters.Thecodeneededto
readanddisplayaGIF,forexample,wouldlikelybeover1000lineslong.Itisimportant,
forsomeonewhowishestobeaprogrammer,toseehowtoconstructafilesothatitcan
be used effectively by others and so that other programmers can create code that can
identifythatfileanduseit.
With that in mind, some other file types will be described briefly and considered as
examplesofhowtoorganizedataintoafile.
HTML
AnHTML(HyperTextMarkupLanguage) file is one that is recognizedby a browser
and can be displayed as a web page. It is a text file, and can be edited, saved, and
redisplayedusingsimpletools;thefancywebeditorsareuseful,butnotnecessary.
ThefirstlineoftextinanHTMLfileshouldbeeitheravariationon:
<!DOCTYPEhtml>
oravariationon:
<html>
The problem is that these are text files, so spaces and tabs and newlines can appear
without affecting the meaning. Browsers are also supposed to be somewhat forgiving
abouterrors,displayingthepageifatallpossible.Asimpleexamplethatshowssomeof
theproblemswhilebeinglargelycorrectis:
importwebbrowser
f=open(“other.html”)
html=False
whileTrue:#Lookatmanylines
s=f.readline()#Read
s=s.strip()#Removewhitespace
#(blanks,tabs)
s=s.lower()#Converttolowercasefor
#compare
k=(s.find(“doctype”))#doctypefound?
ifk>0:#Yes
kk=s.find(“html”)#Lookalsofor'html'
ifkk>=k+7:#Foundit,afterDOCTYPE
html=True#Closeenough
break
else:
k=s.find(“html”)#No'doctype'.'html'?
ifk>0ands[k-1]==“<”:#Yes.Precededby'<'?
html=True#Yes,Closeenough.
break
iflen(s)>0:#isthestringnon-blank?
html=False#Yes.SoitisnotHTML
#probably
break
ifhtml:
webbrowser.open_new_tab('other.html')
else:
print(“ThisisnotanHTMLfile.”)
Thisprogram usesthe webbrowsermodule of Pythonto display theweb page ifit is
one.Thecallwebbrowser.open_new_tab('other.html')opensthepageinanewtab,ifthe
browserisopen.Thismoduleisnotabrowseritself.Itsimplyopensanexistinginstalled
browsertodotheworkofdisplayingthepage.
EXE
ThisisaMicrosoftexecutablefile.Thedetailsoftheformatareinvolved,andrequirea
knowledge of computers and formats beyond a first-year level, but detecting one is
relativelysimple.ThefirsttwobytesthatidentifyanEXEfileare:
Byte0:0x4D
Byte1:0x5a
Itisalwayspossiblethatthefirsttwobytesofafilewillbethesetwoby
accident,butitisunlikely.Ifthefilebeingexaminedis,infact,anEXEfile,thenaPython
programcanexecuteit.Thisusestheoperatingsysteminterfacemoduleos:
importos
os.system(“program.exe”)
8.6 SUMMARY
AfairdefinitionofComputerSciencewouldbethedisciplinethatconcernsitselfwith
information.Computerscanonlyoperateonnumbers,soanimportantaspectofusingdata
istherepresentationofcomplexthingsasnumbers.Mostdataconsistofmeasurementsof
something,andassucharefundamentally
numeric.
A dictionary allows a more complex indexing scheme: it is accessed by content. A
dictionarycanbeindexedbyastringortuple,whichingeneralwouldbereferredtoasa
key,andtheinformationatthatlocationinthedictionaryissaidtobeassociatedwiththat
key.
A Python array is a class that mimics the array type of other languages and offers
efficiencyinstorage,exchangingthatforflexibility.Thestructmodulepermitsvariables
andobjectsofvarioustypestobeconvertedintowhatamountstoasequenceofbytes.It
hasapack()andanunpack()methodforconvertingPythonvariablesintosequencesof
bytes.
The string format() method allows a programmer to specify how values should be
placedwithinastring.Theideaistocreateastringthatcontainstheformattedoutput,and
thenprintthestring.
Pythondatacanbewrittentofilesinraw,binaryform.Itisalsopossibletopositionthe
fileatanybyteinabinaryfile,allowingthefiletobereadorwrittenatanylocation.
Exercises
1.Asktheuserforafilename.Extractthesuffixanduseittolookupthetypeofthefile
inadictionaryandprintashortdescriptionofit.Recognizedtypesincludeimagefiles
(jpg,gif,tiff,png),soundfiles(wav),andothers(dll,exe).
2.ModifytheLatintranslationprogramsothatitaskstheuserforatranslationofany
worditcannotfindandaddsthatwordtothedictionary.
3.Writeaprogramthatreadskeysandvalues(strings)fromtheconsoleandcreatesa
dictionaryfromthosedata.Whentheusertypestheword“done”thentheinputis
complete.Citynamesaregoodexamplesofvalues,andcouldrepresentthecitywhere
thepersonnamedinthekeylives.
4.ModifytheanswertoExercise3sothatafterthedataentryiscomplete,theusercan
enteravalueandtheprogramwillprintallofthekeysassociatedwiththatvalue.
5.Givenadictionary,writeafunctionwritedict()thatwritesthatdictionarytoafile,and
anotherfunctionreaddict()thatwillreadthatfileandrecreatethedictionary.For
simplicityassumethatthekeysaresimplenumbersorstrings.
6.ThePNMfileformatforimageshasthreetypesofimageintwoforms:monochrome,
grey,andcolor,savedastextorinbinaryform.Abinarygreylevelimageiscalleda
PGM(PixelGreyMap)andhasashortheaderfollowedbypixels.Theheaderistext
andconsistsoftheidentifyingcode“P5”followedbythewidthoftheimageinpixels
(NC),followedbytheheight(NR),followedbythemaximumvalueforagreylevel
(NGL)followedbyanendofline.NowthedatafollowsasrowsofNCbytes:
P5
ncnrngl
<imagepixels,1byteeach>
Writeaprogramthatwillreadanimagefileinthisformatanddisplayitonthescreen
asanimageusingGlib.
http://netpbm.sourceforge.net/doc/pgm.html
7.Assumethatthefollowingvariablesexistandhavetheobviousmeanings:year,
month,day,hour,minute,second.Allareintegerexceptsecond,whichisafloat.
TheISO8601standardfordisplayingdatesusestheformat:
YYYY-MM-DDThh:mm:ss.s
wheretheletter“T”endsthedateportionandbeginsthetime.Anexamplewouldbe
2015-12-27T10:38:12.3
Writeafunctionthattakesthegivenvariablesasparametersandprintsthedateinthis
format.
http://www.w3.org/TR/NOTE-datetime
8.WriteaPythonprogramthatopensafilenamedbytheuserfromthekeyboard;the
filehasthesuffix“.jpg,”“.gif,”or“.png.”Determinewhetherthefilecontentsagree
withthesuffix,andprintamessageindicatingtheresult.
9.Aconcordanceisalistofwordsfoundinatext.Buildaconcordanceusinga
dictionarythatkeepstrackofthenumberoftimesthatawordisused,inadditiontoits
merepresence.Printtheresultinglistinalphabeticalorder.
10.Writeaprogramthatprintsoutchecks.Thedateandthepayeeareenteredasstrings
andtheamountisenteredasafloatingpointnumber,themaximumamountbeing
1000dollars.TheFORfieldisalways“Books.”Theprogramformatsthecheck
accordingtothefollowingimage(Figure8.2),where:
Thedateisline3startingatcharacter58
Thepaytofieldisline6startingatcharacter20
Thenumericamountisline6startingatcharacter57
Thetextamountisline8character10
TheFORfieldisline12character15
NotesandOtherResources
Listoffreedatasetstodownload:https://r-dir.com/reference/datasets.html
NASAMeteoriteLandingDatabase:https://data.nasa.gov/view/ak9y-cwf9
String Formatting: https://infohost.nmt.edu/tcc/help/pubs/python/web/format-
spec.html
TheArraytype:https://docs.python.org/3/library/array.html
Image file formats:
https://www.library.cornell.edu/preservation/tutorial/presentation/table7-1.html
http://www.scantips.com/basics09.html
HomepageforMPEG:http://mpeg.chiariglione.org/
GIF1989specification:http://www.w3.org/Graphics/GIF/spec-gif89a.txt
Byte by byte GIF:
http://www.matthewflickinger.com/lab/whatsinagif/bits_and_bytes.asp
TIFFDescription:http://www.fileformat.info/format/tiff/egff.htm#TIFF.FO
PNGSpecification:http://www.w3.org/TR/PNG/
Adam7Interlacing:http://www.libpng.org/pub/png/pngpics.html
EXEfileformat:http://www.delorie.com/djgpp/doc/exe/
FileSignatures:http://www.garykessler.net/library/file_sigs.html
SamplePGMImages:http://people.sc.fsu.edu/~jburkardt/data/pgmb/pgmb.html
1.GunterBorn.(1995).TheFileFormatsHandbook,CengageLearningEMEA,
ISBN-13:978-1850321170.
2.DavidKay.(1994).GraphicsFileFormats,Windcrest,ISBN-13:978-0070340251.
3.Dr.CharlesR.Severance.(2013).PythonforInformatics:ExploringInformation,
CreateSpaceIndependentPublishingPlatform,ISBN-13:978-1492339243.
4.AlanTharp.(1988).FileOrganizationandProcessing,1stedition,JohnWiley&
Sons,ISBN-13:978-0471605218.
5.JohnWatkinson.(2001).MPEGHandbook,FocalPress,ISBN-13:978-0240516561.
CHAPTER9
MULTIMEDIA
9.1MouseInteraction
9.2TheKeyboard
9.3Animation
9.4RGBAColors–Transparency
9.5Sound
9.6Video
9.7Summary
Inthischapter
Foragreatmanypeoplecomputershavebecometheplatformofchoiceforthedeliveryof
entertainment,education,andinformation.Partofthereasonforthisistheubiquityand
speedoftheInternet,butthemainreasonisthatcomputerscandelivermediainalmost
any form: text and images, sound, video, animation, and mixtures of all of these. If
someonehassomethingtosay,thecomputercanpresentittotheworldinfullcolorand
5.1 channel sound. Moreover, the availability of free and inexpensive tools for content
creationallowsalmostanyonetobeamusicproducerorfilmdirector.
Pythoncanbeusedtoprocessanddisplaymostformsofmediathroughpackagesthat
can be downloaded and installed. There are many of these and multiple versions in a
bewilderingarrayofcombinations.ItisnotpossibletodiscussallofthewaysthatPython
canbeusedtodomultimediaandallofthepackagesandlibrariesthathelpprogrammers
implementthesethings.Afacilityhasalreadybeendescribedfordisplayingimagesand
graphics:Glib.WhynotsimplyaddmoremediacapabilitytoGlib,thusbuildingonwhat
has already been discussed? At the same time, of course, yet another module is being
addedtotheglobalmixture.
This extended version of Glib is built using an easily available module that must be
installed first—that module is Pygame. For the new version of Glib to work properly,
Pygamemustbeinstalledonthehostcomputerfirst.Thisshouldnotbedifficult,butthe
processvariesdependingontheoperatingsystemandthenatureofthecomputer(e.g.,32
bitor64bit)sotheprocesswillnotbedescribedhere.TheResourcessectionattheendof
thechapterprovideslinksand
referencesthatwillbehelpful.
ItisessentialtoinstallaversionofPygamethatworkswithPython3.
TherearetwoversionsofGlib.TheversionthatwasusedinChapter7usestkinterasa
basisandshouldnotrequireanythingextratobeinstalled.ItisreferredtoasStaticGlib,
whereasthenew,extendedversionisDynamicGlib.DynamicGlibisupwardscompatible
from Static Glib in that all programs that run using Static Glib should also run using
DynamicGlib,andproduceaverysimilaroutput.Therearesomedifferencesbetweenthe
two,such as font styles and such.There are also new programming idioms that will be
neededwhenusingDynamicGlibonaccountofthedynamicnatureofsomeofthemedia
forms.Infact,onewaytolookatitistothinkofStaticGlibandDynamicGlibasbeing
twodifferentoperatingmodesofthesamelibrary.DynamicGlibisusedindynamicmode,
whereinteractionwiththeusercanoccur,thegraphicsscreencanchange,andsoundcan
beplayed.
Dynamic mode will be explained using the example of mouse positions and button
clicks, which represent a form of dynamic interaction that most people would have
experienced.Followingthat,animation,video,andsoundcanbe
discussedandcombinedintointerestingprojects.
9.1 MOUSEINTERACTION
Using mouse position and button presses is a basic form of communication with a
computer.Theuseofthemousepositiontoactivatesomevisualdeviceonthescreenlike
a button is familiar to everyone who uses a computer, although it is being gradually
replacedbytouchscreens.Theideaisthatwhentheusermovesthemouse,acursoror
indicator moves correspondingly. The position of this cursor indicates a point on the
screen that is active in some way, and if a graphical device is there then it can be
manipulatedusingthemousebuttons.Theproblemisthatamousebuttonpresscanoccur
atanytime; itisunpredictable.This iswhat programmers callan event: something that
happensat an unpredictable moment that must be dealt with. Somesoftware someplace
mustbewatchingthemouseatalltimes,determiningthex,ycoordinatesofthecursoron
thescreenanddrawingthecursorinthecorrectplace.
TheGlibmodulekeepstrackofthemouseusingPygameandcontinuallyupdatesthe
position,whichcanbeaccessedusingfunctions:mouseX()andmouseY(),whichreturn
themost recent xand y coordinates.However, if auser’sPythonprogram is executing,
howcantheGlibsystemalsorunandupdatethemouseposition?Itcannot.So,adynamic
modeprogramgivesupcontrolandletsGlibcontrolmostofthework.
AdynamicGlibprogramconsistsoftwoparts:aninitializationpartandadrawingpart.
Initializationtakesplaceonlyonce,whentheprogramstartsexecuting;itcantakeplacein
afunctioncalledinitialize(),orit can beinthemain program.Glibwill calla function
namedinitialize()once,ifitexists.
Thepart of the computer thatdraws willbe codedas afunction nameddraw(). Glib
willcallthisfunctionmanytimeseachsecond,andtheprogrammerisexpectedtoredraw
the graphics window each time. This scheme allows the mouse position to update very
frequentlyandallowstheprogrammertoaccessthemostrecentmousecoordinatesfrom
within the draw() function, which can take the place of the main program. A simple
example of the dynamic mode and use of the mouse is a program that moves a circle
around the screen. The scheme described here is something that will be familiar to
programmersoftheProcessinglanguage.
Example:DrawaCircleattheMouseCursor
UsingGlibrequirestheuseofthefunctionstartdraw()toinitializethings,inparticular
toestablishthesizeofthedrawingwindow.InDynamicGlibthecalltostartdraw()will
also call the user’s initialize() function if one exists. Other code can appear between
startdraw()andenddraw(),usuallycalculations
andinitializations.Onceenddraw()iscalledcontrolpassestotheuser’sdraw()function.
Itwillbecalled30timespersecondbydefault,althoughthisratecanbechanged.
Drawing a circle at the current mouse position involves repeatedly determining the
mousepositionandthendrawingacircleatthatsetofcoordinates.Thisshouldbedone
withindraw().Initializationinthisprogramistrivialsono
initialize()functionisneeded.
ImportGlib
defdraw():
Glib.background(200)
Glib.fill(255,0,0)
Glib.ellipse(Glib.mouseX(),Glib.mouseY(),30,30)
Glib.startdraw(400,400)
Glib.enddraw()
The result is a red circle that follows the mouse! The draw() function sets the
backgroundcolortoagreylevelof255,thensetsthefillcolortored,thendrawsthecircle
(ellipse).Itdoesthis30timesper second,everytimedraw()iscalled.Themostrecent
mousepositionisalwaysfoundusingthefunctionsGlib.mouseX()andGlib.mouseY().It
seemslikeitshouldnotbenecessarytosetthefillcoloreachtime,sothatcanbeputin
themainprogrambetweenstartdraw()andenddraw()orintheinitialize()function:
Inmain: Ininitialize():
importGlib
importGlib
defdraw():
Glib.background(200)
Glib.ellipse (Glib.mouseX(),
Glib.mouseY(),30,30)
Glib.startdraw(400,400)
Glib.fill(255,0,0)
Glib.enddraw()
definitialize():
Glib.fill(255,0,0)
defdraw():
Glib.background(200)
Glib.ellipse (Glib.mouseX,
Glib.mouseY,30,30)
Glib.startdraw(400,400)
Glib.enddraw()
Isitnecessary tocallthe background()function eachtimedrawis called?Yes. This
functionnotonlysetsthebackgroundcolorbutfillsthescreenwithit,thuserasingwhat
hasbeendrawnsofar.Unlessacalltobackground()occursindraw(),allofthecircles
drawntothatpointwillbevisible(Figure9.1).
Ifthestatement:
fromGlibimport*
appearsatthebeginningoftheprogramthentheGlibfunctionswon’thavetobeprefixed
withthenameGlib.VariablesthatbelongtoGlibstilldo.Theimport*allowsallofthe
namesfromGlib to be used in the program, but is not always recommended because it
complicates the collection of names to be remembered. Still, for the purposes of this
chapter,itwillbeused.
Example:ChangeBackgroundColorUsingtheMouse
Theideahereistochangethebackgroundcolorbasedonthemouseposition.Thereare
only two directions to move, horizontally or vertically, so one of the three colors will
remainconstant;letthatcolorbeblue.Thehorizontalmousepositionwillcontrolthered
value,with the leftmost positionrepresenting no red and the rightmostrepresenting full
red(255).Similarly,themousebeingatthebottomoftheimagerepresentsnogreen,and
at the top it represents full green. The background color will be changed in draw()
accordingly.
GiventhatthepositionofthemouseonthescreenisgivenbymouseX(),thevalueof
theredcoordinatewillbe(mouseX()/width*255).Itmayrequireachangeinxcoordinate
ofmultiplepixelstoshiftthecolorbyoneunit.Asimilarexpressionisusedtochangethe
greenvalue.
Theprogramis:
fromGlibimport*
importGlib
defdraw():
r=int((Glib.mouseX/width)*255.0)
g=int((Glib.mouseY/height)*255.0)
background(r,g,128)
startdraw(400,400)
enddraw()
9.1.1 MouseButtons
Mouse button clicks, as they are called, can be retrieved by writing a function that
handles them. Each time a mouse button is pressed, Glibtries to calla function named
mousePressed().Ifthereisnosuchfunction,that’sOK,andnothingelsehappens.Ifthe
userwritesafunctionnamedmousePressed(),thenitwillbeexecuted.Similarly,whena
mousebuttonisreleased,ittriestocallmouseReleased().Ifthemousebuttonispressed,
then the mouse is moved, and then it is released, the coordinates of the press and the
releasepointwillbedifferent,andbothcanberetrieved.Forexample,when the mouse
buttonis pressed,the mouse coordinatescould be savedas the beginningof a line,and
whenreleasedthecoordinatescouldbetheendoftheline.Multiplelinescouldbedrawn
inthisway.
Example:DrawLinesUsingtheMouse
Usingtheschemedescribedabove,thefunctionmousePressed()willstorethemouse
position in global variables x0 and y0, and mouseReleased() will store the release
coordinatesatx1andy1.mouseReleased() will also draw the line from (x0,y0)to (x1,
y1):
fromGlibimport*
importGlib
x0=x1=y1=y0=0
defmousePressed(b):
globalx0,y0
x0=Glib.mouseX()
y0=Glib.mouseY()
defmouseReleased(b):
globalx1,y1
x1=Glib.mouseX()
y1=Glib.mouseY()
line(x0,y0,x1,y1)
startdraw(400,400)
enddraw()
Notethatthereisnodraw()functionandnoinitialize()function.Drawingisperformed
insideofmouseReleased(),andnoinitializationisneeded.Thisisararesituation.These
functionsacceptoneparameter,whichisthenumberofthebuttonthatwaspressed:leftis
0, middle is 1, and right is 2. Note that all buttons could be pressed before any are
released.Inthisexampleitdoesnotmatterwhatbuttonispressed;theresultisthesame.
These functions are traditionally called callback functions. The occurrence of some
eventcausesthefunctiontobecalled.
Example:AButton
This example will change the background color of the drawing window when a
graphicalbuttonispressed.Abutton,intheuserinterfacesense,isarectangularregionon
thecomputerscreenthatrespondstoamouseclickwithaspecificaction.Itisatwo-part
process: when the mouse cursor enters the rectangular region, the button is said to be
activated.Sometimesitwillbecausedtochangecoloratthispoint,orsomeotheraction
will be performed that indicates that it is ready to function. When a mouse button is
pressed while the button is activated, then some action occurs, usually as defined by a
function being called. The basic idea is simple enough to implement, although some
buttonscanhavecomplexactionssuchassounds,images,andirregularshapes.
Thecursoriswithinarectangularregionwhenitscoordinatesaregreaterthantheupper
left coordinate of the rectangle and smaller than the lower right coordinates. When that
occursthebuttonisreadytobepressed,andshouldchangecolor.Thisdoesnotrequire
anythingbutknowledgeofthemousecoordinates.Ittheleftbuttonispressedinthisstate
(activated), then the action defined by the button will occur; the background color will
change,inthiscase.Theprogrambeginsasnormal,withimportsandinitialization.Here
isaprogramthatdoesthisforabuttonat(100,100)thatis60x20pixelsinsize:
fromGlibimport*
fromrandomimport*
x0=100#upperleftbuttonposition
y0=100
w=60#Buttonsize
h=20
bc=cvtColor(200)#Initialcolor
active=False#Isthebuttoncurrentlyactive?
defdraw():
globalbc,w,h,x0,x1,y0,y1,active
background(red(bc),green(bc),blue(bc))
#Setbackgroundcolortobc
x=mouseX()#Isthemouseintherectangle?
y=mouseY()
ifx>x0andx<x0+wandy>y0andy<y0+h:
fill(50,200,50)#YES.Buttonisactive.Green
active=True
else:
fill(200,50,50)#NO.Buttonisinactive.Red
active=False
rect(x0,y0,w,h)#Drawthebutton
defmouseReleased(b):
globalactive,bc
ifactiveandb==0:#Buttonactive?Leftbutton
#released?
#Ifsogeneratearandom
#backgroundcolor.
bc=(randrange(100,200),randrange(100,200),randrange(100,200))
startdraw(400,400)
enddraw()
Allofthesoftwarebuttonseverywhereworkinbasicallythisway.
9.2 THEKEYBOARD
Like mouse motions and button presses, pressing a key on the keyboard is an event.
Likebuttonpresses,akeypressisasingleeventwithmultipleoptions.Thefactthatakey
hasbeenpressedisanevent,andexactlywhichkeyitwasisa
detail,justasitwaswhenamousebuttonwaspressed.Itisimportanttounderstandthat
using a function such as input() will not be successful when trying to read from the
keyboardwithanevent-drivensystem,althoughknowingabouteventscanbevaluablein
understanding how input() could be implemented. When input() is called it does not
returnuntilalinehasbeenread;keyPressed()capturesthekeypressevent.Itappearsthat
acalltoinput()mayinvolvemanykeypressevents.Whatsoftwarereceivesthem?That
istheimportantquestion.Thesituationisreallytooconfusingtoberesolvedsensibly,so
theruleis:neveruseinput()andrelatedfunctionswhenhandlingkeypresses.ItisOKto
callprint()becauseitisprintingtoaconsoledeviceforwhichnoconflictexists.
Everykeypresswilleventuallycorrespondtoakeyrelease,sotherearetwocallback
functionsagain:
keyPressed(k): Calledwhenakeyispressed.Parameterkisthekeythatwaspressed.
keyReleased(k): Calledwhenakeyisreleased.Parameterkisthekeythatwasreleased.
Theparametersarenotcharactersinthenormalsense,butarenumericcodesthatcan
identify the character. These are based on the Pygame character constants, but extends
them slightly. Table 9.1 gives a list of all of the constants provided by Glib. As an
example,ifaprogrammustrecognizewhentheuparrowkeyispressed,thekeyPressed
()functionthatwoulddothisis:
defkeyPressed(k):
ifk==K_UP:
print(“Uparrowkeypressed.”)
Inanevent-drivenprogramitisunusualforkeypressestobeconvertedintostrings,as
theynormallywouldbeinatypicalconsole-styleprogram.That’sbecauseitisexpected
thattheinterfacetotheevent-drivenprogramwillbethroughmousegesturesandusing
singlekeycommandsfromthekeyboard,like“up
arrow”meaning“moveforward.”
Example:Pressinga“+”CreatesaRandomCircle
Thisprogramwilldrawacircleatarandomlocationwhenthe“+”keyispressed.Old
circles will remain. This illustrates the use of the keyboard in an obvious way. The
initialization is to clear the screen and set the background color and fill color. The
keyPressed()functiongeneratesrandomx,ycoordinatesanddrawsacirclethere:
fromGlibimport*
fromrandomimport*
defkeyPressed(k):
ifk==K_PLUS:
ellipse(randrange(0,width),randrange(0,height),30,30)
startdraw(400,400)
fill(200,0,0)
background(200)
enddraw()
Thekeyvalueispassedasaninteger,butthebuilt-infunctionchr()willconvertthatto
thepropercharacterinmanycases.WhatispossiblytheshortestfunctionalGlibprogram
simplyreadsthekeysandprintsthemontheconsole:
fromGlibimport*
defkeyPressed(k):
print(chr(k))
startdraw()
enddraw()
Thestartdraw()functionhasdefaultparameters,andcreatesa50x50pixelwindowif
nosizeisspecified.Thisprogramdoesnotdrawanything,sodraw()isnotneeded.
Example:ReadingaCharacterString
Therearesomereasonswhyanevent-drivenprogrammightwishtoreaddatafromthe
user as a string. Perhaps a name is required, or a key value to access a database, or a
password.Whateverthereason,itshouldbepossibletoreadastringusingkeyPressed().
The way it would normally be done is to read one character at a time, normal for
keyPressed(),andconstructastringbyconcatenation.That’showthisprogramworks:
fromGlibimport*
s=””
t=””
defkeyPressed(k):#kisthevalueofthekeythat
#waspressed
globals,t
ifk==K_RETURN:#TypingRETURNendsthestring
#construction
t=s
s=””
return
ifk==K_BACKSPACEandlen(s)>0:#Deletethe
#previouscharacter
s=s[0:len(s)-1]#Shortenthestringbyone
#character
else:
s=s+chr(k)#Appendthenewcharacterto
#thestring
defdraw():
globals,t
background(200)
text(“Enterastring:“,10,100)
text(s,20,130)
if(t!=””):
text(“Completedstringis“+t,20,150)
startdraw(200,200)
enddraw()
Theglobalvariablesholdsthestringbeingbuilt,andthestringtholdsthefinalstring.
Characters are captured from the keyboard by keyPressed() and fall into one of three
categories:
1.Mostcharactersareaddedtotheglobalstringsthroughconcatenation.The
characterpassedtokeyPressed()isaninteger.Thechr()functionconvertsittoa
characterwhichisaddedtotheendofs.
2.ABACKSPACEwilldeletethelastcharactertypedfromthestring.Thisisdone
usingasubstringfrom0tothesecondlastcharacter.
3.ARETURNwillendthestring.Thecurrentstringinswillbeassignedtot,ands
willberesettoanemptystring.
Thiskindofstringdataentryisespeciallyusefulwhenenteringfilenamesandnumeric
parameters. There are frequently special interface objects (widgets) that perform these
tasks,suchastextboxes.Glibcouldbeusedtoimplementsuchawidget(see:Exercise6).
9.3 ANIMATION
Makinggraphicalobjectschangepositionissimple,butmakingthemseemtomoveis
more difficult. Animation is something of an optical illusion; images are drawn in
succession,andsoquicklythatthehumaneyecan’tdetectthattheyaredistinctimages.
Smallchangesinpositioninasequenceoftheseimageswillbeseenasmotionratherthan
asasetofstillpictures.Atypicalanimationdrawsanewimage(frame)between24and
30timespersecondtomaketheillusionwork.
TherearetwokindsofanimationthatcanbedoneusingGlib.Thefirstinvolvesobjects
thatconsistofprimitivesthataredrawnbythelibrary.Acirclecanrepresentaball,for
instance,orasetofrectanglesandcurvescouldbeacar.Thesecondkindofanimation
usesimages,whereeachimageisoneframeinthesequence.Theseimagesaredisplayed
entirelyinrapidsuccessiontocreatetheanimation.Inthefirstcasetheanimationisbeing
createdastheprogram
executes,whereasinthesecondtheanimationiscompletebeforetheprogramruns,and
theprogramreallyjustputsitonthescreen.
9.3.1 ObjectAnimation
Animatinganobjectinvolvesupdatingitsposition,speed,andorientationatsmalltime
intervals,soalloftheseaspectsoftheobjectmustbekeptinvariables.Iftherearemany
objects being animated, then all of these variables must exist for each object, and are
updated at the end of each time interval. If the animation is displaying 30 frames per
second, then a new frame is drawn every 0.03 seconds. In Glib the function named
framerate() can be called passing the number of times that draw() will be called each
second,andthendraw()candotheworkneededtoupdatetheobjectsanddrawtheframe.
Example:ABallinaBox
Imagineaballbouncinginasquarebox.Aboxhasthreedimensions,ofcourse,butfor
thisexampleitwillberestrictedtotwo,soitwilllooklikeacirclewithinasquare.The
ball is moving, and when it strikes one of the sides of the square it will bounce, thus
changingdirection.Thereisonemovingobject:theball.Graphicallyitissimplyacircle,
withpositionx,yandspeeddxinthexdirectionanddyintheydirection.Itwillhavesize
30pixels.Theboxwillsimplybethewindowthecircleisdrawnin.
Duringeachframetheballwillmovedxpixelsinthexanddypixelsintheydirection,
sowithinthedraw()functionthepositionisupdatedas:
x=x+dx
y=y+dy
Thisnewpositioniswheretodrawthecircle.However,iftheballisoutsideofthebox
afteritismoved,thenabouncehastobeperformed.Thatis,ifthenewpositionofxis,
for instance, less than 0, then it would have struck the left side of the square and then
changed x direction (bounced). In this case, and also if x>width, the bounce is
implementedby:
dx=-dx
Similarly,iftheycoordinateoftheballbecomeslessthan0orgreaterthantheheight,
thenitbouncesvertically:
dy=-dy
Thiswouldallbetrueiftheballwereverytiny,asinglepoint,butithasasizeof30
pixels,andthecoordinatesofthecirclearethecoordinatesofitscenter.Thismeansthat
themethoddescribedabovewillbouncethecircleonlyafterthecentercoordinatepasses
theboundary,meaningthathalfofthecircleisalreadyontheotherside.It’seasytofix:
the ball is 30 pixels in size, so it should bounce when it gets within 15 pixels of any
boundary. For example, the x bounce should occur when x<=15 or x>=width-15. The
entiresolutionis:
#Bouncingballanimation.
fromGlibimport*
defdraw():
globaldx,dy,x,y
background(200)#Erasethepriorframe
x=x+dx#Changeballposition
y=y+dy
ifx<=15orx>=width-15:#BounceinXdirection?
dx=-dx
ify<=15ory>=height-15:#BounceinYdirection?
dy=-dy
ellipse(x,y,30,30)#Drawtheball
startdraw(200,200)
x=100#Initialxpositionofthe
#ball
y=100#Initialyposition
dx=3#Speedinx
dy=2#Speediny
fill(30,200,20)#Fillwithgreen
enddraw()
Eightframesfromthisanimationshowingtheballbouncinginacorneroftheboxare
shown in Figure 9.2. An entire second’s worth of frames (30) are given on the
accompanyingdisc.
Iftherearemanyobjectsthenallofthepositionsandspeeds,andperhapsevenshape,
size,andcolorwouldhavetobekeptandupdatedduringeachframe.Therearetwousual
waystodothis.Inthefirstcasetheparametersarekeptinarrays(lists).Therewouldbe
an array of x coordinates, an array of y coordinates, of speeds, and so on. Each frame
couldinvolveanupdatetoallelementsofthearrays.Updatingthepositioncanbedone
likethis:
foriinrange(0,Nobjects):
x[i]=x[i]+dx[i]
y[i]=y[i]+dy[i]
Theotherusualmethod forhandlingmultipleobjects is to create anobject classthat
containsalloftheparametersneededtodisplaytheobject.Thereisstillanarray,butitis
anarrayofobjectinstances,andifitiscleverlyprogrammedtheclasscanbeupdatedby
callinganupdate()method:
foriinrange(0,Nobjects):
ball[i].update()
Example:ManyBallsinaBox
Thisexampleusesthesamepremiseasthepreviousone,butwilldrawmanyballsin
thewindow,allofthembouncing.Bothmethodsforkeepingtrackofobjects,arrays,and
classeswillbeillustrated.Themanyarrayssolutionhaslistsforxandy,fordxanddy,for
colorandforsize.Allparametersareinitializedatrandomwhentheprogrambegins.
The solution that uses classes defines a class ball within which the position, speed,
color,andsizearedefined.Theconstructorinitializesthevalues,andtheupdatemethod
changestheball’spositionandperformsanyneededbounces.Thetwosolutionsare:
These two solutions illustrate how classes work very neatly. The class contains
individualpropertiesofaballandmanyarecreated;thearrayscontainmanyinstancesof
eachproperty.Sox[i]andball[i].xrepresentthesamething.Inthiscasethetwoprograms
areaboutthesamesize,buttheclass-basedimplementationencapsulatesthedetailsofthe
ballandwhatcanbedonewithit.Theclass-baseddraw()functiononlysays“draweach
ball,”butinthearrayimplementationthedraw()functionlooksatallofthedetailsofall
ballstodrawthem.Oneoftheimplicationsisthatitwouldbepossibletodividethelabor
betweentwopersons,onewhowrotetheclassandanotherwhowrotetherestofthecode.
Forlargeprogramsthiscanmatterquitealot.
9.3.2 FrameAnimation
Thehardworkinframeanimationisdonebeforethecomputerprogramiswritten.An
animator has created drawings of an object in various stages of movement. All the
program does is display frames one after the other, often looping them to create the
desiredeffect.Acommonexampleofthisistheanimationofgait,walkingorrunning.An
artistdrawsmultiplestagesofasinglestep,beingcarefultoensurethattimingiscorrect:
howlongdoesittakeforanormalpersontostakeapairofsteps(left,right)?Thistime
shouldagreewiththeframestheartistcreates.Ifittakesonesecondtomakethestep,then
itshouldbedrawnas30frames.
Other kinds of animation are performed too. A fire can be animated as a very few
frames,as can smoke and water. The programthat draws theanimation readsall of the
image files into a collection. When the animation is played, the program displays one
image after another within the draw function. This can be complicated by the fact that
theremaybemultipleanimationsplayingatthesametime,possiblyofdifferentlengths
andframesizes.
Example: Read Frames and Play Them Back as an
Animation
In this example there are 10 drawn animation frames of a cartoon character walking.
Theseframesareintendedtorepresentasinglegaitcycle,andsoshouldberepeated.The
program will do the following: when the up arrow key is pressed and held down, the
characterdrawninthewindowwill“walk”;otherwiseastillimagewillbedisplayed.
Firsttheimagesshouldbereadinandstoredinanarray(list)sothattheycanbeplayed
repeatedly.ThenthekeyPressed()functionshouldbewrittensothatwhentheuparrow
keyispressedtheframeswillbedrawn.AflagcanbesetTruewhenthekeyispressed,
andFalsewhenthekeyisreleasedsothatdraw()cantellwhentodrawframesandwhen
not.
defkeyPressed(k):
globalkeydown
keydown=True
defkeyReleased(k):
globalkeydown
keydown=False
A list named frames is initialized with all of the images in the sequence. All that
draw()doesisplaythenextone,usingaglobalvariableftoidentifythecurrentframe.
defdraw():
globalkeydown,f
ifkeydown:
image(frames[f],0,0)
f=f+1
if(f>10):
f=1
Itcyclesthroughtheframesandrepeatswhenallhavebeendisplayed.
The initialization can be a simple matter of reading ten images into variables and
creatingalist.Thiscodedoesitinaloop,usinganumberinthenameandincrementingit:
startdraw(320,240)
keydown=False
frames=[]
foriinrange(1,10):
s=“images/a00”+str(i)+”.bmp”
x=loadImage(s)
frames=frames+[x,]
x=loadImage(“images/a010.bmp”)
frames=frames+[x,]
x=loadImage(“images/a011.bmp”)
frames=frames+[x,]
f=1
image(frames[0],0,0)
enddraw()
Thevariableframesisalistholdingalloftheimages,andframes[i]istheithimagein
thesequence.
Thebuildingofthefilenameisinteresting.Itiscommontousenumberednamesfor
animationframes; thingslike frame01,frame02,and so on. In this case the sequence is
a***.bmpwhere the *** represents a three-digit number. If the variable i is an integer,
thenstr(i)isastringcontainingthatinteger,butleadingzerosarenotpresent.Thus,for
valuesofibetween0and9(onedigit),thestringwillbe“a00”+str(i)+“.bmp”;forvalues
ofibetween10and99(twodigits),thestringwillbe“a0”+str(i)++“.bmp”;finally,for
numbersbetween100and999,thestringwillbe“a”+str(i)+“.bmp”(three digits). The
leadingzerosaremanuallyinsertedintothestring.
Theanimationframesforthegaitsequenceareonthediskalongwiththiscode.
Example: Simulation of the Space Shuttle Control Console
(A Class That Will Draw an Animation at a Specific
Location)
Animationscan sometimesbeusedtodecorateascenein interestingways. Acontrol
panel showing video screens and data displays could use animations to fill the screens,
givingtheillusionofrealthingsbeingmonitored.Aclassthatcanplayaframe-by-frame
animationatanylocationonthescreencouldbeinstantiatedmanytimes,onceforeach
display.
Theclasswouldhavetoreadtheframesitwastoplayandstorethem,playbackthe
framesinaloopwhenrequested,andplacethemwithinthewindowatanylocation.None
ofthesetasksisespeciallyhard.Codeforreadingframesfromafilewaswrittenforthe
previousexample,aswascodefordisplayingtheframes.Eachclassinstancewouldneed
aframecountsothattheloopcouldstartoverattherightplace,andeachclassinstance
couldhaveananimation withadifferentnumber offrames.Finally, placingattheright
locationisamatterofpassingthecorrectparameterstotheimage() function. The class
wouldbeinstantiatedgiventhepositionasxandycoordinatesoftheupperleftcorner.
Sometimes, especially when multiple animations are playing, it will be necessary to
slowdownsomeanimationssothattheylookright.TheGlibcodecallsdraw()afixed
number of times each second, but that may not always be the correct speed for an
animation.Acountcanbeintroducedsothattheframeadvancestothenextonlywhena
countexceedsafixeddelayvalue.Ifthecountis2,forexample,then2callstodraw()are
requiredbeforeanewframeischosen,meaningthattheframeratehasbeendecreasedby
50%.
Thespecificexampleissupposedtoimplementa“simulation”ofaspaceshuttlecontrol
console.Thisisavisualsimulation,notonethatallowsinteractionatanylevel,andthe
idea is to insert animations into a still photo of a real shuttle console and make it look
moreactive.Figure9.4ashowsthestaticimagethatwillbeused.Therearemanyvideo
screensvisible,andtheprogrambeingdevelopedwillreplacethestillimageonsomeof
thosescreenswithmoving,animatedimages.
Threeofthescreensareselectedforanimation.TheimagewasdisplayedusingPaint
andthecoordinatesoftheupperleftcornerofeachofthesescreenswasdetermined,as
werethesizes.Figure9.4bshowsthelocationoftheseregionsontheimage.
Thecodefortheclassstartslikethis:
classAnim:
def__init__(self,x,y):#Constructor––––-
self.frames=[]#Theactualimages
self.xpos=x#Positionofupperleft
self.ypos=y
self.n=0#Howmanyframesarethere?
self.f=0#Whichframeiscurrently
#beingshown?
self.active=False#Isthisanimationbeing
#played?
self.delay=1#Usedtoslowtheframe
#rate
self.count=100000#Whencount>delayaframe
#isdrawn
defdraw(self):
ifself.active:#Drawthecurrentframeatthe
#correctlocation
image(self.frames[self.f],self.xpos,self.ypos)
self.count=self.count+1#Incrementcount.
ifself.count>=self.delay:#Changetheframe
#yet?
self.f=self.f+1#Yes.Andalso
#resetthecount
self.count=0
if(self.f>=self.n):#Looptheframes;
#startoverat0
self.f=0
Thepartoftheclassthatreadstheframesasimagesisbasicallytakenfromtheprevious
example:
defgetframes(self,s1,s2):
self.frames=[]#Thelistvariable
#'frames'containsall
#images
foriinrange(0,100):#Upto100imagescanbe
#read.
ifi<10:
s=s1+“0”+str(i)+s2
print(“Reading“,s)
elifi<100:
s=s1+str(i)+s2
x=loadImage(s)
ifx==None:
self.n=i
print(“Saw“,self.n,”frames.”)
break
self.frames=self.frames+[x,]
There is a flag named active that determines whether the animations are currently
runningornot.Themethodsstart()andstop()turntheanimationonandoffbytoggling
thisvariable.
defstart(self):
self.active=True
defstop(self):
self.active=False
Finally,forthisclass,thedelaycanbesetusingacalltothesetdelay()method,which
simplychangesthevalueofaclasslocalvariabledelay.
defsetdelay(self,d):
self.delay=d
Thedraw()methodoftheprogramsimplydrawsthe
animationsbycallingtheirrespectivedraw()methods:
defdraw():
a.draw()
b.draw()
c.draw()
Themainprogramopensthewindowandloadsanddrawsthebackgroundimage:
startdraw(800,531)#Thesizeofthebackgroundimage
background=loadImage(“images/800px-STSCPanel.jpg”)
image(background,0,0)
Thefirstanimation,atx=239andy=284,willshowsometelevisionstatic,sevenframes
ofwhichwerecreatedforthispurposeusinganotherprogram.
A class instance is created to draw at (239,284) and getFrames() is called to load the
images(thefilenamesare“g100.gif”through“g106.gif”):
a=Anim(239,284)
a.getframes(“images/g1”,“.gif”)
Thesecondanimationisatx=319andy=258andwilldisplaysomeexteriorshotsofthe
spaceshuttle.Theprocessisthesameasbefore,butthefilenamesare“g200.jpg”through
“g204.jpg.”Inaddition,adelayof100isset,becausetheseimagesaretobedisplayedfor
multiplesecondseachtosimulateadisplayscanningasetofcameras:
b=Anim(319,258)
b.getframes(“images/g2”,“.jpg”)
b.setdelay(100)
Finally the third animation, at x=319 and y=322, consists of a computer display
showingPythoncode(thisclass,infact).Itwascreatedbyanotherprogramandconsists
ofnineframesnamed“g300.gif”through“g308.gif.”Thisanimationisdelayedalittleas
wellsothatitappearsasifthetextisscrollingproperly:
c=Anim(319,322)
c.getframes(“images/g3”,“.gif”)
c.setdelay(10)
Thelaststepintheprogramistostartalloftheanimationsplaying:
a.start()
b.start()
c.start()
enddraw()
The example is complete on the disk, and needs to be executed with the images
directory,whichcontainstheanimationframes.
https://commons.wikimedia.org/wiki/File:STSCPanel.jpg
9.4 RGBACOLORS–TRANSPARENCY
InChapter7,itwasseenhowitwaspossibletousetransparencyinanimagetoallow
thevisualizationto‘seethrough’toanimageinthebackground.Asithappens,anypixel
canbeassignedadegreeoftransparencythatpermitsthesamevisualcharacter.Acolor
can be assigned a value that dictates how opaque or transparent it is, allowing colors
behindittoinfluencehowthatpixelisseen.Onecanthinkofthisasafourthcolorvalue,
inadditiontored,green,andblue.Itisreferredtoasalpha,andacolorwithfourcolor
parametersissaidtobeintheRGBAcolorspace,forRed,Green,Blue,andAlpha.
If the value of Alpha is 255, then the color is opaque; as it decreases in value the
transparency increases until at Alpha=0 pixel or object cannot be seen. A program that
drawsthreeoverlappingcirclesusingcolorswithanAlphavalueof60showsthevisual
effect of using transparency (Figure 9.5a). Transparency is specified in this case by
providingtheAlphavalueasafourthparametertofill():
fromGlibimport*
startdraw(300,300)
fill(255,0,0,60)
ellipse(100,100,150,150)
fill(0,255,0,60)
ellipse(200,100,150,150)
fill(0,0,255,60)
ellipse(150,200,150,150)
enddraw()
Transparencycanbeaddedtostrokecolorsalso,andinthesameway(Figure9.5b).For
example:
stroke(255,0,0,65)
9.5 SOUND
Soundisanessentialcomponentofdigitalmedia.Proof?Almostnobodywatchessilent
filmsanymore, and nobody makesthem. Video games are rarelyplayed with the sound
turnedoff.Thereareafewimportantreasonsforthis.
1.Muchhumancommunicationisthroughsound.Speechisthebestexample,but
non-speechsounds,clapping,stampingoffeet,andsoon,arewaysthatpeople
maketheirfeelingsandintentionsknown.
2.Soundsareassociatedwithevents.Whenanobjectfallstothefloorasoundoccurs
withtheimpact.Abuttonispressedandadoorbellrings.Thesesoundsare
importantindicators.
3. Sounds cause emotional reactions in people. Music can do this; it can convey a
moodbetterthanalmostanythingelse.Butsoundcanalsoindicatethingsunseen.A
growling in the dark; a screech in the sky; the sound of an approaching vehicle
aroundacurveintheroad.
InGlibasoundismuchlikeanimageintermsofhowitisused.Asoundfileisloaded
and assigned to a variable, then that variable can be used to play, stop, rewind, and
perform all audio operations on that sound. Each sound must be loaded into a distinct
variableandhasitsowncontrols.TheGlibinterfacetosoundsfilesshouldthereforelook
familiar.
Oneproblemisthatthesoundsystemdoesnothavealargevarietyofsoundfilesthatit
canhandle:“.wav”and“.ogg”areaboutit.Thisleavesthemorepopularformat,“mp3,”
outofcontentionforPythonmediasoftware,atleastfornow.
Thefirststepinplayingasoundistoloadthefile.ThefunctionloadSound()isused
forthis,passingthenameofthesoundfile:
s=loadSound(“song.wav”)
PlayingthesoundisdoneusingtheplaySound()functionofGlib:
playSound(s)
StoppingasoundfromplayingisamatterofcallingstopSound().Settingthevolume
meanscallingvolumeSound()passingaparameterbetween0.0and1.0,where0.0isno
soundand1.0ismaximumvolume.That’sprettymuchitforthebasics.
Example:PlayaSound
The act of reading and playing a sound file will illustrate the essential operations for
usingsound.Thefileinputisdoneasaninitialization.Startingtoplaythefilecouldbe
donethatwaytoo.CallingplaySound()repeatedlyfromdraw()willcausethefiletostart
playing over and over again. A solution is to start some sounds in the main program;
anotheristosetaflagwhenasoundstartsplayingandchecktheflag.Thebestwaywould
betorecordthetimewhenthesoundstartedplayingandseeifthecurrenttimeexceeds
thelengthofthesound.
IfplaySound() is called with a second parameter, an integer, then the sound will be
replayed or repeated that many times. The call playsound(s,3) plays the sound s three
times.Iftherequestedfiledoesnotexist,thenloadSound()returnsNone.
Example: Control Volume Using the Keyboard. Pause and
Unpause
This example adds volume control. The function volumeSound() accepts a single
parameter, and it is a number between 0.0 (lowest volume) and 1.0 (highest volume).
AddingavolumecontrolisassimpleascodingakeyPressed()functionthatchangesthe
volumelevelbyanincrementeachtimeakeyispressed.Use“+”foravolumeincrease
and“-”foradecrease.Thenewprogramis:
fromGlibimport*
defkeyPressed(k):
globalvolume,s
ifk==K_PLUS:
volume=volume+.1
elifk==K_MINUS:
volume=volume-.1
ifvolume<0:volume=0
ifvolume>1:volume=1
volumeSound(s,volume)
startdraw()
s=loadSound(“sun.wav”)
volume=1.0
volumeSound(s,volume)
ifs==None:
print(“Nosuchsoundfile.”)
else:
playSound(s)
enddraw()
Example: Play a Sound Effect at the Right Moment:
Bounces
Asoundeffectrepresentssomeevent,andneedstobeplayedatthemomenttheevent
happens.Synchronizingthetwothingsisassimpleasplayingasoundwhentheeventis
detected. This example program will play a sound representing a ball hitting something
when a simulated ball hits the side of the window and bounces. The bouncing ball
animation program will provide the impact event: when the ball hits the side of the
window,thesoundofanimpactwillbeplayed.
The sound effect is a file, and was recorded using an inexpensive microphone, a
computerwithasoundcard,andtheAudacitysoftware,whichisfreeanddownloadable
(see the end-of-chapter resources). The sound of a glass hitting a desk was recorded,
edited, and saved as a “.wav” file named “bounce.wav.” The program was modified to
readthatfile,andthenplayitbackwheneveracollisionwiththewindowwasdetected.
Theprogramhasthreenewlinesofcode:
#Bouncingballanimation.
fromGlibimport*
defdraw():
globaldx,dy,x,y
background(200)#Erasethepriorframe
x=x+dx#Changeballposition
y=y+dy
ifx<=15orx>=width-15:#BounceinXdirection?
playSound(s)
dx=-dx
ify<=15ory>=height-15:#BounceinYdirection?
dy=-dy
playSound(s)
ellipse(x,y,30,30)#Drawtheball
startdraw(200,200)
x=100#Initialxpositionofthe
#ball
y=100#Initialyposition
dx=3#Speedinx
dy=2#Speediny
s=loadSound(“bounce.wav”)
fill(30,200,20)#Fillwithgreen
enddraw()
Inmanysituationsthere can beasmall delaybetweentheevent andthesound being
played.Asoundcanrarelybeplayedinstantaneously.
9.6 VIDEO
The video facilities provided by Glib are limited, but the library increases the
functionalityoftheunderlyingPygamemodule. It isimportant tounderstand thatvideo
playsataparticularrateand,unlikeaudio,mustacquireaportionofthedisplaywindow
withinwhichtobedrawn.Thismeansthatplayingavideointhenormalwaytakescontrol
awayfromtheprogrammerandallowsthevideosoftwareautonomy.Thiscancausesome
trouble,asprogrammersoftenwanttodrawintothedisplaywindowaswell.
UsingthebasicfunctionalityitispossibletoplayanMPEG-formattedvideowithsound
anywhereinthewindow,andthewindowcanbesizedtosuitthepurpose.Onlyasingle
videocanbeplayedinthiswayatonetime,though.Avideohascharacteristicsofboth
sounds and images: the video resides in a file; it is placed in a specific location in the
window, and has a two-dimensional size (a width and height), but the image displayed
changes as a function of time. Also, a video can have sound as one of its properties.
However,becausemultiplevideosmayhavemultiplesoundchannels,onlyoneofthemis
giventhesoundoutputchannel,meaningthatonlyonecanplaysound.
A video is read into a variable in the same way as an image or sound: a function
loadVideo()returnsavariablethatreferencesthedataonanMPEGfilewhichisspecified
byfilenameastheparameter.AvideoisplayedbycallingthefunctionplayVideo()and
passingthevaluereturnedbyloadVideo().Thesmallestprogram thatplaysavideo file
wouldbesomethinglikethis:
fromGlibimport*
startdraw(400,400)
s=loadVideo(“ellipsis.mpg”)
ifs!=None:
playVideo(s)
enddraw()
Thefilebeingopenedandplayedisoneprovidedontheaccompanying
disc, “ellipsis.mpg,” and it has no sound. The variable s represents the video in the
program,andisreturnedbyloadVideo().Itis,infact,areferencetoaGlibclassnamed
Gvideo.IfithasthevalueNone,thennovideofilewasloadedforsomereason:perhaps
the file does not exist or is in a format that can’t be processed. Glib only recognizes
MPEG video files, and even then only MPEG I. The playVideo() function places the
imageintheupperleftcornerofthewindowandbeginsplayingit,soundincluded.
ThereareasmallcollectionofusefulfunctionsinGlibfordealingwithvideos.They
are:
pauseVideo(m):Parametermisavideo.Pausesthevideoifitisplayingorresumesit
ifitispaused
stopVideo(m):Parametermisavideo.Stopsplayingthevideom
rewindVideo(m):Parametermisavideo.Returnsthevideotothebeginning
isVideoPlaying(m):Parametermisavideo.ReturnsTrueifthevideoisplaying
setVideoVolume(m, v): Parameter m is a video. Adjusts the audio volume on the
videomtobethevaluev,wherev=0istheminimumandv=1.0isthemaximum
lengthVideo(m):Parametermisavideo.Returnsthelengthofthevideoinseconds
whereVideo(m):Parametermisavideo.Returnsthecurrentlocationinthevideo,in
secondsfromthestart
getVideoFrame(m):Parametermisavideo.Returnsthecurrentvideoframeplaying
setVideoFrame(m, f): Parameter m is a video. Changes the playback so the next
frameinthevideotoplayisframef
getVideoPixel(m, x, y): Parameter m is a video. Returns the value of the pixel at
location(x,y)intheframecurrentlybeingdisplayed
sizeVideo(m):Parametermisavideo.Returnsthedimensionsofavideoframe.The
videomustbeloaded
videoSize(s):Parametersisastring.Returnsthedimensionsaframeinthevideofile
s,wheresisastring.Usethisforfindingthesizebeforeloadingthefile
locVideo (m, x, y, w, h): Position the video m at position (x,y) in the window, and
makeitwxhpixelsinsize.Doesnotstartitplaying
Example: Carclub – Display the Video carclub2.mpg
(Annotated)
Whenthepkeyispressedthevideowillpause,andwhenpressedagainitwillresume.
Displaythevideoatlocation100,100inthewindowandmakeit200x200pixels.Display
thenumberofthecurrentframeandthecurrenttimeoftheframebeingplayed.
Thisprogramuseseightofthefifteenvideofunctions.Firstthefileisloadedandthe
locationandsizearesetusinglocVideo().Thenthemainprogramstartsthevideoplaying.
Thedraw()functionisresponsibleforupdatingthenumericvaluesdisplayed.Itresets
thebackgroundandthencheckstoseeifthevideoisplaying;itdisplays“Playing”ifso,
or“Notplaying”otherwise.Thecurrentposition,currenttime,andtotaltimeareextracted
anddisplayedusingcallstotext().
Finally the pause feature is implemented. In the function keyPressed() the function
checksthatthekeypressedwas“p,”andifsoitcallspauseVideo().Thisfunctionkeeps
track of whether or not the function is playing and knows whether to start or stop the
video.Hereistheprogram:
fromGlibimport*
defdraw():
globalvid
background(0,200,190)#Clearthebackground,
#settoaqua.
ifisVideoPlaying(vid):#Playing?Printan
#indicator
text(“playing”,10,40)
else:
text(“Notplaying”,10,40)
whr=whereVideo(vid)#Currenttimeofplay
text(“Frame“+str(getVideoFrame(vid)),10,60)
#Currentframe
text(“Length“+str(whr)+”of“+str(lengthVideo(vid)),
40,370)
defkeyPressed(k):
globalvid
ifk==K_p:#Keypressedwasa'p'
pauseVideo(vid)#Pause
startdraw(500,500)
vid=loadVideo(“carclub2.mpg”)#Loadthecarclubvideo
locVideo(vid,100,100,200,200)#Positionitat
#(100,100)
playVideo(vid)#Playit.
enddraw()
A screen shot of this program in action is shown in Figure9.5. Note that the current
time is displayed to 16 or so digits. This can be changed to display something more
reasonable(see:Exercise8).
Therearefourdifferentwaysto“play”avideousingGlib,andeachhasadistinctsetof
prosand cons and a processfor how to manage the video. After loading the video into
variablem:
1.UsingtheGlibfunctions–loadVideo(),locVideo(),play()andsoonisolatethe
programmerfromtheactualvideoclassGvideo.Thesearetypicallyusedwhen
thereareoneortwovideosandthereisnocomplicatedprocessinggoingon.Only
onevideowithsoundcanbeplayed;theotherswillplaymuted.
2.AutoPlay(m)–Thevideomwillplayautomaticallyinthecurrentdisplaywindow.
Theuserhassomecontrol:thevideocanbepausedandcanbelocatedataspecific
locationandsize.Itwillplayattheinternallydesignatedrate(framespersecond)
withsound.Onlyonevideowithsoundcanbeplayedinthisway;theotherswill
nothavethesoundplayed.
3.PlayVideo(m)–Thevideowillplayatitsinternalframerate,butaframewillnot
bedisplayeduntilthedrawVid()functioniscalled.IfdrawVid()iscalledinsideof
theuser’sdraw()functionthentheframeswillbedisplayedattheGlib-specified
framerate,butsomevideoframesmightbemissed.
4.DrawFrame(m,f)–Theframenumberedfofthevideomwillbedrawn.Sound
willnotplay.Thispermitsthebestcontrolofthevideo,becauseeachframecanbe
played at any speed without missing any; or, every second or third frame can be
played;orrandomframescanbeplayed.Thevideocanevenbeplayedbackwards–
simplystartfatthelargestframevalueanddecreaseby1eachtime.
Threeofthesedifferentstylesareillustratedinthetablebelow:
Exercise:ThresholdaVideo(ProcessingPixels)
InChapter7,aprogramwaswrittenthatthresholdedanimage.Itconvertedeachpixel
toagreyvalueandifthatvaluewassmallerthanaspecifiedthresholditwouldbesetto
black;otherwiseitwouldbesettowhite.Eachframeofavideoisanimage,anditshould
bepossibletothresholdeachframeandthendisplayit.GlibprovidesthegetVideoPixel()
functionthatreturnsthevalueofaspecifiedpixelinthecurrentframeofavideo.
Thisexamplewillusedraw_frame()todisplaythevideo,anddraw_frame()willbe
calledfromtheuser’sdraw()function.Aftertheframeisdisplayedeachpixelintheframe
isexamined(getVideoPixel()),convertedtoagreyvalue(grey(p)),andtestedagainsta
threshold;if smallerthan the threshold,it will be drawn as black bycalling fill(0)and
drawingthepixelwithpoint().Otherwise,thepixelwillbedrawnaswhite.Theoriginal
imageisdisplayedatthetopofthewindow,thethresholdedonebelow.Thusthewindow
has to be created initially with double the height of the image to make room for two
copies.
fromGlibimport*
defdraw():
globalframe,v,wid,ht,x
background(200)
draw_frame(v,frame)
foriinrange(0,wid):
forjinrange(0,ht):
p=getVideoPixel(v,i,j)
g=grey(p)
ifg<t:
fill(0)
else:
fill(255)
point(i,j+ht)
frame=frame+1
fill(0)
text(“Original:Frame”+str(frame),10,30)
text(“Thresholded:Frame“+str(frame),10,ht+30)
s=videoSize(“carclub2.mpg”)
startdraw(s[0],s[1]*2)
v=loadVideo(“carclub2.mpg”)
frame=1
wid=s[0]
ht=s[1]
t=100
locVideo(v,0,0,wid,ht)
enddraw()
9.7 SUMMARY
Afacilityhasalreadybeendescribedfordisplayingimagesandgraphics:Glib,andhere
moremediacapabilityisaddedtoGlib,thusbuildingonwhathasalreadybeendiscussed,
sothatsoundandvideocanbedisplayed.Thenew
library is dynamic Glib and offers the same functionality as previously plus sound,
animation,andvideo.
Using mouse position and button presses is a basic form of communication with a
computer. The Glib module keeps track of the mouse using Pygame and continually
updatesthepositionastwovariables:mouseXandmouseYholdthemostrecentxandy
coordinates.IftheuserwritesafunctionnamedmousePressed(),thenitwillbeexecuted
whenamousebuttonispressed.Similarly,whenamousebuttonisreleasedittriestocall
mousereleased().A software graphical button is a rectangle or other area which, if the
mousebuttonisclickedwhilethemousecursoriswithinthatarea,willperformatask;in
otherwords,theclickwhilethecursorisinthatareacallsafunction.
Thekeyboardissimilarlydealtwithbyhavingauser-codedfunctionkeyPressed()and
anothernamedkeyReleased().Theyarepassedthevalueofthekeythatwaspressedas
theparameter.
Animationis performedby rapidlydisplaying drawn images, or frames,one afterthe
other,orbycreatinganddrawinggraphicalobjectsandthenchangingtheirpositions.A
functionnameddraw()canbewrittenbytheprogrammertodrawtheframesmanytimes
eachsecond.
Soundsaredisplayedbyreadingthemfromafileandcallingaplay()functionwhenthe
soundisneeded.Soundscanbemusic,voice,ambiance,orsoundeffects.
Video is the most resource-consuming of the media types. Glib allows videos to be
recalledandplacedinthewindow,andtobeplayedautomaticallyorframebyframe.
Exercises
1.Writeaprogramthatfiguresouthowfastthemouseismoving(pixelspersecond
assuming30framespersecond)anddisplaysthatvalue.
2.Considertheexamplethatprintsacircleatarandompositionwhena“+”keyis
pressed.Modifyitsothatwhenthe“-”keyispressed,thepreviouscircleisdeleted
(nolongerappearsonthescreen).
3.Writeaprogramthatreadslinesfromafileaspairsofx,ycoordinatesonasingleline
anddrawthemall.Eachlinewouldhavefourintegers:
100100200200
whicharethe(x,y)coordinatesofthestartandendpointsoftheline.Intheexample
above,thelinewouldbedrawnbetween(100,100)and(200,200).
4.Implementacircularbutton.Itisrepresentedonthescreenasacircleat(100,100)of
size30pixels.Normallyitisred,butitturnsgreenwhenactivated.Whenamouse
buttonispressedwhilethebuttonisactivated,arectangleisdrawnsomewhere
(random)inthewindow.
5.Implementabuttonthatnormallyhasthetext“Yes”drawnwithinit,butthatchanges
thattextto“No”whenthebuttonisactivated.Pressingitdoesnothing.
6.UseGlibtoimplementatextboxthatpermitsafilenameorothertexttobeentered
whenthemousecursoriswithinthatregiondefinedbythebox.Usethistocreatea
programthatallowstheusertoenterafilenameofanimageandhavetheprogram
displaythisimageinthewindow.
7.ModifytheprogramfromExercise5abovesothatasoundismadewhenthebuttonis
pressed.Aclickingsoundwouldbemostappropriate,butwhateveritisitmustbeof
shortduration.
8.Floatingpointvalues,suchasthecurrenttimeofthevideoinseconds,areoften
convertedintostringsandrequiretenormoredigitstobedisplayed.Writeafunction
thatchangesafloatingpointnumbersothatitwilldisplayinfivedigits,andmodify
thevideodisplayprogramnamedcarclubsothatallfloatsaredisplayedwithonlytwo
digitstotherightofthedecimal.
9.Modifythesimulationofthespaceshuttleconsolesothatvideosareplayedinthe
simulatedscreensinsteadofframe-by-frameanimations.
NotesandOtherResources
ThankstotheestateofcomposerandmusicianandfriendMichaelBeckerforthe
useofthesong‘HoldingOn,’andfortheuseofthe.wavfile.
DownloadforthePygamemodule.http://www.pygame.org/download.shtml
Anexcellentsoundeditorfor.wavand.mp3files.http://www.goldwave.ca/
Anotherexcellentsoundfileeditor.http://sourceforge.net/projects/audacity/
Complete Pygame documentation.
https://media.readthedocs.org/pdf/pygame/latest/pygame.pdf
Convert video files into MPEG-I format. http://video.online-convert.com/convert-to-
mpeg-1
A good tutorial on video formats. http://www.videomaker.com/article/c10/15362-
video-formats-explained
Free software that converts between video formats. http://www.any-video-
converter.com/products/for_video_free/
1.AlSweigart.(2012).MakingGameswithPython&Pygame,CreateSpace
IndependentPublishingPlatform,ISBN-13:978-1469901732.
2.SeanRiley.(2003).GameProgrammingwithPython,CharlesRiverMedia.
3.VicCostello.(2016).MultimediaFoundations,2ndedition,FocalPress,ISBN-
13:978-0415740036.
4.RichardBoulangerandVictorLazzarini(Eds.).(2010).TheAudioProgramming
Book,TheMITPress,Har/DVDedition,ISBN-13:978-0262014465.
5.Sendpoints.(2015).GUI:GraphicalUserInterfaceDesign,ISBN-13:978-
9881383495.
6.MaheshVenkitachalam.(2015).PythonPlayground:GeekyProjectsforthe
CuriousProgrammer,NoStarchPress,ISBN-13:978-1593276041.
CHAPTER10
BASICALGORITHMS
10.1Sorting
10.2Searching
10.3RandomNumberGeneration
10.4Cryptography
10.5Compression
10.6Hashing
10.7Summary
Inthischapter
Analgorithm,asdiscussedinpreviouschapters,isastep-by-stepdescriptionofameans
tosolveaproblem.Assomeonewhoislearningtoprogram,whatarethemostimportant
algorithms? That rather depends on how “important” is defined. Does it reflect
commercialvalue?Numberoftimesitisused?Pedagogicaluses?Sincetherearemany
waysanalgorithmcanbeimportant,thischapterdealswiththemostcommonalgorithms
discussedonprogrammingwebpagesandinintroductorycomputingtexts.Noneofthese
methodsrequireaknowledgeofadvancedmathematicsordatastructures.
10.1 SORTING
Mostpeopleknowwhatsortingisandcansortasmallsequenceofnumbersinafew
seconds.Eachmayhaveadistinctstrategyfordoingit,butfewcan
explaintosomeoneelsehowtosortanarbitrarysetofnumbers.Theythemselvesmaynot
know how they do it; they can simply tell when something is sorted, and have some
processforsortinginmind.Inshort,theprocessofsortingisoneofthesimplestthings
thatishardtodescribe.
Becausesortingissoimportantincomputerscience,ithasbeenstudiedatgreatlength.
Butwhatisit?Sortinginvolvesplacingthingsinanorderdefinedbyafunctionthatranks
themsomehow.Fornumbers,rankingmeansusingthenumericalvalue.So:thesequence
1,3,2isnotinproperorder,but1,2,3isinascending(gettinglarger)orderand3,2,1is
indescending(gettingsmaller)order.Formally,asequencesisinascendingorderifsi<=
si-1foralli.Theactofsortingmeansarrangingthevaluesinasequencesothatthisis
true.Itisclearthatitcanbedecidedwhenasequenceissorted.
So how can a sequence be placed in sorted order? By using a sorting algorithm, of
course.Forallofthefollowingdiscussiononsorting,assumethattheproblemistosort
intoascendingorder.
10.1.1 SelectionSort
Smallsequencesareeasiertosortthanlongerones,andmayprovidesomeinsightinto
theprocess.Thesequence:
84
isnotsortedinascendingorder,buttestingthisiseasyandfixingitistrivial:simplyswap
thetwovalues.Thelongersequence:
849
is also not sorted but is more difficult to sort because it is longer and there are more
combinationsofthenumbersthatareunsorted.Howcanthissequencebeplacedinorder?
Here’soneidea:
1.Findthesmallestelementinthelist.
2.Swapthatelementfortheelementatthebeginningofthelist.
3.Findthesmallestelementintherestofthelist.
4.Swapthatelementforthesecondelementinthelist.
…andsoonuntilthelistissorted.
Thisiscalledtheselectionsortalgorithm,becauseateachstageitselectsthesmallest
oftheunsorteditemsinthelistandplacesitwhereitbelongs.Considerthefollowinglist:
[12,18,5,21,9]
01234-index
Thesmallestelementinthislistis5,atindex2.Swapelement2forelement0:
[5,18,12,21,9]
Theboldelementsaboveareinsortedorder,whichhereisonlytheoneatlocation0.
Fortheremainderoftheelements,repeattheprocessoffindingthesmallestelementand
placingitatthebeginningoftheunsortedlist(element1).Thatmeansswapping9for18,
element4forelement1:
[5,9,12,21,18]
Repeating,itturnsoutthatelement2,value12,isnowthesmallest,andisinthecorrect
place.
[5,9,12,21,18]
Nowthevalue18issmallestandshouldbeplacedatlocation3.
[5,9,12,18,21]
Nowthesortiscomplete.Whenonlyoneremainsitmustbeinthecorrectplace.
Findingthesmallestelementinalistinvolvesthreethings.First,beginwiththeinitial
elementandassumethatisitthesmallest.Identifyitusingitsindeximin.Next,checkthe
valueofallsuccessiveelementsinthelist(fromimin to theendof thelist)againstthe
valueatimin.Finally,inthecasewhereoneofthesuccessivevaluesatindexkissmaller
than the one at index imin, set imin to k to indicate where a new smallest value was
found.Insimple,impreciseEnglish,scanallof theelementsaboveiminandremember
thelocation ofthe smallest one.Presuming thatthe list tobe sorted isnamed data,the
codeforfindingthesmallestelementfromimintotheendofthelistis:
foriinrange(imin,len(data)):
ifdata[i]<data[imin]:
imin=i
Thiscodedoeswork,butitmodifiesimin,whichisusedtodeterminetheloopbounds,
withintheloopitself.Thiscanbeconfusingtosome,andisbadformgenerally.Itisbetter
tocodethisloopas:
imin=istart
foriinrange(iend,len(data)):
ifdata[i]<data[imin]:
imin=i
What happens after this is to swap the smallest value found for the one at location
istart. In most programming languages this would take three statements, which would
looksomethinglikethis:
temp=data[imin]
data[imin]=data[istart]
data[istart]=temp
Oneofthejoys ofPythonisthat thisswapcanbe performedusingadifferent,some
wouldsayprettier,syntax:
(data[istart],data[imin])=(data[imin],data[istart])
Thisisthecore of thealgorithm,andneeds tobedone for allvaluesof imin; that is
from0tolen(data)-1.Thisisanotherforloop,ofcourse,withinwhichthiscodeisplaced.
Thatouterloopwouldbe:
foristartinrange(0,len(data)-1):
Thisisallthatisneededforthesort.Writingitasafunction,itlookslikethis:
defselection(data):
foristartinrange(0,len(data)-1):
imin=istart
foriinrange(istart,len(data)):
ifdata[i]<data[imin]:
imin=i
(data[istart],data[imin])=(data[imin],data[istart])
Thissortingmethodappearstobenaturaltohumans.Itistheonemostoftendescribed
by students when asked how they sort numbers. It is not the fastest in many cases, but
does a small number of swaps. If the data is already sorted it does no swaps; if it is in
reverseorderitdoeslen(data)-1swaps,thesmallestthatcanbedoneandstillsortthelist.
Whenlookingatalgorithmsitiscommontodefineaworstcaseandabestcase,andto
defineperformancenotinsecondsbutintermsofoneoftheoperationsperformed.Inthat
waythenatureofthecomputer,whetheritisfastorslow,doesnotaffecttheanalysis.For
sortingitiscommontoselecttheoperationtobeusedasabasisforcomparisontobethe
compareoperation:data[i]<data[imin].Howmanyofthesearedone?
Thebestcasefortheselectionsortoccurswhenthelistisalreadysorted.Inthatcaseit
willperformclosetoN2comparisons,whereN=len(data).Thisisthesamenumberof
comparisonsneededfortheworstcase,inwhichthelistisinreverseorder.Atleastitis
consistent. However, it minimizes the number of times swaps occur, and if swapping is
expensivethenthiscouldbethesortingmethodtochoose.
Selection sort is unstable. If there are repeated values in the data, then they will of
courseenduptogetherinthefinal,sortedlist.However,ifasortisstabletheywillremain
inthesameordertheywereoriginally.Selectionsort,likemanyothers,doesnotguarantee
this.Itseemsasifthisisaminorthing,butitdoesmatterinsomecases.Consideralistof
namesinalistthataregiven,inorderofsomesortofscore,onawebpage.Namesfortie
scoresshouldalwaysbeinthesameorderonthepage,sothatifthepageisrefreshedora
linkisfollowedthepagelooksthesame.
Itshouldbesaidherethatgenerallythereisnobestsortingmethod.Thepropertiesof
suchamethodwouldbe:
1.Fast.SelectionsortisN2intermsofcomparisons.Thebestonecannormally
expectfromanysortwouldbeN*log(N)intheworstcase.
2.Doesnotneedextraspace.Thismeansthatthearraycanbesortedinplace,with
perhapsatemporaryvariableforperformingswaps.
3.PerformsnomorethanNswapsintheworstcase.
4.Adaptive.Themethoddetectswhenitisfinishedinsteadofloopingthrough
unproductiveiterations.If,forexample,suchamethodisgivenanalreadysorted
list,itwillfinishinasinglepassthroughthedata.
5.Stable.
Nomethodhasallofthesecharacteristics.
10.1.2 MergeSort
Iftherewerea“best”sortingalgorithmthenthiswouldbetheplaceto
describeit.Asthereisnot,perhapsthebestthingtodowouldbetolookatanalgorithm
thatisquitedifferentfromtheselectionsort,andthathaspropertiesthatitdoesnothave.
Themethodnamedmergesortfitsthatdescriptionnicely:itisanN*log(N)sort,itdoes
needextraspace,anditusesmorethanNswapsbutitisstable.
Merge sort is an example of a divide and conquer style of algorithm, in which a
problemisrepeatedlybrokenupintosub-problems,oftenusingrecursion,untiltheyare
smallenoughtosolve;thesolutionsarecombinedtosolvethelargerproblem.Theidea
behindmergesortistobreakthedataintopartsthatcanbesortedtrivially,thencombine
those parts knowing that they are sorted. Using the sample data from the selection sort
example, the first step in the merge sort is to split the data into two parts. There are 5
elementsinthislist,andthemiddle
elementwouldbeat5//2,or2,sothetwopartsare:
[12,18][5,21,9]
Splittingagain,thefirstsethas2elements,themiddlebeingat0;thesecondsethas3
elements,sosplitat1:
[12][18][5][21,1]
Thefinalsplitbreaksthedataintoindividualcomponents:
[12][18][5][21][1]
The splitting is done in such a way that the original locations are remembered. This
happensintherecursivesolution,butcouldbedoneinotherways.Onewaytovisualize
thisisasatreestructure:
Thiscompletesthedivide portionof thedivideandconquer. Now that the individual
elementsareavailable,itiseasytosortthem,aspairs.Onthelowerrightthepair[21]and
[9]isoutoforder,sotheymustbeswappedwitheachother.Nowtheyaresorted.Onthe
nextlevelupwards,lookingfromlefttoright,theelementsaresorted,althoughmostare
singleelements:
Moving up again, [12] and [18] are combined to make [12,18], a sorted pair. On the
right,thesingleton[5]ismergedwiththepair[9,21]bylookingatthebeginningofeach
listandcopyingthesmallestelementofthepairintoanewlist:
Theresultis:
Ateach stage,the listscontain moreelements andthey aresorted internally,smallest
elementatthebeginning.Combiningapairoftheseissimplyamatteroflookingatthe
elementatthebeginningofeachandcopyingthesmallestonetotheresultuntilthelists
areempty.Thenext,andfinal,mergeinthissetofdatawouldbe:
Step List1 List2 Mergedlist
1 [12,18] [5,9,21] [5] 5issmallerthan12,copy5
2 [12,18] [9,21] [5,9] 9issmallerthan12,copy9
3 [12,18] [21] [5,9,12] 12issmallerthan21,copy12
4 [18] [21] [5,9,12,18] 18issmallerthan21,copy18
5 [] [21] [5,9,12,18,21] Firstlistisempty,copy21
Thefinallistis[5,9,12,18,21]whichissorted,aspromised.
Oncethedatahasbeensplitintoindividualcomponents,themergestagecreatessorted
pairs,thenextmergecreatessetsof4sortednumbers,thenext8,andsoon,doublingeach
timeuntiltheyareallsorted.Alogicalwaytowritetheprogramistouserecursion,where
eachrecursivecallsplitsthedataintwomorepartsuntilthereisonlyoneelement.The
lowestlevelofrecursioncombinestheindividualsintosortedpairs,andreturnstothenext
levelwherethepairsarecombinedintofours,theneights,andsoonuntilatthehighest
levelthelistiscompletelysorted.Writtenasarecursivefunctionthisis:
data=[12,18,5,21,9]
defmergesort(data):
n=len(data)#Forthiscalltherearenelements
#tobesorted
ifn<=1:#Dividethedataintotwoparts
return#unlessn-1,whichmeanssortingis
#complete
middle=n//2#Indexoftheelementinthemiddle
lower=data[:middle]#Lowerindexes,ortheleft
#sublist
upper=data[middle:]#Largerindices,ortheright
#sublist
mergesort(upper)#Sorttheleftsublist
mergesort(lower)#Sorttherightsublist
#TherearenowtwosortedsublistsoflengthN//2.
#MergethemintoonelistoflengthN
(i,j,k)=(0,0,0)
whilei<len(lower)andj<len(upper):#Onesublist
#maybeshorter…
iflower[i]<=upper[j]:#Iftheelementatindexi
#ofthe
data[k]=lower[i]#leftlistissmaller,
#copyittotheresult
i=i+1
else:
data[k]=upper[j]#Otherwisecopytheelement
#atindexj
j=j+1#oftherightsublistto
#theresult
k=k+1#Resultgetslongerby1element
foriinrange(i,len(lower)):#Iftheleftlistwas
#longer,copy
data[k]=lower[i]#theremainingitemsto
#theresult
k=k+1
forjinrange(j,len(upper)):#Iftherightlistwas
#longer,copy
data[k]=upper[j]#theremainingitems
#totheresult
k=k+1
Themergesortisnotasobviousaswasselectionsort,butisfasterinmostcases.Ithas
anotherinterestingapplication:itcanbeusedtosortfiles.Ifafilecontains,forexample,a
billiondatasamplesthatneedtobesorteditisunlikelythattheycanbereadintomemory
andsortedwithaselectionsort.Howthentosortthem?
10.2 SEARCHING
Searching is the act of determining whether some specific data item appears in a list
and,ifso,atwhichindex.Itseemslikeanoddthingtodo;whatcanbedoneknowingthis
information?Itisespeciallyusefulwhenmultiplelistsholddifferentdataconcerningthe
sameitems.Anemployee,asoneexample,mighthavetheirvariousdatasavedasaname
list, an employee ID list, phone number, office number, home address, and so on. The
same index gives information of the same individual for each list. Thus, search the
employeeIDlistfor18762;ifthatindexis32,thentheemployee’snamecanbefoundat
name[32].
OfcoursePythonhasbuilt-inoperationsonalistthatwilldothis:
if18762inemployeeID:#IsthisIDamemberofthelist?
k=employeeID.index(18762)#Whatistheindexof
#18762?
A reason to examine searching algorithms is that not all languages possess these
specificfeaturesandnotallprogramsarewritteninPython.Anotheristhatsomeonehad
toimplementtheoperationsforthePythonsystemitself,andtheyhadtoknowhow.Did
theydoagoodjob?Arethebuilt-inoperationsasfastasonesthataprogrammercould
codeforthemselves?Thiswillbediscoveredusinganexperiment.
10.2.1 Timings
AnysectionofcodeinPythonrequiressomeamountoftimetoexecute.Thespecific
amountdependsonmanythings:thecomputerbeingused,the
Pythoncompiler,thespecificstatements,thedata,andrandomeventssuchaswhatother
programsareexecutingonthatcomputeratthesametime.However,ifitisimportantto
knowwhetherasectionofcodeisfasterthananother,therearetimingfunctionsthatcan
provide a pretty good idea. The time module includes a function named clock() that
returns(onWindows)theelapsedtimeexpressedinsecondselapsedsincethefirstcallto
thisfunction.OnLinuxitbehavesdifferently,andtime.time()maybeabetterchoice.Be
suretolookitup.
Timing a section of code is done by calling time.clock() before and after the code
executesandsubtractingthetwotimes.Forexample,timingasearchofalistusingthein
operatorcouldbedonethisway:
importtime
list=[19872,87656,10982,18756,56344,29765,12856,12534,
88768,90012]
t0=time.clock()
if90012inlist:
found=True
t1=time.clock()
print(“Timewas“,t1-t0)
Thisprintsthemessage:
Timewas2.062843880463903e-05
That’s a pretty small time, as is to be expected. When run again the result was
3.07232e-06; running again gets 2.194514766e-06 and again 7.9002531e-06. These
numbers are all small but very different. Since that is true it is better to time many
executionsofthecodeanddividebythenumberoftimesitran:
t0=time.clock()
foriinrange(0,10000):
if90012inlist:
found=True
t1=time.clock()
print(“Timewas“,(t1-t0)/10000)
This yields more consistent results: 5.5284e-07, 5.5951e-07, and 5.415e-07 in three
different trials. Averaging the result of multiple trials gives even better results, because
spurioustimesonanyonerunwillbeaveragedout.
10.2.2 LinearSearch
Considerthelistthatwasusedinthetimingexample:
list=[19872,87656,10982,18756,56344,29765,12856,12534,
88768,90012]
Findingwhetherthetargetnumber90012appearsinthislistisamatteroflookingat
eachelementtoseeifitisequaltothetarget.Ifsotheansweris“yes”and,bytheway,the
indexatwhichitwasfoundisalsoknown.Thiscanbedoneinabasicforstatement:
index=-1
foriinrange(0,len(list)):
iflist[i]==target:
index=i
break
#Ifthevalueofindexis>=0thenitwasfound.
Thisalgorithmlooksateachelementasking“Isthisequaltothetarget?”When/ifthe
targetislocated,theloopendsandtheanswerisknown.Ifthetargetisnotamemberof
thelist,thenthealgorithmhastoexamineallmembersofthelisttodeterminethatfact.
Thus,theworstcaseiswhentheelementisnotinthelist,anditrequiresNcomparisonsto
findthatout.Iftheelementisapartofthelistthen,ontheaverage,itwillrequireN/2
comparisonstofindit.Itcouldbethefirstelement,orthelast,oranyoftheothers,which
averagesouttoN/2.
Ifthelistisinsortedorderthentheloopcanbeexitedassoonasitisknownwhether
theelementisinthelistornot.Thatis,assoonasthetargetissmallerthantheelementit
isbeingcomparedagainstinthelist,itisclearthatitcan’tbeamemberofthelist,andthe
loopcanbeexited.Thisnormallyspeedsuptheexecution,butthepenaltyisthatthelist
hastobesorted,andthetimeneededtodothis(onlyonce,ofcourse)hastobetakeninto
account.
10.2.3 BinarySearch
Ifthelisthasbeensortedthenthereisafasterwaytosearchforanelement.Thelistcan
bedividedintotwopartsbylookingatthevalueinthemiddleofthelistandcomparingit
tothetarget.Ifthetargetissmallerthanthemiddleelement,thenitwouldhavetobein
thelowerindices(left),otherwiseitwouldhavetoexistinahighervaluedindex(tothe
right).Whatthismeansforperformanceisthatthesearchareaiscutinhalfeachtimea
comparisonisdone.
Thisidea seems simple,but is actuallydifficulttoget right in an implementation. At
conferences where many PhDs in computer science are presenting papers, it has been
foundthatfewerthan10%oftheparticipantscancodeabinarysearchthatworksthefirst
time.Theterminalconditionsaretricky:inparticular,howcanitbedeterminedthatthe
targetisnotinthelist?OK,sothedetailsarecrucial.Atthebeginningthereisalist,and
itslengthisknown.Theindexofthemiddleelementisknowntoo,andthelistissorted.
So:findtheindexofthemiddleelement:
istart=0
iend=len(list)
m=(iend+istart)//2
Ifthetargetisinthelist,isitatasmallerindexthanm(i.e.,islist[m]>target):
iflist[m]>target:
Ifso,don’tbotherlookingatanyindexbiggerthanm.Inotherwords,thelargestindex
tolookatwouldbem-1:
iend=m-1
Ifthetargetisinthelist,isitatalargerindex(i.e.,islist[m]<target)?Ifso,don’tlook
atanylocationswithanindexlessthanm;inotherwords:
eliflist[m]<target:
istart=m+1
Iftarget=list[m]thenithasbeenfoundandthealgorithmterminates.
else:
returnm
Thiscodehastoberepeateduntilthetargethasbeenfound,orithasbeendetermined
thatitisnotinthelist.Theloopconditioniscritical.Theloopcontinuessolongasistart
<=iendsothatifthefinalstepfindsthetargetinthelist,thenitwillreturntheindex.If
theloopexitswithoutfindingtheelement,thentheindexvalueis-1.Thefinalcode,asa
function,is:
defsearch(list,target):
istart=0
iend=len(list)
whileistart<=iend:
m=(iend+istart)//2
iflist[m]>target:
iend=m-1
eliflist[m]<target:
istart=m+1
else:
returnm
returnNone
The speed of the binary search depends on the fact that it is searching a randomly
accessibledatasetlikeaPythonlistoraJavaarray,andnotafile.Itwilltakeontheorder
of log(n) probes into the list to find what it is looking for or to determine that it is not
there.
Timingthebinarysearchgaveanexecutiontimeof3.305e-06seconds,stillslowerthan
thebuilt-inoperation.
10.3 RANDOMNUMBERGENERATION
Pythonoffersarandomnumbermodulenamedrandomthatoffersabroadcollectionof
random number generation facilities. How is it possible to generate a random number
usingsoftware?Shouldn’tacomputerprogramexecuteconsistentlyandalwaysproduce
thesameanswereachtime?Yes,itshould.Theresolutionofthisapparentproblemliesis
thedefinitionofrandom.
First,randomnessisdefinedonlyforcollectionsofeventsornumbers.Onenumber,or
even a small collection, can’t be said to be random. Randomness reflects the lack of a
pattern,andonlyoneortwoeventsdon’treallydisplayapattern.Randomnessismoreof
a statistical property of a sequence, and is not necessarily related strictly to
unpredictability. After all, if a computer program can generate random numbers, then it
shouldbepossibletopredictthenextoneitwillgenerate.
Arandomnumbergenerator(RNG)onacomputerisreferredtoaspseudo-random;itis
not truly random, but exhibits properties of randomness. These properties can be tested
statistically. A typical RNG returns a floating point number between 0.0 and 1.0. This
value can easily be transformed into a random number, either real or integer, in any
desiredrange.Adierollisanintegerbetween1and6inclusive.AnRNGfunctionnamed
rand01()canbeconvertedintoadierollas:
int(rand01()*6+1)
Ifthenumbersgeneratedbyrand01()arerandom,thenitshouldproducedierollsthat
eachhaveaprobabilityof1/6.Ifnotthenthereisabias.
If a coin is flipped many times and the sequence HTHTHTHTHTHTHT results, the
probabilityofHorT(headsortails)is0.5,or50%,whichiswhatwouldbeexpected.Ifa
sequencehasthecorrectpercentagesforeachoutcome,thenitpassesthefrequencytest.
Yetthissequenceisprobablynotrandombecauseoftheobviouspatternintheresults.The
frequencytestisnotenough.
A second test would consider pairs in the sequence and compare the probability of
occurrence of each pair against the theoretical. In the coin toss there are four possible
pairs:HH,HT,TH,andTT.Eachpairshouldappearwithequalprobability,andyetthe
stringaboveshowsonlyHTinstances.Itisnotrandom.Astandardsuiteofrandomness
testscalledDiehardincludesamorecomplexversionofthistest,involvinggroupsoffive
elementsinthesequence,eachonehavingatheoreticalprobabilityof1in120.Thiskind
oftestcanbecalledtheserialtestoroverlappingpermutations.
AthirdtestinvolvesusingtheRNGtogeneratepokerhands.Theprobabilityofspecific
handsiswell-known,andanyconsistentvariationfromtheseprobabilitieswouldimplya
flawintheRNG.Thisisthepokertest.Anycomplexrandomgamecouldbeused,andthe
Diehardsuiteusesthegameofcraps.
There are many other tests that could be applied, and all are based on generating
complex situations and comparing the theoretical distribution of properties generated
againstwhattheRNGcreates.So,nowthattherearewaysoftestinganRNG,canonebe
writteninPythonandtested?
10.3.1 LinearCongruentialMethod
Pseudo-random number generators basically shuffle the bits around in a number in
complexandnon-repeatingways;atleast,theydon’trepeatforalargenumberoftrials.A
historicallycommonmethodfordoingthisistocalculateavaluethatisboundtobelarger
thantheplacewhereitistobestoredandkeeponlytheremaindereachtime.Thevalueof
thisremainderispseudo-randomundercertainconditions.Alinearequationcanbeused
andisfasttocalculate:
Xi+1=(aXi+b)modm(10.1)
whereXiisthepreviousrandomnumberinthesequenceandXi+1isthenextone.The
valueofmshouldbequitelargeanditshouldbeaprimenumber.Manycomputershave
used a 32-bit integer size, and as it happens 232 – 1 is a good value for m ( =
2147483647).Pythonintegerscanbeaslargeasdesired,solargervaluescouldbeused.
Keepingthen to 32 bits is accomplished using an and operation and masking the result
witha32-bitconstant:0xFFFFFFFF.
Valuesforaandb aremore flexible,but largevalues area good idea,and toomany
factors can cause problems. One good set of values is a=69069 and b=362437. This
methodusesapreviousvaluetocalculatethenextone,soaninitialvalueisrequired.This
iscalledtheseed,anditmustbepossibleforauser/programmertobeabletosetthisseed
valuetowhatevertheychoose.IfnotthentheRNGwillgeneratethesamesetofvalues
eachtimeitis used.That’s actuallyagood thingfordebugging, becausewhentracking
downaproblem,itisimportantthattheprogrambehaveconsistently.
ThebasicRNGdescribedabovewouldbe:
_xseed=76951
defirand01():
global_xseed
_xseed=(69069*_xseed+362437)&0xFFFFFFFF
return_xseed
Thisfunctionreturnsanumberbetween0and2147483647,andresetstheseed(_xseed,
aglobal)eachtime.It’sagoodstart,butwhatiswantedisafunctionthatreturnsanumber
between0and1;so,asecondfunctiondoesthissimplybydividingtheaboveresultby
2147483647:
defrand01():
returnirand01()/0xFFFFFFFF
Afunctionthatcansettheseedisneededtoo:
defsetseed(x):
global_xseed
_xseed=x
AcommonlyusedfunctioninthePythonrandompackageisrandrange(a,b),which
returnsarandomintegerbetweenaandb.Thecodeforadiehasalreadybeenwritten,and
sothemathisknown.Usingthetoolsjustwritten,thisiscodedas:
defrandrange(n1,n2):
x=(int)(rand01()*(n2-n1+1))+n1
returnx
How can a random number generator be made to generate a different set of numbers
everytimeaprogramstartsusingit?Simplybysettingtheseedtoanumberthatishardto
predict. Such a number is found in the low bits (milliseconds and microseconds) ofthe
systemclock.Itisimpossibletopredictwhatthesewillbe.SorandomizingtheRNGcan
beaccomplishedlikethis:
defrandomize():
global_xseed
_xseed=int(time.time())&0xFF
Thetime.time()functionreturnsthenumberofsecondssinceafixeddateinthepast,
calledtheepoch.ThisdateisusuallyJanuary1st,1970,midnight.
Othermethodsforgeneratingrandomnumbersexistandarecommonlyused.Python’s
random class uses the Mersenne Twister algorithm, which is often seen as a default in
programming languages but is a trifle slow. Blum-Blum-Shub resembles the linear
congruentialbutusestherelationxi+1=xi2modmwheremistheproductoftwoprime
numbers.Dozensmoremethodsexist.Therearealsopracticalmethodsforgeneratingtrue
randomnumbers,andthesearebasedonspecifichardwarethatcapturesatrulyrandom
process such as radioactive decay, the photoelectric effect, or random electromagnetic
noise.
Finally,therearewebsitesthatwillofferrandomnumbersandsequencesonrequest.
Random.org will serve up true random numbers, for example, and there are dozens of
othersuchsites.Thetimeneededtoconnecttoaserveranduploadarandomnumberis
considerable, so they should be used knowing the tradeoff of time for random number
quality.
10.4 CRYPTOGRAPHY
Cryptographyinvolvessendingmessagesthatonlycertainintendedpeoplecanreceive
andunderstand.Thisinvolvescodesandciphers.Acodesubstitutesonestringforalonger
message;thereisacodebookinwhichthecodestringsareassociatedwiththeirrelevant
message. So, the string “A76” could mean “retreat 100 meters.” Code books had to be
changedregularlybecauseeventuallyonewouldfallintothehandsofsomeonewhowas
notsupposedtohaveone.
A cipher is an algorithm that converts one string of characters into another one of
generallythesamelength.Itcanoperateonbits,oncharacters,oronblocksofcharacters.
Acipherdoesnothaveacodebookbutdoeshaveakey,whichisastringofnumbersor
characters, that the algorithm uses to transform the original string (called the plaintext)
intotheencryptedstring(calledtheciphertext).Theciphertextcanbetransmittedsafely
becauseitcannotbeunderstoodwithoutthekey.
Cryptographyhasbecomemuchmoreimportantinthelast30yearsorso.It’snotjust
that the world is an uncertain place. It is more that people wish to share private
information across the Internet. If a purchase is made with a credit card, then the card
number should be encrypted before sending it to the seller. Access to certain sites that
have valuable services or information requires a password. Installing new software
requiresanaccesskey.Theseareallexampleswhereencryptionisrequired.
Itshouldbementionedthatthesecuretransferofinformationdependsonoperational
securityaswellasonencryption.Someonewithapasswordcan access allservicesand
dataassociatedwiththatpassword,sokeysandpasswordsmustbeprotected.Thisaspect
isbeyondtheabilityofaprogrammertocontrol,andisoftenthewaysecuritysystemsare
broken.
Thereissometerminologythatneedstobeunderstood.Asymmetrickeysystemuses
one key to encode and the same key to decode. Asymmetric systems like public key
systemsuseonekeytoencryptthemessage,akeythatanyonecanknow,andasecond,
privatekeythatonlytherecipientknowsandisusedtodecrypt.Ablockcipherappliesa
keytoacollection(block)ofdata,oftenasizeof64,128,or256bitsatatime.Astream
cipher is usually a symmetric key cipher that encrypts a plain text character with a
characterfromthekey.It’salsocalledastatecipherbecausethe encryptionofthe next
characterdependsonwhathashappenedbefore.
Knowing a little about encryption is important, but it is also important to understand
that it is a very complex and highly mathematical subject, and requires a significant
amountofstudytobecomeanexpert.
10.4.1 One-TimePad
Having just said how complex the field of cryptography is, the first algorithm to be
examinedis,infact,ratheroldandperfectlysecure,ifdifficulttouseinpractice.Suppose
personAwishestosendpersonBthemessage“Meetyouatninepmatlocationalpha.”
Encodingthisrequiresasequenceofrandomcharactersatleastaslongasthemessage.In
actual use, this cipher often used pages from books as keys, books that were easily
accessiblebybothparties.Inthiscasethefollowingtextisusedasthekey:“itwasthe
bestoftimesitwastheworstoftimes.”Theencryptionprocess,knowntoboth, and in
factnotreallyasecret,istoapplytheexclusiveORoperationtocorrespondingcharacters
inthemessageandthekeytoproducetheciphertext:
TheexclusiveORoperationisabit-by-bitlogicaloperatorthatis0ifthetwobitsare
equalandis1otherwise.Itisappliedtothenumericalrepresentationsofthecharacters.
Thisisquite handybecause itis veryfast andcan easilybeaccomplished usingsimple
hardware.Considerthefirstcharacterinthemessage“m.”Thefirstcharacterinthekeyis
“i.”TheASCIIcodesarethenumbers109and105respectively,orinbinary:
01101101109“m”
01101001105“i”
000001004ExclusiveOR
One interesting observation here is that different characters can be encrypted to the
same cipher text byte, as in the above string where “s” and “t” both encrypt to 26.
Anyway,nowthisciphertextistransmittedtoBandisdecodedinexactlythesameway
that it was encoded: apply the exclusive OR between the ciphertext and the same key
(symmetrickey):
ThePythoncodethatcandothebasicencryptionis:
pt=“meetyouatninepmatlocationalpha”
key=“itwasthebestoftimesitwastheworstoftimes”
ct=””
xt=””
foriinrange(0,len(pt)):
v=ord(pt[i])^ord(key[i])
print(v)
ct=ct+chr(v)
print(ct)
Theexclusive-ORoperatoris“^”,andtheexpressionord(pt[i])^ord(key[i])performs
theXORonthemessageandthekeybytes,asnumbers.Doingitagainwiththesamekey
getsthemessageback.
The reason that this is called a one-time pad is that the key can only be used once,
otherwise the cipher is not secure. The security lies in the randomness of the key, and
reusing it reduces the randomness. Eventually if the same key is used often enough, an
observer, someone who can intercept all of the messages, can extract the pattern and
determinethekey.Soinpracticethekeyswerewrittenonpadsofpaperand,onceused,
weredestroyed.Keepingthepadssynchronizedbetweenthesenderandreceivercanbea
problem,especiallyiftherearemanyofeach.Hence,althoughthesystemissecure,itis
notusedveryoften.
10.4.2 PublicKeyEncryption(RSA)
A public key system is commonly used for secure communication across computer
networks,andinvolvesonekeyforencryptionandanotherfordecryption.Therearemany
variationsonthebasicidea,somebeingmuchtoocomplextodiscussinafewpages,but
theRSAalgorithmisrelativelysimple,quitepopular,andverysecure.Itisnamedforits
inventorsRivest,Shamir,andAdleman.
ThemathematicalideathatunderliesRSAisthatonecanfindthreeverylargeintegers
e,d,andn
for any m, and that even knowing e and n or even m, it can be extremely difficult to
findd.Thevaluesdandearethekeys,andmisthemessage.
So,encryptingamessagewouldworkasfollows:AsendsmessagemtoBusingB’s
publiclyknownencryptionkeye:
The value of c is the ciphertext and can be transmitted to B. When B receives the
message,itisdecryptedusingtheirprivatekeyd:
wheren> m.Thisworks becauseof the originalassertionthat (me)d mod n =m.The
successofthismethoddependsonafewotherthings:cancdmodnbecalculatedquickly
enough for large numbers (i.e., 500 bits), and can the numbers d,e, and n be found to
makethiswork?
Thefirststepindeterminingthekeysistoselecttwoverylargeprimenumberspandq.
Let n = p*q. A large number in this context has hundreds of bits, but that creates a
cumbersomeexample,sosmallernumberswillbeusedinthisdiscussion.
Nowcalculateφ(n)=(p-1)*(q-1)andfindanintegeresothateandn are co-prime;
thatisthegreatestcommondivisorbetweeneandnis1.
Letd=(e-1)modφ(n)sothatd*emodn=1.Thiscanbefoundusingasearch,which
maybeinfeasibleduetothesizeofthenumbers:
foriinrange(e,n):
if(i*e)%j==1:
d=i
break
oramathematicalprocessthatusesEuler’stheoremcangivetheanswerfaster,andcode
hasbeenprovidedforthisontheaccompanyingdisc.
Example: Encrypt the Message “Depart at Dawn” Using
RSA
Thefirststepistodeterminesomekeystouseandtodistributethepublickey.Using
theprimenumbers73and83(fartoosmallforarealsituation)thedeterminationofthe
keysis:
nis6059andϕ(n)is5904
eis17,chosenbecauseitisprime.Nowfinddsuchthatd*emodn=1.Searchingforitis
practicalfornumbersthissizeandonegets:
d=3473
Sothepublickeyis(17,6059)andtheprivatekeyis(3473).
Themessageis14characterslong,andwouldbe112bits;nisonly10bitslong,andthe
messagehastobeshorterthanthis.Inthisinstancethemessagecanbesentonecharacter
atatime,butthisisgenerallypoorpractice.Normallylargerblocksofdataareencrypted
at one time. The plaintext string is converted into integers using ord(),and each oneis
encryptedusingtheformula:
Anexamplewouldbe:
message=“Departatdawn”
imessage=()
cmessage=()
foriinrange(0,len(message)):
m=ord(message[i])
imessage=imessage+(ord(message[i]),)
c=(m**e)%n
cmessage=cmessage+(c,)
Nowthemessageconsistsof14blocksof1charactereach.Itcanbetransmittedtothe
recipient,whoisnormallynamedBorBob,inthisform.Thesender,namedAorAlice,
hadaccessto thepublickey only,which isall thatisneeded toencryptthe message.It
cannotbedecryptedusingthepublickey.
dgivend⋅e≡1(modφ(n))
Bobreceivestheciphertextmessage,whichinthiscaseis:
(4652,3518,4274,5770,1663,344,2498,5770,344,2498,2144,5770,1725,4601)
Hetakeseachblockanddecryptsitusing:
ThePythoncodeforthiscouldbe:
dmessage=()
foriinrange(0,len(cmessage)):
c=cmessage[i]
m=(c**d)%n
dmessage=dmessage+(m,)
Theresultingdecryptedmessageis:
(68,101,112,97,114,116,32,97,116,32,100,97,119,110)
Whichis the original message. Noticethat because only oneblock per characterwas
encrypted,theeffectisthatofasubstitutioncipher,inwhicheachletterhasbeenreplaced
byanother.This is very easy to decrypt by noting patterns of letters and frequencies of
letters in the language; the letter “e” is usually the most commonly used letter in an
Englishmessage.Thatiswhythemessageisencryptedasblocksofcharacters.Itishighly
unlikelythatalargeblockwouldberepeatedexactly,andifitwereitwouldbedifficultto
guesswhatitwasanyway.
10.5 COMPRESSION
A little arithmetic will start this discussion. The song “Blackbird” by The Beatles is
almostexactly 4 minutes long. This is 240 seconds, and if itwas converted into digital
form,itwouldbesampledatarateof44,100sampleseachsecond.Thismeansthatthe
songhas240*44100=10.6millionsamples.Butwait—it’sstereo,sodoublethatto21.2
millionsamples.Atypicalsampleis16bits,sothisworksoutto42.4millionbytes;42
megabytes!TheMP3fileforthissongistypically1.9megabytes.Howisthatpossible?
Byusingacompressionalgorithm.
Datacompressionisallaboutwaystotake,forexample,100bytesofinformationand
turnit into 10bytes while losingnone of theessential message. Ofcourse, compressed
datais incomprehensible justto look at and mustbe decompressed in orderfor it to be
used. Data is often compressed before storing it in a file to reduce its footprint on the
storage device, or before transmitting it along a communications channel to take better
advantageoflimitedbandwidth.
The question of how a string of data bytes can be made shorter while losing no
importantinformationremains,andasimpleexamplemaybeinorder.Consideracartoon
image.Thesehavearelativelysmallnumberofdistinctbutvividcolors,usuallylessthan
10colorsandthecolorvariationwithinanyregionissmall.TheexampleimageinFigure
10.1isinPNGformandis23.2Kbytesinsizeat400x456(=182400)pixels.Asrawdata
itwouldbealittleover182Kbytesinsize,547KbytesifRGBcolorwasused.
A simple compression technique that will work in this case is called run-length
encoding. In its simplest form data bytes are preceded by a count indicating how many
repetitionsofthatvaluewereencounteredinthedata.Soiftherewasasectionofdata:
11000000222121200000022222
Thiswouldbeencodedas
21 60 32 11 12 11 12 50 52
Twoones six
zeros
threetwos a
one
a
two
a
one
a
two
fivezeros fivetwos
In this case the original data required 26 bytes and the compressed data required 17
bytes. The new data takes 65% of the space that the original does. This is not a huge
saving,butisprobablyworththeeffort.Itdoesdependheavilyonthenatureofthedata.
ConsidertheimageofFigure10.1.Thecolorareasareuniformandratherlarge,sothis
imagewouldbeanidealcandidateforrun-lengthencoding.Whenwritingtheprogram,it
isimportanttouseabinaryfileandconvertthevalueandcountintounsignedbytesbefore
writing them to the file. This is a new data type called an unsigned byte that was not
discussedinChapter8, and hasthe code“B.”So, writingthecountand valuecouldbe
donelikethis:
s=pack(“BB”,n,v[1])
f.write(s)
Theentireprogramthatwillrun-lengthencodetheimagewillreadtheimagefileand
collectidenticalpixels,countingthemastheyarecollected,untilachangeinpixelvalue
occurs.Thenthe(count,value)pairiswrittentothefile.Alsothepairwillbewrittenif
255pixelshavebeencollected,sincethatisthebiggestnumberthatcanbecountedin8
bits.Theresultisabinaryfileofpairsofnumbers(count,value)thatrepresentthepixels
intheimage.Asthereareonlytwocolors,valuecanbe0or1,0beingwhiteand1being
green;ingeneraltherecanbe256distinctvalues.Theencodingprogramlookslikethis:
fromstructimport*
importGlib
defemit(v,n,f):#Writeapairofbytes(count,value)
s=pack(“BB”,n,v[1])#”'B'isunsignedbyte.
f.write(s)
Green=(123,210,0)#Theobjectcolor
White=(255,255,255)#Thebackgroundcolor
Glib.startdraw(400,456)
b1=Glib.loadImage(“b1.png”)#Readtheimage
outf=open(“b1.txt”,“wb”)#Opentheoutputfile
Glib.image(b1,0,0)#Displaytheimage
count=0
value=Glib.getpixel(b1,0,0)#Readapixelvalue
#initially
forjinrange(0,456):
foriinrange(0,400):#Foreveryimagepixel
ifcount==255:#Largestpossiblecount.
emit(value,count,outf)#Write(255,value)
count=0#Resetthecount
c=Glib.getpixel(b1,i,j)#Getthenextpixelvalue
ifc==value:#Sameasbefore?
count=count+1#Yes.Incrementthecount
else:#No,different
emit(value,count,outf)#Writethecountand
#value
count=1#Resetthecount
value=c#andthevalue
ifcount>0:#Aftertheloopendsaretherepixelsto
#write?
emit(value,count,outf)#Yes,sodoit.
outf.close()
Glib.enddraw()
Thedecodingprogramreadspairsofunsignedbytesfrom the binaryfile,andcreates
pixels.Apair(12,0)wouldbe12whitepixels,forinstance.Apair(12,1)couldbe12
pixelsofsomeothercolor,andthisprogramwritesthepixelssoitdecideswhatcolorthat
willbe.Itwillreadpairsanddrawpixels,intoanimageof400columnsand456rows,
untilallareaccountedfor.Aprogramthatdoesthis(nottheonlyonepossible)is:
fromstructimport*
importGlib
Glib.startdraw(400,456)
inf=open(“b1.txt”,“rb”)#Opentherunlengthencoded
#file
i=0
j=0
cols=400#Thesizeisknown,butcouldbea
rows=456#partofthedatafile.
whileTrue:
s=inf.read(2)#Reada(count,value)pair(bytes)
iflen(s)<=0:#Endoffile?
break#Yes.Exitloop,stopdrawing.
c,v=unpack(“BB”,s)#Converttointegers
ifv==255:#Backgroundpixel?(White)
Glib.fill(255,255,255)
else:#Objectpixel?(Green)
Glib.fill(123,210,0)
forkinrange(0,c):#Drawthepixelstothe
#screen.
ifi>=cols:#Atendofcolumn,add1torow
i=0
j=j+1
Glib.point(i,j)#Drawasa'point'
i=i+1#Nextpixel(increaseX)
ifj>=456:#Lastrow
break
Glib.enddraw()
Themorecomplexthedatais,inthiscasemeaningthemoredistinctvaluesthedatacan
take,thelessusefulthisencodingmethodwillbe.Insomecasesitcanmakethefilesize
larger that the raw data would have been. In the case of this particular image, the run-
lengthencodedfileisabout6Kbytes,asopposedtohalfamillionbytesthatwouldhave
beenneededfortherawimage,savedaspixels.Still,thisservesasabasicproofthatitis
possible to compress a data file without losing any information. There are, of course,
many more algorithms that will compress data to a greater extent and with fewer
constraints.
10.5.1 HuffmanEncoding
Ifatypicaltextfileisexaminedcarefully,itcanbefoundthatthevastmajorityofthe
fileconsistsofrelativelyfewcharacters.Asageneralestimate,over95%ofthecharacters
canbe accounted forby between 25–30 distinctvalues. Acoding scheme thattook this
intoaccountwouldreducethesizeofatextfile,andperhapsitwouldgeneralizetoother
kindsoffile.Forexample,inmanyfilesthevalue0isthemostcommon,andgivingita
smallerrepresentationthan,say,9mayreducetheoverallfilesize.
Thisisnotreallyanovelidea.TheinternationalMorsecodeisbasedonthisideaand
hasbeenaroundforalongtime,beginningin1836.Themostcommonlyusedlettersin
English are shown in Table 10.1. In the Morse code the letter “E” is represented by a
singledot,theletter“T”isasingledash,and“A”isadotfollowedbyadash.Inother
wordsthemostcommonlettershavethesmallestcoderepresentation,asageneralrule.
ThisishowtheHuffmancodeisorganizedtoo.
AHuffmancodeisconstructedfromthegroundup,likeawall.Thelowerlevelsofthe
wallrepresenttheleastfrequentlyusedsymbols,andhavethegreatestnumberofbricks
abovethem.Thefinalcodewillbebinarynumbers,andthelengthofthecodeinbitsfora
symbol is related to the number of bricks above it. The wall is actually shaped like a
pyramid,andiscalledabinarytreebycomputersciencefolks.It’saveryusefulstructure
ingeneral,butthedescriptionwillberestrictedheretoitsuseinHuffmancodes.
Asanexample,considertheEnglishtext:
IthinkthatatthattimenoneofusquitebelievedintheTimeMachine11
Thecharactersoccurinthisparticulartextwiththefollowingfrequencies:
t10 a4 q1 k1
e9 m3 c1 s1
i8 o2 d1 l1
n5 u2 v1
h5 b1 f1
The‘leaves’(ornodes)atthebottomofthetree(itisdrawnupside-down)containthe
lowestfrequencyitems,andsoareplacedfirst.Eachtwonodesinthetreewillhaveone
node above them, straddling them, containing the sum of the frequencies of all nodes
below. All characters are turned into nodes, and each also contains the number of
occurrencesofthatletter.Thiscollectionofnodeswillbecalledaheap.Initiallyallhave
onlyonecharacter,butthiswillchange.
Theruleinbuildingthetreeistopickthepairofnodes(initiallycharacters)thatsumto
thesmallestnumberandconnectthemusinganothernode,oneabovethemthathasaleft
and right node. The first bricks, alphabetically, would be “b” and “c” both with a
frequencyof1.Thefirsttwowouldlooklikethis:
Thebottomnodeshavecharactersandcounts.Theoneabovehasonlyacount,anditis
thesumofthecountsofthetwonodesitisconnectedto.Thisnewnode,withacountof
2,isplacedbackintheheapandthenodesforBandCareremoved.Theheapwillalways
getsmaller.
Repeatingthisprocesswiththeothers,thesmallestpairwecanmakeiswith“d”and
“f,”then“k”and“l,”andthen“q”and“s.”Atthatpointthesmallestnodeis“v”witha
countof1,buttherearenomorenodeswithacountof1.Thesmallest
sumis2,whichuses“v”and‘o’:
Alloftheseare intheheap, andasearch isdonefor the smallestsum of nodes.The
character “u” has a count of 2 and so do any of the nodes above that link to twoother
characters.Thesearenodestoo,solink“u”withtheleftmostnodeabovetogetabigger
grouping—thisiscalledasubtree,becauseitisatree,butitisalsopartofabiggertree.
The‘u’nodeandtheothergivesasumof4:
The tree that is being built has the least commonly used characters placed at a greater
distancefromthetopofthetreethanarethefrequentlyusedcharacters.Thisdistancewill
beusedtoconstructthecodes,smallerforcommoncharacters.Nowthesmallestsumsof
twonodesinthetreeis4,thenodesconnecting“d,”“f,”“k,”and“l”:
Themethodheretakesthesmallesttwonodes,whicharegoingtocreatethesmallest
sum, and connects them, removing the original nodes and replacing them with the new
one.Thesmallestnodesnowarethenodeconnecting“q”and“2”(value2),thenodewith
“m”(value3)andthenodeconnecting“v”and“o”(value3).Thenodewith“m”willbe
selectedtolinktothe2-valuednode.Thetreeisadisconnectedcollectionofnodes,but
rightnowlookslikethis:
plus all of the unconnected nodes for individual letters. So, what’s next? The smallest
valued character remaining is ‘a’ at 4. That would make the smallest sum 7 after
connectingitwiththesubtreeontheright(‘v’and‘o’).Nextintheheaparethetwo4-
nodesabovetocreatean8,andlinking‘h’(5)and‘n’(also5)togeta10:
The pattern should be clear by now. Notice that the nodes with nothing below them
always consist of characters, and the nodes above have only numbers. But oops—the
spacecharacterswerenotcounted,andtheymustbeforthemessagetomakeanysense.
Thereare14spacesinthemessage.Thefinalsumwillbe14+9forthespace.Anodefora
spacehastobeaddedtotheheap.
Thelasttwostepsdon’tinvolveanynewcharacters,buttheywilllinkallofthenodes
togetherandmakethemaccessiblefromonesinglenodeatthetop.Thefinal(top)node
shouldhaveavaluethatisthelengthoftheoriginalstring.
Now comes the bottom line: what was the point of all this? The tree that has been
constructedwillbeusedtoconstructthecodesforeachletter,andthelengthofeachcode
willbethenumberofnodesbetweenthecharactersandthetop(root)ofthetree.Thepath
toeachleftnodeislabeledwithadigit,inthiscasea0,andthepathtotherightnodesis
labelledwitha1,asinthetreeabove.Thecodeforanycharacterisreadoffofthelinks
thatwerefollowedtogetfromthetopofthetreetothenodecontainingthecharacter.So
thespace character, the mostcommon one, is reached by goingleft two times; its code
willbe“00.”The“t”isthesecondmostfrequentcharacter,andisreachedfromthetop
nodebygoingright,thenleft,thenright;thecodeis“101.”Thecompletesetofcodesis:
‘‘ 00 A 11111 D 011100 V 111100
T 101 M 11101 F 011101
E 100 U 01100 K 011110
I 010 O 111101 L 011111
H 1100 B 011010 Q 111000
N 1101 C 011011 S 111001
The coded message is the concatenation of all of the codes for the characters in the
ordertheyappearinthemessage.Theencodedmessagewouldread:
This amounts to 259 bits = 33 bytes. The original string is 71 bytes long, so the
compressed data is 46% of the size of the original data. The Huffman coded string is
brokeninto8-bitbytesandtransmittedthatway:
010001011100010110101111000101110011111101001111
110100101110011111101001010101110110000110111110
111011000011110101110100011001110010011100001100
010101100000110101000111110101001111001000111000
001011010010111001000010101011101100001110111111
0110111000101101100
Decoding requires the table or the tree. If a known table is used, such as the natural
frequencies of English letters, then it would not have to be transmitted along with the
message. The use of a Python dictionary type makes the program for decoding very
elegantindeed.Giventhetableandthemessage,bitsareremovedfromthebeginningof
themessageandplacedintocodestringuntiltheymatchoneofthecodesinthetable.The
Huffmancodehasthepropertythatthebitsequencesareuniquewhenappendedasalong
message.Thefirstbitsequencethatmatchesacodewillbethecodeforthefirstletterin
themessage.
#Huffmandecode
#Thisisthecodedmessage:
bitstring=“01000101110001011010111100010111001111110100111111”+\
“0100101110011111101001010101110110000110111110111011000011110”+\
“1011101000110011100100111000011000101011000001101010001111101”+\
“0100111100100011100000101101001011100100001010101110110000111”+\
“011111101101111000101101100”
table={}#Thisisthetableofcodes
table['00']=”“
table[“11111”]=“A”
table[“011100”]=“D”
table[“111100”]=“V”
table[“101”]=“T”
table[“11101”]=“M”
table[“011101”]=“F”
table[“100”]=“E”
table[“01100”]=“U”
table[“011110”]=“K”
table[“010”]=“I”
table[“111101”]=“O”
table[“011111”]=“L”
table[“1100”]=“H”
table[“011010”]=“B”
table[“111000”]=“Q”
table[“1101”]=“N”
table[“011011”]=“C”
table[“111001”]=“S”
#Pullbitsfromthestringmakingasubstringuntilthe
#substringisfoundinthedictionary.Thenemitthe
#characterindexed.
#Loopuntilallbitsareused
whilelen(bitstring)>0:
code=””#Clearthecurrentcode
#WhilecodeNOTinthedictionary…
whilenot(codeintable):
#Addthenextbitfromthemessage
code=code+bitstring[0]
#Removethatbitfromthemessage
bitstring=bitstring[1:]
#Whenthecodematches,printthecharactercorresponding
#tothecode
print(table[code],end=””)
10.5.2 LZWCompression
Likemanyalgorithms,LZWcompressionisnamedafterthepeoplewhodevisedit:A.
Lempel,J.Ziv,andTerryWelch.Ithasbeenthestandardfordatacompressionformany
years,itwasthemethodusedintheGIFfileformat,andwasusedinmanyversionsof
PDF. It is notthe most effectivemethod ofcompression, but itis lossless andefficient.
LiketheHuffmancode,LZWcreatesatablefromtheoriginaltextandusesthecodesin
thetabletoperformthecompression.UnliketheHuffmancode,thedecompressionstage
doesnotrequirethatthetablebeknowninadvance;itbuildsthetableasitdecompresses
the file. The LZW algorithm also replaces multiple characters with single codes, thus
increasingthecompressionrate.
LZWcompressionusuallybeginswithaknowncodetable,mostoftenthe256ASCII
characters, but any table known by the compressor and decompressor will work. As an
example,anothershortsectionoftextfromTheTimeMachinewillbecompressed:
The Time Traveller for so it will be convenient to speak of him was expounding a
reconditemattertousHisgreyeyesshoneandtwinkledandhisusuallypalefacewas
flushed and animated The fire burned brightly and the soft radiance of the
incandescentlightsintheliliesofsilvercaughtthebubblesthatflashedandpassedin
ourglasses.
Punctuation has been removed for simplicity. The algorithm begins with a table of
characters,inthisinstancetheonesthatappearinthequote,butingeneralthetablecan
contain any starting set of symbols. This is called the code table, and associates a
numerical code with a string. The code table in this case will consist of the letters
(uppercase)andtheirvaluesstartingwith0:“A”=0,“B”=1,andsoon.Thespacehastobe
includedaswell.Thecodesequence024wouldbethestring“ACE”usingthisscheme.
Naturallytherehastobemoretothisifitistobeaviablecompressionmethod.When
encoding,thecharactersareexaminedoneatatimeandappendedtoaninputstring,and
lookedupinthetable.Ifthestringisfoundinthetable,thenthenextcharacterisreadand
appendedtothestringanditislookedupagain.Thisrepeatsuntilthestringisnotfound,
atwhichpointafewthingshappen:thecodeforthelaststringthatwasfoundiswrittento
theoutput,thenewstringthatwasencounteredinthestringbutnotfoundinthetableis
addedtothetables,andtheprocesscontinuesusingthelastcharacterreadin.Thismeans
that not only characters but also short strings that occur in the text will have numeric
codes,andthatthetablewillbecreatedfromthetextthatwasgiven.
Considerthetextintheexample:Thefirstcharacterseenis“T”:
1.“T”existsinthetablealready,soanewcharacterisreadinandappendedtothe“T”
tocreatethepair“TH.”
2.“TH”isnotinthetable.Thecharacter“T”hasthecode19,so19iswrittentothe
outputfile.
3.Thestring“TH”isaddedtothetable.Itwillbecode27.
4.Theinputstringisnow“H.”
5.Thecharacter“H”isinthetableandhascode7.Thenextcharacterisreadinand
appendedto“H”creating“HE.”
6.“HE”isnotinthetable,sothecodeforcharacter“H,”whichis7,iswrittentothe
outputfile.
7.Thestring“HE”isaddedtothetable,code28.
8.Theinputstringisnow“E.”
Theprocessrepeats.Ifamultiple-characterstringisfoundinthetable,thenthesteps
arebasicallythesame.Hypothetically:
1.Thecharacter“T”isnextandisinthetable.Readthenextcharacter“H”and
appendto“T”toget“TH.”
2.“TH”isinthetable.Readthenextcharacter“E”andappendto“T”toget“THE.”
3.“THE’isnotinthetabletoemitthecodefor“TH,”whichis27.
4.Inputstringisnow“E.”
Step1repeatsuntilastringisobtainedthathasnotbeenseenbefore.Intheexample
herethefirst27codesarelettersandthespacecharacter.Thenextfewcodesare:
TH27 HE28 E29
T30 TI31 IM32
ME33 ET34 TR35
Thefirst3-characterstring(trigram)inthetableis“ET.”
Python’s dictionary type is especially valuable for coding the LZW algorithm. The
facilityforlookingupastringinatableisexactlywhatisrequiredhere.Thecriticalpart
oftheprogramcouldbewrittenasfollows:
#countisthenextunassignedsymbol
#chisthelastcharacterreadin
#sisthecurrentcharacterstring
#infistheinputfile(text)
s=””#Initialstringisempty.
ch=inf.read(1).upper()#Readthefirstcharacter,upper
#case.
whilelen(ch)>0:#Whilethefilestillhasdata…
ifs+chindict:#Isstringconcatenatedwithch
#inthetable?
s=s+ch#Yes.Concatenateandrepeat
else:#No.
print(dict[s],”“,end=””)#Printthecodefor
#thestrings
dict[s+ch]=count#Putthenewstringinto
#thedictionary
count=count+1#Newcodeisnextinteger.
s=ch#Stringisnowthelast
#characterread.
ch=inf.read(1).upper()#Readanewcharacter
When decoding the LZW file, the initial table is known. Again, this is often just the
ASCIIcharactersbutcanbesomethingelse,andinthiscaseisthelettersplusthespace.
Thefilecontainscodes,notcharacters,butthecodesareinthetable,right?No,onlythe
startingcodesareinthetable.Sodecodingthemessageintheexamplestartseasily.The
firstfewcodesinthemessageare:
19742619812291917021…
Thefirstcodeisreadinandisthecodefortheletter“T.”Thisisfollowedby7(“H”)
and4(“E”)andsoonuntilthecode29isreached.Thereisnoentryforthecode29inthe
table.ThisiswherethereallycleverpartoftheLZWalgorithmhappens.
Whendecoding,theprogrambuildsthetableagain.Afterall,thecharactersareinthe
sameorderintheencodeddata,soitshouldbepossibletoreproducetheprocessthatwas
usedtobuildthecodetableinthefirstplace.Whenthefirstcodeisreadin,thecodeis
expectedtobeinthetable,andthecorrespondingletter“T”iswrittenandplacedintoa
string.Thenextcodeisreadandcorrespondsto‘H.’Now“TH”isaddedtothedictionary,
and“H”iswrittenandbecomesthecurrentstring.Now“E”isseen,“HE”isaddedtothe
table,and“E”iswritten,andsoon.Againadictionarycanbeusedtostorethecodes,but
alistismoreefficient.Theindicesarecodes,whicharenumbers,soalistisfinehere.The
centralpartoftheprocessis:
code1=int(inf.readline())#CODE1isthefirstcode
#onthefile
print(dict[code1],end=””)#Outputthestringfor
#CODE1
whileTrue:#Whilemodecodesonthe
#file…
code0=int(inf.readline())#CODE0isthenextcode
#onthefile
ifcode0<len(dict):#IsCODE0inthetable?
s=dict[code0]#YES.Sisthestring
#forCODE0
else:
s=dict[code1]#NO.Sisthestringfor
#CODE1
s=s+ch#AppendCHtoS.
print(s,end=””)#INEITHERCASEemitS
ch=s[0]#CHbecomesthefirst
#characterofS
dict=dict+[dict[code1]+ch,]#Addnewstringtothe
#table
count=count+1
code1=code0
A pseudo-code summary of both the encoding and decoding processes is given in
Figure10.9,andworkingprogramsareprovidedonthedisc
(lzwe.py and lzwd.py). If punctuation is to be added, then a different conversion to
uppercasewouldhavetobedone.Forpracticalapplications,theentireASCIIcharacterset
wouldbeusedattheoutset.
10.6 HASHING
Ahashing algorithmattempts to characterizeacomplexpieceofdatawithsomething
simpler, and preferably unique. The most common example would be to find a number
thatcouldrepresentacharacterstring.Ahashingalgorithmhastobefast,becausetheidea
veryoftenistoconvertastringintoanindextoalistortuple.Considerthestring“while.”
There are five characters (bytes) here. How can this string be used as an index into a
tuple?
Anynumericaloperationonthecodesusedtorepresentthecharactermightwork,but
some result in codes that are too large. Simply adding the codes would give a value of
537,whichcouldworkbutalsomightbetoolarge.Imaginetheapplicationistolookup
Pythonkeywords;thereare33ofthem.Thevalueresultingfromthehashshouldbean
indexbetween0and32,sotakethehashmod33.Ifthatistriedtheresultisthathalfof
the33entries willbeempty, andhalf willhavetwo ormore stringsthathave thesame
index.Theresultis:
4:“None” 12:“return” 21:“try” 31:“global”
6:“class” 13:“global” 22:“is”
7:“from” 14:“as” 25:“finally”
9:“while” 15:“lambda” 27:“or”
10:“and” 17:“in” 29:“False”
11:“continue” 20:“True” 30:“for”
When two things hash to the same value it is said to be a collision. In this case the
collisionsare:
(class,def) (False,nonlocal) (return,del) (from,not) (lambda,with)
(True,elif) (while,if) (from,yield) (global,assert) (False,else)
(from,import) (and,pass) (is,break) (is,except) (None,raise)
Twovaluescan’toccupythesamelocationinatuple,sosomethingmustbedone.The
simplestwaytodealwithcollisionsistohaveextraspaceinthelistortuple.Ifthesizeof
thetupleisspecifiedas145,thenallstringshashtodistinctvalues.Ofcourse,now112
tupleentriesareempty,butdoesthatreallymatter?Thealternativetoatableindexedby
hashing(ahashtable)wouldbealistthathastobesearched,andhashingisverymuch
faster.
Asithappens,simplyaddingthecharacterstogetherisnotaverygood
hashingmethod.Thereareafewwell-knownones.
djb2
Thisalgorithm startswith apredefined seed for a hashvalue, multipliesit by 33 and
addsthenextcharacterfromthestring,multipliesthatby33,addsthenextcharacter,and
soon.Thecodeis:
defdjb2(s,size):
sum=5381
foriinrange(0,len(s)):
sum=sum*33+ord(s[i])
sum=sum%size
returnsum
Whymultiplyby33?Itworkswell,andnobodyknowswhy.Theseedof5381canbe
changedtosee howdifferentvalueswork. Withtheconfigurationgiven here,therewill
needtobe112elementsinthetupletoavoidcollisions.Iftheprogramischangedslightly
sothatanexclusiveORreplacesthesum,thesizedecreasesto105.Thatis:
sum=sum*33^ord(s[i])
10.6.1 sdbm
Thisisamethoddevisedforscramblingbits,butmakesfora goodhashingfunction.
Theiterationishash(i)=hash(i-1)*65599+str[i].Thenumber65599isarbitrary,but
happenstobeprime.Afunctiontoimplementthisis:
defsdbm(s,size):
hash=0
foriinrange(0,len(s)):
hash=ord(s[i])*65599+hash
returnhash%size.
Therearemanyotherhashingmethods(see:Knuth).Theideaisanimportantone.Itis,
forexample,awaytoimplementPythondictionaries:hashthekeytoanintegeranduse
thattoaccessthevalue.
10.7 SUMMARY
The goal of this chapter was to introduce important algorithms or general techniques
used in computer science. Sorting is a traditional programming problem for
undergraduatesandisessentialinmanydata-handlingapplications.Theselectionsortand
themergesortwerediscussedatlength.
Searching involves finding some piece of data within a larger collection. A linear
searchstartsatthebeginningandlooksatconsecutiveelementsuntilthetargetisfound.A
binarysearchsplitsthedataintotwohalveseachtimeanelementinthesetisexamined
andsoisfaster,butitdependsonthedatabeingsorted.
Randomnumbergenerationcreatesasequenceofnumbersthatsatisfiesastatisticaltest
for randomness. Such numbers are crucial in computer simulations and games, and in
somenumericalalgorithms.
Cryptographyinvolvessendingmessagesthatonlycertainintendedpeoplecanreceive
andunderstand.Acipherisanalgorithmthatconvertsonestringofcharactersintoanother
oneofgenerallythesamelength.Theone-timepadmethodwasexamined,followedby
theverypopularRSAalgorithm.
Datacompressionisaboutwaystotakemanybytesofinformationandturntheminto
fewer bytes while losing none of the essential message. Of course, compressed data is
incomprehensiblejusttolookatandmustbedecompressedinorderforittobeused.This
sectiondemonstratedrunlengthencoding,Huffmancodes,andtheLZWalgorithm.
Thefinalsectionwasa briefdiscussionofhashing, awayto convertstringsor other
complexdatatypesandreducethemtosimplerformssuchasintegers.Thedjb2andthe
sdbmmethodsweresingledoutasbeingtypicalofthewaythatsuchalgorithmswork.
Exercises
1.Hashingalgorithmsmustbefast.Usethetimingschemesdiscussedinthischapterto
determinewhichofthethreehashingalgorithmspresentedisthefastest.
2.Whenasequenceofnumbersissortedintoascendingorderthenelementi-1isalways
smallerthanorequaltoelementi.Hereisadescriptionofasortingalgorithm:scanthe
datasetStofindanypairsofadjacentlocationswhereS[i-1]>S[i],andwhenanyare
foundswapthetwovalues.Repeattheprocessuntilthearrayissorted.Doesitever
getsorted?Whatisthebestcaseandwhatistheworstcase?Implementthemethodin
Python.
3.Comparethelinearcongruentialrandomnumbergeneratordescribedinthischapter
againsttherandom()functioninPython.Implementadierollusingeachmethod,and
rolladie1000times.Whichmethodisnearesttotheexpectedfrequencydistribution
(equalforallvalues)?Repeattheprocess1000timesandscorePythononepointwhen
itsrandomnumbergeneratorwinsbythismeasure,andscorethebook’sgeneratorone
pointwhenitwins.Whichistheoverallwinner?
4.Thequalityofahashingalgorithmismeasuredbyhowrandomthehashcodesare
whengivenasamplesetofstrings.Oneestimateofrandomnessisthenumberofcells
withmorethanonevaluehashedtoit(thebestherewouldbe0),andtheaverage
numberofvalueshashedtooccupiedcells—thisshouldbecloseto1.Measurethese
forthethreehashingmethodspresentedforasizeof60cells.
5.Dataforregistrantsinaswimmingcompetitionconsistsoftheswimmersname,
number,nationalranking,andtimeinthe200-meterfreestylecompetition.Thesedata
arelocatedinfourlists:name,number,rank,t200.Inallcasesthesameindexis
usedtoaccessallofthedataforthesameperson.Sortthesedataindescendingorder
ontimeandidentifythepersonsinthetopthreespotsandtheirtimes.
6.Steganographyworksbyconcealingamessageratherthanmakingitunreadable,asis
donewhenusingencryption.Intheidealsituationnobodywillevensuspectthatthere
isasecondmessagehiddenwithinthefirst.Consideraschemethatusesthespacesin
amessage:asinglespaceisa‘0’andadoublespaceisa“1.”Thelettersarecodedas
5-bitcodesstartingwith“A”=00000,“B”=00001,andsoon.Writeprogramsthat
encodeanddecodesuchmessages.
NotesandOtherResources
Random.orgrandomnumberserver.https://www.random.org/
A pretty good description of RSA:
https://en.wikipedia.org/wiki/RSA_%28cryptosystem%29
Encode/decodestenographicmessagesdisguisedasspam.http://www.spammimic.com/
1.DonaldKnuth.(1997).TheArtofComputerProgramming,Volume3:Sorting
andSearching,3rdEdition,Addison-Wesley,138–141,ISBN0-201-89685-0.
2.AnanyLevitin.IntroductiontotheDesign&AnalysisofAlgorithms,2ndEdition,
98–100,ISBN0-321-35828-7.
3.RobertSedgewick.(1998).AlgorithmsinC++,Parts1–4:Fundamentals,Data
Structure,Sorting,Searching,2ndEdition,Addison-WesleyLongman,273–274,
ISBN0-201-35088-2.
4.G.Marsaglia.(2003).http://www.csis.hku.hk/~diehard
5.MakatoMatsumotoandTakujiNishimura.(January1998).Mersennetwister:A623-
dimensionallyequidistributeduniformpseudo-randomnumbergenerator,ACM
Trans.Model.Comput.Simul.8(1),3–30,DOI=
http://dx.doi.org/10.1145/272991.272995
6.LenoreBlum,ManuelBlum,andMikeShub.(1982).Comparisonoftwopseudo-
randomnumbergenerators,AdvancesinCryptology:ProceedingsofCRYPTO’82,
Plenum,61–78.
7.ClaudeE.Shannon.(October1949).Communicationtheoryofsecrecysystems
(PDF),BellSystemTechnicalJournal,28(4),656–715,retrieved2011-12-21,
doi:10.1002/j.1538-7305.1949.tb00928.x
8.TheOnlyUnbreakableCryptosystemKnown—TheVernamCipher,retrieved
2014-03-17,Pro-technix.com
9.B.Schneier.(1994).Descriptionofanewvariable-lengthkey,64-bitblockcipher
(Blowfish),inFastSoftwareEncryption,editedbyRossAnderson,Cambridge
SecurityWorkshopProceedings(December1993),Springer-Verlag,191–204.
10.StevenW.Smith.(2007).DataCompressionTutorial:Part1,
http://www.eetimes.com/document.asp?doc_id=1275417&page_number=2
11.H.G.Wells.(1895).TheTimeMachine,WilliamHeinemann,
http://www.gutenberg.org/cache/epub/35/pg35.txt
CHAPTER11
PROGRAMMINGFORTHESCIENCES
11.1FindingRootsofEquations
11.2Differentiation
11.3Integration
11.4Optimization:FindingMaximaandMinima
11.5LongestCommonSubsequence(EditDistance)
11.6Summary
Inthischapter
It is true that the earliest calculating devices were created to help with commercial
concerns, like payments, credit, and inventory. The abacus is an excellent example—it
doesbasicarithmeticandwaslikelyanearly“cashregister.”Mucholderdevicesdoexist,
such as the Lebombo bone that helped ancient African bushmen do simple calculations
andkeeptrackoftime.Theelectroniccomputer,ontheotherhand,wasdesignedtocarry
outscientificcalculations,inparticularthoserelatedtodecryptingmilitarymessagesand
buildingtheatombomb.Computersare,ofcourse,usedforthosethingsstill,butthereis
now a vast array of computations in the scientific domain that could not be carried out
withoutthehelpofacomputer.
Scientists from different disciplines would disagree about what the most important
algorithmsandtechniquesforsciencewere.That’sbecauseofthewidelydisparatethings
thatphysicistsandbiologists,astwoexamples,study.Thereareafewrecurringproblems
thatpopupinalmostallsciencedomains,andsomeimportanttechniquesthatgeneralize
tomanyscienceandsomenon-scienceareas.
11.1 FINDINGROOTSOFEQUATIONS
Therootofanequationisthexcoordinatecorrespondingtoitszerovalue.Thismaynot
be the smallest or the largest value, but the place where a function equals zero is often
important. For example, if a function for the error in a calculation can be found, then
finding the place where the error is zero would be important. In one dimension the
problembeingsolvedis:
x:f(x)=0(11.1)
Orinotherwords,findthevalueofxthatresultsinf(x)beingequaltozero.Thefunction
couldbequitecomplicated,butforthetechniquetoworkitshouldhaveaderivative.
ThebasisofmanyrootfindingproceduresisNewton’smethod.Theprocedurebegins
with a guess at the right answer. The guess in many cases does not have to be very
accurate,butissimplyastartingpoint.Ifarangeofvaluesisgivenwithinwhichtofind
thesolution,thecenterofthatrangemaybeagoodstartingguess.So,hereisaproblemto
startwith:
f(x)=(x-1)3betweenx=-2andx=12(11.2)
Thecenteroftherangeisx=5.
Theinitialguessiscalledx0,andherex0=5.Thefunctionvaluethere,f(x0), is 64.
Thealgorithmnowsaysthatthenextguessforx,x1,willbe:
x0–f(x0)/f’(x0)(11.3)
wheref’(x0)isthederivativeoffatthepointx=x0.Thisisawrinkle—thederivativeoff
has to be calculated. It’s easy to do for many functions, hard for others. A numerical
methodwillbeexaminedalittlelaterinthischapter,sointhemeantimeitispossibleto
simplycode a functionthat givesthe derivative,having donethe calculuson paperand
thenwrittenthefunctionbasedonthat.Thederivativeof(x-1)3is3x2-6x+3.
#Rootsofafunction
defobjective(x):
return(x-1)*(x-1)*(x-1)
defderiv(x):
return3*x*x-6*x+3
#Rangeis-2to+12
x=5.
fx=1000.
delta=0.000001
print(“Step0:x=”,x,”obj=“,objective(x))
i=1
whileabs(fx)>delta:
f=objective(x)
ff=f/deriv(x)
x=x-ff
fx=objective(x)
print(“Step“,i,”:x=”,x,“obj=“,fx)
i=i+1
Step0:x=5.0obj=64.0
Step1:x=3.666666666666667obj=18.96296296296297
Step2:x=2.7777777777777777obj=5.618655692729766
Step3:x=2.185185185185185obj=1.6647868719199308
...
Step14:x=1.0137019495631274obj=2.5724508967303e-06
Step15:x=1.0091346330420865obj=7.622076731056633e-07
The correct answer in this case is x=1.0, so the method gets to within 0.009 of the
correct root in 15 steps. Depending on the application, this could be fine. What if the
initialguesswasterrible?Iftheprocessstartsatx=500thenittakes27steps,butgets
justalittleclosertotherightanswer(x=1.0087).Startingat-500alsotakes27steps.
It’s possible that there is no root. What happens in that case? The program keeps
looking.Itovershoots,andthengoesback,andforth,andbackagain.Topresentthisfrom
happeningitiscommontoplacealimitofthenumberoftimestheprogramwilltry.When
thislimitisexceededanerroroccursindicatingthatthereisnosolution.
This first example has illustrated some common concepts that are used in numerical
analysis, which is the mathematical discipline encompassing the computation of
mathematicalfunctionsandoperations.Thecommonconceptsinclude:
Theinitialguess: It isrelativelycommon tohave a numericalalgorithmbegin ata
guessedvalue.
Thedelta:Itisalsocommontohaveanalgorithmstepwhenthechangeintheresult
or some mathematical feature becomes smaller than a specified threshold, called
delta.
Iteration:Numericalmethodsfrequentlyrepeatacalculationexpectingittoconverge
onthecorrectresult,usingthepreviouslycalculatedvalueasthenewstartingpoint.
Maximumiterations:Auserofanumericalmethodcanassumethatthemethodwill
notconverge(getcloseenoughtotherightanswer)ifaspecifiednumberofattempts
havebeenmade.
11.2 DIFFERENTIATION
Determining the derivative of a function is something that is often thought of as a
symbolic operation, and the result is valid for any value of the function. This may not
alwaysbe true, and it may not be easy to do in thegeneral case. Think about what the
previousalgorithmdoes—itneedsthederivativeofafunctionatonespecificpoint.Can
thatbedeterminedifthealgebraicformofthefunctionisnotknown?Yes,itcan,towithin
somedegreeofaccuracy.
The derivative of a function at a point x is the slope of the curve defined by that
functionatthatpoint.Thedefinitionofthederivativeoffatthepointxis:
f’(x)=(f(x+h)–f(x-h))/(2h)(11.4)
ashgetssmallerandsmaller,whatiscalledalimitincalculus.Thisformulaisessentially
themathematicaldefinitionofaderivative.Onacomputerhcanbemadequitesmall,but
canneverbezero.Iftheexpressionaboveisusedasanestimateofthederivative,itwill
work in many cases. It is based on sampling two points of the function each time. An
improvementcanbemadebyusingmorepoints;forexample:
usesfourpointsandoftenproducesbetterresults.
Codingthisusesafunctionpassedasaparameter.Itmakessensethatthefunctiontobe
differentiatedwouldbeaparametertothefunctionthatdifferentiatesit;otherparameters
willbex,thepointatwhichitwillbeevaluated,delta,theaccuracydesired,andniter,the
maximumnumberofiterations.Thecalculationshouldtakeplaceinatry-exceptblockso
that numerical errors will be caught. The two-point and the four-point versions of the
functionthatperformsnumericaldifferentiationare:
Testing these functions is an excellent demonstration. First a function to be
differentiatediswritten.Thepreviousexampleonfindingrootshasasimpleone(renamed
asf1):
deff1(x):
return(x-1)*(x-1)*(x-1)
That example also has a function that represents the derivative of f1 at the point x
(renameddf1):
defdf1(x):
return3*x*x-6*x+3
Thefunctiondf1()shouldreturntheexactderivativeoff1(),andcanbeusedtocheck
thevaluereturnedbyderiv1()orderiv2().Createaloopthatrunsoverarangeofxvalues
andcomparethevaluereturnedfromdf1()tothatreturnedbyderiv1()and/orderiv2():
foriinrange(1,20):
x=i*1.0
f=f1(x)
df=df1(x)
mydf=deriv1(f1,x)
mydf2=deriv2(f1,x)
print(f,df,mydf,n0,”“,mydf2,n1)
Theresultlookssomethinglikethis:
Both functions give excellent results in a very few iterations in this case. Of course,
somefunctionspresentmoredifficultiesthandosimplepolynomials.[2]
11.3 INTEGRATION
An integral is most often thought of as the area under a curve, where the curve is a
function (Figure 11.1a). Numerical integration amounts calculating that area using an
algorithm.Theareaofarectangleiseasytocalculate,soiftheregionunderacurvecould
bereasonablyapproximatedbyabunchofrectangles,thentheproblemwouldbesolved.
Thisistheideabehindthetrapezoidalrule.Theintegralfromx0tox1ofafunctionf(x)
canbeapproximatedbythewidth(x1-x0)multipliedbytheheight(theaveragevalueof
thefunctioninthatrange),whichisjustarectanglethatapproximatestheareaunderthe
curve(Figure11.1b).Inmathematicalnotation:
This would generally be a pretty poor approximation of a curve, and would yield
correspondinglybadapproximationsoftheintegral.However,thesmallerthewidthx1-
x0 the more accurate the approximation can be, and so using a great many small
trapezoidswouldbemuchbetterthanusingonlyone(Figure11.1c).Howmany?Thatis
not known at the outset, but could be increased from an initial guess until a desired
accuracywasachieved.
A function that performs integration using this method would accept a function, the
starting x0 and the ending x1 for the integral. The function would break the interval
betweenx0andx1intonparts,whennisaninitialguess.Thefunctionisevaluatedforall
n parts, the area of each trapezoid is computed, and they are summed to get the final
result.Nowincreasenanddoitagain.Ifthetwovaluesarecloseenough(delta)thenthe
processiscomplete.
Thiswillbedoneintwosteps.Firstafunctiontrap0()thatcomputesandreturnsthe
sumofNtrapezoids.Theobviousbutslowwaytodothisis:
ftrap0(f,x0,x1,n):#Slowmethod
dx=(x1-x0)/n#DividerangeintoNparts
xa=x0#Startatx0
xb=x0+dx#Currenttrapezoidisxatoxb
sum=0#Sumofareasstartsat0.0
foriinrange(0,n):#AddupNtrapezoids
f0=f(xa)#Computefunctionatxaandxb
f1=f(xb)
sum=sum+dx*(f1+f0)/2#Areaofthetrapezoid
xa=xa+dx#Nextxaandxbaredxfrom
xb=xb+dx#thecurrentones
returnsum#Thesumistheintegral.
Theintegrationfunctiontrapezoid()willcallthisfunctionwithincreasingvaluesofn
untiltwoconsecutiveresultsshowasmallenoughdifference(i.e.,smallerthanaprovided
deltavalue):
deftrapezoid(f,x0,x1,delta=0.0001,niter=1024):
n=4
resold=trap0(f,x0,x1,2)
resnew=trap0(f,x0,x1,4)
whileabs(resnew-resold)>delta:
ifn>niter:break
resold=resnew
n=n*2
resnew=trap0(f,x0,x1,n)
returnresnew
The function trap0() can be sped up significantly by not re-computing the function
twiceeachtimethroughtheloop,butrememberingthepreviousvalueinstead(Exercise
2).A morepopularalgorithmfor integrationisSimpson’sRule,which tries to minimize
the error even more by using a quadratic approximation to the curve at the top of the
trapezoid,insteadofastraightline.
11.4 OPTIMIZATION:FINDINGMAXIMAAND
MINIMA
Findingextremevalues,eitherthemaximumorminimum,isaverycommonproblem
incomputing, notjust in sciencebut in manydisciplines. Itis sometimes referredto as
optimization.Naturallyfindingabest(insomesense)valuewouldbeappealing.Whatis
theleastamountof fuelneededto travel fromChicagoto Atlanta?Whatroutebetween
thosetwocitiesrequirestheleastamountofdrivingtime?Whatrouteisshortestinterms
ofdistance?Therearemanyreasonstowantanoptimumandmanywaystodefinewhat
anoptimumis.
Inthefollowingdiscussionthefunctiontobeoptimizedwillbeprovided,sotherewill
no guesswork on that subject. The question concerns how to find location (parameters)
wheretheminimumormaximumoccurs.
11.4.1 NewtonAgain
Figure11.2showsanexampleofafunctiontobeoptimized.Thereisaminimumof7
at the point x= 1. How can this be found? If the nature of the function is known, for
instancethatitisaquadraticpolynomial,thentheoptimumcanbefoundimmediately.It
willbeatthepointwherethederivativeiszero.Theproblemofoptimizationisthatone
doesnotknowmuch,ifanything,aboutthefunction.Itcanonlybeevaluated,andperhaps
thederivativescanbefoundnumerically.Giventhat,howcantheminormaxbefound?
Ifthederivativecanbefound,thenitmaybepossibletosearchforanoptimumpoint.
Atavaluex,ifthederivativeisnegative,thentheslopeofthecurveisnegativeatthat
point; if the derivative is positive then the slope is positive. If an x value can be found
wheretheslopeisnegative(callthispointx0)andanotherwhereitispositive(callthis
x1),thentheoptimum(slope=0)mustbebetweenthesetwopoints.Findingthatpoint
canbedoneasfollows:
1.Selectthepointbetweenthesetwo(x=(x0+x1)/2).
2.Ifthederivativeisnegativeatthispoint,letx0=x.Ifpositiveletx1=x.
3.Repeatfromstep1untilthederivativeiscloseenoughto0.
This process is pretty much random. Finding the two starting points is a matter of
guessinguntiltheyarefound.Thesearchrangegetssmallerbyafactorof2eachiteration.
Thefactthatthefunctioncanbeevaluatedatanypointmeansthatitispossibletomake
better guesses. In particular, it’s possible to assume that the function is approximately
quadratic at each step. Quadratics have an optimum at a predictable place. The method
calledNewton’sMethodfitsaquadraticateachpointandmovestowardsitsoptimalpoint.
Themethodisiterative,andwithoutdoingthemaththeiterationis:
A functiontocalculate the firstand secondderivative is needed.The formulafor the
secondderivativeisbasedonthedefinitionofdifferentiation,aswastheformulaforthe
first.Itis:
The program should therefore be straightforward. Repeat the calculation of xn-1 -
f’(x)/f’’(x)untilitconvergestotheanswer.Thiswillbethelocationoftheoptimum.An
examplefunctionwouldbe:
defnewtonopt(f,x0,x1,delta=0.0001,niter=20):
x=(x0+x1)/2
fa=1000.0
fb=f(x)
i=0
print(“Iteration0:x=”,x,”f=”,fb)
while(abs(fa-fb)>delta):
fa=fb
x=x-deriv(f,x)/derivsecond(f,x)
fb=f(x)
i=i+1
print(“Iteration“,i,“:x=”,x,”f=”,fb)
ifi>niter:
return0
Thisfindsalocaloptimumbetweenthevaluesofx0andx1.Alocaloptimummaynot
be the largest or smallest function value that the function can produce, but may be the
optimuminalocalrangeofvalues.
Figure 11.2a shows a typical quadratic function. It is f(x) = x2-2x+8, and has an
optimumatx=1.BecauseitisquadratictheNewtonoptimizationfunctionabovefinds
theresultinasinglestep.Figure11.2bisasinefunction,andcanbeseentohavemany
minimaandmaxima.AnyoneofthemmightbefoundbytheNewtonmethod,whichis
whyarangeofvaluesisprovidedtothefunction.
Thenewtonopt()functionsuccessfullyfindstheoptimuminFigure11.2aatx=1,and
finds one in Figure 11.2b at x = 90 degrees (π/2 radians). If there is no optimum the
iterationlimitwillbe reached.Ifeitherderivativedoesnotexist,thenanexceptionwill
occur.
11.4.2 FittingDatatoCurves–Regression
Scientists collect data on nearly everything. Data are really numerical values that
representsomeprocess,whetheritbephysical,chemical,biological,orsociological.The
numbersaremeasurements,andscientistsmodelprocesses using thesemeasurementsin
ordertofurtherunderstandthem.Oneofthefirstthingsthatisusuallydoneistotryto
findapatterninthedatathatmaygivesomeinsightintotheunderlyingprocess,oratleast
allow predictions for situations not measured. One of the common methods in data
analysisistofitacurvetothedata;thatis,todeterminewhetherastrongmathematical
relationshipexistsbetweenthemeasurements.
Asanexample,asetofmeasurementsoftreeheightswillbeused.Theheightofaset
ofaspecificvarietyoftreesismadeoveraperiodoftenyearsandthedataresidesinafile
named “treedata.txt.” The question: is there a linear relationship (i.e., does a tree grow
generally the same amount each year)? Specifically, what is that relationship (i.e., how
muchcanweexpectatreetogrow)?Figurezshowsavisualizationofthesedatainthe
formof a scattergramorscatterplot, in which the data are displayed as points in their
(x,y)positiononagrid.
The“curve”tobefitinthiscasewillbealine.Whatistheequationofthelinethatbest
representsthedatainthefigure?Ifthatwereknownthenitwouldbepossibletopredict
theheightofatreewithsomedegreeofconfidence,ortoestimateatree’sagefromits
height.
Oneformoftheequationofalineisthepoint-slopeform:
y=mx+b
wheremistheslope (angle) of theline and b is the intercept, the place where the line
crossestheYaxis.Thegoaloftheregressionprocess,inwhichthebestlineisfound,isto
identifythevaluesofmandb.Asimpleobservationisneededfirst:theequationofaline
canbewrittenas:
mx+b-y=0
Ifapointactuallysitsonthisline,thenpluggingitsxandyvaluesintotheequationwill
resultina0value.Ifapointisnotontheline,thenmx+b-ywillresultinanumberthat
amounts to an error; its magnitude indicates how far away the point is from the line.
Fitting a line to the data can be expressed as an optimization problem: find a line that
minimizesthetotalerroroverallsampledatapoints.If(xi,yi)isdatapointithenthegoal
istominimize:
byfindingthebestvaluesofmandb.Theexpressionissquaredsothatitwillalwaysbe
positive, which simplifies the math. It may be possible to do this optimization using a
generaloptimizationprocesssuchasNewton’s,butfortunatelyforastraightlinethemath
has been done in advance. Other situations are more complicated, depending on the
functionbeingfitandthenumberofdimensions.
Asimplelinearregressionisdonebylookingatthedataandcalculatingthefollowing:
Eachofthesecanbecalculatedusingaseparatefunction.Thentheslopeofthebestline
throughthedatawouldbe:
Andtheinterceptis:
b=meany–m*meanx(11.11)
The function regress() that does the regression accepts a tuple of X values and a
correspondingtupleofYvalues,andreturnsatuple(m,b)containingtheparametersof
the line that fits the data. It depends on other functions to calculate the mean, standard
deviation, and correlation; these functions could generally be more useful in other
applications.Theentirecollectionofcodeis:
11.4.3 EvolutionaryMethods
A genetic algorithm (GA) or an evolutionary algorithm (EA) is an optimization
techniquethatusesnaturalselectionasametaphortooptimizeafunctionorprocess.The
ideaistocreateacollectionofmanypossiblesolutions(apopulation),whicharereally
just sets of parameters to the objective function. These are evaluated (by calling the
function)andthebestofthemarekeptinthepopulation;theremainderarediscarded.The
population is refilled by combining the remaining parameter sets with each other in
various ways in a process that mimics reproduction, and then this new population is
evaluatedandtheprocessrepeats.
Theideais thatthe populationcontains the bestsolutions thathave beenseen sofar,
andthatbyrecombiningthemanew,bettersetofsolutionscanbecreated,justasnature
selects plants and animals to suit their environment. This method does not require the
calculationofaderivative,soitcanbeusedtooptimize“functions”thatcan’tbehandled
inotherways.Itcanalsodealwithlargedimensions;thatis,functionsthattakealarge
numberofparameters.
Thisideamaybenew,andsodevelopingitwithanexamplemaybethe best wayto
illustrateit.Considertheproblemoffindingtheminimumofafunctionoftwovariables.
This is really an attempt to find values for x and y that result in the smallest function
result. Evolutionary algorithms are often tested on quite nasty functions, having lots of
localminimaorlargeflatregions.Twosuchfunctionswillbeusedhere:theGoldstein-
Pricefunction:
(1+(x+y+1)2(19-14x+3x2-14y+6xy+3y2)(30+(2x+3y)2(18-32x+12x2+48y-
36xy+27y2))(11.12)
andBohachevsky’sfunction:
x2+y2–0.3cos(3px)–0.4cos(4py)+0.7(11.13)
GraphsofthesefunctionsareshowninFigure11.4.
The first step in the evolutionary algorithm is to create a population of potential
solutions. This is just a collection of parameter pairs (x,y) created at random. The
populationsizeforthisexamplewillbe100,andisaparameteroftheEAprocess.Thisis
doneintheobviousway:
defgenpop(population_size):
pop=()
foriinrange(0,population_size):
p=(randrange(-10,10),randrange(-100,100))
pop=pop+(p,)
returnpop
The population is a tuple of a hundred (x,y) parameter pairs. Now these need to be
evaluated, and so the objective function must be written. This will differ for each
optimizationproblem,ofcourse.Inthiscaseitisthesumoftheerrorsbetweenagiven
line(oneoftheparameters)andthedatapoints.Onewaytocalculatethisis:
defobjective(x,y):#Oneofmanypossiblkeobjective
#functions
returngoldsteinprice(x,y)
#returnboha(x,y)
defgoldsteinprice(x,y):#Goldstein-Pricefunction
#f(0,-1)=3
return(1+(x+y+1)**2*(19-14*x+3*x*x-14*y+6*x*y+
3*y*y))*\(30+(2*x-3*y)**2*(18-32*x+12*x*x+
48*y-36*x*y+27*y*y))
defboha(x,y):
z=x*x+y*y-0.3*cos(3*pi*x)-0.4*cos(4*pi*y)+0.7
returnz
Allmembersofthepopulationareevaluated,andthebestones—inthiscasetheones
havingthesmallestobjectivefunctionvalue—arekept.Agoodwaytodothisistohave
thevaluesinatupleEwhereE[i]istheresultofevaluatingparametersP[i],andsorting
thecollectionisdescendingorderonE.Sincethereare100entriesinEthiswillmeanthat
E[0:n]willcontainthebestn%ofthepopulation.Thefunctioneval()willcreateatuple
of function evaluations for the whole population, and sort() will sort these and the
correspondingparameters. These contain nothingnew,and willnot be shown here. The
program here will select the best 10% and discard the remainder, replacing them with
copiesofthegoodones.
Thekeyissueisoneofintroducingvarietyinthepopulation.Thismeanschangingthe
valuesoftheparameterswhile,onewouldhope,improvingtheoverallperformanceofthe
group. Using the metaphor of natural selection and genetics, there are two ways to
introducechangeintothepopulation:mutationandcrossover.Mutationinvolvesmakinga
smallchangeinaparameter.InrealDNA,amutationwouldchangeoneofthebasepairs
inthesequencewhichwouldusuallyamounttoarathersmallchange,butwhichwillbe
fatalinsomecases.IntheEAbeingwritten,amutationwillbearandomamountaddedto
orsubtractedfromaparameter.Mutationsoccurrandomlyandwithasmallprobability,
whichwillbenamedpmutintheprogram.Valuesbetween0.015and0.2aretypicalfor
pmut,butabestvaluecan’tbedetermined,andisproblemspecified.Avalueof0.02will
beusedhere.
The function mutate() will examine all elements in the global population, mutating
thematrandom(i.e.,addingrandomvalues):
defmutate(m):
globalpmut,population
foriinrange(int(m),len(population)):
c=population[i]
ifrandom()<pmut:#Mutatethexparameter
c[0]=c[0]+random()*10.0-5
ifrandom()<pmut:#Mutatetheyparameter
c[1]=c[1]+random()*10.0-5
population[i]=c
A crossoveris morecomplex, involvingtwo setsof parameters.It involves swapping
parts of the parameters sets from two “parents.” Some parameters could be swapped
entirely,inthiscasemeaningthat(x0,y0)and(x1,y1)wouldbecome(x0,y1)and(x1,
y0).Othertimespartsofoneparameterwouldbecombinedwithpartsofanother.There
are implementations involving bit strings that make this easier, but when using floating
pointvaluesasisbeingdonehere,agoodwaytodoacrossoveristoselecttwo“parents”
and replace one of the parameters in each with a random value that lies between the
originaltwo.
defcrossover(m):
globalpopulation,pcross
foriinrange(m,len(population)):#Keepthebestones
#unchanged
ifrandom()<pcross:#Crossoveratthe
#givenrate
k=randrange(m,len(population))#Pickarandom
#mate
w=randrange(0,1)#ChangeXorY?
c=population[i]#Getindividual1
g1=c[w]#GetXorYforthisguy
cc=population[k]#getindividual2
g2=cc[w]#GetXorY
if(g1>g2):t=g1;g1=g2;g2=t
#swapsog1issmallest
c[w]=random()*(g2-g1)+g1#Generatenew
#parameterfor1
cc[w]=random()*(g2g1)+g1#Generatenew
#parameterfor2
SampleoutputfromthreeattemptstofindtheoptimalvalueofBohachevsky’sfunction
is:
Iterations x y Result
5 [0.002528698, 0.0] at9.158793881192118e-05
173 [5.38185770e-10, -0.0006229] at1.2643415301605287e-05
4 [-0.0007491, 0.0] at8.0394329584621e-06
Thisshowsthatsometimestheprocesstakesmuchmoretimetoarriveatasolutionthan
others.Itdependsontheinitialpopulation,aswellasontheparametersoftheprogram:
themutation and crossover probabilities,the percentage of the top individualsto retain,
andthenatureofthemutationandcrossover
operatorsthemselves.
Figure11.5outlinestheoverallprocessinvolvedintheoptimization.Detailsonspecific
techniquescanbefoundinthereferences.
11.5 LONGESTCOMMONSUBSEQUENCE(EDIT
DISTANCE)
So far in this chapter the methods being discussed are numerical ones, and given the
topic being discussed that makes some sense. There are, on the other hand, many
algorithms that are not numeric in nature, but may be more symbolic, involve patterns,
pictures, sounds, or other more complex data forms. It is true that at some level all
problems to be solved on a computer must be formulated using numbers, but in the
examplessofarthenumbersarereallythesubjectoftheproblem,andtheproblemwould
besolvednumericallyevenifdonewithapencilandpaper.Inothercasesthisisnotso.
As a major example, consider the problem of comparing two sequences of DNA. A
sequenceinthisinstanceconsistsofastringofletters,eachonereferredtoasabaseinthe
sequence. DNA consists of a long sequence of base pairs involving four molecules:
Adenine (A), Guanine (G), Thymine (T), and Cytosine (C) linked together chemically.
Theseultimatelydefinethestructureofaprotein,anditisthesequencethatisimportant.
Acommonproblemincomputationalbiologyistofindthelongestsequenceincommon
betweentwoDNAstrands,wherethesamplesmaybefromdifferentindividualsoreven
differentspecies.MethodsfordoingthistendtoinvolvetheeditdistanceorLevenshtein
distance.
Theeditdistanceisawayofspecifyinghowsimilarordissimilartwostringsaretoone
anotherbyfindingtheminimumnumberofeditingoperationsrequiredtotransformone
stringintotheother.Aneditingoperationcanbeachangeinacharacter,adeletion,oran
insertion.Forexample,whatistheeditdistance
betweentheword“planning”andtheword“pruning”?Itis3:
planning
pranningchange“l”to“r”
prunningchange“a”to“u”
pruningdelete“n”
Howis this used when looking at DNA?ADNAsequence is a set ofthe codes read
fromapieceofDNA,andisastringcontainingonlythelettersG,A,T,andC.Comparing
two pieces of DNA is a matter of comparing the two strings. So, the two strings
AGGACAT and ATTACGAT are distance 3 from each other. The longest common
subsequencehas5charactersinit:
AGGACAT
ATTACGAT
11.5.1 DeterminingLongestCommonSubsequence(LCS)
Exhaustive searching of two S1 and S2 strings for the longest common subsequence
wouldsimplybetooslowforanypracticalpurpose.Fortunately,alotofworkhasbeen
doneoverthepast50yearsonthisproblem,andthemethodofchoiceappearstobethe
Smith-Watermanmethod.Itbuildsamatrix(two-dimensionalarray)whereeachcharacter
ofthefirststringrepresentsacolumnofthematrix,andeachcharacterintheotherstring
formsarow,inorderofappearance.Thematrixisfilledwithnumbersusingthefollowing
relation:
Thefunctionσ(a,b)givesapenaltyforamatch/mismatchbetweentwocharactersaand
b.Hereitwillbe2foramatchand-2foramiss.Thegappenaltyisthevalueassignedto
havingtoleaveagapinthesequencetoperformabettermatch.Usuallythiswouldbe-1.
Theschemeoffersadegreeofflexibility,sothatdifferentpenalties(andrewards)canbe
appliedindifferentcircumstances.
ThefirststepintheSmith-Watermanmethodistocreateamatrix(atable)Tinwhich
there are len(S1+1) columns and len(S2+1) rows. The first index in T(i,j) refers to the
columnandthesecondindexistherow.Thevaluesinthecurrentrowareafunctionof
thoseinthepreviousone.Placea0ineachelementofthefirstrowandthefirstcolumn.
Forthetwostringsusedpreviouslythiswouldlooklikethetablebelow.
Now,foranyelementT(i,j)theneighboringelementsare:
T(i-1,j-1)T(i,j-1)T(i+1,j-1)
T(i-1,j)T(i,j)T(i+1,j)
ThefirstcelltofillinthetableTisT(1,1),markedwitha“*”characterintheexample
totheleft.Therelationusedtofillthiscellhasfourparts:
1.T(i–1,j–1)+σ(S1(i),S2(j))Thecharactersintherowandcolumnmatch,
soσ(S1(i),S2(j))=σ(A,A)=1+T(0,0)=2
2.Gappenaltyis-1,T(i-1,j)=0,T(0,1)=0.Resultis-1
3.Gappenaltyis-1,T(i,j-1)=T(1,1)=0.Resultis-1
4.Resultis0
Themaximumvalueofthesefourcalculationsis1,soT(1,1)=2.
ThenextcelltocomputeisT(2,1).Thistimethetwocharactersarenotthesame,so:
1.T(i−1,j−1)+σ(S1(i),S2(j))whereσ(S1(i),S2(j))=σ(G,A)=−1+T(1,0)
=
2.Gappenaltyis-1,T(i-1,j)=T(1,1)=2.
Resultis2-1=1
3.Gappenaltyis-1,T(i,j-1)=T(2,1)=0.Resultis-1
Resultis0
andsoT(2,1)=1
ForT(3,1):
1.GandAarenotthesame,s(G,T)=-2:
2.T(i-1.j)=T(2,1)=1so1-1=0
3.T(i,j-1)=T(3,0)=0so0–1=-1
4.0
Resultis0
ForT(4,1):
1.σ(A,A)=2,
2.T(i-1,j)=T(3,1)=0,0-gap=-1
3.T(i,j-1)=T(4,0)=0,0-gap=-1
4.0
Resultis2
ForT(5,1):
1.σ(C,A)=-2,
2.T(i-1,j)=T(4,1)=2,2-gap=1
3.T(i,j-1)=T(5,0)=0,0-gap=-1
4.0
Resultis1
ForT(6,1):
1.σ(A,A)=2,
2.T(i-1,j)=T(5,1)=1,1-gap=0
3.T(i,j-1)=T(60)=0,0-gap=-1
4.0
Resultis2
FinallyforT(7,1):
1. σ(T,A)=-2,
2. T(i-1,j)=T(6,1)=2,2-gap=1
3. T(i,j-1)=T(70)=0,0-gap=-1
4. 0
Resultis1
Theresultafterrow2iscompleteis:
Nowmovetothenextrow.Theprocessrepeatsuntilallcellshavebeenexaminedand
assignedvalues.Forthisexamplethefinalmatrixis:
Thelowerrightentryiscolumn7row8,or(7,8).
Thismatrixindicatesthedegreeofmatchatpointsinthestring.Todeterminetheactual
matchbetweenthestrings,beginwiththelargestvalueinthematrix.Inthiscaseitisin
thelowerrightcorner,butthat’snotalwaystrue.Whereverthemaximumis,startatthat
pointinthematrixandtraceleftandupwards;thisisessentiallymovingfromtheendof
eachstringbacktothebeginning.Ateachstepthemoveisleft,up,ordiagonally.
Theprocesswillbuildtwowaystomatchthestring.Oneindicateshowtochanges1
intos2(callthisM1),andtheotherindicateshowtoturns2intos1(callthisM2).Both
stringsareconstructedfrom,inthiscase,(7,9)back
to(0,0).
IfthereismorethanonecellinTwithamaximumvalue,thenarouteshouldbetraced
backfromeachmaximum.
Fortheexamplestringtheresultis:
M1=AGGACCAT
M2=ATTAC_AT
ThereisamismatchattheGG/TTpairandaninsertedgapinM2.
11.6 SUMMARY
A discussion of some of the more important problem types studied by scientists was
presentedalongwithsomemethodfortheirsolution.Therootofanequationistheplace
whereitsvalueiszero,andNewton’smethodwasdescribedasameansoffindingaroot.
Newton’s method requires that the derivative of the function be known, so means of
numericallydeterminingthederivativewerealsodiscussed.
Since derivatives could be calculated, methods for performing integration were
describedandfunctionsfordoingthecalculationusingthetrapezoidalrulewerewritten.
One of the more common calculations in science is to find an optimumvalue for a
function.AnothermethodofNewton’swasusedtofindmaximaorminimaofafunction.
The modeling of data is important in scientific (and other) disciplines. A method for
finding the best straight line that passes through a set of data was illustrated (linear
regression)andcodewasdesignedandtestedforthisproblem.
Evolutionaryalgorithmscanbeusedtofindtheoptimumofafunction,andisespecially
useful when dealing with multidimensional functions or functions that have many local
optima,andwhennoderivativeofthefunctionexists.
BiologistssometimesneedtomatchsequencesofDNA.Amethodthatdoesthisusing
bases as characters and sequences as strings was presented; this is the Smith-Waterman
algorithmforlocalsequencematching,andiscommonlyusedfortheseproblems.
Exercises
1.Modifytherootfindingexamplesothatanumericalderivativeisusedinsteadofan
analyticalone(i.e.,usederive1()orderive2()).Thisisamorepracticalsituation.What
istheeffect?
2.Modifythetrap0()functioninthetrapezoidruleexamplesothatitnevercallsthe
functionbeingevaluatedmorethanonceforanypoint.
3.LookupSimpson’sRuleandcodeyourownversion.Compareitwiththetrapezoid
rulefortwofunctionsofyourchoice.Whichoneismoreaccurateaftereachiteration?
4.Writeafunctionerror()thatacceptsXandYdatatuples,andvaluesaandb.Itreturns
thetotalerrorbetweenthedatapointsandthecurveax2+bx.
5.The(natural)logarithmofavaluevisdefinedtobetheintegralof1/xfrom1tov.
Createafunctionthatcalculatesthenaturallogusingtheexistingintegrationfunction.
6.RuntheevolutionaryalgorithmtooptimizetheGoldstein-Pricetwentytimes.Doesit
everfailtoapproachtheminimum?Howoften?WhatcanbedoneifanEAdoesnot
arriveatanoptimum,andhowcanitbedetermined?
7.Usingsoftwaredevelopedinthischapter,findtwopositivenumberswhosesumis9
andsothattheproductofonenumberandthesquareoftheothernumberisa
maximum.
NotesandOtherResources
Onlineeditdistancecalculator:http://planetcalc.com/1721/
Smith-Waterman algorithm: http://www.slideshare.net/avrilcoghlan/the-smith-
waterman-algorithm
https://www.youtube.com/watch?v=jrJ23aaByE8
1.D.Levy.LectureNotes,(Ch5)NumericalDifferentiation,
http://www2.math.umd.edu/~dlevy/classes/amsc466/lecture-notes/differentiation-
chap.pdf
2.WilliamPress,SaulTeukolsky,WilliamVetterling,andBrianFlannery.(2007).
NumericalRecipes:TheArtofScientificComputation,3rdedition,Cambridge
UniversityPress.
3.RichardHamming.(1987).NumericalMethodsforScientistsandEngineers,Dover
Publications.
4.J.R.Parker.(2002).GeneticAlgorithmsforContinuousProblems,15thCanadian
ConferenceonArtificialIntelligence,Calgary,Alberta,
May27–29.
5.D.E.Goldberg.(1989).GeneticAlgorithms,Optimization,andMachine
Learning,Addison-Wesley,Reading,MA.
6.vlab.amrita.edu.(2012).GlobalAlignmentofTwoSequences-Needleman-
WunschAlgorithm,Retrieved9December2015,vlab.amrita.edu/?
sub=3&brch=274&sim=1431&cnt=1
7.S.B.NeedlemanandC.D.Wunsch.(1970).Ageneralmethodapplicabletothe
searchforsimilaritiesintheaminoacidsequenceoftwoproteins,
J.Mol.Biol.,48,443–453.
8.T.F.SmithandM.S.Waterman.(1981).Identificationofcommonmolecular
subsequences,J.Mol.Biol.,147(1),195-197.
CHAPTER12
HOWTOWRITEGOODPROGRAMS
12.1ProceduralProgramming–WordProcessing
12.2ObjectOrientedProgramming–Breakout
12.3DescribingtheProblemasaProcess
12.4RulesforProgrammers
12.5Summary
Inthischapter
Thereisnogeneralagreementonhowbesttoputtogetheragoodprogram.Good,bythe
way,meansfunctionallycorrect,readable,modifiable,reasonablyefficient,andthatsolves
aproblemthatsomeoneneedssolved.Thischapterwillbedistinctfromtheothersinthis
book:we’llmoveintosecondpersonnarrative,partlybecauseofthemorepersonalnature
ofthesubjectmaterial.Writingcodeforsomepeopleisliketellingastoryormakinga
painting:it’snotthatitisart,butthatitispersonal.Ifyouwishtoinsultaprogrammer,
saythattheircodeispoorlystructured,ornaïve,orinsomewaylessthanadequate.
Therearemanyprocessesthathavebeendescribedforprogramming,andthetruthis
thatnotonlyistherenotonebestone,butitisrarelycertainthananyofthemisbetter
than any of the others. When someone writes a program, they are trying to solve a
problem.Whattheyaredoingistranslatingaloosecollectionofideasintoaformthatcan
berepresentedonacomputer,whichistosayas
numbers.Theideasareassociatedwithalgorithms,thingsthatcanbeshowntoworkforat
leastarangeofsituations.Thenthatneedstobeconvertedintoasequenceofstepsthat
leadstoasolutiontotheoriginalproblem.
Thisisinpartaprobleminsynthesis,thecombiningofseparatecomponents,elements,
and ideas into a coherent whole. There is something called synthesis programming, but
that’snotwhatisbeingdiscussed.Thepartsofaprogramincludedecisionconstructs(IF
statements), looping (FOR and WHILE), expressions, assignment statements, and data
structures(tuples,dictionaries,strings,etc.).Thereisadegreeofskillinvolvedinusing
theseunitstobuildasensiblelargerprogram.Thisskillissomewhatindividual.Notwo
programmerswillcreateexactlythesameprogramforanon-trivialproblem.
Whatwe’regoingtodointhischapterisshowthedevelopmentofanentirecomputer
program,withallof theintermediatesteps,flaws, errors,andflashes of genius(ifany).
Why?Theansweris“becausethatisrarelydoneinlecturesorinabook.”Whenteaching
mathematics the professor often shows the proof of a theorem on the blackboard (or as
PowerPointslides)andexplainsthesteps.Whattheyneverdoisshowhowthetheorem
wasactuallyprovedwhentheoriginalpersonprovedit—deadends,daysofnoprogress,
goodideas,badideas:thewholemessyprocess.
Thisiscrucial.Notheoremandnocomputerprogramflows fully formedandcorrect
fromsomeone’shead.Observingthefullprocessmaybeavaluablestageintheeducation
of a programmer. They will see that the process is prone to error, even for good
programmers. They’ll see that not all ideas that seem good are actually good; that the
processisnotalinearone,butthatitappearsinsomesensetospiral,gainingfunctionality
ateachloop.Andtheywillseethattherecanbeasimpleandobviousmethodthatcould
beagreeduponbymanydifferentprogrammersandyet adapted foreachnewsituation.
Themethod that we’ll use is callediterativerefinement, and it is nearly independent of
languageorphilosophy.Ofcourse,noteveryonewillagree.
Oneexampleprogramwillbeacomputergame,andonethatcan’tbeplayedwithouta
computer.Itwillbeabreakoutstylegamethatusescirclesinsteadofrectangles.Theother
willbeasystemthatformatstypedtext.
12.1 PROCEDURALPROGRAMMING–WORD
PROCESSING
Intheearlydaysofdesktoppublishing,theprogramsthatwritersuseddidnotdisplay
theresultsonthescreenin“what-you-see-is-what-you-get”form.Formattingcommands
were embedded within the text and were implemented by the program, which would
createaprintableversionthatwasproperlyformatted.Programslikeroff,nroff,tex,and
variationsthereofarestillused,butmostwritingtoolsnowlooklikeWordorPageMaker
withcommandsbeinggiventhroughagraphicaluserinterface.
Thereisalimittowhatkindoftextprocessingcanbedoneusingsimpletextfiles,but
when you think about it that’s really what a typewriter produces—simple text on paper
withfixed size fonts. Thatworked for a verylong time. It was good enough for Ernest
HemingwayandRaymondChandler.
The program that will be developed here will accept text from a file and format it
according to a set of commands that have a specific format and are predefined by the
system.Theinputwillresemblethatacceptedbynroff,anoldUnixutility,butwillbea
subsetforsimplicity.Sinceitusesstandardtextinputandoutputanymeasurementswill
bemadeincharacters,notinchesorpoints.Commandswillbeginonanewlinewitha“.”
character and will be alphabetic. A line beginning with “.br”, for instance, results in a
forcedlinebreak.Somecommandstakeaparameter:thecommand“.ll55”setstheline
lengthto55characters.
Hereisalistofallofthecommandsthatthesystemwillrecognize:
.pln Setsthepagelengthtonlines
.bpn Beginpagen
.br Break
.fi Filloutputlines(e.g.,justify)
.nf Don’tfilloutputlines
.na Nojustificaton
.cen Centerthenextninputlines
.lsn Outputn-1linespacesaftereachline
.lln Linelengthisncharacters
.inn Indentncharacters
.tin Temporarilyindentncharacters
.nh Donothyphenate
.hy Hyphenationon
.spn Generatenlines
Theprogramwillreadatextfileandidentifythewordsandthecommands.Thewords
willbewrittentoanoutputfileformattedasdescribedbythecommands.Thedefaultwill
betorightjustifythetext,andtouseemptylinesasparagraphbreaks.Thequestionstobe
answeredhereare:
1.Howdoesonebegincreatingsuchaprogram?
2.Cantheprocessofprogramcreationbedescribed?
a.Istheprocesssystematicorcasual?
b.Isthereonlyoneprocess?
Beginningwiththelastquestionfirst,thereisnosingleprocess.Whatispresentedhere
is only one, but it should be understood that there are others, and that some processes
probablyworkbetterthanothersforsomekindsofprogram.Theprogramtobecreated
herewillnotuseclasses,andwillinvolveaclassicalortraditionalmethodologygenerally
referredtoastop-down.Somepeopleonlyuseobject-orientedcode,butaproblemwith
teachingthatwayisthataclasscontainstraditional,procedure-orientedcode.Tomakea
class,onemustfirstknowhowtowriteaprogram.
12.1.1 Top-Down
The idea behind top-down programming is that the higher levels of abstraction are
describedfirst.Adescriptionof what theentire programis to dois writtenin akind-of
English/computer hybrid language (pseudocode), and this description involves making
callstofunctionsthathavenotyetbeenwrittenbutwhosefunctionisknown.Whenthe
highestleveldescriptionisacceptable,thenthefunctionsusedaredescribed.Inthisway
thehigh-leveldecisionsaredescribedintermsofthelowerlevel,whoseimplementationis
postponeduntilthedetailsareappropriate.Theprocessrepeatsuntilallpartshave been
described, at which time the translation of the pseudocode into a real programming
language can proceed, and should be straightforward. This can result in many distinct
programs,butallshoulddobasicallythesamething,simplyinsomewhatdifferentways.
Forthetaskathand,thefirststepistosketchtheactionsoftheprogramasawhole.The
programbeginsbyopeningthetextfileandopeninganoutputfile.Thebasicactionisto
copyfrominputtooutput,withcertainadditionstotheoutputtext.Thedatafileisreadin
ascharactersorwords,butoutputaslinesandpages.Soperhapsthefollowing:
Openinputfileinf
Openoutputfileoutf
Readawordwfrominf
Whilethereismoretextoninf:
Ifwisacommand:
Processthecommandw
Else:
Thenextwordisw.Processit
Readawordfrominf
Closeinf
Closeoutf
Thisrepresentstheentireprogram,althoughlackingadegreeofdetail.AsPythonthis
wouldlookalmostthesame:
filename=input(“PYROFF:Enterthenameiftheinput
file:“)
inf=open(filename,“r”)
outf=open(“pyroff.txt.“w”)
w=getword(inf)
whilew!=””:
ifiscommand(w):
process_command(w)
else:
process_word(w)
w=getword(inf)
inf.close()
outf.close()
Inorderfortheprogramtocompilethefunctions,theymustexist.Theyshouldinitially
bestubs,relativelynon-functionalbutresultinginoutput:
fromrandomimport*
defgetword(f):
print(“Getword“)
defiscommand(w):
print(“ISCOMMANDgiven“,w)
ifrandom()<0.5:
returnFalse
returnTrue
defprocess_command(w):
print(“Processingcommand“,w)
defprocess_word(w):
print(“Processingtheword“,w)
Thisprogramwillrun,butneverendsbecauseitneverreadsthefile.Still,wehavea
structure.
Nowthefunctionsneedtobedefined,andintheprocessfurtherdesigndecisionsare
made. Consider getword(): what comprises a word and how does it differ from a
command?Acommandstartsatthebeginningofalinewitha“.”character.Itisfollowed
bytwoalphabeticcharactersthataredefinedbythesystem.Ifthetwocharactersdonot
matchanycombinationsinthelistofcommands,thenitisnotacommand.Aword,onthe
otherhand,beginsorendswithawhitespace(blank,tab,orendofline)andcontainsall
ofthecharactersbetweenthosewhitespaces.Itmaynotbeawordinthetraditionalsense,
in that it may not be an English word; it could be a number or other sequence of
characters.Thosemay causeproblems,but itwillbe leftup totheuser tofigure itout.
Example:alongURLmayextendoveraline.Theprogramhastodosomething,andso
will probably put an end of line when the count of characters exceeds a maximum and
leavetheproblemtotheusertofix.
So,let’sfigureoutthegetword()function.Itwillconstructawordasacharacterstring
fromindividualcharactersthathavebeenreadfromtheinputfile.Afirsttrycouldbe:
defgetword(f):
w=””
whilewhitespace(ch(f)):
nextch(f)
whilenotwhitespace(ch(f)):
w=w+ch(f)
nextch(f)
print(“Getwordis“,w)
returnw
Thefunctionwhitespace()returnsTrueifitsparameterisawhitespacecharacter.The
functionnextch() reads the next characterfrom the specified file, andthe function ch()
returns the value of the current character. To effectively test getword(), we need to
implementthesethreefunctions.Here’safirstattempt:
defwhitespace(c):
ifc==”“:returnTrue
ifc==“\t”:returnTrue
ifc==“\n”:returnTrue
returnFalse
defch(f):
globalc
return(c)
defnextch(f):
globalc
c=f.read(1)
This way of handling input is a bit unusual, but there is a reason for it. We are
anticipating a need to buffer characters or to place them back on the input stream. It is
similartotheinputschemeusedinPascal,orthesystemfoundinearlyformsofUNIX
whichusedgetchar–putchar-ungetc.Thenecessityofextractingcommandsfromthe
input stream, and that commands must begin a new line, might make this particular
schemeuseful.Theinitialimplementationofnextch()simplyreadsanewcharacterfrom
thefile,butitcouldeasilybemodifiedtoextractacharacterfromabuffer,andrefilethe
bufferifitisempty.Bothwouldlookthesametotheprogrammerusingthem.
Theprogramruns,buthasaproblem:itneverterminates.Afterthetextfilehasbeen
read,theprogramseemstocallnextch()repeatedly.Aftersomethoughtthereasonisclear
—whentheinputrequestresultsinanemptystring(“”)thecurrentcharacterisnotawhite
space,andtheloopingetword()thatisbuildingawordrunsforever.Thisisatraditional
end-of-fileproblemandcanbesolvedinafewdifferentways:aspecialcharactercanbe
usedforEOF,aflagcanbeset,ortheemptystringcanbetestedforintheloopexplicitly.
Thelattersolutionwaschosen,andfixestheinfiniteloop.Thewordconstructionloopin
getword()becomes:
whilenotwhitespace(ch(f))andch(f)!=””:
A possible next step is to distinguish between commands and words. Because a
commandstartsalineandbeginswitha“.”therearetwothingstodo:markthebeginning
ofanewline,andlookuptheinputstringinatableofcommands.Thecommandcouldbe
searchedfirst,thenifitmatchesacommandnamewecouldbackuptheinputtoseeifit
was preceded by a newline character (“\n”). A newline counts as a white space, and
anotheroptionwouldbetosetaflagwhenanewlinecharacterisseen,clearingitwhen
anothercharacterisreadin.Nowastringisacommandiftheflagsetbeforeitwasreadin
anditmatchesoneofthecommands.Timingiseverythinginthismethod,butwhitespace
separates words, so it could work by simply remembering (saving) the last white space
characterseenbeforeanyword.Thatsoundslikeagoodidea.
Oops.Whenimplemented,noneofthecommandsarerecognized.Atableofnameswas
implementedasatuple:
table = (“.pl”,”.bp”,”.br”,”.fi”,”.nf”,”.na”,”.ce”,
“.ls”,”.ll”,”.in”,”.ti”,”.nh”,”.hy”,”.sp”)
Thenextch()functionwasmodifiedso:
defnextch(f):
globalc,lastws
c=f.read(1)
ifwhitespace(c):
lastws=c
and the function iscommand() is implemented by checking for the newline and the
matchofthestringinthetable:
defiscommand(w):
globaltable,lastws
iflastws==“\n”:
ifwintable:
returnTrue
returnFalse
To discover the problem some print statements were inserted that show the previous
white space character and the match in the table for all calls to iscommand(). The
problem,whichshouldhavebeenobvious,isthatwhenthecommandisreadin,thelast
whitespaceseenwillbetheonethatterminatedit,nottheoneinfrontofit.
A solution: keeping the same theme of remembering white space characters, how about
savetheprevioustwowhitespacecharactersseen.Themostrecentwhitespacewillbethe
one that terminated the word string, and the second most recent will always be the one
before it. All of the others, if any, would have been skipped within getword(). The
solution,ascodedinthenextch()function,wouldbe:
defnextch(f):
globalc,clast,c2last
c=f.read(1)
ifwhitespace(c):
c2last=clast
clast=c
Therearetwovariablesneeded,clastbeingthepreviouswhitespaceandc2lastbeing
the one encountered before clast. Now iscommand() is modified slightly to look for
c2last:
defiscommand(w):
globaltable,c2last
ifc2last==“\n”:
ifwintable:
returnTrue
returnFalse
Yes,thisnowidentifiesthecommandsinthesourcefile,eventhetextthatlookslikea
commandbutisnot:“.xx.”
Noticethatthedevelopmentoftheprogramconsistsofaninitialsketchandthenfilling
inthecodeasstubsandcodingthestubstobefunctionalcode,oneatatime.Sometimesa
stub requires further undefined functions to be used, and those could be coded as stubs
too,orcompletediftheyaresmallsoastoallowtestingtoproceed.It’sajudgmentcallas
towhethertocompletethestubsdownthechainforonepartoftheprogramortoproceed
tothenextoneatthecurrentlevel.Forexample,shouldwehavecompletedthenextch()
andch()functionsbeforetryingtodesignprocess_command()?Itdoesdependonhow
testingcanproceedandwhat“level”we’reat.Thenextch()functionlookslikeitwon’t
call other functions that have not been implemented, and it is tough to test getword()
withoutfinishingnextch().
Thisdiscussionspeakstowhatthenextstepwillbefromhere,andtheansweris“there
couldbemany.”Let’s lookatcommandsnext,becausetheywilldictatetheoutput,and
then deal with formatting last. It is known that a string represents a command, and the
function called as a consequence is process_command(). This function must determine
whichcommandstringwasseenandwhattodoaboutit.Thewaycommandsarehandled
andthewaytheoutputdocumentisspecifiedhastobesortedoutbeforethisfunctioncan
befinished,butasetofstubscanholdtheplaceoffuturedecisionsasbefore.
Thestringthatwasseentobeacommandisstoredinatuple.Theindexofthestring
withinthetupletellsuswhichcommandwasseen,althoughastringmatchcouldbedone
directly.Usingatupleisbetterbecausenewcommandscanalwaysbeaddedtotheendof
the tuple during future modifications and it is easier to modify command names. The
function,whichusedtobeastub,isnow:
defprocess_command(w):
globaltable,inf,page_length,fill,center,center_
count,globalspacing,line_length,adjust,hyphenate
k=table.index(w)
ifk==0:#.PL
s=getword(inf)
page_length=int(s)
elifk==1:#.BP
genpage()
elifk==2:#.BR
genline()
elifk==3:#.FI
fill=True
elifk==4:#.NF
fill==False
elifk==5:#.NA
adjust=False
elifk==6:#.CE
center=True
s=getword(inf)
center_count=int(s)
elifk==7:#.LS
s=getword(inf)
spacing=int(s)
elifk==8:#.LL
s=getword(inf)
line_length=int(s)
print(“Linelength“,line_length,“characters”)
elifk==9:#.IN
s=getword(inf)
indent(int(s))
elifk==10:#.TI
s=getword(inf)
temp_indent(int(s))
elifk==11:#.NF
hyphenate=False
elifk==12:#.HY
hyphenate=True
elifk==13:#.TL
dotl()
elifk==14:#.SP
s=getword(inf)
space(int(k))
This completes iteration 5 of the system and generates quite a few new stubs and
defineshowsomeoftheoutputfunctionswilloperate.Therearesomeflags(hyphenate,
center,fill, adjust) and some parameters for the output process (line_length, spacing,
etc.)thatareset,andsowillbeusedinsendingoutputtexttothefile.Theseparameters
beingknown,itistimetodefinetheoutputprocess,whichwillbeimplementedstarting
withthefunctionprocess_word().
Aswasmentionedearlier,theprogramreadsdataonecharacteratatimeandemitsitas
words.Thereisaspecifiedlinelength,andwordscanbereadandstoreduntilthatlength
isnearedorexceeded.Wordscouldbestoredinastring.Whenthelinelengthisreached,
thestringcouldbewrittentothefile.Ifrightjustificationisbeingdone,spacescouldbe
addedtosomeotherspacesinthestringuntilthelinelengthwasmetexactly,orthefinal
wordcouldbehyphenatedtomeetthelinelength.Ifrightjustificationisnotbeingdone,
thenthelinelengthonlyhastobeapproached,butnotexceeded.
Fortextcenteringinputlinesarepaddedwithequalnumbersofspacesonbothsides.
Thepagesizeismetbycountinglines,andthenbycreatinganewpagewhenthepage
sizeis met, possibly byentering a form feedor perhaps by printing emptylines until a
specifiedcountisreached.Indentingissimple:theincommandresultsinafixednumber
ofspacesbeingplacedatthebeginningofeachoutputline;theticommandresultsina
specifiednumberofspacesbeingplacedatthebeginningofthecurrentline.Hyphenation
isdonebytablelookup.Certainsuffixesandprefixesandlettercombinationsarepossible
locationsforahyphen.Thefinalwordonalinecanbehyphenatedifalocationwithinitis
subjecttoahyphenasindicatedbythetable.
Theprocessistoreadandbuildwordsandcopythemtoastring,thenextoutputline.
Noactionistakenuntilthestringnearsthelinelength,atwhichpointinsertionofspaces,
hyphenation,orotheractionsmaybetakentomakethestringfittheline,eithercloselyor
precisely.Afteralinemeetsthesizeneededitiswritten,perhapsfollowedbyothersifthe
linespacingislargerthanone.So,thebasicactionoftheprocess_word()functionwillbe
tocopythewordtoastring,theoutputbuffer,underthecontrolofasetofvariablesthat
aredefinedbytheuserthroughcommands:
page_length 55 Numberoflinesoftextonasinglepage
fill True Controlswhetherthetextisbeingformatted
adjust True Controlswhetherthetextisrightjustified
center False Controlswhethertextisbeingcentered
center_count 0 Numberoflinesstilltobecentered
spacing 1 Numberoflinesoutputperlineoftext
nindent 0 Numberofspacesontheleft
line_length 66 Numberofcharactersononeline
hyphenate True Arewordshyphenatedbythesystem?
Thesimplestversionofprocess_word()wouldcopywordstothebufferuntiltheline
wasfullandthensimplywritethatlinetotheoutputfile.
defprocess_word(w):
globalbuffer,line_length
iflen(buffer)+len(w)+1<=line_length:
buffer=buffer+””+w
else:
emit(buffer)
buffer=w
The code above adds the given word plus a space to the buffer if there is room.
Otherwiseitcallstheemit()functiontowritethebuffertotheoutputfileandplacesthe
word at the beginning of a new line. This is nearly correct. Some of the output for the
samplesourceis:
ThisissampletextfortestingPyroff.Thedefaultistoright
adjustcontinuously,butembeddedcommandscanchangethis.
Nowthelinewidthshouldbe
30characters,andsotheleft
marginwillpulledback.This
lineiscentered.xxnota
command.Indented4
Notethatthecommand“.ll30”wascorrectlyhandled,butthatthereisanextraspaceat
the beginning of the first line. That’s due to the fact that process_word() adds a space
between words, and if the buffer is empty that space gets placed at the beginning. The
solutionistocheckforanemptybuffer:
iflen(buffer)+len(w)+1<=line_length:
iflen(buffer)>0:
buffer=buffer+””+w
else:
buffer=w
Thiswasasuccessful fix,andcompletesiteration6 ofthesystem, whichisnow150
lineslong.
Withinprocess_word()therearemultipleoptionsforhowwordswillbewrittentothe
output. What has been done so far amounts to filling but no rightjustification. Other
optionsare:nofilling,centering,andjustification.Whenthefillingisturnedoff,aninput
linebecomesanoutputline.Thisistrueforcenteringaswell.Whenjustificationistaking
place the program will make the output lines exactly line_length characters long by
insertingspacesinthelinetoextenditandbyhyphenation,wherepermitted,toshortenit.
Theruleisthatthelinemustbethecorrectlengthandmustnotbeginorendwithaspace.
Theimplementationofthispartoftheprogramisattheheartoftheoverallsystem,but
wouldnotbepossiblewithoutasensibledesignuptothispoint.
12.1.2 Centering
First,acenteredlineistobewrittentooutputwhenanendoflineisseenoninput.This
meansthattheclastvariablewillbeusedtoidentifytheendoflineandtoemitthetext.
Next, the line will have spaces added to the beginning and end to center it. The buffer
holdsthe lineto be writtenand haslen(buffer) characters.The number of spacesto be
addedtotalsline_length–len(buffer),andhalfwillbeaddedtothebeginningoftheline
andhalftotheend.Afunctionthatcentersagivenstringwouldbe:
defdo_center(s):
globalline_length
k=len(s)#Howlongisthestring?
b1=line_length-k#Howmuchshorterthantheline?
b2=b1//2#Splitthatamountintwo
b1=line_length-k-b2
s=”“*b1+s+”“*b2#Addspacestocenterthetext
emit(s)#Writetofile
In the process_word() function some code must be added to handle centering. This
codehas todetect theend ofline andpass the bufferto do_center().It alsocounts the
lines,becausethe“.ce”commandspecifiesanumberoflinestobecentered.
ifcenter:#Textisbeingcentered,nofill
iflen(buffer)>0:#Addthiswordtotheline
buffer=buffer+””+w
else:
buffer=w
ifclast==“\n”:#Aninputline=anoutputline
do_center(buffer)#Emitthetext
center_count=center_count-1#Countlines
ifcenter_count<=0:#Done?
center=False#Yes.Stopcentering.
Thiscodeisnotquiteenough.Therearetwoproblemsobserved.Oneproblemisthat
thebuffercouldbepartlyfullwhenthe“.ce”commandisseen,andmustbeemptied.This
problem is serious, because filling may be taking place and the line might have to be
justified.Forthemomentacalltoemit()willhappenwhenthe“.ce”commandisseen,
butthiswillhavetobeexpanded.
Theotherproblemissimpler:thedo_center()functiondoesnotemptythebuffersothe
linebeingcenteredoccurstwiceintheoutput.Forexample:
marginwillpulledback.
Thislineiscentered←Thisiscorrect
Thislineiscentered.xxnot←Thisiswrong.Textisrepeated.
acommand.Indented4
Thesolutionistoclearthebufferafterdo_center()iscalled:
do_center(buffer)#Emitthetext
buffer=””#Clearthebuffer
12.1.3 RightJustification
Centeringtextisafirststeptounderstandinghowtojustifyit.Rightjustifiedtexthas
the sentences arranged so that the right margin is aligned to the line. When centering,
spaces are added to the left and right ends of the string so as to place any text in the
middle of the line. When justifying, any space in the line can be made into multiple
spaces,thusextendingthetextuntilitreachestherightmargin.Naturallyitwouldnotbe
acceptabletoplacealloftheneededspacesinonespot.Itlooksbestiftheyaredistributed
asevenlyaspossible.However,nomatterwhatisdonetherewillbesomesituationsthat
causeuglyspacing.We’llhavetolivewiththat.
Thenumberofspacesneededtofillupalineisline_length–len(buffer),justasitwas
whencentering.Aswordsareaddedtothelinethisvaluebecomessmaller,andwhenitis
smallerthanthelengthofthenextwordtobeadded,thentheextraspacesmustbeadded
andanewlinestarted.Thatis,when
k=line_length-len(buffer)
ifk<len(word):
thenadjustingisperformed.First,countthespacesinthebufferandcallthisnspaces.If
k>nspaces then change each single space into k//nspaces space characters and set k =
k%nspaces.Thiswillrarelyhappen.Nowweneedtochangesomeofthespacesinthe
bufferintodoublespaces.Whichones?Inanattempttospreadthemaround,setxk=k+
k//2. This will be used as an increment to find consecutive spots to put spaces. So for
example,letk=5,inwhichcasexk=7.Thefirstspacecouldbeplacedinthemiddle,or
atspacenumber2.Nowcountxkpositionsfrom2,startingoveratzerowhenyouhitthe
end.Thiswillgive4asthenextposition,followedby1,then3,andthen0.Thisprocess
seemstospreadthemout.Nowthebufferiswrittenoutandthenewwordisplacedinan
emptybuffer.
Thissoundstricky,solet’swalkthroughit.Neverentercodethatisnotlikelytowork!
Insideoftheprocess_word()function,checktoseeifadjustingisgoingon.Ifso,check
toseeifthecurrentwordfitsinthecurrentline.Ifso,putitthereandmoveon.
elifadjust:
k=line_length-len(buffer)#Numberofspaces
#remaining
ifk>len(w):#Doesthewordwfit?
iflen(buffer)==0:#Yes.Emptybuffer?
buffer=w#Yes.Buffer=word.
else:#No.Addwordtothe
#buffer
buffer=buffer+””+w
print(“Buffernow“,buffer,k,len(w))
else:#Notenoughspaceremains
print(buffer,k,w,len(w))
nspaces=buffer.count(”“)#Howmanyspacesin
#buffer?
xk=k+k//2+1#Spaceinsertincrement
whilek>0:
i=nth_space(buffer,xk)
buffer=buffer[0:i]+””+buffer[i:]
k=k-1
xk=xk+1
emit(buffer)
buffer=w
The function nth_space (buffer, xk) locates the nth space character in the string s
modulo the string length. The spaces were not well distributed with this code in some
cases.Therewasasuspicionthatitdependedonwhetherthenumberofremainingspaces
wasevenorodd,sothecodewasmodifiedtoread:
...
xk=k+(k+1)//2#Spaceinsertincrement
ifk%2==0:
xk=xk+1
...
whichworkedbetter.Theoutputforthefirstpartofthetestdatawas:
This is sample textfor testing Pyroff. The default is to right adjust continuously, but
embeddedcommandscanchangethis.
Nowthelinewidthshouldbe
30characters,andsotheleft
marginwillpulledback.
Thislineiscentered
.xxnotacommand.Indented4
characters.Theideabehind
top-downprogrammingisthat
thehigherlevelsof
abstractionaredescribed
...
Theshortlinesarerightjustified,butthedistributionofthespacescouldstillbebetter.
Thefunctionnth_space()isimportant,andlookslikethis:
defnth_space(s,n):
globalnindent
nn=0#nnisacountofspacesseensofar
i=0#iistheindexofthecharacterbeingexamined
whileTrue:
ifs[i]==”“:#Ischaracteriaspace?
nn=nn+1#Yes.Countit
ifnn>=n:#Isthisenoughspaces?
returni#Yes,returnthelocation
i=(i+1)%len(s)#Nexti,wrappingaroundtheend
12.1.4 OtherCommands
The rest of the commands have to do with hyphenation, pagination, and indentation,
exceptforthe“.br”command.Dealingwithindentationfirst,thecommand“.in”specifies
anumberofcharacterstoindent,asdoes“.ti.”The“.in”commandbeginsindentinglines
from the current point on, whereas “.ti” only indents the next line. Since the “.ti”
commandonlyindentsthenextlineoftext,perhapsinitializingthebuffertothecorrect
numberofspaceswilldothetrick.Therestofthetextforthelinewillbeconcatenatedto
thespaces,resultinginanindentedline.
The“.in”commandcurrentlyresultsinthesettingofavariablenamednindenttothe
numberofspacestobeindented.Followingthesuggestionfor atemporaryindent,why
notreplaceallinitializationsofthebufferwithindentedones?Therearemultiplelocations
withintheprocess_word()functionwherethe
bufferissettothenextword:
buffer=w
Thesecouldbechangedto:
buffer=”“*nindent+w
Thissoundscleanandsimple,butitfailsmiserably.Hereiswhatitlookslike.Forthe
inputtext:
Indented4characters.
.in2
The idea behind top-down programming is that the higher levels of abstraction are
described first. A description of what he entire program is to do is written in a kind-of
English/computer hybrid language (pseudocode), and this description involves making
callstofunctionsthathavenotyetbeenwrittenbutwhosefunctionisknown.
Weget:
Indented4characters.The
ideabehindtop-down
programmingisthatthe
higherlevelsofabstraction
aredescribedfirst.A
descriptionofwhathe
entireprogramistodois
writteninakind-of
English/computerhybrid
language(pseudocode),and
thisdescriptioninvolves
makingcallstofunctions
thathavenotyetbeen
Canyoufigureoutwheretheproblemisbylookingattheoutput?Thisisaskillthat
developsasyoureadmorecode,writemorecode,anddesignmorecode.Thereisaplace
intheprogramthatwilladdspacestothetext,andclearlythathasbeendonehere.Itis,in
fact,howthetextisrightadjusted.Thespacesarecountedandsometimesreplacedwith
doublespaces.Thishappenedheretosomeofthespacesusedtoimplementtheindent.
Possiblesolutionsincludetheuseofspecialcharactersinsteadofleadingblanks,tobe
replacedwhenprinted; findinganotherway toimplementindenting; modifyingthe way
right adjusting is done. Because the number of spaces at the beginning of the line is
known,thelattershouldbepossible:whencountingspacesintheadjustmentprocess,skip
thenspacescharactersatthebeginningoftheline.Thisisamodificationtothefunction
nth_character()topositionthecountaftertheindent:
defnth_space(s,n):
globalnindent
nn=0
i=0
whileTrue:
print(“nn=”,nn)
ifs[i]==”“:
nn=nn+1
print(ꞌ”“ꞌ)
ifnn>=n:
returni
i=(i+1)%len(s)
ifi<nindent+tempindent:←
i=nindent+tempindent←
Asecondproblemintheindentationcodeisthatthereshouldbealinebreakwhenthe
commandisseen.Thisisamatterofwritingthebufferandthenclearingit.Thisshould
also occur when a temporary indent occurs, but before it inserts the spaces. Say, the
temporaryindentwillhavethesameproblemasindentwithrespecttorightadjustment,
andwehavenotdealtwiththat.
Thelinebreakcanbehandledwithanewfunction:
deflbreak():
globalbuffer,tempindent,nindent
iflen(buffer)>0:
emit(buffer)
buffer=”“*(nindent+tempindent)
tempindent=0
Thebreakinvolveswritingthebufferandclearingit.Clearingitalsomeanssettingthe
indentation.Becausethissequenceofoperationshappenselsewhereintheprogram,those
sequencescanbereplacedbyacalltolbreak().Notethatanewvariabletempindenthas
beenadded;itholdsthenumberofspacesforatemporaryindentation,andisaddedtothe
regularnindentvalueeverywherethatvariableisusedtoobtainthetotalindentationfora
line.Nowrightadjustmentofatemporarilyindentedlineshouldwork.
The lbreak() function is used directly to implement the “.br” command. A stub
previouslynamedgenline()canberemovedandreplacedbyacalltolbreak().
Linespacingcanbehandledinemit(),whichiswhere linesarewrittentothe output
file.Afterthecurrentbufferiswritten,anumberofnewlinecharactersarewrittentoequal
thecorrectlinespacing.Thenewemit()functionis:
defemit(s):
globaloutf,lines,tempindent,spacing,page_length
outf.write(s+”\n”)
lines=(lines+1)%page_length
foriinrange(1,spacing):
outf.write(“\n”)
lines=(lines+1)%page_length
tempindent=0
Whataboutpages?Thereisacommandthatdealswithpagesdirectly,andthatis“.bp,”
whichstartsanewpage.Thepagelengthisknownintermsofthenumberoflines,and
emit counts the lines as it writes them. Implementing the “.bp” command should be a
matter of emitting the number of lines needed to complete the current page. Something
likethis:
defgenpage():
globalpage_length,lines
lbreak()
foriinrange(lines,page_length):
emit(””)
Atthispointallthatismissingistheabilitytohyphenate,whichwillbeleftasoneof
theexercises.Thesystemappearstodowhatisneededusingthe
smalltestfile,sothetimehascometoconstructmorethoroughtests.Afile“preface.txt”
holdsthetextfortheprefaceofabooknamed“PracticalComputerVisionUsingC.”This
book was written using Nroff, and the commands not available in pyroff were removed
fromthesourcetextsothatitcouldbeusedastestdata.Itconsistsofover500linesof
text.Theresultofthefirsttrywasinteresting.
Pyroff appeared to run using this input file but never terminated. No output file was
created.Thefirststepwastotrytoseewhereitwashavingtrouble,soaprintstatement
wasadded toshow whatword had been processed last.That wordwas “spectrograms,”
anditappearsinthefirstparagraphoftext,after
headingsandsuch.Nowthedatathatcausedtheproblemisknown.Whatistheprogram
doing? There must be an unterminated loop someplace. Putting prints in likely spots
identifies the culprit as the loop in the nth_space() function. Tracing through that loop
findsanoddthing:thevalueofnindentbecomesnegative,andthatcausestheloopnever
toterminate.Thetestdatacontainedasituationthatcausedtheprogramtofail,andthat
situationresultedfromadifferencebetweenNroffandpyroff:inNroffthecommand“.in
-4”subtracts4fromthecurrentindentation,whereasinpyroffitsetsthecurrentindentto
-4.
Thiskindoferrorisverycommon.Allvaluesenteredbyausermustbetestedagainst
thelegalboundsforthatvariable.Thiswasnotdonehere,andthefixissimple.However,
itremindsustodothatforallotheruserinputvalues.Theseareprocessedinthefunction
process_command(),solocatingthosevaluesiseasy.Oncethiswasdonethingsworked
prettywell.Therewasoneproblemobserved,andthatwasanindentationerror.Consider
theinputtext:
.nf
1.Introduction
.in3
1.1Imagesasdigitalobjects
1.2Imagestorageanddisplay
1.3Imageacquisition
1.4Imagetypesandapplications
Theprogramformatsthisas:
1.Introduction
1.1Imagesasdigitalobjects
1.2Imagestorageanddisplay
1.3Imageacquisition
1.4Imagetypesandapplications
Thereisanextraspaceinthefirstlineaftertheindent.Thisseemslikeitshouldbeeasy
to find where the problem is, but the function that implements the command, indent(),
looksprettyclean.However,oncarefulexamination(andprintingsomebuffervalues)it
can be seen that it should not call lbreak() because that function sets the buffer to the
properlyindentednumberofspacecharacters.Thismeansthatwhenthelatertestsforan
emptybufferoccur, thebufferis not empty and text is appended toit rather than being
simplyassignedtoit.Thatis,foranemptybufferthefirstwordisplacedintoit:
buffer=word
Whereasiftextispresentthewordisappendedafteraddingaspace:
buffer=buffer+””+word
Theindentfunctionnowlookslikethis:
defindent(n):
globalnindent,buffer
nindent=n
emit(buffer)
buffer=””
Theprefacenowformatsprettywell,ifnotuptoWordstandards.Otherproblemsmay
well exist, and should be reported to the author and publisher when discovered. The
book’swikiistheplaceforsuchdiscussions.
12.2 OBJECTORIENTEDPROGRAMMING–
BREAKOUT
TheoriginalgamenamedBreakoutwasbuiltin1976,conceivedbyNolanBushnelland
SteveBristowandbuiltbySteveWozniak(somesayaidedbySteveJobs).Inthisgame
there are layers of colored rectangles in the upper part of the screen. A simulated ball
moves around the game window, and if it hits a rectangle it accumulates points and
bounces. The ball also bounces off of the top and sides of the window, but will pass
throughthebottomandbelostunlesstheplayermovesapaddleintoitspath.Ifsotheball
will bounce back up and perhaps score more points; if not the ball moves out of play.
Afterafixednumberofballsarelostthegameisover.
The game being developed here will use circles, that we’ll call tiles, rather than
rectangles.Therewillbe5rowsoftiles,eachofadifferentcolorandpointvalue:5,10,
15,10,and5pointsforeachrowrespectively.Thatwaythemostconcealedrowhasthe
mostpoints.Theplayerwillgetthreeballstotrytoclearallofthetilesaway.Thepaddle
willmove leftwhentheleftarrowkeyispressedandrightwhentherightarrowkeyis
pressed. The speed of the ball and of the paddle will be determined when the game is
tested.Asoundwillplaywhenatileisremoved,whentheballhitsthesideortopofthe
window,whentheballhitsthepaddle,andwhentheballislost.Thecurrentscoreandthe
numberofballsremainingwillbedisplayedonthescreensomeplaceatalltimes.
Figure12.1 showsanexampleof a breakoutgamecloneontheleft,withrectangular
bricks. On the right is a possible example of how the game that we’re developing here
mightlook.
12.3 DESCRIBINGTHEPROBLEMASAPROCESS
The first step is to write down a step-by-step description of how the program might
operate. This may be changed as it is expanded, but we have to start someplace. A
problemoccursalmostimmediately:istheprogramtobeaclass?Functions?Doesituse
Glib?
Thisdecisioncanbepostponedalittlewhile,butinmostcasesaprogramisnotaclass.
It is more likely to be a collection of classes operated by a mail program. However, if
objectorientationisalogicalstructure,anditoftenis,itshouldevolvenaturallyfromthe
waytheproblemisorganizedandnotimposeitselfonthesolution.
Thegameconsistsofmultiplethingsthatinteract.Playisaconsequenceofthebehavior
ofthosethings.Forexample,theballwillcollidewithatileresultinginsomepoints,the
tiledisappearing,andabounce.Thenexteventmaybethattheballcollideswithawall,
orbouncesoffofthepaddle.Thegameisasetofmanagedeventsandconsequences.This
makesitappearasifanobjectorienteddesignandimplementationwouldbesuitable.The
ball,eachtime,andthepaddlecouldbeobjects(classinstances)andcouldinteractwith
eachotherunderthesupervisionofamainprogramwhichkepttrackofallobjects,time,
scores,andotherdetails.
Let’sjustfocusonthegameplaypartofthegame,andignoreintroductorywindowsand
highscorelistsandotherpartsofarealgame.Thegamewillstartwithaninitialsetupof
the objects. Tiles will be placed in their start locations, the paddle will be placed, the
locationsofthewallsdefined;thenitwillbedrawn.Theinitialsetupwasoriginallydrawn
on paper and then a sample rendering was made, shown in Figure 12.1. The code that
drawsthisis:
#Ver0-Renderinitialplayarea
importGlib
Glib.startdraw(400,800)
Glib.fill(100,100,240)
foriinrange(0,12):
Glib.ellipse(i*30+15,30,30,30)
Glib.fill(220,220,90)
foriinrange(0,12):
Glib.ellipse(i*30+15,60,30,30)
Glib.fill(220,0,0)
foriinrange(0,12):
Glib.ellipse(i*30+15,90,30,30)
Glib.fill(180,120,30)
foriinrange(0,12):
Glib.ellipse(i*30+15,120,30,30)
Glib.fill(90,220,80)
foriinrange(0,12):
Glib.ellipse(i*30+15,150,30,30)
Glib.fill(0)
Glib.rect(180,350,90,10)
Glib.enddraw()
This code is just for a visual examination of the potential play area. The first one is
alwayswrong,andthisoneistoo,butitallowsustoseewhyitiswrongandtodefinea
morereasonablesetofparameters.Inthiscasethetilesdon’tfullyoccupythehorizontal
region,thetilegroupsaretooclosetothetop,becausewewanttoallowaballtobounce
betweenthetoprowandthetopoftheplayarea,andtheplayareaistoolargevertically.
Fixing these problems is a simple matter of modifying the coordinates of some of the
objects. This code will not be a part of the final result. It’s common, especially in
programs involving a lot of graphics, to test the visual results periodically, and to write
sometestingprogramstohelpwiththis.
This program already has some obvious objects: a tile will be an object. So will the
paddle,andsowill theball.Theseobjects havesomeobviousproperties too:atilewill
haveapositioninx,ycoordinates,anditwillhaveacolorandasize.Itwillhaveamethod
todrawitonthescreen,andawaytotellifithasbeenremovedorifitstillactive.The
paddlehasapositionandsize,andsodoestheball,althoughtheballhasnotbeenseen
yet.
Whatwillthemainprogramlooklikeifthesearetheprincipalobjectsinthedesign?
Thefirstsketchisveryabstractanddependsonmanyfunctionsthathavenotbeenwritten.
Thisfleshesoutthewaytheclassesandtheremainderofthecodewillinteractandpartly
definesthemethodstheywillimplement.Theinitializationstepwillinvolvecreatingrows
oftilesthatwillappearmuchlikethoseintheinitialrenderingabovebutactuallyconsist
offiverowsoftileobjects.Thiswillbedonefromafunctioninitialize(), buteach row
wouldbecreatedinaforloop:
foriinrange(0,12):
tiles=tiles+tile(i*30+15,y,thiscolor,npoints)
wherethetilewillbecreatedandispasseditsx,yposition,color,andnumberofpoints.
Theentirecollectionoftilesisplacedintoatuplenamedtiles.Theballwillbecreatedata
random location and with a random speed within the space between the paddle and the
tiles, and the paddle will be created so that is initially is drawn in the horizontal center
nearthebottomofthewindow.
definitialize():
score=0
nballs=2
b=ball()#Createtheball
p=paddle()#createthepaddle
thiscolor=(100,100,240)#Blue
npoints=5#Toprowis5pointseach
foriinrange(0,12):
tiles=tiles+tile(i*30+15,y,thiscolor,npoints)
#andsoonfor4morerows
Themaindraw()functionwillcallthedraw()methodsofeachoftheclassinstances,
andtheywilldrawthemselves:
defdraw():
background(200)
b.move()#Movetheball
b.draw()#Drawtheball
p.draw()#Drawthepaddle
forkintiles:
k.draw()#Draweachtile
text(“Scoreis:”+score,scorex,scorey)
text(“Ballsremaining:“+nballs,remainx,remainy)
When this function is called (many times each second) the ball is placed in its new
position,possiblyfollowingabounce,andthenisdrawn.Thepaddleisdrawn,andifitis
to be moved it will be done through the user pressing a key. Then the active tiles are
drawn,andthemessages aredrawnonthe screen.Thestructureof the mainpartof the
programisdefinedbytheorganizationoftheclasses.
12.3.1 InitialCodingforaTile
Atilehasagraphicalrepresentationonthescreen,butitismorecomplexthanthat.It
cancollidewithaballandhasacolorandapointvalue.Alloftheseaspectsofthetile
have to be coded as a part of its class. In addition, a tile can be active,meaningthat it
appearsonthescreenandcancollidewiththeball,orinactive,meaningthattheballhas
hititanditisoutofplayforallintentsandpurposes.Here’saninitialversion:
classtile:
def__init__(self,x,y,color,points):
self.x=x
self.y=y
self.color=color
self.points=points
self.active=True
self.size=30
defdraw(self):
ifself.active:
Glib.fill(self.color)
Glib.ellipse(self.x,self.y,self.size,self.size)
Atthebeginningofthegameeverytilemustbecreatedandinitializedwithitsproper
position,color,andpointvalue.Thenthedraw()functionforthemainprogramwillcall
thedraw()methodofeverytileduringeverysmalltimeinterval,orframe.Accordingto
thecodeaboveifthetileisnotactive,thenitwillnotbedrawn.Let’stestthis.
Rule: Never write more than 20-30 lines of code without testing at least part of it.
Thatwayyouhaveaclearerideawhereanyproblemsyouintroducemaybe.
Asuitabletestprogramtostartwithcouldbe:
defdraw():
globaltiles
forkintiles:
k.draw()
Glib.startdraw(360,350)
red=(250,0,0)
print(red)
tiles=()
foriinrange(0,12):
tiles=tiles+(tile(i*30,90,red,15),)
Glib.enddraw()
whichsimplyplacessometilesonthescreeninarow,passingacolorandpointvalue.
Thisalmostworks,butthefirsttileiscutinhalfbytheleftboundary.Iftheinitialization
becomes:
tiles=tiles+(tile(i*30+15,90,red,15),)
thenaproperrowof12redcirclesisdrawn.Modificationswillbemadetothisclassonce
weseemoreclearlyhowitwillbeused.
12.3.2 InitialCodingforthePaddle
Thepaddleisrepresentedasarectangleonthescreen,butitsroleinthegameismuch
moreprofound:itistheonlywaytheplayerhastoparticipateinthegame.Theplayerwill
typekeystocontrolthepositionofthepaddlesoastokeeptheballfromfallingoutofthe
area.Sotheballhastobedrawn,asthetilesdo,butalsomustbemoved(i.e.,changethe
Xposition)inaccordancewiththe player’swishes. Thepaddleclass initiallyhas a few
basicoperations:
classpaddle:
def__init__(self,x,y):
self.x=x
self.y=y
self.speed=3
self.width=90
self.height=10
defdraw(self):
Glib.fill(0)
Glib.rect(self.x,self.y,self.width,self.height)
defmoveleft(self):
ifself.x<=self.speed:
self.x=0
else:
self.x=self.x–self.speed
defmoveright(self):
ifself.x>width-self.width-self.speed:
self.x=width-self.width
else:
self.x=self.x+self.speed
WhentherightarrowkeyispressedaflagissettoTrue,andthepaddlemovestothe
right (i.e., its x coordinate increases) each time interval, or frame. When the key is
released the flag is set to False and the movement stops as a result. Movement is
accomplishedbycallingmoveleft()andmoveright(),andthesefunctionsenforcealimit
onmotion:thepaddlecannotleavetheplayarea.Thisisdonewithintheclasssothatthe
outsidecodedoesnotneedtoknowanythingabouthowthepaddleisimplemented.Itis
important to isolate details of the class implementation to the class only, so that
modificationsanddebuggingcanbelimitedtotheclassitself.
Thepaddleis simplya rectangle,as faras thegeometryis concerned,and presentsa
horizontal surface from which the ball will bounce. It is the only means by which the
playercanmanipulatethegame,soitisimportanttogetthepaddleoperationsandmotion
correct.Fortunately,movingarectangleleftandrightisaneasythingtodo.
Testingthisinitialpaddleclassusedadraw()functionthatrandomlymovedthepaddle
leftandright,andamainprogram,thatcreatesthepaddle:
defdraw():
globalp,f
Glib.background(200)
p.draw()
iff:
p.moveright()
else:
p.moveleft()
ifrandom()<.01:
f=notf
Glib.startdraw(360,350)
f=True
p=paddle(130)
Glib.enddraw()
Thiscodeworks,andsumsupthefunctionalityofthepaddle.
12.3.3 InitialCodingfortheBall
The ball really does much of the actual work in the game. Yes, the bounces are
controlledbytheuserthroughthepaddle,butoncetheballbouncesoffofthepaddleithas
to behave properly and do the works of the game: destroying tiles. According to the
standardclassmodelofthisprogram,theballshouldhaveadraw()methodthatplacesit
into its proper position on the screen. But the ball is moving, so its position has to be
updatedeachframe.Italsohastobounceoffofthesidesandtopoftheplayingarea,and
thedraw()methodcanmakethishappen.Theessentialcodefordoingthisis:
classball():
def__init__(self,x,y):
self.x=x
self.y=y
self.dx=3
self.dy=-4
self.active=True
self.color=(230,0,230)
self.size=9
defdraw(self):
ifnotself.active:
return
Glib.fill(self.color[0],self.color[1],
self.color[2])
Glib.ellipse(self.x,self.y,self.size,self.size)
self.x=self.x+self.dx
self.y=self.y+self.dy
if(self.x<=self.size/2)or\
(self.x>=Glib.width-self.size/4):
self.dx=-self.dx
ifself.y<=self.size/2:
self.dy=-self.dy
elifself.dy>=Glib.height:
self.active=False
This version only bounces off of the sides and top, and passes through the bottom.
Testingitrequiresamainprogramthatcreatestheballandadraw()functionthatsimply
callstheball’sdraw()method:
defdraw():
globalb
Glib.background(200)
b.draw()
Glib.startdraw(360,350)
b=ball(300,300)
Glib.enddraw()
The ball is created at coordinates (300,300) and does three bounces, disappearing
throughthebottomafterthat.Abouncingballhasbeencodedbefore,sothereisnothing
newhereyet.
12.3.4 CollectingtheClasses
Anextstepistotestallthreeclassesrunningtogether.Thiswillensurethatthereareno
problems with variable, method, and function names and that interactions between the
classes are in fact isolated. All three should work together, creating the correct visual
impressiononthescreen.Thecodeforthethreeclasses
was copied to one file for this test. The main program simply creates instances of each
classasappropriate,reallydoingwhattheoriginaltestprogramdidineachcase:
Glib.startdraw(360,350)
red=(250,0,0)
print(red)
tiles=()
foriinrange(0,12):
tiles=tiles+(tile(i*30+15,90,red,15),)
f=True
p=paddle(130)
b=ball(300,300)
Glib.enddraw()
Thedraw() function callsthe draw() methods for each class instance and moves the
paddlerandomlyasbefore:
defdraw():
globaltiles,p,f,b
Glib.background(200)
forkintiles:
k.draw()
p.draw()
iff:
p.moveright()
else:
p.moveleft()
ifrandom()<.01:
f=notf
b.draw()
Theresultwasthatallthreeclassesfunctionedtogetherthefirsttimeitwasattempted.
The game itself depends on collision, which will be implemented next, but at the very
leasttheclassesneedtocooperate,oratleastnotinterferewitheachother.That’strueat
thispointinthedevelopment.
12.3.5 DevelopingthePaddle
Placingthepaddleundercontroloftheuseristhenextstep.Whenakeyispressedthen
thepaddlestatewillchange,fromstilltomoving,andviceversawhenreleased.Thisis
accomplishedusingthekeypressed()andkeyreleased()functions.Theywillsetorcleara
flag, respectively, that causes the paddle to move by calling the moveleft() and
moveright() methods. The flag movingleft will result in a decrease in the paddle’s x
coordinateeachtimedraw()iscalled;movingrightdoesthesameforthe+xdirection:
defkeyPressed(k):
globalmovingleft,movingright
ifk==Glib.K_LEFT:
movingleft=True
elifk==Glib.K_RIGHT:
movingright=True
defkeyReleased(k):
globalmovingleft,movingright
ifk==Glib.K_LEFT:
movingleft=False
elifk==Glib.K_RIGHT:
movingright=False
Fromtheuserperspectivethepaddlemovesaslongasthekeyisdepressed.Insideof
theglobaldraw()function,theflagsaretestedateachiterationandthepaddleismovedif
necessary:
defdraw():#07-classes-01-20.py
global…movingleft,movingright
...
ifmovingleft:
p.moveleft()
elifmovingright:
p.moveright()
p.draw()
...
The other thing the paddle has to do is serve as a bounce platform for the ball. A
question surrounds the location of collision detection; is this the job of the ball or the
paddle?Itdoesmakesensetoperformmostofthistaskintheballclass,becausetheball
isalwaysinmotionandisthethingthatbounces.However,thepaddleclasscanassistby
providing necessary information. Of course, the paddle class can allow other classes to
examine and modify its position and velocity and thus perform collision testing, but if
thosedataare tobe hidden, theoption istohave amethod thattestswhether amoving
objectmighthavecollidedwith thepaddle.Theypositionof thepaddleis fixedandis
storedinaglobalvariablepaddle,sothatisnotanissue.Amethodinpaddlethatreturns
Trueifthex
coordinatepassedtoitliesbetweenthestartandendofthepaddleis:
definpaddle(self,x):
ifx<self.x:
returnFalse
ifx>self.x+self.width:
returnFalse
returnTrue
Theball classcan now determinewhether it collides with the paddle by checkingits
own y coordinate against the paddle and by calling inpaddle() to see if the ball’s x
position lies within the paddle. If so, it should bounce. The method hitspaddle() in the
ballclassreturnsTrueiftheballhitsthepaddle:
defhitspaddle(self):#08classes-01-21.py
ifself.y<=paddleY+2andself.y>=paddleY-2:
ifp.inpaddle(self.x):
returnTrue
returnFalse
Themostbasicreactiontohittingthepaddleistochangethedirectionofdyfromdown
toup(dy=-dy).
12.3.6 BallandTileCollisions
Thecollisionbetweenaballandatileismoredifficulttodocorrectlythananyofthe
othercollisions.Yes,determiningwhetheracollisionoccursisasimilarprocess,andthen
pointsarecollectedandthetileisdeactivated.Itisthebounceoftheballthatishardto
figureout.Theballmaystrikethetileatnearlyanyangleandatnearlyanylocationonthe
circumference. This is not a problem in the original game, where the tiles were
rectangular,becausetheballwasalwaysbouncingoffofahorizontalorverticalsurface.
Nowthere’ssomethinkingtodo.
Thecorrectcollisioncouldbecalculated,butwouldinvolveacertainamountofmath.
The specification of the problem does not say that mathematically correct bounces are
required.Thisisagamedesignchoice,perhapsnotaprogrammingchoice.Whatdoesthe
gamelooklikeifasimplebounceisimplemented?Thatcouldinvolvesimplychanging
dyto–dy.
Thisversionofthegameturnsouttobeplayable,evenfun;buttheballalwayskeeps
thesamexdirectionwhenitbounces.Whatwoulditlooklikeifitbouncedinroughlythe
right direction, and how difficult would that be? The direction of the bounce would be
dictated by the impact location on the tile, as seen in Figure12.3. This was figured out
afterafewminuteswithapencilandpaper,andisintuitiveratherthanprecise.
We’llhavetofigureoutwheretheballhitsthetile,determinewhichofthefourpartsof
thetilethisliesin,andthencreatethenewdxanddyvaluesfortheball.Akeyaspectof
the solution being developed is to avoid too much math that has to be done by the
program.Isthispossible?
Thefirststepistofindtheimpactpoint.Wecouldusealittlebitofanalyticgeometry,
orwecouldapproximate.Thefactisthattheballisnotmovingveryfast,andtheexact
coordinatesoftheimpactpointarenotrequired.Atthebeginningofthecurrentframethe
ball was at (x,y) and at the beginning of the next is will be at (x+dx, y+dy). A good
estimateofthepointofimpactwouldbethemeanvalueofthesetwopoints,or(x+dx/2,
y+dy/2).Closeenoughforacomputergame.
Now the question is: within which of the four regions defined in Figure 12.3 is the
impactpoint?Theregionsaredefinedbylinesat45degreesand-45degrees.Theatan()
functionwill, when using screencoordinates, havethe –dxpoints between-45 and +45
degrees.The–dypoints,wherethedirectionofYmotionchanges,involvetheremaining
collisions.Whatneedstobedoneistofindtheangleofthelinefromthecenterofthetile
totheballandthencomparethatto-45…+45.
Hereisanexamplemethodnamedbounce()thatdoesexactlythis.
#Returnthedistancesquaredbetweenthetwopoints
defdistance2(self,x0,y0,x1,y1):
return(x0-x1)*(x0-x1)+(y0-y1)*(y0-y1)
defbounce(self,t):
dd=t.size/2+self.size/2#Bounceoccurswhenthe
#distance
dd=dd*dd#Betweenballandtile<
#radiisquared
collide=False
ifself.distance2(self.x,self.y,t.x,t.y)>=ddand\
self.distance2(self.x+self.dx,self.y+self.dy,t.x,
t.y)<dd:
self.x=self.x+self.dx/2#Estimatedimpact
#pointoncircle
self.y=self.y+self.dy/2
collide=True
elifself.distance2(self.x,self.y,t.x,t.y)<dd:
collide=True#Balliscompletelyinside
#thetime
ifnotcollide:
return
#Iftheballisinsidethetile,backitout.
whileself.distance2(self.x,self.y,t.x,t.y)<dd:
self.x=self.x-self.dx*0.5
self.y=self.y-self.dy*0.5
ifself.x!=t.x:#Computetheball-tileangle
a=atan((self.y-t.x)/(self.x-t.y))
a=a*180./3.1415
else:#Ifdx=0thetangentisinfinite
a=90.0
ifa>=-45.0anda<=45.0:#Thexspeedchange
self.dx=-self.dx
else:
self.dy=-self.dy#Theyspeedchanges
Aftersometestingthecode:
#Iftheballisinsidethetile,backitout.
whileself.distance2(self.x,self.y,t.x,t.y)<dd:
self.x=self.x-self.dx*0.5
self.y=self.y-self.dy*0.5
wasadded.Itwasfoundthatiftheballwastoofarinsidethetilethenitsmotionwasvery
odd; as it moved through the tile it constantly changed direction because the program
determinedthatitwasalwayscolliding.
12.3.7 BallandPaddleCollisions
Now we return to examine the collision between the ball and the paddle. Thepaddle
seems to be flat, and colliding with any location on the paddle should have the same
result.Perhaps.Whatiftheballhitsthepaddleveryneartooneend?Thereisacorner,
andmaybehittingtooneartothecornershouldyieldadifferentbounce.Thiswasthecase
intheoriginalgames.Iftheballstruckthenearedgeofthepaddleonthecorneritcould
actuallybouncebackintheoriginaldirectiontoagreaterorlesserdegree.Thisgivesthe
playeragreaterdegreeofcontrol,oncetheyunderstandthesituation.Otherwisethegame
isreallypredeterminediftheplayermerelyplacesthepaddleinthewayoftheball.Itwill
alwaysbounceinexactlythesamemanner.
Theproposedideaistobounceatadifferentangledependingonwheretheballstrikes
thepaddle.Weneedtodecidehownearandhowintensetheeffectwillbe.Iftheballhits
thepaddlenearthecenter,thenitwillbouncesothattheincomingangleisthesameasthe
outgoingangle.Whenithitsthenearendofthepaddleitwillbouncesomewhatbackin
theincomingdirection,andwhenitstrikesthefarendthebounceanglewillbeshallower
abouncefromthecenter.
Let’s say that if the ball hits the first pixel on the paddle it will bounce back in the
originaldirection,meaningthatdx=-dxanddy=-dy.Abouncefromthecenterdoesnot
change dx but does set dy = -dy. If the relationship is linear across the paddle, the
implication would be that striking the final pixel would set dx = 2*dx and dy = -dy.
Strikinganypixelinbetweenwoulddividethechangeindxbythenumberofpixelsinthe
paddle,initially90.Iftheballhitspixelntheresultwillbe:
delta=2*dx/90.0
dx=-dx+n*delta
Aproblemhereisthatthedxvaluewilldecreasecontinuouslyuntiltheballisbouncing
upanddown.Perhapstheincomingangleshouldnotbeconsidered.Thebounceangleof
theballcouldbecompletelydependentonwhereithitsthepaddleandnothingelse.Ifdx
is-5onthenearendofthepaddleand+5onthefarend,then:
dx=-5+n*10.0/90.0
Thecodeinthedraw()methodoftheballclassismodifiedtoread:
ifself.hitspaddle():
self.dy=-self.dy
self.dx=-5+(1./9.)*(self.x-p.x)
Theusernowhasalotmorecontrol.Thegamedoesappearslow,though.Andthereis
onlyoneball.Oncethatislost,thegameisover.
12.3.8 FinishingtheGame
What remains to be done is to implement multiple balls. Multiple balls are tricky
becausetherearetimingissues.Whentheballdisappearsthroughthebottomoftheplay
area it should reappear someplace, and at a random place. It should not appear
immediately, though, because the player needs some time to respond; let’s say three
seconds.Meanwhilethescreenmustcontinuetobe
displayed.It’stimetointroducestates.
Astateisasituationthatcanbedescribedbyacharacteristicsetofparameters.Astate
canbelabeledwithasimplenumberbutrepresentssomethingcomplex.Inthisinstance
specificallytherewillbeaplaystate,inwhichthepaddlecanbemovedandtheballcan
scorepoints,andapausestate,whichhappensafteraballislost.Thedraw()functionis
the place where each step of the program is performed at a high level, and so will be
responsibleforthemanagementofstates.
Thecurrentstageoftheimplementationhasonlytheplaystate,andallofthecodethat
managesthatisinthedraw()functionalready.Changethenameofdraw()tostate0()and
createa statevariable statethat can havevalues 0or 1: playis 0, pause is1. The new
draw()functionisnowcreated:
defdraw():
globalplaystate,pausestate
ifstate==playstate:
state0()
elifstate==pausestate:
state1()
where:
playstate=0
pausestate=1
The program should still be playable as it was before as long as state == playstate.
Whathappensinthepausestate?Thecontrolsofthepaddleshouldbedisabled,andno
ballisdrawn.Thegoalofthepausestateistoallowsometimefortheusertogetready
forthenextball,sosometimeisallowedtopass.Perhapstheplayershouldbepermitted
tostartthegameagainwithanewballwhenakeyispressed.Thiseliminatestheneedfor
atimer, which are generally to beavoided. So, the pause state is entered when the ball
departsthefieldofplay. The gameremainsinthepause stateuntilthe playerpressesa
key,atwhichpointanewballiscreatedandthegameenterstheplaystate.
Enteringthepausestatemeansmodifyingthecodeintheballclassalittle.Thereisa
lineofcodeattheendofthedraw()methodoftheballclassthatlookslikethis:
elifself.dy>=Glib.height:
self.active=False
Thisiswheretheclassdetectstheballleavingtheplayarea.Weneedtoaddtothis:
elifself.dy>=Glib.height:
self.active=False
while,ofcourse,makingcertainthatthevariablesneededarelistedasglobal.Thisdidnot
do as was expected until it was noted that the condition should have been if self.y >=
Glib.height.Thecomparisonwithdywasanerrorintheinitialcodingthathadnotbeen
noticed.Also,itseemsliketheactivevariableintheballclasswasnotuseful,soitwas
removed.
NowinthekeyPressed()functionallow akey press tochange from the pause tothe
playstate.Anykeywilldo:
ifstate=pausestate:
resume()
Theresume()functionmustdotwothings.First,itmustchangestatebacktoplay.Next
itmustrepositiontheballtoanewlocation.Easy:
defresume():
globalstate,playstate
b.x=randrange(30,Glib.width-30)
b.y=250
state=playstate
Thisworksfine.Thegameistoonlyhaveaspecifiednumberofballs,though,andthis
numberwastobedisplayedonthescreen.So,whenintheplaystateandaballislost,the
count of remaining balls (balls_remaining) will be decreased by one. If there are any
ballsremaining,thenthepausestateisentered.Otherwisethegameisover.Perhapsthat
shouldbeathirdstate:gameover?Yes,probably.
Thegameoverstateisenteredwhentheballleavestheplayareaandnoballsareleft
(intheballclassdraw()method.Intheglobaldraw()functionthethirdstatedetermines
ifthegameiswonorlostandrendersanappropriatescreen:
…
Glib.text(“Score:“+str(score),10,30)
ifscore>=maxscore:
Glib.background(0,230,0)
Glib.text(“YouWin”,200,200)
else:
Glib.background(100,10,10);
Glib.text(“YouLose”,200,200)
Glib.text(“Score:“+str(score),10,30)
Andthat’sit!Screenshotsfromthegameinvariousstatesareshownin
Figure12.4.(14playble3.py)
12.4 RULESFORPROGRAMMERS
Theauthorofthisbookhascollectedasetofrulesandlawsthatapplytowritingcode,
andontensofthousandsoflineswrittenand45yearsasaprogrammer(hestartedvery
young). There are over 250 of them, but not all apply to Python. For example, Python
enforcesindentingandhasnobegin-endsymbols.Theonesthatdoapplyareasfollows:
2.Usefour-spaceindentsandnottabs.
5.Placeacommentinlieuofadeclarationforallvariablesinlanguageswhere
declarationsarenotpermitted.
6.Declarenumericconstantsandprovideacommentexplainingthem.
7.Rarelyuseanumericconstantinanexpression;nameanddeclarethem.
8.Usevariablenamesthatrefertotheuseormeaningofthevariable.
9.Makeyourcodecleanfromthebeginning,notafteritworks.
10.Anon-workingprogramisuseless,nomatterhowwellstructured.
11.Writecodethatisasgeneralaspossible,evenifthatmakesitlonger.
12.Ifthecodeyouwriteisgeneral,thenkeepitandreuseitwhenappropriate.
13.Functionslongerthan12(notincludingdeclarations)linesaresuspect.
15.Avoidrecursionwhereverpossible.
16.Everyprocedureandfunctionmusthavecommentsexplainingfunctionanduse.
17.Writeexternaldocumentationasyoucode—everyprocedureandfunctionmust
haveadescriptioninthatdocument.
18.Somedocumentationisforprogrammers,andsomeisforusers.Distinguish.
19.Documentationforusersmustneverbeinthecode.
20.Avoidusingoperatingsystemcalls.
21.Avoidusingmachinedependenttechniques.
22.Dousetheprogramminglanguagelibraryfunctions.
23.Documentationforaprocedureincludeswhatotherproceduresarecalled.
24.Documentationforaprocedureincludeswhatproceduresmightcallit.
26.Whendoinginput:assumethattheinputfileiswrong.
27.YourprogramshouldacceptANYinputwithoutcrashing.Feeditanexecutableas
atest.
28.Sideeffectsareverybad.Aproperfunctionshouldreturnavaluethatdependsonly
onitsparameters.Exceptionsdoexistandshouldbedocumented.
29.Everythingnotdefinedisundefined.
33.Buffersandstringshavefixedsizes.Knowwhattheyareandbeconstrainedby
them.
34.Ahandleisapointertoastructureforanobject;makecertainthathandlesusedare
stillvalid.
35.Stringsandbuffersshouldnotoverlapinstorage.
36.Contentsofstringsandbuffersareundefineduntilwrittento.
40.Everyvariablethatisdeclaredistobegivenavaluebeforeitisused.
41.Putsomeblanklinesbetweenmethoddefinitions.
42.Explaineachdeclaredvariableinacomment.
44.Solvetheproblemrequested,notthegeneralcaseorsubsets.
45.Whitespaceisoneofthemosteffectivecomments.
48.Avoidglobalsymbolswherepossible;usethemproperlywhereuseful.
49.Avoidcasts(typecasting).
50.Roundexplicitlywhenroundingisneeded.
51.Alwayschecktheerrorreturncodes.
52.Leavespacesaroundoperatorssuchas=,==,!=,andsoon.
53.Amethodshouldhaveaclear,single,identifiabletask.
54.Aclassshouldrepresentaclear,single,identifiableconcept.
57.Dothecommentsfirst.
58.Afunctionshouldhaveonlyoneexitpoint.
59.Readcode.
60.Commentsshouldbesentences.
61.Acommentshouldn’trestatetheobvious.
62.Commentsshouldalign,somehow.
65.Don’tconfusefamiliaritywithreadability.
67.Afunctionshouldbecalledmorethanonce.
68.Codeusedmorethanonceshouldbeputintoafunction.
69.Allcodeshouldbeprintable.
70.Don’twriteverylonglines.80Characters.
71.Thefunctionnamemustagreewithwhatthefunctiondoes.
72.Formatprogramsinaconsistentmanner.
75.Havealog.
76.Documentalltheprincipaldatastructures.
77.Don’tprintamessageforarecoverableerror—logit.
78.Don’tusesystem-dependentfunctionsforerrormessages.
79.Youmustalwayscorrectlyattributeallcodeinthemoduleheader.
80.Providecrossreferencesinthecodetoanydocumentsrelevanttotheunderstanding
ofthecode.
81.AllerrorsshouldbelistedtogetherwithanEnglishdescriptionofwhattheymean.
82.Anerrormessageshouldtelltheuserthecorrectwaytodoit.
83.Commentsshouldbeclearandconciseandavoidunnecessarywordiness.
84.Spellingcounts.
85.Runyourcodethroughaspellingchecker.
87.Functiondocumentationincludesthedomainofvalidinputstothefunction.
88.Functiondocumentationincludestherangeofvalidoutputsfromthefunction.
91.Eachfilemuststartwithashortdescriptionofthemodulecontainedinthefileand
abriefdescriptionofallexportedfunctions.
92.Donotcommentoutoldcode—removeit.
93.Useasourcecodecontrolsystem.
95.Commentsshouldneverbeusedtoexplainthelanguage.
96.Don’tputmorethanonestatementonaline.
97.Neverblindlymakechangestocodetryingtoremoveanerror.
98.Printingvariablevaluesinkeyplacescanhelpyoufindthelocationofabug.
99.Onecompilationerrorcanhidescoresofothers.
100.Ifyoucan’tseemtosolveaproblem,thendosomethingelse.
101.Explainittotheduck.Getaninanimateobjectandexplainyourproblemtoit.This
oftensolvesit.(Wyvill)
102.Don’tconfuseeaseoflearningwitheaseofuse.
103.Aprogramshouldbewrittenatleasttwice—throwawaythefirstone.
104.Hasteisnotspeed.
105.Youcan’tmeasureproductivitybyvolume.
106.Expecttospendmoretimeindesignandlessindevelopment
107.Youcan’tprograminisolation.
108.Ifanifendsinreturn,don’tuseelse.
109.Avoidoperatoroverloading.
110.Scoresofcompilationerrorscansometimesbefixedwithonecharacter—startat
thefirstone.
111.Programsthatcompilemostlystilldonotwork.
112.Incrementallyrefineyourcode.StartwithBEGIN-SOLVE-END,thenrefine
SOLVE.
113.Drawdiagramsofdataandalgorithms.
114.Useasymbolicdebuggerwhereverpossible.
115.Makecertainthatfileshavethecorrectname(andsuffix!)whenopening.
116.Neverassignavalueinaconditionalexpression.
117.Ifyoucan’tsayitinEnglish,youcan’tsayitinanyprogramminglanguage.
118.Don’tmovelanguageidiomsfromonelanguagetoanother.
119.First,donoharm.
120.Ifobjectoriented,designtheobjectsfirst.
121.Don’twritedeeplynestedcode.
122.Multipleinheritanceisevil.Avoidsin.
123.Productivitycanbemeasuredinthenumberofkeystrokes(sometimes).
125.Yourcodeisnotperfect.Notevenclose.Havenoegoaboutit.
126.Variablesaretobedeclaredwiththesmallestpossiblescope.
127.Thenamesofvariablesandfunctionsaretobeginwithalowercaseletter.
133.Collectyourbestworkingmodulesintoacodelibrary.
134.Isolatedirtycode(e.g.,codethataccessesmachinedependencies)intodistinctand
carefullyannotatedmodules.
135.Anythingyouassumeabouttheuserwilleventuallybewrong.
136.Everytimearuleisbroken,thismustbeclearlydocumented.
137.Writecodeforthenextprogrammer,notforthecomputer.
138.Yourprogramshouldalwaysdegradegracefully.
139.Don’tsurpriseyouruser.
140.Involveusersinthedevelopmentprocess.
142.Mostprogramsshouldrunthesamewayandgivethesameresultseachtimethey
areexecuted.
143.Mostofyourcodewillbecheckingforerrorsandpotentialerrors.
144.Normalcodeanderrorhandlingcodeshouldbedistinct.
145.Don’twriteverylargemodules.
146.Puttheshortestclauseofanif/elseontop.
149.Havealibraryoferror-reportingcodeanduseit(beconsistent).
150.Reporterrorsinawaythattheymakesense.
151.Reporterrorsinawaythatallowsthemtobecorrected.
152.Onlyfoolsthinktheycanoptimizecodebetterthanagoodcompiler.
153.Changethealgorithm,notthecode,tomakethingsfaster.Polynomialis
polynomial.
154.Copyandpasteisonlyforprototypes.
155.It’salwaysyourfault.
156.Knowwhattheproblemisbeforeyoustartcoding.
157.Don’tre-inventthewheel.
158.Keepthingsassimpleaspossible.
159.Datastructures,notalgorithms,arecentraltoprogramming.(Pike)
160.Learnfromyourmistakes.
161.Learnfromthemistakesofothers.
162.Firstmakeitwork;THENmakeitworkfaster.
163.Wealmostneverneedtomakeitfaster.
164.Firstmakeitwork;thenmakeitworkbetter.
165.Programmersdon’tgettomakebigdesigndecisions—dowhatisasked,effectively.
166.Learnnewlanguagesandtechniqueswhenyoucan.
167.Neverstartanewprojectinalanguageyoudon’talreadyknow.
168.Youcanlearnanewlanguageeffectivelybycodingsomethingsignificantinit,just
don’texpecttoselltheresult.
169.Youwillalwaysknowonlyasubsetofanygivenlanguage.
170.Thesubsetyouknowwillnotbethesameasthesubsetyourcoworkersknow.
171.Objectorientationisnottheonlywaytodothings.
172.Objectorientationisnotalwaysthebestwaytodothings.
173.Tocreateadecentobject,onefirstneedstobeaprogrammer.
174.Youmaybesmarterthanthepreviousprogrammer,butleavetheircodealone
unlessitisbroken.
175.Youprobablyarenotsmarterthanthepreviousprogrammer,soleavetheircode
aloneunlessitisbroken.
176.Yourprogramwillneveractuallybecomplete.Livewithit.
177.Allfunctionshavepreconditionsfortheircorrectuse.
178.Sometimesafunctioncannottellwhetheritspreconditionsaretrue.
180.Computershavegigabytesofmemory,mostly.Optimizingitisthelastthingtodo.
181.Computeimportantvaluesintwodifferentwaysandcomparethem.
182.0.1*10isnotequalto1.
183.Addingmanpowertoalatesoftwareprojectmakesitlater.
184.Italwaystakeslongerthanyouexpect.
185.Ifitcanbenull,itwillbenull.
186.Donotusecatchandthrowunlessyouknowexactlywhatyouaredoing.
187.Beclearaboutyourintention.i=1-iisnotthesameasif(i==0)theni=1elsei=0.
188.Fancyalgorithmsarebuggierthansimpleones,andthey’remuchharderto
implement.(Pike)
189.Thefirst90%ofthecodetakes10%ofthetime.Theremaining10%takesthe
other90%ofthetime.
190.Allmessagesshouldbetagged.
191.DonotuseFORloopsastimedelays.
192.Auserinterfaceshouldnotlooklikeacomputerprogram.
193.Decomposecomplexproblemsintosmallertasks.
194.Usetheappropriatelanguageforthejob,whengivenachoice.
195.Knowthesizeofthestandarddatatypes.
198.Ifyousimultaneouslyhittwokeysonthekeyboard,theonethatyoudonotwant
willappearonthescreen.
199.Patternsarefortheweak—itassumesyoudon’tknowwhatyouaredoing.
200.Don’tassumeprecedencerules,especiallywhendebugging—parenthesize.
202.++and—areevil.What’swrongwithi=i+1??
204.It’shardtoseewhereaprogramspendsmostofitstime.
205.Fancyalgorithmsareslowwhennissmall,andnisusuallysmall.(Pike)
206.Assumethatthingswillgowrong.
207.Computersdon’tknowanymath.
208.Expecttheimpossible.
209.Testeverything.Testoften.
210.Dothesimplebitsfirst.
211.Don’tfixwhatisnotbroken.
212.Ifitisnotbroken,thentrytobreakit.
213.Don’tdrawconclusionsbasedonnames.
214.Acarelesslyplannedprojecttakesthreetimeslongertocompletethanexpected;a
carefullyplannedprojecttakesonlytwiceaslong.
215.Anysystemwhichdependsonhumanreliabilityisunreliable.
216.Theeffortrequiredtocorrectcourseincreasesgeometricallywithtime.
217.Complexproblemshavesimple,easytounderstand,andwronganswers.
218.Anexpertisthatpersonmostsurprisedbythelatestevidencetothecontrary.
219.Oneman’serrorisanotherman’sdata.
220.Noiseissomethingindatathatyoudon’twant.Someonedoeswantit.
12.5 SUMMARY
There is no general agreement on how best to put together a good program. A good
programisonethatisfunctionallycorrect,readable,modifiable,reasonablyefficient,and
that solves a problem that someone needs solved. No two programmers will create the
sameprogramforanon-trivialproblem.Theprogramdevelopmentstrategydiscussedin
this chapter is called iterative refinement, and is nearly independent of language or
philosophy.
There is no single process that is best for writing programs. Some people only use
object-oriented code, but a problem with teaching that way is that a class contains
traditional,procedure-orientedcode.Tomakeaclass,onemustfirstknowhowtowritea
program.
The idea behind top-down programming is that the higher levels of abstraction are
described first. A description of what he entire program is to do is written in a kind-of
English/computer hybrid language (pseudocode), and this description involves making
calls to functions that have not yet been written but whose function is known—these
functionsarecalledstubs.Thefirststepistosketchtheactionsoftheprogramasawhole,
then to expand each step that is not well-defined into a detailed method that has no
ambiguity.Compile and testthe program asfrequently as possibleso that errorscan be
identifiedwhileitisstilleasytoseewheretheyare.
Thekeytoobject-orienteddesignisidentifyingthebestobjectstobeimplemented.The
restoftheprogramwilltakealogicalshapedependingontheclassesthatituses.Tryto
isolatethedetailsoftheclassfromtheoutside.Alwaysbewillingtorethinkachoiceand
rewritecodeasaconsequence.
Exercises
1.Addsoundtothegame.Whentheballcollideswithanobjectasoundeffectshould
play.
2.ConsiderhowhyphenationmightbeaddedtoPyroff.Howwoulditbedecidedto
hyphenateaword,andwherewouldthenewcodebeplaced?Inotherwords,sketcha
solution.
3.InsomeversionsofBreakout-stylegames,certainofthetilesortargetshavespecial
properties.Forexample,sometimeshittingaspecialtargetwillresultintheball
speedinguporslowingdown,willhaveanextrapointvalue,orwillchangethesizeof
thepaddle.Modifythegamesothatsomeofthetargetsspeeduptheballandsome
othersslowitdown.
4.ThePyroffsystemcanturnrightadjustingoff,butnoton.Thisseemslikeaflaw.Add
anewcommand,“.ra,”thatwillturnrightadjustingon.
5.Mostwordprocessorsallowforaheaderandafooter,somespaceandpossiblysome
textatthebeginningandend,respectively,ofeverypage.Designacommand“.he”
thatattheleastallowsforemptyspaceatthebeginningofapage,andacorresponding
command“.fo”thatallowsforsomelinesattheendofapage.
6.WhichthreeoftheRulesforProgrammersdoyouthinkmakethegreatestdifference
inthecode?Whichthreeaffectcodetheleast?Arethereanythatyoudon’t
understand?
NotesandOtherResources
Bouncing a ball off of a wall: https://sinepost.wordpress.com/2012/08/19/bouncing-
off-the-walls/
1.JonBentley.(1999).ProgrammingPearls,2nded.,Addison-WesleyProfessional,
ISBN-13:978-0201657883.
2.AdrianBowyerandJohnWoodwark.(1983).AProgrammer’sGeometry,
Butterworth-HeinemannLtd.,ISBN-13:978-0408012423.
3.FrederickBrooks.(1995).TheMythicalMan-Month:EssaysonSoftware
Engineering,AnniversaryEdition,Addison-WesleyProfessional.
4.JimParker.(2015).100CoolProcessingSketches,eBook,
https://leanpub.com/100coolprocessingsketches
5.JimParker.(2015).GameDevelopmentUsingProcessing,MercuryLearningand
Publishing.
6.R.Rhoad,G.Milauskas,andR.Whipple.(1984).GeometryforEnjoymentand
Challenge,rev.ed.,Evanston,IL,McDougal,Littell&Company.
7.GeraldM.Weinberg.(1998).ThePsychologyofComputerProgramming,AnlSub
ed.,DorsetHouse,ISBN-13:978-0932633422.
CHAPTER13
COMMUNICATINGWITHTHEOUTSIDE
WORLD
13.1Email
13.2FTP
13.3CommunicationBetweenProcesses
13.4Twitter
13.5CommunicatingwithOtherLanguages
13.6Summary
Inthischapter
Pythoncanreaddatafromthekeyboardandprintonthescreen,itcandisplaygraphics,
audio, and video, allow mouse (and touch) interactions, and read and writedata to and
fromfiles.That’salotofcommunication,butitallhappensononecomputer—theoneon
whichtheprogramisrunning.Intheageofhigh-speedInternet,socialmedia,podcasts,
blogs,andwikis,thisisnotenough.Thewideworldoutsideofthedesktopbeckons,anda
programmerwithaknowledgeofPythonandtherelevantmodulescanrespond.
Canacomputercommunicatewithanotherone?Ofcourse.Canaprogramsendemail?
Yes, that’s what a mail program like Thunderbird or Outlook does. Can a program be
writtenthatreadstweetsastheyaresent?Sure,butthereisaprice.Thatis:thesethings
aredoneaccordingtosomeoneelse’srules.Thefirstemailwassentin1971onaprivate
network named Arpanet. It sent mail between distinct computers, rather than sending
messagesbetweenusersonaspecificmachine.In1972UnixEmailwasmadeavailable,
andwasnetworkedin1978;thatwasthestartofsomethingbig.
Thesenderandreceiverhadtoagreeonhowtoencodeanddecodeamessage,andhow
toaccessitfromthenetwork.Tosendmailbetweendifferentcomputersalwaysrequiresa
standard,aschemethatisagreeduponbyimplementersofthesystem.Otherwisemailcan
onlybesentbetweenUNIXsystems,orWindows,oriOS.Email,tobepractical,needsto
bemoreflexible.Itneedstobeubiquitous,andsoallneedtoagreeonastandardforhow
Emailcanbesentandreceived.Astandardwaseventuallyagreedon,anditwascalledthe
SimpleMailTransferProtocol(SMTP)andwasestablishedin1982.
ThiswassevenyearsbeforetheWorldWideWeb,soEmailreallyrepresentsthefirst
practicalwaytocommunicatebetweencomputersoveralongdistance.FTPhappenedat
about the same time. The enabling technology for the Web, TCP/IP, came next. All of
these developments in networking and software combined to create the modern
interconnectedsociety,butallarebasedonacollectionofrulesthatsoftwaremustagree
to(protocols)iftheyaretomakeuseofthenetworkinfrastructure.Thisisanexampleof
design by contract, in which designers create formal specifications for components and
using those involves a kind-of contract or agreement between programmers developing
clientsoftwareandthosewhobuiltthemodulesanddesignedtheprotocols.
Therearehigh-levelprogramsthatprovideagooduserinterfacetotheInternetandthat
implement these protocols beneath their visual presentation. When using Python a
collectionofmodulesareusedthathandletheverylow-leveldetails,buttheinterfaceto
theprogrammerexposestheprotocol.Someofthesemodulesareprovidedinastandard
Pythoninstallation(smtplib,email),andsomearenot(MPI,Tweepy),andwillhavetobe
installedbeforethecodeinthischapterwillrun.
When communicating with another machine a key issue is that of authentication.
Almost all protocols require that a connection be formed between the two computers,
usingsomekindofidentificationofthosemachinessuchastheirIPaddress.Thentheone
initializingtheconnectionmustprovethatithaspermissiontodowhatitisabouttodo.
Thisresemblesloggingin,andinvolvesauseridentificationandapasswordofsometype.
Oncethe user has been identified there is an exchange of messages that tell the remote
computerwhatisdesiredofit,andthatallowinformationtobereturnedtothecaller.This
processisnearlyuniversal,buttakessomewhatdifferentformsondifferentsystems.
13.1 EMAIL
Emailisagoodexampleofaclient-serversystem,andonethatgetsusedmillionsof
timeseachminute.TheEmailprogramonaPCistheclient,andallowsausertoentertext
messages,specifydestinations,attachimages,andallofthefeaturesexpectedbysucha
program.ThisclientpackagestheEmailmessage(data)accordingtotheSimpleMessage
TransferProtocol(SMTP)andsendsthattoanothercomputerontheInternet,theEmail
server.AnEmailusermusthaveanaccountontheserverforthistoworksotheycanbe
identified and the user can receive replies; so, the process is: log into the Email server,
thensendtheSMTPmessagetotheEmailserverprogramonthatserver.Thustheclient
side of the contract is to create a properly formatted message, to log into the server
properly,andpassthemessagetoit.
Nowtheserverdoesthework.Giventhedestinationofthemessage,itsearchesforthe
server that is connected to that destination. For example, given the address
xyz@gmail.com, the server for gmail.com is located. Then the email message is sent
acrossthenetworktothatserver.Theserversoftwareatthatendreadsthemessageand
placesitintothemailbox,whichisreallyjustadirectoryonadiskdriveconnectedtothe
server,forthespecifiedusexyz.Themail
messageisessentiallyatextfileatthispoint.
This description is simplified but essentially accurate, and describes what has to be
donebyaprogramthatissupposedtosendanEmailmessage.ThePythonmodulethat
permitsthesendingofEmailimplementstheprotocolandofferstheprogrammerwaysto
specifytheparameters,likethedestinationandthemessage.Theinterfaceisimplemented
asasetoffunctions.Thelibraryneededforthisissmtplib,apartofthestandardPython
system.
Example:SendanEmail
Sending an Email message starts with establishing a connection between the client
computerandtheuser’smailserver,theoneonwhichtheyhaveanaccount(username
and password). For the purposes here a Gmail (Google) server will be used, which
complicatestheissueatinybit.TheEmailaccountsintheexamplearealsoGmailones,
andthesecanbehadforfreefromGoogle.
Theprogrammustdeclaresmtplibasanimportedmodule.Thesendingaddressandthe
receivingaddresswillbethesameinthisexample,butthisisjustatest.Normallythiswill
not be the situation. The Email address is the user ID for Gmail authentication and the
passwordisdefinedbytheuser.Thesearealljuststrings.
importsmtplib
LOGIN=yourloginID#LoginUserIDforGmail,stringPASSWD=yourpassword
#LoginpasswordforGmail,string
sndr=pythontextbook@gmail.com#Senderꞌsemailaddress
rcvr=pythontextbook@gmail.com#Receiverꞌsemailaddress
Part of the SMTP scheme is a syntax for Email messages. There is a header at the
beginningthatspecifiesthesender,receiver,andsubjectofthemessage.Theseareusedto
formatthemessage,nottorouteit—thereceiveraddressisspecifiedlater.Asimplesuch
messagelookslikethis:
From:user_me@gmail.com
To:user_you@gmail.com
Subject:Justamessage
Astringmustbeconstructedthatcontainsthisinformation:
msgt=“From:user_me@gmail.com\n”
msgt=msgt+“To:user_you@gmail.com\n”
msgt=msgt+“Subject:Justamessage\n”
msgt=msgt+“\n”
Nowthebodyofthemessageisattachedtothisstring.ThisisthepartoftheEmailthat
isimportanttothesender:
msgt=msgt+“Attention:ThismessagewassentbyPython!\n”
Thestringvariablemsgtnowholdsthewholemessage.Thismessageisintheformat
definedbytheMultipurposeInternetMailExtensions(MIME)standard.Thenextstepfor
theprogramistotrytoestablishaconnectionwiththesender’semailserver.Forthisthe
smtpmoduleisneeded,specificallytheSMTP()function.Itiscalledpassingthenameof
theuser’sEmailserverasaparameter,anditreturnsavariablethatreferencesthatserver.
Inthisexamplethatvariableisnamedserver:
server=smtplib.SMTP(ꞌsmtp.gmail.comꞌ)
Ifitisnotpossibletoconnecttotheserverforsomereason,thenanerrorwilloccur.It
isthereforeagoodideatoplacethisinatry-exceptblock:
try:
server=smtplib.SMTP(ꞌsmtp.gmail.comꞌ)
except:
print(“Erroroccurred.Canꞌtconnect”)
else:
Now comes the complexity that Gmail and some other servers introduce. What has
happened after the call to smtplib.SMTP() is that a communications session has been
openedup.Thereisnowanactiveconnectionbetweentheclientcomputerandtheserver
at smtp.gmail.com. Some servers demand a level of security that, among other things,
ensures that other parties can’t modify or even read the message. This is accomplished
using a protocol named Transport Layer Security (TLS), the details of which are not
completely relevant because the modules take care of it. However, to send data to
smtp.gmail.comtheservermustbetoldtobeginusingTLS:
server.starttls()
NowtheusermustbeauthenticatedusingtheirIDandpassword:
server.login(LOGIN,PASSWD)
Onlynowcanamessagebesent,andonlyiftheloginIDandpasswordarecorrect.The
senderisthestringsndr,therecipientisrcvr,andthemessageismsgt:
server.sendmail(sndr,rcvr,msgt)
Nowthatthemessagehasbeensent,itispolitetoclosethesession.Loggingoffofthe
serverisdoneasfollows:
server.quit()
Thisprogramwillsendoneemail,butitcanbeeasilymodifiedtosendmany emails
oneaftertheother.Itcanbemodifiedtoreadthemessagefromthekeyboard,orperform
anyofthefunctionsofatypicalEmail-sendingprogram
(Exercise1).
ThemoduleemailcanbeinvokedtoformatthemessageinMIMEform.Thefunction
MIMEText(s) converts the message string s into an internal form, which is a MIME
message.Fieldslikethesubjectandsendercanbeaddedtothemessage,andthenitissent
aswasdonebefore.Forexample:
importsmtplib
fromemail.mime.textimportMIMEText
LOGIN=yourloginID
PASSWD=yourpassword
fp=open(“message.txt”,“r”)#Readthemessagefroma
#file
mtest=fp.read()
#Or:simplyuseastring
#mtest=“AmessagefromPython:MerryChristmas.”
fp.close()
msg=MIMEText(mtest)#CreateaMIMEstring
sndr=pythontextbook@gmail.com#SenderꞌsEmail
rcvr=pythontextbook@gmail.com#RecipientꞌsEmail
msg[ꞌSubjectꞌ]=ꞌMailfromPythonꞌ#AddSubjecttothe
#message
msg[ꞌFromꞌ]=sndr#Addsendertothe
#message
msg[ꞌToꞌ]=rcvr#Addrecipenttothe
#message
#SendthemessageusingGoogleꞌsSMTPserver,asbefore
s=smtplib.SMTP(ꞌsmtp.gmail.comꞌ)#localhostcouldwork
s.starttls()
s.login(LOGIN,PASSWD)
s.send_message(msg)
s.quit()
UsingMIMEText() to create the message avoids having to format it correctly using
basicstringoperations.
13.1.1 ReadingEmail
ReadingEmailismorecomplicatedthanwritingit.ThecontentofanEmailisoftena
surprise,andsoareadermustbepreparedtoparseanythingthatmightbesent.Therecan
be multiple mailboxes: which mailbox will be looked at? There are usually many
messages in a mailbox: how can they be distinguished? In addition, the protocol for
retrievingmailfromaserverisdifferentfromthatusedtosendit;infact,therearetwo
competingprotocols:POPandIMAP.
ThePostOfficeProtocol(POP)istheolderofthetwoschemes,althoughithasbeen
updatedafewtimes.Itcertainlyallowsthebasicrequirementsofamailreader,whichis
todownloadanddeleteamessageinaremotemailbox(i.e.,ontheserver).TheInternet
Message Access Protocol (IMAP) is intended for use by many Email clients, and so
messagestendnottobedeleteduntilthatisrequested.WhensettingupanEmailclient
oneoftheseprotocolsusuallyhastobespecified,andthenitwillbeusedfromthenon.
TheexampleherewilluseIMAP.
Example:DisplaytheSubjectHeadersforEmailsinInbox
AnoutlinefortheprocessofreadingEmailissketchedontheright-handsideofFigure
13.1. Reading Email uses a different module that was used to send Email: imaplib, for
readingfromanIMAPserver.Thefunctionnamesaredifferentfromthoseinsmtplib,but
thepurposeofsomeofthemisthesame.ThefirstthreestepsinreadingEmailare:
importimaplib
server=ꞌimap.gmail.comꞌ#GmailꞌsIMAPserver
USER=pythontextbook@gmail.com#UserID
PASSWORD=“password”#Maskthispassword
EMAIL_FOLDER=“Inbox”
mbox=imaplib.IMAP4_SSL(server)#Connecttotheserver
mbox.login(USER,PASSWORD)#Authenticate(login)
The next step is to select a mailbox to read. Each has a name, and is really just a
directory someplace. The variable mbox is a class instance of a class named
imaplib.IMAP4_SSL, the details of which can be found in many places, including the
Internet.Ithasamethodnamedselect()thatallowstheexaminationofamailbox,given
its name (a string). The string is a variable named EMAIL_FOLDER which contains
“Inbox,”andthecalltoselect()thatessentiallyopenstheinboxis:
z=mbox.select(EMAIL_FOLDER)
The return value is a tuple. The first element indicates success or failure, and if z[0]
contains the string “OK” then the mailbox is open. The usual alternative is “NO.” The
secondelement of the tupleindicates how many messagesthere are, but itis in an odd
format. If there are 2 messages, as in the example, this string is b’2’; if there were 3
messagesitwouldbeb’3’;andsoon.Thesearecalledmessagesequencenumbers.
Havingopened the mailbox, the nextstep is to read it andextract the messages. The
protocol requires that the mailbox be searched for the messages that are wanted. The
imaplib.IMAP4_SSLclassoffersthesearch()methodforthis,thesimplestformbeing:
mbox.search(None,“ALL”)
whichreturnsallofthemessagesinthemailbox.IMAPprovidessearchfunctionality,and
all this method does is connect to it, which is why it seems awkward to use. The first
parameterspecifiesacharacterset,andNoneallowsittodefaulttoageneralvalue.The
secondparameterspecifiesasearchcriterionasastring.Therearedozensofparameters
thatcanbeusedhereandthedocumentationforIMAPshouldbeexaminedindetailfor
solutionstospecificproblems.However,someofthemoreusefultagsinclude:
ANSWERED:Messagesthathavebeenanswered.
BCC<string>:MessageswithaspecificstringintheBCCfield.
BEFORE<date>:Messageswhosedate(nottime)isearlierthanthespecifiedone.
HEADER<field-name><string>:Aspecifiedfieldintheheadercontainsthestring.
SUBJECT<string>:MessagesthatcontainthespecifiedstringintheSUBJECTfield.
TO<string>:MessagesthatcontainthespecifiedstringintheTOfield.
UNSEEN:Messagesthatdonothavethe\Seenflagset.
So,acalltosearch()thatlooksforthetext“Python”inthesubjectlinewouldbe:
mbox.search(None,“SUBJECTPython”)
Thesearch()functionreturnsatupleagain,wherethefirstcomponentisastatusstring
(i.e., “OK,” “NO,” “BAD”) and the second is a list of messages satisfying the search
criteriainthesameformatasbefore.Ifthesecondmessageiftheonlymatch,thisstring
willbeb’2.’Ifthefirstthreematchitwillbeb’123.’
Finally,themessagesareread,orfetched.Theimaplib.IMAP4_SSLclasshasafetch()
methodtodothis,anditagaintakessomeoddparameters.Whataprogrammerthinksof
theinterfaceortheAPIor,inotherwords,thecontract,isnotimportant.Whatmustbe
doneistosatisfytherequirementsandacceptthedataasitisoffered.Thefetch()method
acceptstwoparameters:thefirstistheindicationofwhichmessageisdesired.Thefirst
messageisb’1’, the secondis b’2’, and so on. The second parameter is an indicator of
whatitisthat shouldbereturned. Theheader?Ifso, pass“(RFC822.HEADER)”as the
parameter.Why?Becausetheyaskforit.RFC822isthenameofaprotocol.IftheEmail
bodyiswanted,thenpass“(RFC822.TEXT)”.Ashortlistofpossibilitiesis:
RFC822-Everything
RFC822.HEADER-Nobody,headeronly
RFC822.TEXT-Bodyonly
RFC822.SIZE-Messagesize
UID-Messageidentifier
Multipleofthesespecifierscanbepassed;forexample:
mbox.fetch(num,ꞌ(UIDRFC822.TEXTRFC822.HEADER)ꞌ)
returnsatuplehavingthreeparts:theID,thebody,andtheheader.Bytheway,theheader
tends to be exceptionally long, 40 lines or so, and is mostly uninteresting to a specific
application.Forthisexample,theonlypartoftheheaderthatisinterestingisthe“Subject”
part.Fieldsintheheaderareseparatedbythecharacters“\r\n”sotheyareeasytoextract
inacalltosplit().Eliminatingtheheaderdataforamoment,thecall:
(env,data)=mbox.fetch(num,ꞌ(UIDRFC822.TEXT)ꞌ)
resultsinatuplethathasan“envelope”thatshouldindicate“OK”(theenvvariable).The
datapartisastringthatcontainstheUIDandthetextbodyofthemessage.Forexample:
[(bꞌ2(UID22RFC822.TEXT{718}ꞌ,b”Gota collectionof old45ꞌs for sale. Contact
me.\r\n\r\n—\r\n”),bꞌ)ꞌ]
Thissaysthatthisismessage2andshowsthetextofthatmessage.
Thisexampleissupposedtoprintallofthesubjectheadersinthismailbox.Thecallto
fetch()shouldextracttheheaderonly:
(env,data)=mbox.fetch(num,ꞌ(RFC822.HEADER)ꞌ)
ThedetailsofIMAParecomplexenoughthatitiseasytoforgetwhattheoriginaltask
was, which was to print the subject lines from the messages in the mailbox. All of the
relevantmethodshavebeendescribedandcompletingtheprogramispossible.Theentire
programis:
importimaplib
server=ꞌimap.gmail.comꞌ#IMAPServer
USER=“pythontextbook@gmail.com”#USERID
PASSWORD=””#Maskthispassword
EMAIL_FOLDER=“Inbox”#Whichmailbox?
mbox=imaplib.IMAP4_SSL(server)#Connect
mbox.login(USER,PASSWORD)#Authenticate
env,data=mbox.select(EMAIL_FOLDER)#Selectthemailbox
ifenv==ꞌOKꞌ:#Diditwork?
print(”Printingsubjectheaders:“,EMAIL_FOLDER)
env,data=mbox.search(None,“ALL”)#Selectthe
#messageswanted.
ifenv!=ꞌOKꞌ:#Arethereany?
print(“Nomessages.”,env)#Nope.
exit()
fornumindata[0].split():#Foreachselected
#messagebꞌ123…ꞌ
(env,data)=mbox.fetch(num,ꞌ(RFC822.HEADER)ꞌ)
#Readit
ifenv!=ꞌOKꞌ:
print(“ERRORgettingmessage”,num,“,“,env)
break
s=str(data[0][1])#Lookforthestring
#“Subject”intheheader
k=s.find(“Subject”)
if(k>=0):#Foundit?
s=s[k:]#Extractthestringtothe
#nextꞌ\rꞌ
k=s.find(ꞌ\rꞌ)
s=s[:k]
print(s)#Andprintit.
mbox.close()
else:
print(“Nosuchmailboxas“,EMAIL_FOLDER)
mbox.logout()
Typicaloutputwouldbe:
Printingsubjectheaders:Inbox
Subject:ContentsofChapter13
Subject:45RPM
Subject:anotherEmail
ThepointofthissectionwastodemonstratehowaPythonprogram,oranyprogramfor
thatmatter,mustcomplywithexternalspecificationswheninterfacingwithsophisticated
software systems, and to introduce the concept of a protocol, a contract between
developers.OfcourseaprogramthatcansendEmailisusefulbyitself.
13.2 FTP
TheFileTransferProtocolisusedtoexchangefilesbetweencomputersonanetwork,
inparticularacrosstheInternet.Itprovidesthesamesortofinterfacetodataonadistance
computeraswouldbeexpectedfromafilesystemonadesktop.Itcancopyafileineither
direction, but can also change directories, list the directory contents, and perform other
useful operations. This again presumes that the rules set up by the FTP interface are
followed.
Havingjustseenthecommunicationrequirementsforsendingand receivingEmail,it
shouldbepossibletopredictthewaythatFTPwilloperate.Aconnectionwillhavetobe
madetoaremotecomputer,andsomeformofauthenticationwilltakeplace.Theclient
(the program that established the connection) will now send a set of commands to the
server, which will read and process them. Then, finally, the client will terminate the
connection.Thisisalltrue.
The commands that can be processed by an FTP server include things like LIST the
contents of this directory, change the working directory (CWD), retrieve a file (RETR)
andsendorstoreafile(STOR).Thesearesentacrossthenetworkasstringsandrepresent
raw FTP commands, those that take place at a low level of abstraction in the system.
HigherlevelcommandsareimplementedasspecificmethodsintheFTPclassofftplib.
Forexample,thereisacommandnamedPWDthatwilldisplaythenameofthecurrent
remotedirectory.FTPoffersafunctionthatwillsendthiscommand:
FTP.pwd()
Doing the same thing by sending the command directly would use the sendcmd()
methodofFTP,andwouldpassthecommandasastring:
ftp.sendcmd(“PWD”)
There is a difference to the programmer. The pwd() method returns the string that
representsthedirectory,whereaswhenthetextcommandissent,the return valueisthe
stringthattheFTPsystemreturned,whichissomethinglike:
257“/”isthecurrentdirectory
AnexamplewillbeusedtoillustratetheuseofFTP.
Example:DownloadandDisplaytheREADMEFilefroman
FTPSite
The site chosen for the example belongs to NASA, but any ftp site will work. The
connectionandauthenticationstepsare:
fromftplibimportFTP
ftp=FTP(“ftp.hq.nasa.gov”)#PleasedonꞌtalwaysuseNASA
ftp.login()#Selectadifferentsite.
The login step is interesting because there are no parameters given. This is an
anonymous FTP connection, which is common for sites that offer things for download.
The default login when using the login() method is a user ID of “anonymous” and a
password, if one is requested, of “anonymous.” It is also possible to specify an ID and
passwordiftheuserhasthem:
ftp.login(“myuserid”,“mypassword”)
Thelogin()functionreturnsthesite’swelcomemessage,whichisastringthatcanbe
ignored.
The example is supposed to download the file named README and display it. The
methodretrlines()candothis,becauseitisatextfile.Ifitwereabinaryfile,likeanMP3
orJPGfile,thentheretrbinary()methodwouldbeusedinstead.Thefirstparameterto
retrlines()isacommand.ToretrieveafilethecommandisthekeywordRETRfollowed
bythefilename.Thesimplestversionis:
ftp.retrlines(ꞌRETRREADMEꞌ)
whichwilldisplaythetextfromthefileonthescreen.That’swhatwaswanted,butthe
method can do more. If a function name is passed as the second parameter, then that
functionwillbecalledforeverylineoftext,andwillbepassedthatlineasaparameter.To
illustratethisconsiderasimplefunctionthattakesastringandprintseachline,lookingfor
“\n”characters.Thefunctionis:
defmyprint(ss):
s=str(ss)#Sometimestheparameterssistypebyte.
x=s.split(“\n”)
foriinrange(0,len(x)):
print(x[i])
Nowacalltoretrlines()couldbeasfollows:
ftp.retrlines(ꞌRETRREADMEꞌ,myprint)
oreven:
ftp.retrlines(ꞌRETRREADMEꞌ,print)
tousethestandardprint()function.Ofcourseanyfunctionthattakesastringparameter
couldbepassed.TosavetheREADMEfileasalocalfile,forexample:
ftp.retrlines(ꞌRETRREADMEꞌ,open(ꞌREADMEꞌ,ꞌwꞌ).write)
willwritethefiletoalocalonenamedREADME,butlackingtheend-of-linecharacters.
Binary files use retrbinary(), and it has the same form as retrlines(). However, the
secondparameter,thefunction,mustbepassed,becausebinaryfilescannotbesenttothe
screen.Downloadingandsavinganimagefilemightbedoneasfollows:
ftp.retrbinary(ꞌRETRorion.jpg,open(ꞌorion.jpg,ꞌwbꞌ).write)
Thesessionwouldendbyloggingout:
ftp.quit()
Uploadingafile,thatismovingafilefromadesktoptoasiteontheInternet,usedthe
methodstorlines()fortextandstorbinary()forbinaryfiles.Examples:
f=open(“message.txt”,“rb”)
ftp.storlines(“STORmessage.txt”,f)
Themethodcopieslinesfromthelocalfiletotheremoteone.Thefiletobecopiedis
openin“rb”mode.Forabinaryexampleassumeanimage:
f=open(“image.jpg”,“rb”)
ftp.storbinary(“STORimage.jpg”,f)
session.storbinary(“STORkitten.jpg,”file)#sendthefile
13.3 COMMUNICATIONBETWEENPROCESSES
UnderneaththeFTPandEmailprotocols,whichallowinterfacestoapplications,liesa
communications layer, the programs that actually send bytes between computers or
betweenprogramsonthesamecomputer.Itisconductedverymuchlikeaconversation.
One person, the client, initiates the conversation (“Hi there!”). The other (the server)
responds (“Hello. Nice to see you.”). Now it is the client’s turn again. They take turns
sendingandacceptingmessagesuntilonesays“goodbye.”Thesemessagesmightcontain
Email,orFTPdata,orTVprograms.Thislayerdoesnotcarewhatthedatais;noneofits
business,really.Itsjobistodeliverit.
Dataaredeliveredinpackets,eachcontainingacertainamount.Inorderfortheclient
todeliverthedatatheremustbeaserverwillingtoconnecttoit.Theclientneedstoknow
theaddressofaserver,justasanFTPaddressorEmaildestinationwasrequiredbefore,
butnowallthatisneededisthehostnameandaportnumber.Aportisreallyalogical
construction,somethingakintoanelementofalist.Iftwoprogramsagreetosharedata
byhavingoneofthemplaceitinlocation50001ofalistandtheotheronereaditfrom
there,itgivesanapproximateideaofwhataportis.Someportnumbersareassignedand
should not be used for anything else; FTP and Email have assigned ports. Others are
availableforuse,andanytwoprocessescanagreetouseone.
Amodulenamedsocket,basedontheinterprocesscommunicationschemeonUNIXof
the same name, is used with Python to send messages back and forth. To create an
exampletwocomputersshouldbeused,onebeingtheclientandonetheserver,andtheIP
addressoftheserverisrequiredtoo.
Example:AServerThatCalculatesSquares
Theclientwillopenacommunicationslink(socket)totheserver,whichhasaknownIP
address.Theserverwillengageinashorthandshake(exchangeofstrings)andthenexpect
toreceiveanumberfortheclient.Theclientwillsendaninteger,theserverwillreceiveit,
square it, and send back the answer. This simple exchange is really the basis for all
communicationsbetweencomputers:onemachinesendsinformation,theotherreceivesit,
processesit,andreturnsareplybasedonthedataitreceived.
Theclient:willbegintheconversation.Itcreatesaconnection,asocket,totheserver
using the socket() function of the socket module. Protocols must be specified, and the
mostcommononeswillbeused:
importsocket
HOST=ꞌ19*.***.*.***ꞌ#Theremotehost
PORT=50007#Thesameportasusedbytheserver
s=socket.socket(socket.AF_INET,socket.SOCK_STREAM)
s.connect((HOST,PORT))
Port 50007 is used because nothing else is using it. Now the client starts the
conversation,justasitappearsatthebeginningofthissection:
s.send(bꞌHithere!ꞌ)
Thesend()functionsendsthemessagepassedasaparameter.Thestring(asbytes)is
transmitted to the server through the variable s, which represents the server. The client
nowwaitsfortheconfirmationstringfromtheserver,whichshouldbe“Hello.Nicetosee
you.”Theclientcalls:
data=s.recv(1024)
whichwaits for a response from the server.This response will be 1024 bytes long at
most,anditwillwaitonlyforashorttime,atwhichpointitwillgiveupandanerrorwill
bereported.Whenthisclientgetstheresponse,itproceedstosendnumberstotheserver.
Theyareconvertedintothebytestypebeforetransmission.Inthisexampleitsimplyloops
through100consecutiveintegers:
foriinrange(0,100):
data=str(i).encode()
s.send(data)
Aftersendingtotheserveritwaitsfortheanswer.Actuallythat’sapartofthereceive
function:
data=s.recv(1024)
after100integerstheloopendsandtheconnectionisclosed:
s.close()
The Server: is always listening. It creates a socket on a particular port so that the
operatingsystemknowssomethingispossiblethere,butbecausetheservercannotpredict
whenaclientwillconnectorwhatclientitwillbeitsimplylistensforaconnection,by
callingafunctionnamedlisten():
importsocket
fromrandomimport*
HOST=ꞌꞌ#Anullstringiscorrecthere.
PORT=50007
s=socket.socket(socket.AF_INET,socket.SOCK_STREAM)
s.bind((HOST,PORT))
s.listen()
AF_INETandSOCK_STREAMareconstantsthattellthesystemwhichprotocolsare
beingused.Thesearethemostcommon,butseethedocumentationforothers.Thebind()
andthelisten()functionsarenew.Associatingthisconnectionwithaspecificportisdone
usingbind().Thetuple(HOST,PORT)saystoconnectthishosttothisport.Theempty
stringforHOSTimpliesthiscomputer.
Thelisten()callstartstheserverprocess,thisprogram,acceptingconnectionswhenasked.
A process connecting on the port that was specified in bind() will now result in this
process, the server, being notified. When a connection request occurs, the server must
acceptitbeforedoinganyinputoroutput:
conn,addr=s.accept()
Inthe tuple (conn, addr) that isreturned, conn represents the connection, like a file
descriptorreturnedfromopen(),andisusedtosendandreceivedata;addristheaddress
ofthesender,theclient,andisastring.Iftheaddrwereprinted:
print(“Connectedto“,addr)
ItwouldlooklikeanIPaddress:
Connectedto423.121.12.211
Nowtheservercanreceivedataacrosstheconnection,anddoessobycallingrecv():
data=conn.recv(1024)
print(“Serverheardꞌ“,data,“ꞌ“)
Theparameter1024specifiesthesizeofthebuffer,orthemaximumnumberofbytes
thatcanbereceivedinonecall.Thevariabledataisoftypebytes,justastheparameterto
send() was in the client. The client was the first to send, and it sent the message “Hi
there!” That should be the value of data now, if it has been received properly. The
responsefromtheservershouldbe“Hello,nicetoseeyou.”
conn.send(bꞌHello.Nicetoseeyou.ꞌ)
Thesameconnectionisusedforsendingandreceiving.
Nowtherealdatagetsexchanged.Theserverwillacceptintegers,sentasbytes.Itwill
squarethemandtransmittheanswerback.
whileTrue:
data=conn.recv(1024)#Readtheincomingdata
ifdata:
i=int(data)#Convertittointeger
print(“Received“,i)
data=str(i*i).encode()#Squareitandconverttobytes
conn.send(data)#Sendtotheclient
Theservercantellwhentheconnectionisclosedbytheclient,butitisalsopolitetosay
“Goodbye”somehow, perhaps by sending aparticular code.If the loopever terminates,
theservershouldclosetheconnection:
conn.close()
This is a pretty good example of a data exchange and a contract, because there are
specified requirements for each side of this conversation which will result in success if
done correctly and failure if messed up. Failure is sometimes indicated by an error
message,oftenatimeout where theclient or server was expecting something thatnever
arrived.Inothercasesfailureisnotformallyindicatedatall;theprogramsimply“hangs”
thereanddoesnothing.Ifatanytimebothprocessesaretryingtoreceivedata,thenthe
programwillfail.
Figure13.2showsthecommunicationbetweentheclientandtheserverasadiagram.If
theclientandtheserverareatanytimebothtryingtoacceptdatafromtheconnection,
thentheprogramwillfail.Inthediagramalldatatransferscanbeseenastransmit-accept
pairsbetweenthetwoprocesses,andasread-writepairswithintheserverandwrite-read
pairswithintheclient.
The FTP protocol can now be seen as a socket connection, wherein the client sends
strings (commands) to the server, which parses them, carries out the request, and then
sendsanacknowledgementback.
13.4 TWITTER
ForthefewpeoplewhomaybeunfamiliarwithTwitter,itisasocialmediaservicethat
allowsitsuserstosendshort(140character)messagesouttotheworld,orreallytotheir
subscribedlisteners.Fromitsbeginningin2006,Twitterhasgrowntothepointwhereit
handleshundredsofmillionsofmessages(tweets)perdayfromtheir302millionactive
users. It differs from Email in that it broadcasts messages, and the recipients are self-
selected.
ThemessagesareenteredbyTwitterusers,eachofwhomhasanaccount.Allmessages
becomepartofastream,andtheonesthataparticularuserwantstoseearepulledfrom
that stream and placed on the user’s feed. It is, however, possible to see the feed and
examinemessagesastheyaresent,collectingdataoridentifyingpatterns.Twitterallows
accesstothestream,butwhenusingPythonitrequirestheuseofamodulethatmustbe
downloadedandinstalled.That
moduleiscalledtweepy.
Awarning:settinguptheauthenticationsothattheTwitterstreamcanbeaccessedis
notsimple.Atwitteraccountisneeded,anapplicationhastoberegistered,andtheapp
mustbespecifiedasbeingabletoread,write,anddirectmessages.Twitterwillcreatea
unique set of keys that must be used for the authentication: the consumer key and
consumersecretkey,thentheaccesstokenandaccesssecrettoken.Again,itdoesnotpay
toaskwhybecauseitsimplymustbedone.Howisabetterquestion.
Atweetislimitedto140characters,butthatonlyconsiderscontent.Theamountofdata
sentinatweetissubstantiallylargerthanthat,6000bytesormore.That’sduetothelarge
amountofmetadata,ordescriptiveinformation,inatweet.Mostpeopleneverseethat,but
aprogramthatreadstweetsandsiftsthemforinformationwillhavetodealwithit.The
twitterinterfacereturnstweetdatainJSONformat(JavaScriptObjectNotation)whichisa
standardforexchangingdata,similarinpurposetoXML.Thisformathasto beparsed,
butasecondPythonmodulenamedjsonwilldothatsonofurtherdiscussionofJSONwill
benecessary.
Example:ConnecttotheTwitterStreamandPrintSpecific
Messages
This program will examine the twitter feed and print messages that have the term
“startrek” in them. It is useful to see that once again, authentication is one of the first
thingstodo.InthecaseofTweepyanobjectiscreated,passingtheauthenticationstrings.
First:
importtweepy
importjson
#Authenticationdetailsfromdev.twitter.com
consumer_key=ꞌgetyourownꞌ
consumer_secret=ꞌgetyourownꞌ
access_token=ꞌgetyourownꞌ
access_token_secret=ꞌꞌgetyourownꞌ
authentication=tweepy.OAuthHandler(consumer_key,
consumer_secret)
authentication.set_access_token(access_token,
access_token_secret)
Nowsomethingdifferentisneeded.Tweepywantstohaveanobjectpassedtoitthatis
asubclassofonethatitdefines,StreamListener.Asapartofthedealthatismadewith
Tweepy,theclassmusthaveamethodnamedon_data()andanothernamedon_error().
Theon_data()methodiscalledbyTweepywhenthereisdatainthestreamtoberead,and
thedatais passedas astringin JSONformat; theon-error()method iscalled whenan
erroroccurs,andispassedastringwiththeerrormessage.Creatingthissubclasswillbe
describedalittlelater.However,assumethatitiscalledtweet_listener.Thenextstepin
theprocessistocreateaninstanceofthisclass:
listener=tweet_listener()
Through this class instance the stream will be accessed. Now tell Tweepy what this
instanceissoitcanuseit.Alsodotheauthentication:
stream=tweepy.Stream(authentication,listener)
FinallytellTweepywhattoextractfromtheTwitterstream.Forthisexample,thecallis:
stream.filter(track=[ꞌStarTrekꞌ])
butotherpartsofthestreamcanbeaccessedandsenttothisprogram:timesand dates,
locations, etc. In this case the track argument looks into the message text for the “Star
Trek”string,caseinsensitive.Multiplesearchstringscanbeplacedinthelist:[“startrek”,
“casablanca”].
OK,whataboutthetweet_listenerclass?ItisasubclassofStreamListener,aswassaid
earlier.Theon_data()methodneedstoparsetheJSON-formattedstringitispassedand
printthepartsofthemessagethataredesired.Sincethefilter()callrestrictsthemessages
tothosecontainingthestring“startrek,”allthathastobedoneinthismethodistoprint
thebodyofthemessage.Hereistheclassshowingthemethod;theexplanationfollows:
classtweet_listener(tweepy.StreamListener):
defon_data(self,data):
#TwitterreturnsdatainJSONformat-decodeitfirst
dict=json.loads(data)
print(dict[ꞌuserꞌ][ꞌlocationꞌ])
print(dict[ꞌuserꞌ][ꞌscreen_nameꞌ],dict[ꞌtextꞌ])
returnTrue
defon_error(self,status):
print(status)
TheparameterdataisinJSONformat.Toconvertitintosomethinguseable,passitto
thejson.loads()method.ItreturnsaPythondictionarywiththedataavailable,indexedby
fieldname.ThedatastructureusedbyTwitteriscomplex,andisshowninsmallpartin
Table13.1.Theleftsideofthetableshowsthemessagefieldnames,therightlistssome
oftheuserfields;userisafieldwithinthemessagethatdescribesthesender.Thevariable
dictistheresultingdictionary.
To simply solve the problem posed, all that would have to be done is to print
dict[‘text’],whichisthemessagebody.Thevalueofdict[‘user’]isthedataforthesender
ofthemessage.Thereisalotofthat,mostlynotusefultoanyonebutanappdeveloper
(e.g., background color of the user’s window), but dict[‘user’]’[‘screen_name’] is the
Twitteridentityofthesender,anddict[‘user’][‘location’]oftenindicateswheretheyare.
Itwouldbepossibletocollectdataonwherethelargestnumberoftweetsarebeingsent
from, what kind of information is being conveyed, and in this way perhaps develop an
earlywarningsystemforeventshappeningintheworld.
13.5 COMMUNICATINGWITHOTHERLANGUAGES
Pythonisterrificformanythings,butitcanbequiteslow.Itisinterpretedandhasalot
ofoverheadformanyofitsfeatures;dynamictypingdoesnotcomecheap.Also,itmaybe
hard to easily access operating system functions from Python. C, C++, and other
languagesdonothavetheseproblems.It’spossibletowriteaprograminPythonthatcalls,
forexample,aCprogramtodocomplexcalculationsofsystemcalls.
Consider the problem of finding the greatest common divisor (GCD) between two
integers; that is, the largest number that divides evenly into both of them. If the GCD
betweenNandMis1,thenthesenumbersarerelativelyprime,andtheycouldfindusein
arandomnumbergenerator.
Example:FindTwoLargeRelativelyPrimeNumbers
ThisproblemwillbesolvedusingaCprogramtodotheGCDcalculationandaPython
programtopassitlargenumbersuntilarelativelyprimeparisfound.TherearemanyC
versionsoftheGCDprogram.Thisisacommonfirst-yearprogrammingassignment.One
suchisgcd.c,providedontheaccompanyingdisc
#include“stdafx.h”
#include<stdio.h>
int_tmain(intargc,_TCHAR*argv[])
{
longn,m;
scanf(“%ld%ld”,&n,&m);
while(n!=m)
{
if(n>m)
n-=m;
else
m-=n;
}
printf(“%d”,n);
return0;
This is written for Visual C++ 2010 Express, but very similar code will compile for
othercompilersandsystems.Thebasicideaisthatitreadstwolargenumbers,namedn
and m, determines their largest common divisor, and prints that number to standard
output. The way that Python will communicate with this C program is through the I/O
system.Creadsfromstandardinput,andwritestostandardoutput.ThePythonprogram
will co-opt input and output, pushing text data containing the values of n and m to the
input,andcapturingstandardoutputandcopyingittoastring.
Thisrequirestheuseofamodulenamedsubprocessthatpermitstheprogramtoexecute
thegcd.exeprogramandconnecttothestandardI/O.AfunctionnamedPopen()takesthe
nameofthefiletobeexecutedasaparameterandrunsit.Italsoallowsthecreationof
pipes,which are data connections thatcan take the place of files. The Popen() call that
runsthegcdprogramis:
p=subprocess.Popen(ꞌgcd.exeꞌ,
stdin=subprocess.PIPE,
stdout=subprocess.PIPE)
ConnectingstdinandstdouttosubprocessPIPEsmeansthatnowPythoncanperform
I/Owiththem.WhenGCDstartstoexecuteitexpectstwointegersoninput.Thesecan
nowbesentfromthePythonprogramlikethis:
p.stdin.write(data)
Theexpressionp.stdinrepresentsthefileconnectiontotheprogram,andwritingtoit
does the obvious thing. The Python program writes data to the C program, and the C
programreadsitfromstdin.Datashouldbeoftypebytes,andshouldcontainbothlarge
numbersincharacterform.Correspondingly,whentheCprogramhasfoundthegreatest
commondivisor,itwritestostandardoutput.CapturingthisinPython:
s=str(p.stdout.readline())
TheCprogramwrites;thePythonprogramreads.Thevaluereturnedisoftypebytes
again,soitisconvertedintoastring.
ThefinalPythonsolutioncallstheCprogramrepeatedlyuntiltheGCDis1:
importsubprocess
n=11111122
m=121
data=bytes(str(n)+ꞌꞌ+str(m),ꞌutf-8ꞌ)
whileTrue:
p=subprocess.Popen(ꞌgcd.exeꞌ,
stdin=subprocess.PIPE,
stdout=subprocess.PIPE)
p.stdin.write(data)
p.stdin.close()
s=str(p.stdout.readline())
print(s)
ifs==“bꞌ1ꞌ“:
print(“Numbersare“,n,m)
break
m=m+1
data=bytes(str(n)+ꞌꞌ+str(m),ꞌutf-8ꞌ)
Thismethodofcommunicatingwithotherlanguagesisquiteuniversal,butslowerthan
passing parameters to functions and methods directly. There are a lot of problems with
calling functions in other languages, not the least of which concerns typing. Python,
dynamically typed interpreted language that it is, would have to have the programmer
performsomesignificantgymnasticstoconvertlistsordictionariesintoaformthatCor
Javacoulduse.
13.6 SUMMARY
Design by contract has designers create formal specifications for components, and
using those involves a kind-of contract or agreement between programmers developing
clientsoftwareandthosewhobuiltthemodulesanddesignedtheprotocols.ForEmail,as
anexample,senderandreceiverhavetoagreeonhowtoencodeanddecodeamessage,
andhowtoaccessitfromthenetwork.Tosendmailbetweendifferentcomputersalways
requires a standard, a scheme that is agreed upon by implementers of the system, a
protocol.
TheSimpleMessageTransferProtocolisaspecificationoftheprocessanddataneeded
tosend anEmail message.The Python modulesmtplibprovides the methods needed to
interfacewiththissystem,whichistosaythatsmtplibimplementsSMTPandprovidesa
programmeraccessatvariousplaces.ForreadingEmailtherearetwoschemesinplay:the
Post Office Protocol (POP) and the more modern Internet Message Access Protocol
(IMAP). An Email client must agree to satisfy one of these. They Python module for
IMAPisimaplib.
The File Transfer Protocol (FTP) is used to move entire files and directories across
networks.Itprovidesthesamesortofinterfacetodataonadistancecomputer
aswouldbeexpectedfromafilesystemonadesktop.Theftplibmoduleoffersmethods
for handling this protocol. After authentication, commands are sent as character strings,
andfilescanbesentorreceivedusinganalogsofreadandwriteoperations.
FTP is built on top of lower level communication primitives such as sockets, which
createbidirectionaldataconnectionsbetweentwoprogramsondifferentcomputers.
Twitter sends out a stream of data containing all of the tweets sent by users, and the
client can scan these for tweets that are of interest (subscribed). This stream can be
capturedusingPythonandtheTweepymodule,andautomaticscanningofthefeedcanbe
doneaccordingtotheuser’sprogram.
It is also possible for Python to communicate with other programs written in other
languagesbyco-optingtheinputandoutputfilesforthoseprogramsandfeedingdatainto
andextractingresultsfromtheI/Ochannels.
Exercises
1.WriteasimpleEmailsendingprogramsendmail.Itwillaskforthedestinationfrom
thekeyboardandacceptthemessagethatwaytoo.Multipledestinationaddressescan
bespecifiedbyseparatingthemwithcommas.Thesender’sEmailaddressandthe
servernameshouldbebuiltintotheprogram.
2.Writeanapplicationthatallowsausertospecifyawordorwordsandwillexamine
theirmailboxforanyEmailthatcontainsthem.ThecorrespondingEmailmessages
willbewrittentoatextfilenamed“search.txt.”
3.WriteaPythonprogramthatwilldownloadthefilesnamed“one.txt,”“two.txt,”and
“three.txt”fromanftpsitespecifiedbyyourinstructorintofilesofthesamenameona
desktopcomputer.
4.Designandcodeaserverprogramthatwilldealpokerhandsusingthesocket
protocol.Whentheserverisconnectedtobyaclient,theclientsendsastring“deal.”
Theserverwillgeneratearandompokerhandandsenditastext:Thespadessuitin
thisschemeis:s1s2s3s4s5s6s7s8s9s10sjsqsk;heartsare“h,”diamondsare“d,”
andclubsare“c.”Writeatestprogramthatreadsandprintstheresultinghands.
5.DIFFICULT.Writeatwo-playerponggameusingsockets.Thegamewilldisplaya
graphicalversionofthestandardPongscreenonthelocalandremotescreens.The
localplayercanmoveonlythelocalpaddle,andmotionsbytheremoteplayerare
reflectedonthelocalscreen.Theballispositionedatthesameplace,asnearlyas
possible,onbothscreenssimultaneously.
6.Therearewordsandphrasesthatmanygovernmentsusetoindicateapotential
securityproblem.Theycanexamineemailsandvarioussocialmediaoutlets.Examine
theTwitterstreamfortweetscontainingthewords“assassination,”“security,”
“weapon,”or“hostage.”Printthetweetandlocationfromwhichitclaimstobesent.
7.HowcantheIPaddressofadistantsitebedetermined?SearchtheInternetforthat
informationasitcanbeimplementedinaPythonprogram,andimplementaprogram
thataskstheuserforaURLandreturnsanIPaddressforthatURL.
NotesandOtherResources
Pythonftplibdocumentation:https://docs.python.org/3/library/ftplib.html
SearchcriteriainIMAP:http://tools.ietf.org/html/rfc3501#section-6.4.4
Tweepdownload:https://pypi.python.org/pypi/tweepy/3.4.0
Tweepyintro:http://docs.tweepy.org/en/latest/streaming_how_to.html
How to use the Twitter API to stream tweets. https://www.youtube.com/watch?
v=pUUxmvvl2FE
Twittermessagefields:https://dev.twitter.com/overview/api/tweets
Twitteruserfields:
JSONTutorial:http://www.w3schools.com/json/
1.ToddCampbell.(2002).TheFirstEmailMessage,
https://www.cs.umd.edu/class/spring2002/cmsc434-
0101/MUIseum/applications/firstemail.html
2.TimBerners-Lee,RobertCailliau,AriLuotonen,HendrykNielsen,andArthurSecret.
(1994).TheWorld-WideWeb,CommunicationsoftheACM,37(8),76–82.
3.TakeshiSakaki,MakotoOkazaki,andYutakaMatsuo.(2010).Earthquakeshakes
Twitterusers:Real-timeeventdetectionbysocialsensors,inProceedingsofthe
19thInternationalConferenceontheWorldWideWeb(WWW’10),ACM,NewYork,
NY,USA,851–860.
CHAPTER14
ABRIEFGLIBREFERENCE
14.1Glibtkinter
14.2Images
14.3DynamicGlib
14.4Video
14.5Audio
14.6Interaction
14.7Other
Inthischapter
The reason that there are two “versions” of Glib is that not all schools and other
institutions will permit installing new modules. The module tkinter is standard with
Python3,butdoesnotbyitselfofferthetoolsformultimedia.Thebasicsystemconsists
ofgraphicsoperations,includingimageinput,output,anddisplay.
Forplaceswhereitispossibletoinstallthepygamemodule,thedynamicGlibmodule
canbeused.Thisoffersthesametoolsasdoesthetkinterversionplushassound,mouse
andkeyboardinteraction,andvideo.
Whatfollowsisdocumentationforbothsystems.
14.1 GLIBTKINTER
BLACK=“#000000”
WHITE=“#ffffff”
RED=“#ff0000”
GREEN=“#00ff00”
BLUE=“#0000ff”
BACKSPACE='\b'
Width()
Returnthewidthofthegraphicswindowinpixels.
Height()
Returntheheightofthegraphicswindowinpixels.
fill(r,g=1000,b=1000)
Turnonfilling.Setthefillcolorforpolygons,textcolortoo,to(r,g,b)orjustrifone
parameterisgiven.
nofill()
Turnfillingoff.
stroke(r,g=1000,b=1000)
Set the line and outline color to (r,g,b). If one parameter is given, then it is a grey
level.
nostroke()
Turnoffoutlinedrawing.
ellipsemode(z)
Setthemodefordrawingellipses.Thedefaultmodefordrawingellipsesisreferredto
as CENTER mode, where the center of the ellipse is given. There are three others:
RADIUSmode,inwhichthewidthandheightparametersrepresentsemi-majorandsemi-
minor axes; CORNER mode in which the upper left corner is specified instead of the
center; and CORNERS mode, in which the upper left and the lower right corner of the
boundingboxarespecified.
rectmode(z)
Setthemodefordrawingrectangles.Thedefaultmodefordrawingellipsesisreferred
to as CENTER mode, in which x, y are the coordinates of the center; w and h are the
widthandheight.InRADIUSmodex,yarethecoordinatesofthecenter;wandrarethe
horizontalandverticaldistancestoanedge.InCORNERmodex,yarethecoordinatesof
theupperleftcorner;wandharethewidthandtheheight.InCORNERSmodex,yare
the coordinates of the upper left corner; w and h are the coordinates of the lower right
corner.
ellipse(xpos,ypos,width,height)
Drawanellipse.Alsousedforcircles.Fourmodesasdescribedinellipsemode.
line(x0,y0,x1,y1)
Draw a line using the current stroke color between screen coordinates (x0, y0) and
(x1,y1).
point(x,y)
Drawapoint(pixel)atlocation(x,y)inthecurrentfillcolor.
rect(xpos,ypos,x2,y2)
Drawarectangle.Same4modesasellipse.Fillwiththecurrentfillcolor.
triangle(x0,y0,x1,y1,x2,y2)
Drawatrianglespecifiedbythreepoints.
bold()
Setfonttobold.
italic()
Setfonttoitalic.
normal()
Setthefontweightandslanttonormal(notbold)androman(notitalic).
setfont(s)
Setthefontfamilytothenamegiveninthestringparameters.Defaultis“Helvetica.”
text(s,x,y)
Drawthetextstringsstartingatscreencoordinate(x,y).
quad(x0,y0,x1,y1,x2,y2,x3,y3)
Drawaquadrilateralhavingthefourcornersbeing(x0,y0),(x1,y1),(x2,y2)and(x3,y3).
strokeweight(n)
Setthethicknessofdrawnlinesandstrokestonpixels.
arc(x0,y0,x1,y1,start,angle,s=ARC)
Draw anarc.An arc is defined as a portion of an ellipse from a starting angle for a
specified number of degrees, as referenced from the center of the ellipse. The angle 0
degrees is horizontal and to the right; 90 degrees is upwards (decreasing Y value). The
ellipse is defined by a bounding rectangle, specifying the upper left and lower right
coordinatesofaboxthatjustholdstheellipse.ThefinalparameterbeingARCmeansto
draw only the curve; if it is CHORD then it will draw the curve and a line joining the
endpoints.PIESLICEdrawsthearcandlinesfromtheendpointsofthearctothecenterof
theellipse.
cvtColor(z)
Takethegreyvaluezandreturnacolorobjectthathasred=z,green=z,andblue=z.
cvtColor3(r,g,b)
Take thecolor componentvaluesr, g,and b(red, green, andblue) andreturn acolor
objecthavingthosevalues.
background(r,g=1000,b=1000)
Set the background color to the color given by components (r,g,b). This effectively
clearsthegraphicswindowaswell.
textsize(n)
Changethesizeoftexttonpixelshigh.
14.2 IMAGES
loadImage(s)
Readtheimagefromthefilewhosenameisstoredintheparametersandreturnitas
thevalue.
image(im,x,y)
Displaytheimageimatwindowcoordinate(x,y).
copyImage(x)
Returnacopyoftheimageparameterx.
getpixel(im,i,j)
Returnthecolorofthepixelatlocation(i,j)oftheimageim.
setpixel(im,i,j,c)
Setthepixelvalue(color)oftheimageimatcoordinates(i,j)tothecolorc.
grey(c)
Returnthegreylevelequivalentofthecolorc.Itaveragesthethreecolorcomponents.
red(c)
Extractandreturntheredcomponentofthecolorpassedasc.
green(c)
Extractandreturnthegreencomponentofthecolorpassedasc.
blue(c)
Extractandreturnthebluecomponentofthecolorpassedasc.
save(im,s)
Savetheimageiminafilenamedbythestrings.
imageSize(s)
Acquire the size (width, height in pixels) of an image that exists as a file named s.
Return a tuple with the (width,height). This can be executed outside of the startdraw–
enddrawblocksothatstartdraw()canopenawindowofasizeappropriatetoanimage.
startdraw(xs=width,ys=height)
Thisindicatesthebeginningofasectionofcodewithinwhichdrawingoperationswill
beperformed.Itopensawindowonthescreenwithawidthofxspixelsandaheightofys
pixels.Itsetsupatitleonthewindowanddoessomeinitializations,suchassettingupa
font,settingdrawingmodes,andestablishingabackgroundcolor.
enddraw()
Forusersoftkinter,allthisfunctiondoesistocallmainloop().Foreveryoneelse,this
functionmustbecalledinorderthatcontrolcanbepassedtothedrawingfunctionsand
thattheuser’sdrawingcanbedisplayedinthegraphicswindow.Itistheclosetotheblock
ofcodebegunwiththecalltostartdraw().
14.3 DYNAMICGLIB
Symbolicnamesforspecialcharacters(LEFT,etc.)Theseareglobalvariablesthatare
usedbytheuser.
K_BACKSPACE =pygame.K_BACKSPACE
K_TAB =pygame.K_TAB
K_CLEAR =pygame.K_CLEAR
K_RETURN =pygame.K_RETURN
K_PAUSE =pygame.K_PAUSE
K_ESCAPE =pygame.K_ESCAPE
K_SPACE =pygame.K_SPACE
K_EXCLAIM =pygame.K_EXCLAIM
K_QUOTEDBL =pygame.K_QUOTEDBL
K_HASH =pygame.K_HASH
K_DOLLAR =pygame.K_DOLLAR
K_AMPERSAND =pygame.K_AMPERSAND
K_QUOTE =pygame.K_QUOTE
K_LEFTPAREN =pygame.K_LEFTPAREN
K_RIGHTPAREN =pygame.K_RIGHTPAREN
K_ASTERISK =pygame.K_ASTERISK
K_PLUS =pygame.K_PLUS
K_COMMA =pygame.K_COMMA
K_MINUS =pygame.K_MINUS
K_PERIOD =pygame.K_PERIOD
K_SLASH =pygame.K_SLASH
K_0 =pygame.K_0
K_1 =pygame.K_1
K_2 =pygame.K_2
K_3 =pygame.K_3
K_4 =pygame.K_4
K_5 =pygame.K_5
K_6 =pygame.K_6
K_7 =pygame.K_7
K_8 =pygame.K_8
K_9 =pygame.K_9
K_COLON =pygame.K_COLON
K_SEMICOLON =pygame.K_SEMICOLON
K_LESS =pygame.K_LESS
K_EQUALS =pygame.K_EQUALS
K_GREATER =pygame.K_GREATER
K_QUESTION =pygame.K_QUESTION
K_AT =pygame.K_AT
K_LEFTBRACKET =pygame.K_LEFTBRACKET
K_BACKSLASH =pygame.K_BACKSLASH
K_RIGHTBRACKET =pygame.K_RIGHTBRACKET
K_CARET =pygame.K_CARET
K_UNDERSCORE =pygame.K_UNDERSCORE
K_BACKQUOTE =pygame.K_BACKQUOTE
#K_a =pygame.K_a
K_b =pygame.K_b
K_c =pygame.K_c
K_d =pygame.K_d
K_e =pygame.K_e
K_f =pygame.K_f
K_g =pygame.K_g
K_h =pygame.K_h
K_i =pygame.K_i
K_j =pygame.K_j
K_k =pygame.K_k
K_l =pygame.K_l
K_m =pygame.K_m
K_n =pygame.K_n
K_o =pygame.K_o
K_p =pygame.K_p
K_q =pygame.K_q
K_r =pygame.K_r
K_s =pygame.K_s
K_t =pygame.K_t
K_u =pygame.K_u
K_v =pygame.K_v
K_w =pygame.K_w
K_x =pygame.K_x
K_y =pygame.K_y
K_z =pygame.K_z
K_DELETE =pygame.K_DELETE
K_KP0 =pygame.K_KP0
K_KP1 =pygame.K_KP1
K_KP2 =pygame.K_KP2
K_KP3 =pygame.K_KP3
K_KP4 =pygame.K_KP4
K_KP5 =pygame.K_KP5
K_KP6 =pygame.K_KP6
K_KP7 =pygame.K_KP7
K_KP8 =pygame.K_KP8
K_KP9 =pygame.K_KP9
K_KP_PERIOD =pygame.K_KP_PERIOD
K_KP_DIVIDE =pygame.K_KP_DIVIDE
K_KP_MULTIPLY =pygame.K_KP_MULTIPLY
K_KP_MINUS =pygame.K_KP_MINUS
K_KP_PLUS =pygame.K_KP_PLUS
K_KP_ENTER =pygame.K_KP_ENTER
K_KP_EQUALS =pygame.K_KP_EQUALS
K_UP =pygame.K_UP
K_DOWN =pygame.K_DOWN
K_RIGHT =pygame.K_RIGHT
K_LEFT =pygame.K_LEFT
K_INSERT =pygame.K_INSERT
K_HOME =pygame.K_HOME
K_END =pygame.K_END
K_PAGEUP =pygame.K_PAGEUP
K_PAGEDOWN =pygame.K_PAGEDOWN
K_F1 =pygame.K_F1
K_F2 =pygame.K_F2
K_F3 =pygame.K_F3
K_F4 =pygame.K_F4
K_F5 =pygame.K_F5
K_F6 =pygame.K_F6
K_F7 =pygame.K_F7
K_F8 =pygame.K_F8
K_F9 =pygame.K_F9
K_F10 =pygame.K_F10
K_F11 =pygame.K_F11
K_F12 =pygame.K_F12
K_F13 =pygame.K_F13
K_F14 =pygame.K_F14
K_F15 =pygame.K_F15
K_NUMLOCK =pygame.K_NUMLOCK
K_CAPSLOCK =pygame.K_CAPSLOCK
K_SCROLLOCK =pygame.K_SCROLLOCK
K_RSHIFT =pygame.K_RSHIFT
K_LSHIFT =pygame.K_LSHIFT
K_RCTRL =pygame.K_RCTRL
K_LCTRL =pygame.K_LCTRL
K_RALT =pygame.K_RALT
K_LALT =pygame.K_LALT
K_RMETA =pygame.K_RMETA
K_LMETA =pygame.K_LMETA
K_LSUPER =pygame.K_LSUPER
K_RSUPER =pygame.K_RSUPER
K_MODE =pygame.K_MODE
K_HELP =pygame.K_HELP
K_PRINT =pygame.K_PRINT
K_SYSREQ =pygame.K_SYSREQ
K_BREAK =pygame.K_BREAK
K_MENU =pygame.K_MENU
K_POWER =pygame.K_POWER
K_EURO =pygame.K_EURO
K_A =65
K_B =66
K_C =67
K_D =68
K_E =69
K_F =70
K_G =71
K_H =72
K_I =73
K_J =74
K_K =75
K_L =76
K_M =77
K_N =78
K_O =79
K_P =80
K_Q =81
K_R =82
K_S =83
K_T =84
K_U =85
K_V =86
K_W =87
K_X =88
K_Y =89
K_Z =90
noloop()
Stopcallingthedrawfunctionrepeatedly.Thishastheeffectofceasinganydynamic
activity.
defsize(xs,ys):
globalwidth,height,canvas
width=xs
height=ys
canvas=pygame.display.set_mode((xs,ys),
pygame.DOUBLEBUF,32)#Makethesketchwindow
pygame.display.set_caption('Drawing')
14.4 VIDEO
loadVideo(s)
Loadsavideofromthefilenamedbythestringsandreturnsareferencetothatvideo
asaGvideoclassinstance.Thisinstanceneedsneverbemanipulatedbytheprogrammer,
onlypassedtootherfunctions.
playVideo(m)
Playthevideorepresentedbythevariablem.
pauseVideo(m)
Pause(orunpauseifpaused)thevideorepresentedbythevariablem.
stopVideo(m)
Stopplayingthevideorepresentedbythevariablem.
rewindVideo(m)
Placethevideombackatthebeginningframe.
isVideoPlaying(m)
ReturnsTrueifthevideomiscurrentlyplayingandFalseifnot.
setVideoVolume(m,v)
Settheaudiovolumeofthevideomtoalevelbetween0(off)and1
(maximum).
lengthVideo(m)
Returnthelengthofthecurrentvideo,inseconds.
whereVideo(m)
For the video m, return the number of seconds that the video has been playing (i.e.,
currentposition)
getVideoFrame(m)
Returnthenumberofthecurrentframeplayinginthevideom.
setVideoFrame(m,f)
Setthepositionofthevideomsothatthenextframetoplaywillbenumberf.
getVideoPixel(m,x,y)
Returnthepixelatlocation(x,y)inthevideoframefrommcurrentlybeingdisplayed.
sizeVideo(m)
Returnthesizeofaframe(widthxheightinpixels)ofthevideom.
locVideo(m,x,y,w,h)
Placethevideomatposition(x,y)onthegraphicswindow,andgiveitsize(w,h).
videoSize(s)
Return the size (width, height) of the video in the file s without loading it. For
specifyingthesizeofthegraphicswindow.
14.5 AUDIO
loadSound(s)
Loadasoundfilewherethefilenameisinstrings.Returnanobjectreference(handle)
tothatfile.
playSound(a,loop=0)
Playthesoundindicatedbya.
defstopSound(a):
Stopplayingthesoundindicatedbya.
volumeSound(a,v)
Setthevolumeofsoundatoavaluebetween0(off)and1(maximum).
durationSound(a)
Returnthelengthofthesoundsinseconds.
14.6 INTERACTION
mouse()
Returnthecoordinatesofthemousecursorasatuple(x,y).
14.7 OTHER
capture(s)
Savetheimageinthecurrentgraphicswindowastheimagefiles.
TheGlibmodulecontainstwoclassesforinternaluse,animageclassandavideoclass.
Theyaredescribedhereonlyforcompleteness,butifyouplantousetheirinternalcodein
anywaypleasereaditcarefullyfromthesourcefileandbecertainthatyouunderstandit.
Animageclass:
classGimage:
im=None#Image
pixels=None#Pixelsdata
w=h=0
defsetIm(self,x):
defget(self,x,y):
defset(self,x,y,c):
defsave(self,s):
defdraw(self,canvas,x,y):
Avideoclass:
classGvideo:
def__init__(self,name):
defloadVideo(self,s):
defplay(self):
deflocVideo(self,x,y,w,h):
defdraw(self):
defcopy(self,x,y,w,h):
defstop(self):
defrewind(self):
defis_playing(self):
deflength(self):
defwhere(self):
defget_frame(self):
defpause(self):
defget_frame_number(self):
defdraw_frame(self,f,iplay=0):
defget_pixel(self,x,y):
defauto_play(self):
defset_volume(self,v):
INDEX
A
A
Accessor,224
Accumulator,14
LoadAccumulator,14
Acousticdelaylines,11
Addresses,12
Advanceddatafiles,299–317
Binaryfiles,299–301
EXE,317
HyperTextMarkupLanguage(HTML),316–317
Randomaccess,304–306
Maintainingthehighscorefileinorder,305–306
Standardfiletypes,306–315
GIF,307–308
Imagefiles,306–307
JPEG,309–310
PNG,312–314
Soundfiles,314–315
TaggedImageFileFormat(TIFF),310–312
WAVfile,314
Structmodule,301–303
Aliasing,161
Alohanet,23
AmericanStandardCodeforInformationInterchange(ASCII),27
Analyticalengine,3
Assignmentstatements,432
B
Babbage,Charles,3
Basicalgorithms,361–398
Bernoullinumbers,4
Binarynumbers,6
Arithmeticin,9–11
Converttodecimal,8
Bit,8
Bootloader,18
BranchifAccumulatorisZero(BAZ),17
Bristow,Steve,452
Buffering,200
Bushnell,Nolan,452
Button,21
C
Calculationsbymachine,2–3
Changetheworkingdirectory(CWD),490
Checkbox,21
Cipher,399
Class,217–248
Datatypesand,228–243
Abouncingball,231–237
Adeckofcards,229–231
Basicdesign,237–238
Cat-a-pult,237–243
Detaileddesign,238–243
Ducktyping,246–248
Subandinheritance,243–246
Objectsinavideogame,243–246
Typesand,219–228
Areallysimple,223–227
Encapsulation,227–228
ClearAccumulator(CLA),17
Client-serversystem,26
Codetable,393
Collision,23
Compression,382–396
Huffmanencoding,385–392
Huffmanalgorithm,388
LZW,392–396
Runlengthencoding,383
Computernetworks,22–26
Internet,24–25
WorldWideWeb,25–26
Computersystemlayers,17–22
Assemblersandcompilers,18–19
Graphicaluserinterfaces(GUIs),19–22
Concatenate,107
Constructor,221
Core,12
Coredump,12
Crossover,418
Cryptography,376–382
One-timepad,378–379
Plaintextandciphertext,377
PublicKeyEncryption(RSA),379–382
Statecipher,377
Streamcipher,377
Symmetrickey,377
D
Datastructures,432
Decimalnumbers,8
Converttobinary,8–9
Delta,406
Designbycontract,504
Dictionary(Python),287–293,304,391
AnaiveLatin–Englishtranslation,289–291
Functionsfor,291–292
Loopsand,292–293
Differentiation,406–407
2and4-pointfunctionversions,407
Documentation,55–57
Docstring,57
Drop–downlist,21
E
Email,481–497
Communicationbetweenprocesses,492–497
Communicationbetweentheclientandtheserverprocesses,497
Packetsandportnumber,493
FileTransferProtocol(FTP),490
Ftplib,490
Reading,484–490
Procedureforsending,485
Usefultags,486–487
READMEFile,491
Sendan,481
Smtplib,482
Endoffilecondition,201
Epoch,376
Equationroots,404–406
Commonconcepts,406
Event,325
Exception,92,199
F
Fetch,12
Fetch-executecycle,13,17
File(s),189–213
Alittletheory,191–195
Filestorageondisk,194–195
Slowfileaccess,195
Commonthingsin,190–191
Keyboardandinput,195–197
Listof,190
Writingto,211–213
Appendingdatatoa,212–213
Filedescriptororhandle,198
FormattedtextandI/O,294–299
NASAmeteoritelandingdata,295–299
Function(s),144–184
Definition,144–149
Generalizethehistogramcodeforotheryears,147–149
Parameterorargument,146
Syntaxofa,145
Usepoundntodrawahistogram,146–147
Execution,149–170
Defaultparameters,156–158
Functionsasreturnvalues,168–170
Gameofsticks,158–161
None,158
Parameters,153–156
Returningavalue,150–153
Scope,161–163
Variableparameterlists,163–165
Variablesasfunctions,165–168
Recursion,170–176
Avoidinginfinite,175–176
Programdesign(Nimgame),178–184
Developmentprocessexposed,182–184
G
Glib,255,283,507–519
Audio,518
Dynamic,512–516
Functionsandconstants,283–284
Images,510–512
Interaction,518–519
Playavideo,353–355
Staticanddynamic,324
tkinter,507–510
Ellipsemode,509
Usefulfunctionsin,351–353
Video,516–517
Graphics,254–280
Graphicprogramming,254–280
Generativeart,277–280
Histogramexample,264–268
Images,271–277
Identifyingacar,274–275
Pixels,273–274
Thresoldingexample,275–276
Transparency,276–277
Linesandcurves,260–262
Piechartexample,268–271
Poixellevelgraphics,257–260
Polygons,262–263
Text,263–264
Windowandcolors,255–257
Guessanumber,37,39–40,50–51,71–72,
Solvingproblem,40–41,94–96
H
Hashing,396–398
djb2,397–398
sdbm,398
Headcrash,193
I
Icon,21
IfStatements,52–55,432
Else,54–55
Infiniteloop,70
Infiniterecursion,175
Initialguess,406
Interpreter,19
Integration,408–410
Instructionregister,14
InternetMessageAccessProtocol(IMAP),485
InternetProtocol(IP),24
Iteration,184,406
Maximum,406
Iterativerefinement,184,432
L
Linearprogramming,167
List(s),123–135
Append,126–127
Count,131
Editing,125–126
Exceptions,133–135
Extend,127
Index,128
Insert,126
Listcomprehension,131–132
Pop,128–129
Remove,127–128
Reverse,130–131
Sort,129–130
Tuplesand,132–133
Loading,17
Longestcommonsequence(editdistance),421–427
DeterminingLongestCommonSubsequence(LCS),422–427
EditdistanceorLevenshteindistance,421
Smith–Watermanmethod,422
Loops,67,432
Counting,78–79
Forloop,78
Nested,84–86
M
Memory,11–13
MersenneTwisteralgorithm,376
Method,220
Multimedia,323–356
Animation,335–346
Aballinabox,336–338
ChangeBackgroundColorUsingtheMouse,327–328
DrawaCircleattheMouseCursor,325–327
Framewithtwoexamples,340–346
Manyballsinabox,338–340
Object,336–340
Keyboard,330–335
Pressinga“+”CreatesaRandomCircle,331–334
ReadingaCharacterString,334–335
Mouseinteraction,324–330
Button,329–330
DrawLinesUsingtheMouse,328–329
Mousebuttons,328–330
RGBAcolors,346–347
Sound,347–350
Controllingvolume,349
Playasound,348
Playasoundeffect,349–350
Video,350–355
PlayvideousingGlib,353–355
Processingpixels,355–356
UsefulfunctionsinGlib,351–353
MultipurposeInternetMailExtensions(MIME)standard,482
Mutation,418
Mutatorsorsetters,224
N
NetworkAccessPoint(NAP),25
Newline,200
Newton’smethod,404
O
Objectorientedprogramming–breakout,452–453
Optimization:maximaandminima,410–420
Evolutionaryorgeneticalgorithm,416–420
Goldstein–Pricefunction,417
Fittingdatatocurves-regression,413–416
Newton’smethod,411–413
P
Parameterorargument,146
PDP–8,14
Predefinednamesorsystemvariables,39
Point,136
PointofPresence(POP),25
PostOfficeProtocol(POP),485
Primeornon–primenumber,79–84
Else,83–84
Exitingfromaloop,82–83
Problemasprocess,453–470
Ballandpaddlecollusions,466–467
Ballandtilecollusions,463–466
Collectingtheclasses,461–462
Developingthepaddle,462–463
Finishingthegame,467–470
Initialcodingforatile,456–457
Initialcodingfortheball,459–461
Initialcodingforthepaddle,457–459
Proceduralprogramming,433–452
Centering,443–445
Commands,442
Filling,443
Listofsystemcommands,433–434
Othercommands,447–452
Pseudocode,434
Rightjustification,445–447
Top–down,434–443
Program,4
Programcounter,13
Programmability,4
Programminglanguagecommunication,502–504
Greatestcommondivisor(GCD),502
Prompt,38
Publicvariables,248
Puzzles,74
PyCharm,38
Pygame,324
Python,19,36–62,69,75,78,90,92,101,107,109,121,123,135,144,153,163,
177,191,219,228,230,272,286,293,318,323,369,373,394,470,479,485,498,
502,504,507
Arrays,293
Class-SyntaxandSemantics,221–223
Creatingmodules,176–178
Dictionaries,287–293
Executing,37–39
List,123–135
PythonGUI,38
Usingfilesin,197–211
CommaSeparatedVariable(CSV),205
Commonfileinputoperations,202–205
Endoffile,201
Filenotfoundexceptions,199
Openafile,198–200
Playjeopardy,208–210
Printtheplanetname,205–207
Readingfromfiles,200–211
Thewidthstatement,210–211
Q
Quantization,30
R
Randomnumber(s),74–78
Built–infunction,75
Functioncall,76
Generation,373–376
Linearcongruentialmethod,374–376
Registers,12
Repetition,67–96
Drawingahistogram,86–89
Exceptionsanderrors,90–96
Loopsingeneral,89–90
Representation,26–31
Reservedwords,39
Retrieveafile(RETR),490
Rock–paper–scissors,37,40,57–62,73–78
Solvingproblem,41–42
Exchanginginformationwithcomputers,45–46
Numberbases,48–50
Strings,integers,andrealnumbers,47–48
VariablesandvaleswithGUI,42–45
Typesaredynamic(advanced),60–62
Rulesforprogrammers,470–477
S
Sampling,29
Script,sourcecodeorcomputerprogram,36
Searching,369–373
Binarysearch,371–373
Linearsearch,371
Timings,370–371
Secondarystorage,192
Swendorstoreafile(STOR),490
Sequence,102
Settypes,135–138
Crap,136–138
SimpleMailTransferProtocol(SMTP),480
Simpson’sRule,410
Slice,106
Slider,21
Sorting,361–369
Mergesort,365–369
Propertiesof,365
Selectionsort,362–365
Squareroot,76
Statements,44
Storedprograms,13–17
String(s),102–114
Comparing,103–105
Editing,107–110
Forloops,113
Methods,110–112
Slicing,105–107
Spanningmultiplelines,112–113
Stubs,477
Synthesisprogramming,432
T
TheWhileStatements,69–73
Modifyingthegame,72–73
Tracks,194
TransportLayerSecurity(TLS),483
Transistor,7
Tuple(s),78,115–123
Assignment,121–122
Built–infunctionsfor,122–123
Delete,119–120
Inforloops,116–118
Membership,118–119
Update,120–121
Turing,Alan,13
Twitter,497–501
Fieldsinamessage,500