Chapter 5 Iteration and Loops
One of the main benefits of using computers to perform tasks is that computers never get tired or bored, and so can do the same thing over and over and over and over and over and over again. This is a process known as iteration. Iteration represents another form of control flow (similar to conditionals), and is specified in a program using a set of statements called loops. In this module, you will learn the basics of writing loops to perform iteration; more advanced iteration concepts will be covered in later modules.
Contents
- Resources
- While Loops
- Counting and Loops
- Conditionals and Sentinels
- For Loops
- Difference from While Loops
- Working with Files
- Try/Except
5.1 Resources
5.2 While Loops
Loops are control structures (similar to if
statements) that allow you to perform iteration. These statements specify that a block of code should be executed repeatedly—the block is executed statement by statement, and then the flow of control “loops” back to the top to execute the statements again. Programming languages such as Python support a number of different kinds of loops, which differ primarily in how they determine whether or not to repeat the block (though this difference is reflected in the syntax).
In programming, the most basic control structure used for iteration is known as a while loop, which is used for “indefinite iteration” A Python while loop has the following structure:
while condition:
# lines of code to run if the condition is True
This construction looks much like an if
statement, and is similar in many regards. As with an if
statement, the condition can be any Boolean value (or any expression that evaluates to a boolean value), and it determines whether or not the loop’s block will be executed.
In order to understand how a while loop influences the flow of program control, consider a more concrete example:
count = 5
while count > 0:
print(count)
count = count - 1
print("Blastoff!")
In this example, when the interpreter reaches the while
statement, it first checks the condition (count > 0
, where count
is 5). Finding that condition to be True
, the interpreter then executes the block, printing the count
and then decrementing it. Once the block is executed, the interpreter loops back to the while
statement and rechecks the condition. Finding that count
(4) is still greater than 0, it executes the block again (causing count
to decrement again). This continues until the interpreter loops to the while
statement and finds that count
has reached 0, and thus is no longer greater than 0. Since the condition is now False
, the interpreter does not enter the loop, and proceeds to the following statement (“Blastoff!”).
Importantly, the condition is only checked when the block is about to execute: both at the “start” of the loop, and then at the beginning of each subsequent iteration. Having the condition become True
in the middle of the block (e.g., temporarily) will have no impact on the control flow. It is also possible for the interpreter to “never enter” the loop if the condition is not initially True
.
5.2.1 Counting and Loops
The above example also demonstrates how to use a “counter” to determine whether or not the loop has run a sufficient number of times: this is known as a loop control variable (LCV). The “standard” counting loop looks like:
count = 0 # 1. initialize the counter
while count < 100: # 2. check if the counter has reached its target
print(count) # 3. do some work (this may be multiple statements)
count = count + 1 # 4. update the counter
In order for a counted loop to work properly, you need to be careful about steps 2 and 4: the condition and the update.
First, recall that the condition is whether to run the loop, not whether to stop:
count = 0
while count == 100: # bad condition!
print(count)
count = count + 1
In this case, the condition is not initially True, so the interpreter never enters the loop.
- When writing conditions, think “do we keep going” rather than “are we there yet?”. In loops (as in life), the journey is more important than the destination!
Second, consider what happens if you forget to update the loop control variable:
count = 0
while count < 100
print(count)
# no counter update
In this case, the interpreter checks that count
(0) is less than 100, then runs the loop. Then checks that count
(still 0) is less than 100, then runs the loop. Then checks that count
(still 0) is less than 100, then runs the loop…
This is known as an infinite loop: the loop will run forever, never being able to “break out” and reach the next statement.
- If you hit an infinite loop in Jupyter Notebook, use
Kernel > Interrupt
to break it and try again. If running a.py
file on the command-line, useCtrl-C
to cancel the script.
There are lots of ways to accidently produce an infinite loop:
- Having a condition that is “too exact” can cause the loop control variable to “miss” a particular breaking value:
python count = 0 while count != 100: # if we aren't yet at 100 print(count) count = count + 3 # this will never equal 100
Thus it is always safe to use inequalities (e.g., <
or >
) when writing loop conditions.
- Resetting the counter in the body of the loop can cause it to never reach its goal:
python count = 0 while count < 100: count = 0 # this resets the count! print(count) count = count + 1
The best way to catch these errors is to “play computer”: pretend that you are the compiler, and go through each statement one by one, keeping track of the loop control variable (writing its value down on a sheet of paper does wonders). This will help you be able to “trace” what your program is doing and catch any bugs there may be.
5.2.2 Conditionals and Sentinels
As with if
statements the block for a while
loop can contain any valid Python statements, including if
statements or even other loops (called a “nested loop”). Control statements such as if
are intended an extra step (4 spaces or 1 tab):
# flip a coin until it shows up heads
still_flipping = True
while still_flipping:
flip = random.randint(0,1)
if flip == 0:
flip = "Heads"
else:
flip = "Tails"
print(flip)
if flip == "Heads":
still_flipping = False
In this example, the still_flipping
boolean variable acts as the loop control variable, as it determines whether or not the loop is repeated. Using a Boolean as a LCV is known as using a sentinel variable. A sentinel (guard) variable is used to control whether or not the program flow gets out of the loop: as long as the sentinel is True
, the loop continues to run. Thus the loop can be “exited” by assigning the sentinel to be False
. This is particularly useful when there may be a complex set of conditions that need to be met before the program can carry on.
- It is of course possible to design a sentinel such as
done_flipping
, and then have thewhile
condition check that the sentinel is notTrue
. This may be useful depending on how you’ve structured the algorithm. In either case, be sure that your sentinels are named carefully and accurately reflect the information conveyed by the variable!
5.3 For Loops
If you look back at the basic counting loop, you’ll notice that tracking the count
loop control variable can be problematic. It is easy to forget the update statement (count = count + 1
), and the the count
variable itself acts as an extra “global” (or “less local”) variable that the interpreter needs to keep track of.
To avoid these problems, we can instead use a different kind of loop called a for loop. A for loop is used for “definite iteration”, when we want execute a loop a specific number of times. The basic Python for loop has the following structure:
for local_variable in range(maximum):
# lines of code to run for each number
For example, the basic counting while loop example could be rewritten using a simpler for loop:
for count in range(100):
print(count)
range()
is a function that returns a value representing a sequence of numbers. It is an example of a sequence data structure, which is a way of representing multiple data items in a single variable. Sequences will be discussed more in later modules (including the most common type of sequence, a list). The range()
function can be called with different arguments depending on the range and spread of numbers you wish to use:
# numbers 0 to 10 (not inclusive)
# (0 through 9)
range(10)
# numbers 1 through 11 (not inclusive)
# (1 through 10)
range(1,11)
# numbers 0 through 10 (not inclusive), skipping by 3
# (0, 3, 6, 9)
range(0, 10, 3)
Thus for count in range(100)
can be read as “for each number (called count) from 0 to 100”.
For loops may more properly be thought of as “for each” loops; they are used to go through the items in a collection (e.g., each number in a range
), executing the loop body once for each item. The local_variable
(e.g., count
) in a for loop is implicitly assigned the value of the “current” item in the collection (e.g., which number in the range we’re on) at each iteration.
- For ranges, this basically means that the for loops keeps track of the current iteration; but the idea of this local variable will be more important as we introduce additional collection types.
5.3.1 Difference from While Loops
The main difference between while loops and for loops is:
while loops are used for indefinite iteration; for loops are used for definite iteration.
While loops are appropriate when the interpreter doesn’t know in advance (before the loop starts) how many times the loop block will be executed: the loop does not have a definite number of iterations. On the other hand, a for loop is appropriate when the interpreter does know in advance (before the loop starts) how many times the block will be executed—a definite number of iterations. Note that that number of iterations may be a variable so not determined until runtime; however, the value of that variable will still be known when the for
statement is executed.
All iteration can be written with a while loop; but when performing definite iteration, it is easier, faster, and more idiomatic to use a for loop!
5.4 Working with Files
For loops can be used to iterate through any collection (technically any “iterable” type). One of the more useful collections when working with data is external files (e.g., text files). Files can be treated as a collection or sequence of lines (each divided by a \n
newline character), and thus Python can “read” a text file using a for loop to iterate over the lines of text in the file.
In order to read or write text file data, you use the built-in open()
function, passing it the path to the file you wish to access. This function will then return an object representing that particular file (e.g., it’s location on the disk), with methods that you can use to read from and write to it.
- Remember to always use relative paths. Note that when using a Jupyter Notebook, the “current working directory” is the direction in which you ran the
jupyter notebook
command to start the server.
my_file = open('myfile.txt') # open the file
for line in my_file:
print(line) # print each line in the file
Once you have opened a file, you can use a for loop to iterate through its line (as in the example above). You can also use a while loop, calling the readline()
method on the file in order to read a single line at a time.
It is also possible to write out content to a file. To do this, you need to open the file with “write” access (allowing the program to write to and modify it) by passing w
as the second argument to the open()
function. You can then use the write()
method to “print” text to the file:
# "open" the file with "write" access
file = open('myfile.txt', 'w')
file.write("Hello world!\n")
file.write("It's a mighty fine morning\n")
- Note that unlike the
print()
function,write()
does not include a line break at the end of each method call; you need to add those yourself!
5.4.1 Try/Except
File operations rely on a context that is internal to the program itself: namely, that the file you wish to open actually exists at the location you specify! But that may not be the case—particularly if which file to open is specified by the user:
filename = input("File to open: ") # which file to open
file = open(filename) # "open" the file
for line in file:
print(line)
If the user provides a bad file name, your program will encounter an error through no fault of your own as a programmer:
$ python script.py
File to open: neener neener
Traceback (most recent call last):
File "script.py", line 4, in <module>
file = open(filename)
FileNotFoundError: [Errno 2] No such file or directory: 'neener neener'
Since it’s possible for the user to make a mistake, we could like the program not to simply fail with an error, but to instead be able to “recover” and keep running (e.g., by asking the user for a different file name). We can peform this kind of error handling by utilizing a try
statement with an except
clause:
filename = input("File to open: ")
try: # this might break (not our fault)
file = open(filename)
for line in file:
print(line)
except: # catching FileNotFoundError
print("No such file")
A try
statement acts somewhat similar to an if
statement; however, the try
statement checks to see if any errors occur within its block. If such an error occurs, rather than the program catching, the interpreter will immediately jump to the except
class and execute that block, before continuing on with the rest of the program.
- Note that this is an exception to the general rule that conditions are only checked at the start of a block; a
try
block effectively tells the computer to keep an eye out for any errors (“try this, but it might break”), with theexcept
clause specifying what to do if such an error occurs.
try
statements are used when a program may hit an error that is not caused by programmer’s code, but by an external input (e.g., from a user or a file). You should not use a try
statement to fix broken program logic or invalid syntax: instead, you should fix those problems directly!