Session Eight: Generators, Iterators, Decorators, and Context Managers¶
The tools of Pythonicity
Review/Questions¶
Review of Previous Class¶
- Advanced OO Concepts
- Properties
- Special Methods
- Testing with pytest
Homework review¶
- Circle Class
- Writing Tests using the pytest module
Decorators¶
A Short Digression
Functions are things that generate values based on input (arguments).
In Python, functions are first-class objects.
This means that you can bind symbols to them, pass them around, just like other objects.
Because of this fact, you can write functions that take functions as arguments and/or return functions as values (we played with this a bit with the function generator assignment):
def substitute(a_function):
def new_function(*args, **kwargs):
return "I'm not that other function"
return new_function
A Definition¶
There are many things you can do with a simple pattern like this one.
So many, that we give it a special name:
Decorator
“A decorator is a function that takes a function as an argument and returns a function as a return value.”
That’s nice and all, but why is that useful?
An Example¶
Imagine you are trying to debug a module with a number of functions like this one:
def add(a, b):
return a + b
You want to see when each function is called, with what arguments and with what result. So you rewrite each function as follows:
def add(a, b):
print "Function 'add' called with args: %r" % locals()
result = a + b
print "\tResult --> %r" % result
return result
That’s not particularly nice, especially if you have lots of functions in your module.
Now imagine we defined the following, more generic decorator:
def logged_func(func):
def logged(*args, **kwargs):
print "Function %r called" % func.__name__
if args:
print "\twith args: %r" % args
if kwargs:
print "\twith kwargs: %r" % kwargs
result = func(*args, **kwargs)
print "\t Result --> %r" % result
return result
return logged
We could then make logging versions of our module functions:
logging_add = logged_func(add)
Then, where we want to see the results, we can use the logged version:
In [37]: logging_add(3, 4)
Function 'add' called
with args: (3, 4)
Result --> 7
Out[37]: 7
This is nice, but we have to call the new function wherever we originally had the old one.
It’d be nicer if we could just call the old function and have it log.
Remembering that you can easily rebind symbols in Python using assignment statements leads you to this form:
def logged_func(func):
# implemented above
def add(a, b):
return a + b
add = logged_func(add)
And now you can simply use the code you’ve already written and calls to add will be logged:
In [41]: add(3, 4)
Function 'add' called
with args: (3, 4)
Result --> 7
Out[41]: 7
Syntax¶
Rebinding the name of a function to the result of calling a decorator on that function is called decoration.
Because this is so common, Python provides a special operator to perform it more declaratively: the @ operator:
# this is the imperative version:
def add(a, b):
return a + b
add = logged_func(add)
# and this declarative form is exactly equal:
@logged_func
def add(a, b):
return a + b
Callables¶
Our original definition of a decorator was nice and simple, but a tiny bit incomplete.
In reality, decorators can be used with anything that is callable.
In python a callable is a function, a method on a class, or even a class that implements the __call__ special method.
So in fact the definition should be updated as follows:
“A decorator is a callable that takes a callable as an argument and returns a callable as a return value.”“
An Example¶
Consider a decorator that would save the results of calling an expensive function with given arguments:
class Memoize:
"""
memoize decorator from avinash.vora
http://avinashv.net/2008/04/python-decorators-syntactic-sugar/
"""
def __init__(self, function): # runs when memoize class is called
self.function = function
self.memoized = {}
def __call__(self, *args): # runs when memoize instance is called
try:
return self.memoized[args]
except KeyError:
self.memoized[args] = self.function(*args)
return self.memoized[args]
Let’s try that out with a potentially expensive function:
In [56]: @Memoize
....: def sum2x(n):
....: return sum(2 * i for i in xrange(n))
....:
In [57]: sum2x(10000000)
Out[57]: 99999990000000
In [58]: sum2x(10000000)
Out[58]: 99999990000000
It’s nice to see that in action, but what if we want to know exactly how much difference it made?
Nested Decorators¶
You can stack decorator expressions. The result is like calling each decorator in order, from bottom to top:
@decorator_two
@decorator_one
def func(x):
pass
# is exactly equal to:
def func(x):
pass
func = decorator_two(decorator_one(func))
Let’s define another decorator that will time how long a given call takes:
import time
def timed_func(func):
def timed(*args, **kwargs):
start = time.time()
result = func(*args, **kwargs)
elapsed = time.time() - start
print "time expired: %s" % elapsed
return result
return timed
And now we can use this new decorator stacked along with our memoizing decorator:
In [71]: @timed_func
....: @Memoize
....: def sum2x(n):
....: return sum(2 * i for i in xrange(n))
In [72]: sum2x(10000000)
time expired: 0.997071027756
Out[72]: 99999990000000
In [73]: sum2x(10000000)
time expired: 4.05311584473e-06
Out[73]: 99999990000000
Examples from the Standard Library¶
It’s going to be a lot more common for you to use pre-defined decorators than for you to be writing your own.
Let’s see a few that might help you with work you’ve been doing recently.
For example, we saw that staticmethod() can be implemented with a decorator expression:
class C(object):
def add(a, b):
return a + b
add = staticmethod(add)
Can be implimented as:
class C(object):
@staticmethod
def add(a, b):
return a + b
And the classmethod() builtin can do the same thing:
In imperative style...
class C(object):
def from_iterable(cls, seq):
# method body
from_iterable = classmethod(from_iterable)
and in declarative style:
class C(object):
@classmethod
def from_iterable(cls, seq):
# method body
Perhaps most commonly, you’ll see the property() builtin used this way.
Remember this from last week?
class C(object):
def __init__(self):
self._x = None
def getx(self):
return self._x
def setx(self, value):
self._x = value
def delx(self):
del self._x
x = property(getx, setx, delx,
"I'm the 'x' property.")
class C(object):
def __init__(self):
self._x = None
@property
def x(self):
return self._x
@x.setter
def x(self, value):
self._x = value
@x.deleter
def x(self):
del self._x
Note that in this case, the decorator object returned by the property decorator itself implements additional decorators as attributes on the returned method object.
Does this make more sense now?
Iterators and Generators¶
Iterators¶
Iterators are one of the main reasons Python code is so readable:
for x in just_about_anything:
do_stuff(x)
It does not have to be a “sequence”: list, tuple, etc.
Rather: you can loop through anything that satisfies the “iterator protocol”
The Iterator Protocol¶
An iterator must have the following methods:
an_iterator.__iter__()
Returns the iterator object itself. This is required to allow both containers and iterators to be used with the for and in statements.
an_iterator.next()
Returns the next item from the container. If there are no further items, raises the StopIteration exception.
List as an Iterator:¶
In [10]: a_list = [1,2,3]
In [11]: list_iter = a_list.__iter__()
In [12]: list_iter.next()
Out[12]: 1
In [13]: list_iter.next()
Out[13]: 2
In [14]: list_iter.next()
Out[14]: 3
In [15]: list_iter.next()
--------------------------------------------------
StopIteration Traceback (most recent call last)
<ipython-input-15-1a7db9b70878> in <module>()
----> 1 list_iter.next()
StopIteration:
Making an Iterator¶
A simple version of xrange() (whoo hoo!)
class IterateMe_1(object):
def __init__(self, stop=5):
self.current = 0
self.stop = stop
def __iter__(self):
return self
def next(self):
if self.current < self.stop:
self.current += 1
return self.current
else:
raise StopIteration
(demo: Examples/Session08/iterator_1.py)
iter()¶
How doyou get the iterator object (the thing with the next() method) from an “iterable”?
The iter() function:
In [20]: iter([2,3,4])
Out[20]: <listiterator at 0x101e01350>
In [21]: iter("a string")
Out[21]: <iterator at 0x101e01090>
In [22]: iter( ('a', 'tuple') )
Out[22]: <tupleiterator at 0x101e01710>
for an arbitrary object, iter() calls the __iter__ method. But it knows about some object (str, for instance) that don’t have a __iter__ method.
What does for do?¶
Now that we know the iterator protocol, we can write something like a for loop:
(Examples/Session08/my_for.py)
def my_for(an_iterable, func):
"""
Emulation of a for loop.
func() will be called with each item in an_iterable
"""
# equiv of "for i in l:"
iterator = iter(an_iterable)
while True:
try:
i = iterator.next()
except StopIteration:
break
func(i)
Itertools¶
itertools is a collection of utilities that make it easy to build an iterator that iterates over sequences in various common ways
http://docs.python.org/library/itertools.html
NOTE:
iterators are not only for for
They can be used with anything that expects an iterator:
sum, tuple, sorted, and list
For example.
LAB / Homework¶
In the Examples/Session08 dir, you will find: iterator_1.py
- Extend (iterator_1.py ) to be more like xrange() – add three input parameters: iterator_2(start, stop, step=1)
- See what happens if you break out in the middle of the loop:
it = IterateMe_2(2, 20, 2)
for i in it:
if i > 10: break
print i
And then pick up again:
for i in it:
print i
- Does xrange() behave the same?
- make yours match xrange()
Generators¶
Generators give you the iterator immediately:
- no access to the underlying data ... if it even exists
- Conceptually:
- Iterators are about various ways to loop over data, generators generate the data on the fly
- Practically:
You can use either either way (and a generator is one type of iterator)
Generators do some of the book-keeping for you.
yield¶
yield is a way to make a quickie generator with a function:
def a_generator_function(params):
some_stuff
yield something
Generator functions “yield” a value, rather than returning a value.
State is preserved in between yields.
A function with yield in it is a “factory” for a generator
Each time you call it, you get a new generator:
gen_a = a_generator()
gen_b = a_generator()
Each instance keeps its own state.
Really just a shorthand for an iterator class that does the book keeping for you.
An example: like xrange()
def y_xrange(start, stop, step=1):
i = start
while i < stop:
yield i
i += step
Real World Example from FloatCanvas:
Note:
In [164]: gen = y_xrange(2,6)
In [165]: type(gen)
Out[165]: generator
In [166]: dir(gen)
Out[166]:
...
'__iter__',
...
'next',
So the generator is an iterator
A generator function can also be a method in a class
More about iterators and generators:
http://www.learningpython.com/2009/02/23/iterators-iterables-and-generators-oh-my/
Examples/Session08/yield_example.py
generator comprehension¶
yet another way to make a generator:
>>> [x * 2 for x in [1, 2, 3]]
[2, 4, 6]
>>> (x * 2 for x in [1, 2, 3])
<generator object <genexpr> at 0x10911bf50>
>>> for n in (x * 2 for x in [1, 2, 3]):
... print n
... 2 4 6
More interesting if [1, 2, 3] is also a generator
Generator LAB / Homework¶
Write a few generators:
- Sum of integers
- Doubler
- Fibonacci sequence
- Prime numbers
(test code in Examples/Session08/test_generator.py)
Descriptions:
- Sum of the integers:
keep adding the next integer
0 + 1 + 2 + 3 + 4 + 5 + ...
so the sequence is:
0, 1, 3, 6, 10, 15 .....
- Doubler:
Each value is double the previous value:
1, 2, 4, 8, 16, 32,
- Fibonacci sequence:
The fibonacci sequence as a generator:
f(n) = f(n-1) + f(n-2)
1, 1, 2, 3, 5, 8, 13, 21, 34...
- Prime numbers:
Generate the prime numbers (numbers only divisible by them self and 1):
2, 3, 5, 7, 11, 13, 17, 19, 23...
- Others to try:
- Try x^2, x^3, counting by threes, x^e, counting by minus seven, ...
Context Managers¶
A Short Digression
Repetition in code stinks.
A large source of repetition in code deals with the handling of externals resources.
As an example, how many times do you think you might type the following code:
file_handle = open('filename.txt', 'r')
file_content = file_handle.read()
file_handle.close()
# do some stuff with the contents
What happens if you forget to call .close()?
What happens if reading the file raises an exception?
Resource Handling¶
Leaving an open file handle laying around is bad enough. What if the resource is a network connection, or a database cursor?
You can write more robust code for handling your resources:
try:
file_handle = open('filename.txt', 'r')
file_content = file_handle.read()
finally:
file_handle.close()
# do something with file_content here
But what exceptions do you want to catch? And do you really want to have to remember all that every time you open a file (or other resource)?
Starting in version 2.5, Python provides a structure for reducing the repetition needed to handle resources like this.
Context Managers
You can encapsulate the setup, error handling and teardown of resources in a few simple steps.
The key is to use the with statement.
Since the introduction of the with statement in pep343, the above six lines of defensive code have been replaced with this simple form:
with open('filename', 'r') as file_handle:
file_content = file_handle.read()
# do something with file_content
open builtin is defined as a context manager.
The resource it returnes (file_handle) is automatically and reliably closed when the code block ends.
At this point in Python history, many functions you might expect to behave this way do:
- open and codecs.open both work as context managers
- networks connections via socket do as well.
- most implementations of database wrappers can open connections or cursors as context managers.
- ...
But what if you are working with a library that doesn’t support this (urllib)?
There are a couple of ways you can go.
If the resource in questions has a .close() method, then you can simply use the closing context manager from contextlib to handle the issue:
import urllib
from contextlib import closing
with closing(urllib.urlopen('http://google.com')) as web_connection:
# do something with the open resource
# and here, it will be closed automatically
But what if the thing doesn’t have a close() method, or you’re creating the thing and it shouldn’t?
You can also define a context manager of your own.
The interface is simple. It must be a class that implements these two special methods:
- __enter__(self):
- Called when the with statement is run, it should return something to work with in the created context.
- __exit__(self, e_type, e_val, e_traceback):
Clean-up that needs to happen is implemented here.
The arguments will be the exception raised in the context.
If the exception will be handled here, return True. If not, return False.
Let’s see this in action to get a sense of what happens.
An Example¶
Consider this code:
class Context(object):
"""from Doug Hellmann, PyMOTW
http://pymotw.com/2/contextlib/#module-contextlib
"""
def __init__(self, handle_error):
print '__init__(%s)' % handle_error
self.handle_error = handle_error
def __enter__(self):
print '__enter__()'
return self
def __exit__(self, exc_type, exc_val, exc_tb):
print '__exit__(%s, %s, %s)' % (exc_type, exc_val, exc_tb)
return self.handle_error
This class doesn’t do much of anything, but playing with it can help clarify the order in which things happen:
In [46]: with Context(True) as foo:
....: print 'This is in the context'
....: raise RuntimeError('this is the error message')
__init__(True)
__enter__()
This is in the context
__exit__(<type 'exceptions.RuntimeError'>, this is the error message, <traceback object at 0x1049cca28>)
What if we try with False?
In [47]: with Context(False) as foo:
....: print 'This is in the context'
....: raise RuntimeError('this is the error message')
__init__(False)
__enter__()
This is in the context
__exit__(<type 'exceptions.RuntimeError'>, this is the error message, <traceback object at 0x1049ccb90>)
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-47-de2c0c873dfc> in <module>()
1 with Context(False) as foo:
2 print 'This is in the context'
----> 3 raise RuntimeError('this is the error message')
4
RuntimeError: this is the error message
contextlib.contextmanager turns generator functions into context managers
Consider this code:
from contextlib import contextmanager
@contextmanager
def context(boolean):
print "__init__ code here"
try:
print "__enter__ code goes here"
yield object()
except Exception as e:
print "errors handled here"
if not boolean:
raise
finally:
print "__exit__ cleanup goes here"
The code is similar to the class defined previously.
And using it has similar results. We can handle errors:
In [50]: with context(True):
....: print "in the context"
....: raise RuntimeError("error raised")
__init__ code here
__enter__ code goes here
in the context
errors handled here
__exit__ cleanup goes here
Or, we can allow them to propagate:
In [51]: with context(False):
....: print "in the context"
....: raise RuntimeError("error raised")
__init__ code here
__enter__ code goes here
in the context
errors handled here
__exit__ cleanup goes here
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-51-641528ffa695> in <module>()
1 with context(False):
2 print "in the context"
----> 3 raise RuntimeError("error raised")
4
RuntimeError: error raised
Homework¶
Python Power
More reading, etc:¶
Iterators, generators, and containers:
A nice post that clearly lays out how all these things fit together:
http://nvie.com/posts/iterators-vs-generators/
Transforming Code into Beautiful, Idiomatic Python:
Raymond hettinger (again) talks about Pythonic code.
A lot of it is about using iterators – now you know what those really are.
Assignments¶
Task 1: Timing Context Manager
Create a context manager that will print to stdout the elapsed time taken to run all the code inside the context:
In [3]: with Timer() as t:
...: for i in range(100000):
...: i = i ** 20
...:
this code took 0.206805 seconds
Extra Credit: allow the Timer context manager to take a file-like object as an argument (the default should be sys.stdout). The results of the timing should be printed to the file-like object.
Task 2: p-wrapper Decorator
Write a simple decorator you can apply to a function that returns a string. Decorating such a function should result in the original output, wrapped by an HTML ‘p’ tag:
In [4]: @p_wrapper
...: def return_a_string(string):
...: return string
...:
In [5]: return_a_string("this is a string")
Out[5]: '<p> this is a string </p>'
Note that this is a very simple version of the very useful decorators provided by Web Frameworks.
Task 3: Generator Homework (documented above)
Task 4: Iterator Homework (documented above)