Howard Klein

Howard Klein

Musings on software, discrete event simulation and other pseudo-random topics

Python Coroutine Options, Part 1: Generators

As I mentioned here, coroutines provide a means to express processes occurring over simulated time in a pretty natural way. Python has one built-in mechanism, the generator, which enables coroutine-like functionality.  Two other flavors of coroutine, greenlets and tasklets, are implemented by third-party packages; these will be discussed in a later post.

Simply put, generators are Python functions containing one or more yield statements; they were introduced in Python 2.2. I’m not going to attempt to provide either a complete description of or tutorial on generators; there are plenty of those elsewhere on the web, written by folks much more qualified than myself.  Let it suffice to say that:

  1. A generator is an iterator created by calling the generator function
  2. The generator function starts executing when the built-in next() is invoked on the generator (which can occur implicitly via a loop)
  3. Any yield statement within the generator function returns control (and optionally, a value) to the generator’s caller while allowing the caller to later resume execution of the generator function from where the yield was invoked.

This feature allows us to create a discrete event simulator in which processes are coded as generator functions.  Any process action which consumes (or might consume) simulated time takes the form, in code, of a yield expression.  For example, a simple process that acquires a resource, waits for ten time units, and then releases the resource might look something like this:

yield acquire_resource(rsrc1)
yield wait_for(10)
release_resource(rsrc1)

This is, in fact, the basic approach taken by SimPy, an existing open source simulation framework.  It works, but it does have some limitations.  In particular, generators, as initially implemented, were not easily stackful; while a generator can invoke a regular function, that function cannot in turn invoke yield in any form while still being a “regular function” – once we include a yield statement, it becomes another generator function, and calling it simply returns a generator (rather than “doing” something).

This restriction is not insignificant from our perspective.  It means, for example, that a process implemented through a generator function cannot invoke a wait() function that blocks (through a yield) completely behind the scenes.  It also makes it more difficult for someone coding a complex process to decompose that process into subroutines, as the simulation would fail to execute properly if any of these subroutines attempted to yield, either directly or indirectly.

This limitation was partially addressed in Python 3.3, which introduced enhanced support for subgenerators* via the yield from syntax, e.g.

yield from subprocess()

where subprocess is another generator function – a subgenerator. Both the generator and the subgenerator will maintain their state between yields, allowing us to create, in effect, pretty-much stackful coroutines. The price we pay is the need to make any function that might block (or call a subroutine that potentially blocks) a generator, and the requirement to invoke that function via a yield from statement – all of which looks somewhat (OK, a lot) messier than we would like, and is fraught with the potential for bugs that might not make themselves obvious at run time.  If you don’t expect your function to be a generator and it is, your program may run without complaint, but it almost certainly won’t run as designed.

*Strictly speaking the use of subgenerators was possible prior to Python 3.3, but doing so in a robust way was pretty challenging and generally not recommended.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>