A couple of previous posts (here and here) presented three Python-based alternatives for expressing process or agent-based behavior that takes place over (or blocks for) simulated time: generators, greenlets and tasklets.
So which approach is best? There are (at least!) three factors to consider:
- Ease of use – ease-of-coding, expressiveness, maintainability.
- Eco-system factors – compatibility with the rest of my chosen Python tool chain, now and in the future.
- Performance.
Let’s have a look, starting with ease-of-use.
Ease of Use
I’ll start by considering a process-oriented model that simulates a single-server queuing system using greenlets or tasklets. We can implement a base SimProcess
class with the following methods (see this post for the gory details):
def acquire(self, server): # blocks until the server is available and assigned to the caller def wait_for(self, delaytime): # blocks for the specified amount of simulated time
Our single server model then just needs to implement a SimProcess
subclass with attributes server
and servicetime
, and an execute()
method that calls acquire()
and wait_for()
:
def execute(self): self.acquire(self.server) self.wait_for(self.servicetime) self.server.release(self)
In contrast, for a generator-based implementation, execute() is a generator function, as are the base class acquire() and wait_for() functions. execute() would then look something like this:
def execute(self): yield from self.acquire(self.server) yield from self.wait_for(self.servicetime) self.server.release(self)
The generator-based implementation is at least a bit messier-looking with the required yield from
statements. Is that difference significant from a clarity and ease-of-coding perspective? Perhaps not, but the advantages of the greenlet/tasklet approach do become more pronounced as model complexity increases.
A greenlet/tasklet process class allows us to create and call other methods (subprocesses) that can also block via acquire()
and/or wait_for()
calls. A generator-based process class would have to use sub-generators instead, and then we would need to remember to call these using the yield from
syntax – and so on down the entire calling tree (should there be sub-sub-processes). Forgetting to include yield from
anywhere in that calling tree breaks the model, and it might do so silently. (Calling a generator function without yield from
simply creates and returns the generator, without executing it. Is it possible to detect this error at run time? Perhaps, but I haven’t figured out how.)
We might also be tempted to encapsulate some non-blocking behaviors in regular methods or functions, called by the execute()
generator. If we later add blocking behavior to one of those methods (directly or indirectly), that method becomes a generator, and we need to modify each and every caller of that method.
Finally, I’ll note that while simple queuing models can be implemented with only these two defined blocking operations, we’ll inevitably add more to support different and more complex modeling scenarios – all of which must be generators in a generator-based framework.
In short, the greenlet/tasklet approach provides a cleaner and more natural mechanism for invoking blocking or potentially blocking operations. For relatively simple models, this might look like syntactic sugar, but the real benefits increase with the size and complexity of our models.
I’ve been discussing, over the course of a number of posts now, the various ways of implementing coroutines in Python, and how each of these mechanisms can be used to express “behavior over time” in a discrete event simulation model. To recap:
- I started by presenting Python generators, followed by greenlets and tasklets.
- I dove into the details through a discussion of the code for a simple (M/M/1 queue), but complete simulation model – using each of those three options
- In my last post, I compared the three approaches from the ease-of-use, expressiveness and maintainability perspective. Greenlets and tasklets came out on top, particularly as model complexity increases.
Ecosystem
Next, I’ll look what I call (for lack of a better term) “Ecosystem” factors.
Up front, I’ll have to acknowledge that I’m going to simply ignore the “Python 2 vs. Python 3” debate. For some, I realize that this is a caveat so large that it renders the rest of the discussion moot – but it’s also a quasi-religious issue that is well-covered by theologians far more qualified than myself. Suffice to say, I’ve chosen Python 3.x. If you’re a Python 2 devotee I certainly can understand why, but I’ve decided to make my life easier by choosing one side of that fence, and 3.x is the side I’m choosing.
With that aside, I’m trying to address questions like:
- Is the approach (generator/greenlet/tasklet) available in or for the Python distribution I want to use?
- Is the approach compatible with other third-party packages or tools I wish to use?
- Is the approach future-proof? Or in other words, will it still be available to me as the Python ecosystem evolves?
Let’s look at these in turn…
Is the approach available in or for the Python distribution I want to use?
Generators are the winner here, assuming that you are able and willing to use Python language version 3.3 or above, as that’s when the yield from
syntax was introduced. As a core language feature, every implementation of Python 3.3 and above includes generator and sub-generator support.
Greenlets are available as a third-party package for all relatively recent CPython distributions, including every Python 3.x distribution. Greenlets are also supported by PyPy, a stand-alone Python distribution. As far as I know, greenlet support is not available for any other mainstream Python distribution (e.g. Jython or IronPython), either built-in or through a third-party package.
Tasklets are available through two stand-alone distributions, Stackless Python and PyPy. There, as far as I know, no tasklet implementations available that are compatible with any other mainstream Python distribution. As far as language version is concerned, there are, as of this writing, Stackless Python distributions for each Python version through 3.4.2. PyPy is a bit further behind; as of this writing, it’s most recent Python3-compatible release implements Python 3.2.5.
Is the approach compatible with other third-party packages or tools I wish to use?
Generators, being a core language feature, are pretty much by definition compatible with any and all third-party packages and tools. Provided you are using a CPython distribution, the same can be said of greenlets, since they are provided as an extension package to CPython.
The answer is less clear-cut for tasklets, as their use implies a stand-alone Python distribution (either Stackless Python or PyPy). Most prominently, PyPy does not provide full support for the heavily-used numpy module. Stackless, being more closely related to the standard CPython distribution, generally seems (at least to me – I have to acknowledge I haven’t looked terribly closely) to be more compatible with extension modules than PyPy. As far as tools go, some IDEs support Stackless Python better than others, as noted here. I can’t speak to PyPy in this regard.
Is the approach future-proof? Or in other words, will it still be available to me as the Python ecosystem evolves?
As far as generators go, the answer is clearly “yes”. It is a core language feature that has recently been enhanced. A new (as of Python 3.4) extension module, asyncio, has been built on top of this enhanced generator functionality. It’s not going away any time soon.
As an independently maintained extension package, greenlet’s future is not quite as clear-cut. There are other third-party extensions that make use of greenlet, but several of them focus on asynchronous I/O and could conceivably be replaced by asyncio-based solutions at some point in the future. The greenlet package is certainly stable and has been continuously updated for new CPython releases. Will that continue indefinitely? That’s a more difficult question to answer. If greenlets were the only third-party extension that I used, I might be concerned by that possibility. But it isn’t, and I’m not.
While reading the tea leaves is always a challenge, I get the general impression that much Stackless Python development effort and interest is moving to PyPy (and here). PyPy is certainly an interesting option, particularly for performance-critical applications such as simulation. Is PyPy under active development? Certainly. Does it rival CPython in terms of user base? Not yet. Will it eventually? Your guess is as good as (and quite possibly better than) mine.
In summary, there are absolutely no ecosystem- based concerns with simulation based on generators; while nothing is forever, they are likely to outlive me. There are no real concerns right now with greenlets, but its future comes with fewer guarantees. Tasklet support does require a separate Python distribution, which inevitably brings a number of ecosystem related issues into play.
All of which leads to…
Performance
When it comes to simulation – or, at least, a simulation framework – there’s no such thing as “fast enough”. We always want to build larger, more complex models. We always want to run more experiments and more replications. In short, even if our current models and analyses run “fast enough”, we always want the ability to do more. Simulation applications will, sooner or later, consume all of the compute power made available to them. Performance isn’t the only factor to be considered when choosing or building a simulation framework, but it most definitely does matter. So we can and should look at the performance implications of our three options.
Preliminaries
Let’s start by looking at simple generators, greenlets and tasklets in isolation. Below is the code for a simple generator, along with a function that instantiates one of those generators and runs it to the first yield statement:
def generatorFunc(): while True: yield def startGenerator(): generator = generatorFunc() next(generator)
And the corresponding code for greenlets…:
main_greenlet = greenlet.getcurrent() def greenletFunc(): while True: main_greenlet.switch() def startGreenlet(): gr = greenlet(greenletFunc) gr.switch()
and tasklets:
channel = stackless.channel() def taskletFunc(): while True: channel.receive() def startTasklet(): tasklet = stackless.tasklet(taskletFunc)() tasklet.run()
I timed startGenerator()
and startGreenlet()
on my (Windows) laptop (system specs: : i7-4600U, 2.1 GHz, 8Gb memory) using CPython 3.4.3. I also timed startTasklet()
using Stackless Python 3.3.5. For what it’s worth, the timings were generated using the timeit module’s Timer.repeat()
method, with n = 100,000, using the minimum of the ten repetitions. The results are shown below (the right-hand vertical axis presents a normalized scale, with the generator time being 1.0):
Next, I looked at the cost of subsequent calls on these three coroutine implementations, creating test objects:
test_generator = generatorFunc() test_greenlet = greenlet(greenletFunc) test_tasklet = stackless.tasklet(taskletFunc)()
and then timing each of the following calls:
next(test_generator) test_greenlet.switch() channel.send(0)
The results are shown below, again in both microseconds/call and with next(generator) calls normalized to 1.0. For entertainment purposes, I also timed and included a plain-old-function call (coming in as about seven times faster than a generator):
So far, it looks like generators are the clear winner. They are created and initially run almost twice as fast as tasklets, and more than four times faster than greenlets. Their relative performance on subsequent calls is even better – three times faster than tasklets, six times faster than greenlets.
Performance of a “Real” Simulation
But I’m more interested in the performance of an actual simulation. Let’s look at the performance of M/M/1 simulations (as described here), with generator, greenlet and tasklet-based process implementations. (This time, each timing result is the minimum CPU from five simulation runs, each run for one million simulated time units):
Out of curiosity (or perhaps just because of too much time on my hands), I ran the generator-based simulation using both CPython and Stackless. That model ran about 6% more slowly under Stackless, though I’m not sure what, if any, general conclusions can be drawn from that result. (I also ran the simple generator timing tests under both CPython and Stackless, and the differences there were negligible.) The tasklet-based model was about 18% slower than the CPython generator-based model, while greenlet-based model was about 10% slower.
Scalability
What about larger models? Are there any differences (e.g. in memory footprint) that would change these results if we scale up the number of concurrent process objects in a simulation? I tried to evaluate that scenario by modifying my model to simulate multiple, concurrent M/M/1 queues. To simulate n queues, this simulation starts by creating n Server
instances and n ArrivalEvents
(one per server). I then ran simulations for up to 10,000 concurrent queues (with each simulation running for 10,000 simulated time units). Again, I compared implementations using generators (both in CPython and Stackless), greenlets, and tasklets:
A normalized graph illustrates with a bit more precision:
Our previous results appear to scale pretty well; greenlet and tasklet-based implementations are generally 10-15% slower than generators, regardless of scale. Again I’m not really sure what to make of the differences running identical generator-based code on two Python distributions (CPython and Stackless).
So…
What does this all mean? Generators are 2-6 times faster than greenlets and tasklets, but most of that advantage disappears in the M/M/1 simulations; clearly the bulk of the process time is being spent elsewhere. Throw in more extensive output data collection and other overhead activities of a “real” simulator, and I suspect that the generator-based performance advantage almost disappears completely – well into the single digit percentages, if I had to guess.
Are there cases where the differences would be much more significant? I can’t absolutely rule that out – but based on my rough scalability experiments, I believe they would be few and far between, at least on the types of models that I’m thinking about.
Summary and Conclusions
To summarize:
- Greenlets and tasklets are essentially equivalent from an ease and power of use perspective; both have some clear advantages over generators, particularly as model logic becomes more complex.
- Tasklets are problematic due to their incompatibility with the standard CPython distribution; using them requires either Stackless Python or PyPy. Greenlets, implemented as an extension module, are compatible with CPython. Generators work everywhere Python (3.3 and above) does.
- The future of Stackless Python is a bit murky; while PyPy is under more active development, it arguably has not yet acquired a critical mass of mindshare. The Greenlet package is stable and appears to have a reasonable user base, but as a third-party package, future support can’t be absolutely guaranteed. Generators are part of the core language, and I see no reason for that to change in the foreseeable future.
- Generator-based solutions are very slightly faster – there’s not likely to be enough difference to have any practical impact.
So which way should I go? It comes down to generators vs. greenlets – the ecosystem issues that come with tasklets are too significant. For me, the ease-of-use and expressiveness of a greenlet-based framework make that the most attractive option, in spite of any worries about long term support for the package. (I use a number of third-party extension modules; like many developers, I find that the benefit of the functionality that they provide far outweighs the risk that one or more of them won’t outlive the core Python language.) So I’m going with greenlets.