[PEAK] Reactor-driven microthreads
Phillip J. Eby
pje at telecommunity.com
Wed Dec 31 13:12:37 EST 2003
At 12:12 PM 12/31/03 -0500, Bob Ippolito wrote:
>On Dec 31, 2003, at 12:22 AM, Phillip J. Eby wrote:
>
>>Hm. This seems like the first really good use case I've seen for
>>having a macro facility in Python, since it would allow us to spell
>>the yield+errorcheck combination with less boilerplate. Ah well.
>
>That, and "flattening" generators
>
>def someGenerator():
> for something in someOtherGenerator():
> yield something
>
>This gets really ugly after a while, but it happens a lot when you are
>doing "microthreads".
Actually, the Thread class in my second post allows you to do:
def someGenerator():
yield someOtherGenerator(); resume()
Thread objects maintain an external "stack" of nested generators, so you
don't need to do this kind of passthru.
>By the way, the microthread thing in Twisted has been talked about
>before (
>http://twistedmatrix.com/pipermail/twisted-python/2003-February/
>002808.html ) and is implemented to some extent as twisted.flow
>(entirely different from the flow module in that discussion).
I've looked at twisted.flow. To me, it's way overcomplicated in both
implementation and concepts. By contrast, the approach I've just sketched
involves only two interfaces/concepts: threads and schedulers. (Three, if
you count "reactor".) By contrast, twisted.flow seems to have stages,
wrappers, blocks, and controllers, just for starters. (Again, not counting
"reactor" as a concept, and ignoring that the package also depends on
Deferreds.) Last, but not least, twisted.flow doesn't support the kind of
error-passback handling that I've devised, unless I'm misunderstanding
something about how it works.
I do understand that twisted.flow allows you to chain iterable "flows"
while remaining co-operative. However, it's also trivial to use that kind
of chaining in this framework, too, if you're using a generator that's
designed for this:
lines = sock.readlines()
yield lines; resume()
for line in lines:
print line
yield lines; resume()
You would implement sock.readlines() as something like:
def readlines(self):
queue = Queue()
def genLines():
# ... Loop to accumulate a line
queue.put(line)
Thread(self).run(genLines())
return queue
For clarity, I've omitted the part where the generator in readlines()
accumulates the data, pushes back partial lines, etc. As you can see, the
only new concept we need is a "queue", or perhaps we should call it a
"pipe". Actually, the concept is probably used often enough that it would
be easier to do:
def readlines(self):
def genLines():
# ... Loop to accumulate a line
queue.put(line)
return Queue(self, run=genLines())
This is roughly equivalent to twisted.flow's notion of a 'Stage', but is a
bit more concrete/explicit. A critical difference between my proposed
framework and twisted.flow is that t.f tries to intermingle control
instructions and data in the values yielded from an iterator. I think
that's a bad idea, because it makes the implementation hard to
explain. (c.f. "The Zen of Python": "If the implementation is hard to
explain, it's a bad idea").
By contrast, having two distinct concepts of "thread" and "queue" makes
both sides of the code (app and tools) relatively easy to follow. I find
twisted.flow much harder to follow on the "tool" side and thus harder to
see how I would create new data sources, generators, or "instructions".
More information about the PEAK
mailing list