[PEAK] A taxonomy of event sources: designing the events API
Phillip J. Eby
pje at telecommunity.com
Wed Jan 7 00:23:50 EST 2004
Just a few thoughts on the kinds of event sources that PEAK will deal
with... if anybody has thoughts/suggestions/issues beyond what's listed
here, please let me know. At the moment I'm at home sick with the flu or
something, so I'm not sure that all my thoughts will be coherent. Then
again, nobody may notice the difference from my usual writing. :)
Stream vs. Broadcast vs. Handled -- Streamed events are consumed by the
next registered callback. Broadcast events are sent to all registered
callbacks as of the time of the event. Handled events are passed to a
series of "handlers" until the event is "consumed". Handled events can't
be used to directly resume a thread, but the handler receiving them can of
course forward the event through another event source.
Edge vs. Level -- An edge-triggered event resumes threads only when the
event "occurs". A level-triggered event (condition) resumes threads so
long as the condition is present. Callbacks registered while the condition
is present are called immediately.
Examples:
Edge Stream - queue-like data source, GUI clicks/commands.
Edge broadcast - set a value, like a model field for a GUI's view
Level Stream - token passing/semaphore: i.e. a thread can "take" it by
turning "off" the level, and then "release" it by turning "on" the level,
whereupon the next thread in line may "take" it. Also useful for I/O
conditions like "readable(stream)" and "writable(stream)", since the
condition may no longer hold once a thread receives the event and does
something about it.
Level broadcast - condition flag (possibly automatic computation based on
other kinds of event sources)
Edge handled - GUI commands, logger events, timer events
Level handled - no such thing, since handled events don't resume threads.
The above taxonomy is quite rough, but it seems to boil down into two basic
kinds of event sources: handled events and everything else. :) The
"everything else" then needs to know whether it's stream or broadcast, and
that seems to be something that could live as a flagged thing. The edge
vs. level thing would then branch out a bit lower in the class hierarchy,
with some sort of "condition" base class for the level-triggered
stuff. It would need to have a way to know if the "condition" in question
holds, in order to handle shouldSuspend(thread) differently than
edge-triggered sources. (Edge sources simply do
'self.addCallback(thread.step); return True', or some such, while level
sources check whether the condition is true first, and if so, just tell the
thread not to suspend.)
It's rather interesting, I think, that despite the simplicity of these five
categories, there is an immense amount of room for variation in actual
event source implementations. For example, one could create transformers
from one kind of source to another, like stream-to-broadcast or
edge-to-level (e.g. by testing whether the event value meets a condition),
or transformers that transform the events they forward in some way.
The difficult thing for me at this point, then, is to ferret out which are
the most basic forms of the fundamental five (four, really, since the
"handled" event source paradigm doesn't really have any fundamental
variations, though there are plenty of possible implementation variations
having to do with handler priorities). I could easily spend forever
creating variations on each theme.
I also don't want to force people to spell out all the options every time
they create an event source. It should be possible to just create a
Condition (level broadcast), Semaphore (level stream), Value (edge
broadcast), or Distributor (edge stream).
It'd be nice to have a way to apply transforms or dependencies, perhaps as
an input to these types. So that whenever the basic event source has its
addCallback() or shouldSuspend() methods called, the "dependencies" can be
linked. OTOH, transforms seem to be the most varied concept of all, since
there are any number of possible rules being applied, coming from any
number of event sources. Consider, for example, a Condition that is based
on the relationship between two or three Values, or a Distributor whose
events are being converted from XML-RPC to pickles (just to make up a
random nonsense conversion).
There are a few useful abstractions, though, like the notion of a
dependency to another event source, which means that while callbacks exist
on the dependent source, there must be a callback registered with the
depended-on source. In the simplest case this might be a single callback
function.
Another idea is that perhaps it would be useful to have operators
applicable to event sources: aValue * 3 might create a derived Value, and
aValue.gt(27) might produce a Condition tied to aValue.
Hm. Actually, maybe the simplest thing of all would be to spawn event
threads to do transformations, as the only "primitive" transform needed
then would be an 'AnyOf(*eventSources)' object, thus allowing threads to
wait on multiple events simultaneously. It seems horribly wasteful to have
so many threads, as it's hard not to think of them as expensive
things. But event threads are scarcely more than a simple object holding a
list of generators. Not only are they trivial in memory usage, they don't
consume any CPU until/unless something happens to task switch to them.
If we take this approach, we can easily write any possible transform or
dependency by writing a generator function, as long as we have a suitable
implementation of 'AnyOf'. The trick to writing a decent 'AnyOf' event is
that it must resume a thread suspended on it once and *only* once, even if
more than one of the "any" events fires. I suppose the simplest way to do
this is for AnyOf.shouldSuspend() to register a special callback with each
of the "any" events, that will only call the thread once. Heck, I guess
actually that could be done by AnyOf.addCallback() - it would simply wrap
the supplied callback with a forwarder that fires only once, and then call
.addCallback() on each of the "any" events.
With this primitive available, we could then manage any sort of event
transformations we liked, using event threads. And since event threads are
based on functional operations, we can build up a library over time of
common functional patterns of transformation, such as "value matches
constant -> condition" and "value computed from values using
function". One would then simply spawn a thread on the function with
parameters, e.g.:
self.scheduler.spawn( fireWhenEqual(self.aValue, 42, self.aCondition) )
This would then fire 'aCondition' when 'aValue' equals 42. Or perhaps:
self.scheduler.spawn(
computeValue(
lambda: self.aValue * 3,
self.tripleValue,
self.aValue
)
)
to ensure that 'tripleValue' is always set to three times 'aValue',
whenever 'aValue' changes. Of course, it might be easier (and more
efficient) to simply write:
def updateTriple(self):
while True:
yield self.aValue; value = events.resume()
self.tripleValue.set(value * 3)
updateTriple = binding.Make(
events.threaded(updateTriple), uponAssembly=True
)
So we'll have to see over time how that works out. (events.threaded will
be an advice wrapper that turns a generator function into a function that
returns a thread, so once this component is assembled, 'self.updateTriple'
will be an initialized events.Thread object that has already run up to its
first 'yield'. You can see how this will make using event threads
ridiculously easy.)
Hm. I just realized something rather interesting about this style of event
dependency. There's no way to end up in an infinite event loop if you
always use threads to handle events. By that I mean that if our
'updateTriple' method updated 'self.aValue' instead of 'self.tripleValue'
by mistake, it would not result in an infinite loop. The reason is that
while a thread is triggering an event, it cannot also be waiting on an
event. Thus, no such looping is possible.
Indeed, if you think about it, you'll see that any thread resumed by the
first thread has the same dilemna - it can't fire an event that will cause
the first thread to be resumed, because the first thread isn't suspended.
This doesn't mean that you can't create an infinite loop at all, just that
you can't do it in a single event cycle. For example, the second thread
could, upon waking, suspend itself for let's say 0.0001 seconds, to allow
the first thread to go back to sleep. Then it could fire an event that
would re-wake the first. Note, however, that you have to 1) go out of your
way to do it, 2) wait for another event (thus setting a cap on the speed of
re-firing, and allowing the possibility of other threads intervening), and
3) can't cause a stack overflow due to recursive invocation of
threads. The maximum Python stack depth during any event threading
activity is equal to the number of simultaneously runnable threads.
(Of course, all that assumes that you don't do evil things like manually
registering a callback to run 'aThread.step()' while the thread is already
running, although we could certainly implement a re-entrancy check to
prevent that if we wanted to.)
Anyway, that's a bit of a digression, certainly from the event
taxonomy. But it does seem as though we now know what basic types are
required for non-handler events:
Condition
Semaphore
Value
Distributor
AnyOf
I've already mentioned that I'm dropping Queue, Subscription, and the
associated interfaces. They had circular reference issues and other
complications, and writing this message has proved to me that they're
entirely superfluous. The existing State and Value types (and interfaces)
will be merged into one 'Value' type; the distinction between them is
trivial. In the new Value, the 'set()' method will take an optional
'force' parameter to mean, "fire an event even if the new value set is
equal to the existing value". There doesn't seem to be any point in
maintaining different types for such a minor behavior variation.
Semaphore should take a count (default of 1), and have
set(count)/put()/take() methods. put() would increment the count, and
take() would decrement it. Yielding to a semaphore suspends a thread
unless the semaphore has a positive count. A suspended thread will resume
as soon as the count rises above zero again.
Condition should have a set() method that takes a boolean, and it would
suspend threads yielding to it when the boolean is false. Unlike a
semaphore, however, it invokes all its callbacks when the condition becomes
true. (Note that a Condition is different from a boolean Value: a boolean
Value *always* suspends yielding threads until the next change of the
value, while a Condition only suspends when the value is false.)
Distributor would have just a 'send(event)' method, that would fire a
single callback, passing in the event. And AnyOf will be just a pure
IEventSource.
I think the interfaces will change a bit too. There should probably be an
'IStatefulEventSource', adding a no-arguments __call__ method to
IEventSource, to get the current "state" of the event source. This would
be implemented by all of the types previously listed, except for
Distributor and AnyOf. I suppose there could also be an
'ISettableEventSource', adding a 'set(value)' method to
'IStatefulEventSource'. This again could be implemented by everything but
Distributor and AnyOf, so perhaps it's sufficient to have a single
interface for get/settability. OTOH, functions like 'writable(socket)'
shouldn't have to guarantee they'll return a settable event source, so it
probably is best to keep the interfaces separate.
Finally, I suppose that Distributor and Semaphore should have interfaces to
cover their special methods (put/take and send).
Hm. Well that doesn't sound too bad... five classes, maybe an abstract
base, and one more if you count Thread. Not bad at all for an event-driven
programming microkernel! Oh yeah, throw in the 'resume()' function, and a
'threaded()' advice wrapper.
There will also be the scheduler and I/O selector components, but
technically they're not part of the microkernel. You can actually use
event threads without any kind of scheduler or selector if you don't need
to wait for time or I/O events.
I also did a bit of research, looking at Twisted's half-dozen or so reactor
variants for supporting different GUI frameworks, and they all look like
variations of the idea I had the other day for simply adding a
periodically-scheduled event thread that asks the GUI framework for GUI
events, and/or using the GUI framework's I/O event support to schedule
I/O. We can interoperate with Twisted's support for these by simply adding
a thread to 'iterate()' the Twisted reactor repeatedly, or by adding GUI
threads on our own. Similarly, we can either have a raw I/O selector
manage I/O event sources, or we can implement an I/O selector that works by
delegating to a reactor. Heck, the scheduler can also be implemented as
either a standalone scheduler, or as a component that schedules threads
using a reactor. Either way, the interface can be the same.
So, the parts of peak.events that will lie beneath scheduling --
fundamental event source types and the event thread type -- really *are* a
microkernel that doesn't care much what architectures you build on top of it.
I think that about wraps this up. Unless somebody can see some gaping
design flaws that I've missed here, or has some use cases that aren't
covered, I think this represents a decent plan for the microkernel
API. Thoughts, anyone?
More information about the PEAK
mailing list