[PEAK] Callback-free publish/subscribe now in Trellis SVN
Phillip J. Eby
pje at telecommunity.com
Thu May 15 19:09:54 EDT 2008
I've just checked in to the Trellis SVN, a new trellis data type,
``collections.Hub``, that lets you broadcast and receive messages
without explicit subscriptions or callbacks. You can create a rule
that reads messages from a hub using its ``get()`` method, and you
can write messages to the hub via its ``put()`` method. Any rule
that depends on a ``get()`` call will be recalculated after any
matching ``put()`` calls occur. Below is an excerpt from the
documentation describing the mechanism of operation. Enjoy, and
please let me know if you have any questions or problems!
Hub
---
A ``collections.Hub`` is used for loosely-coupled many-to-many
communications with flexible pattern matching -- aka
"publish/subscribe" or "pub/sub" messaging::
>>> hub = collections.Hub()
You can send messages into a hub by calling its ``put()`` method::
>>> hub.put(1, 2, 3)
However, this does nothing unless there are rules using the hub's
``get()`` method to receive these messages::
>>> @trellis.Performer
... def watch_3_3():
... for message in hub.get(None, None, 3):
... print message
>>> hub.put(1, 2, 3)
(1, 2, 3)
The ``put()`` and ``get()`` methods both accept an arbitrary number
of positional arguments, but ``get()`` will only match ``put()``
calls with the same number of arguments::
>>> hub.put('x', 'y')
>>> hub.put(1, 2, 3, 4)
And then, only if the non-``None`` arguments to ``get()`` match the
corresponding arguments given to ``put``::
>>> hub.put(1, 2, 4)
>>> hub.put(5, 4, 3)
(5, 4, 3)
You can of course have multiple rules monitoring the same hub::
>>> @trellis.Performer
... def watch_2_4():
... for message in hub.get(2, 4, None):
... print "24:", message
>>> hub.put(2,4,3)
24: (2, 4, 3)
(2, 4, 3)
>>> hub.put(2, 4, 4)
24: (2, 4, 4)
And you can send more than one value in a single recalculation or
atomic action, with the relative order of messages being preserved
for each observer::
>>> def send_many():
... hub.put(1, 2, 3)
... hub.put(2, 4, 4)
... hub.put(2, 4, 3)
>>> trellis.atomically(send_many)
24: (2, 4, 4)
24: (2, 4, 3)
(1, 2, 3)
(2, 4, 3)
Note, however, that all arguments to ``put()`` and ``get()`` must be hashable::
>>> hub.put(1, [])
Traceback (most recent call last):
...
TypeError: list objects are unhashable
>>> hub.get(1, [])
Traceback (most recent call last):
...
TypeError: list objects are unhashable
This is because hubs use a dictionary-based indexing system, that
avoids the need to test every message against every observer's match
pattern. Each active ``get()`` pattern is saved under an index,
keyed by its rightmost non-``None`` value.
Each value in a message is then looked up in this index, and then
tested against that (hopefully small) subset of active patterns. For
example, if we look at the contents of our sample hub's index, we can
see that the ``(None, None, 3)`` match pattern is indexed under
"position 2, value 3", and the ``(2, 4, None)`` pattern is indexed
under "position 1, value 4"::
>>> hub._index
{(2, 3): {(None, None, 3): 1}, (1, 4): {(2, 4, None): 1}}
This means that ``(2, 4, None)`` will only be checked for messages
with a 4 in the second position, and ``(None, None, 3)`` will only be
checked for messages with a 3 in the third position (which of course
it will always match).
So, for best performance in high-volume applications, make sure you
design your messages to place "more distinct" fields further to the
right. For example, if you have a small number of distinct message
types, you should probably make the message type the first field, so
that if a ``get()`` matches on both the message type and some
more-distinctive field, it will be indexed only on the
more-distinctive field, avoiding it being matched against every
message of the desired type. (Unless of course, the ``get()`` is
*supposed* to return all messages of the desired type!)
In contrast, if you placed the message type as the last field, then
any ``get()`` targeting a particular message type would incur a
match-time penalty for *every* message of that type. Thus, you
should place fields with fewer possible values more to the left, and
fields with a larger number of possible values more to the right.
More information about the PEAK
mailing list