[PEAK] Heads up: Trellis API changes coming

Phillip J. Eby pje at telecommunity.com
Wed Jul 25 16:17:06 EDT 2007


After giving more thought to hubs, spokes, and factbase stuff (see my 
most recent braindump), I think I've sorted out how to make some 
improvements to the current API.  It has been bothering me that we 
have 9 decorators for component class stuff, and that they overlap in 
all sorts of weird little ways, and the idea of adding even *more* 
decorators to support hub-and-spoke stuff was just not good.

So, here's what I've figured out: if I make a certain change to the 
way "event" cells reset their values, I don't need to implement a 
"hub" cell type or any ability to do "write rules".  And in addition, 
I won't need to support event cells with rules any more, which 
simplifies the existing decorator set as well as getting rid of the 
need to add more.

Specifically, the Lisp "Cells" library at some point decided to 
change how they handle "ephemeral" cells (equivalent to our "event" 
cells), so that instead of silently resetting their contents after a 
pulse, they would actually act as if they had been manually reset to 
a new value, triggering a pulse and propagation.

I was previously tempted to follow this approach myself, because 
although it requires some extra rule recalculations, it eliminates 
the need to worry about "or" conditions in rules that refer to event 
cells.  As discussed here:

   http://www.eby-sarna.com/pipermail/peak/2007-July/002720.html

it's currently necessary to "or" such conditions using "|" instead of 
"or", in order to properly maintain dependencies.

I was a little worried, though, because Cells' author Ken Tilton 
warned that this change to event propagation could have bad effects 
on rules containing side-effects.  (Cells tries to separate 
side-effects into so-called "observer" methods, while the Trellis 
just treats everything as rules.)

But after giving it some thought, what sealed the deal for making 
this change in propagation strategy, is that it *eliminates the need 
to have "event" rules*.  Currently, you can make a cell an "event" 
(i.e., automatically resetting its value after every pulse) *and* 
give it a rule.  This feature is needed because if a rule depends on 
an event, you need a way to have the rule reset itself, too.

However, if event cells' resetting is propagated, then the rule can 
simply reflect its reset state as part of the actual rule.

So, if I make this change to propagation, event handling is 
simplified everywhere.  Rules are just rules -- there is no need to 
worry about whether the rule should also be considered an 
"event".  Events are then simpler to define as well, since all they 
need to know is their "reset" value.

In fact, given their new, more restrictive role, I think they could 
be called something other than "events".  I've got a few candidates:

* message port
* mailbox
* trigger
* command
* inbox
* consumer
* receiver

But I'm not sure how much I like any of them.  The real idea to 
convey is that it's a thing you can put one thing into, that will 
then be received, processed, and removed.  Mailbox is closest to that 
idea, except that it seems to imply you could put more than one thing 
in it at a time.  Message port also is close, but it's a long name 
and shortening it to 'port' makes it too ambiguous with other value cells.

Trigger implies that you set it off, but not that you can enclose a 
value or message.  Command isn't quite right because a command would 
be something put *into* this box.  Inbox is shorter than mailbox or 
message port, and seems to imply less capacity and delay than 
mailbox.  Consumer is a little too active.  Receiver isn't bad.

I guess I lean towards either inbox, trigger, or receiver.  The two 
common use cases are things like passing in some data from an 
external system (like a mouse-move event) or to flag a condition.  It 
makes sense to put a mouse position into an inbox, but not into a 
trigger.  Conversely, it makes sense to set a trigger to "true", but 
not as much to an inbox.  I suppose, however, that if it's a 
receiver, it can receive either mouse movements or flags.

Hm.  It would be nice if we could make receivers callable, such that 
if you have a 'mouse_move' receiver, you could use wx's Bind() to 
make it get called with the event.  That is:

     self.Bind(wx.EVT_MOUSE_MOVE, self.mouse_move)

instead of:

     self.Bind(wx.EVT_MOUSE_MOVE, self.__cells__['mouse_move'].set_value)

which is about what you have to do now.  Of course, I suppose you 
could always just do:

    self.Bind(wx.EVT_MOUSE_MOVE, partial(setattr, self, 'mouse_move'))

Which is good because the callable-receiver thing doesn't really work 
too well when I think about it in more detail.

Okay, so receivers will still use the same CellProperty 
descriptor.  The 'event' and 'events' decorators will be replaced 
with a single 'receiver' function that creates a CellProperty and 
registers the right metadata.  It'll take one optional parameter: the 
reset value of the cell, defaulting to None.

That brings us down to a mere 8 decorators.  I think I should 
probably just drop 'value', because it's just as easy to use:

     trellis.values(x = 1)

as it is to use:

     x = trellis.value(1)

and there should be only One Obvious Way To Do It.  So that takes us 
down to 7 decorators.  (Note that we have to keep @rule and rules() 
both, because sometimes you need more than a lambda, and conversely 
it should be possible to be compact.  Practicality beats purity.)

The three remaining candidates for elimination or simplification are 
'optional', 'cell_factory', and 'cell_factories'.  'optional' only 
makes sense for rules, and no longer needs to be combinable with 
'event', so we could perhaps make optional work as both @optional 
over a function and as optional(**names_to_rules).

However, optional rules are likely to be rare.  The only reason to 
use them is to keep a rule from being updated unless some part of the 
program has actually used it at least once.  This implies both that 
1) the rule itself is somewhat unlikely to be a simple lambda 
expression, and 2) there aren't likely to be many of them.  Thus, an 
@optional-only syntax seems sufficient.

That brings us down to @cell_factory and cell_factories().  My 
original theory was that these would be needed for components that 
need to share receivers or writable values with other 
components.  For example, my "EditBridge" class that copies text from 
a wx widget to a cell.

But in practice, at least so far, it hasn't really been 
necessary.  Any place I've needed to have a specific cell, it's been 
just as easy for the component's creator to pass in the cell as a 
keyword argument.

The other reason for having cell-factory control was to allow the use 
of custom cell types, without needing to override __init__ or put the 
cell setup into another rule.  For example, at one point I planned to 
have the ability to specify an alternate comparison function for 
telling whether a value has "changed", and the only way to specify 
this would've been via custom cell construction.

However, it doesn't seem as there really are going to be any such 
things as custom cell types, or even custom comparison 
functions.  For one thing, it's easy enough to implement most of 
these things as other rules in a fairly straightforward way.  For 
another, if we do have the need to set up custom cells, these can be 
done in one-time setup rules or __init__.

In addition, if the "custom cell" can be effectively implemented by 
decorating the cell's rule (and I think most of the use cases for 
custom cells would in fact work that way), then one might as well 
decorate the rule directly.

So, I'm going to call YAGNI on cell_factory and cell_factories, even 
though internally I have to keep much of the machinery for them in 
place.  That leaves us with the final five API functions, illustrated here:

     class Example(trellis.Component):

         trellis.values(
             C = 100,
             F = 212,
         )

         trellis.rules(
             C = lambda self: (self.F-32)/1.8,
             F = lambda self: (self.C*1.8)+32,
         )

         start_reporting = trellis.receiver(False)

         @trellis.optional
         def report(self):
             print "Celsius.....", self.C
             print "Fahrenheit..", self.F

         @trellis.rule
         def begin_reporting(self):
             if self.start_reporting:
                 self.report

If you set Example().start_reporting = True, it will begin printing 
both temperatures, with updates any time there's a change.  Before 
that point, it won't, because the 'report' rule is "optional" (i.e., 
not automatically activated upon __init__).

I think that trimming the API down to these five elements (with just 
three that will be used in most components) makes things a lot easier 
to learn and understand.

Okay, next bits to deal with: hubs and spokes.

First off, as I alluded to earlier, I've figured out how to get away 
without having a special hub cell type.  It's pretty much sufficient 
to just do something like this to implement a set, for example:

     empty = frozenset()

     class Set(trellis.Component):

         _added = _removed = empty

         trellis.values(
             data = None
         )

         changed = trellis.receiver(False)

         @trellis.rule
         def added(self):
             if self.changed:
                 a = self._added
                 self._added = empty
                 return a
             return empty

         @trellis.rule
         def removed(self):
             if self.changed:
                 r = self._removed
                 self._removed = empty
                 return r
             return empty

         @trellis.rule
         def data(self):
             data = self.data
             if data is None:
                 data = set()
             if self.added:
                 for item in self.added: data.add(item)
             if self.removed:
                 for item in self.removed: data.remove(item)
             return data

         def __iter__(self):
             return iter(self.data)

         def __len__(self):
             return len(self.data)

         def __contains__(self, item):
             return item in self.data

         def add(self, item):
             self.changed = True
             if item in self._removed:
                 self._removed.remove(item)
             elif self._added:
                 self._added.add(item)
             else:
                 self._added = set([item])

         def remove(self, item):
             self.changed = True
             if item in self._added:
                 self._added.remove(item)
             elif self._removed:
                 self._removed.add(item)
             else:
                 self._removed = set([item])

This set's "added" and "removed" attributes always reflect the 
changes in a Set's contents since the preceding pulse.  No special 
cell types are required, just a little care in design and 
implementation -- plus the fact that the change in "receiver" 
propagation means that added and removed will reset themselves automatically.

Now on to spokes...

Spoke are needed in order to eliminate N-way recalculations when only 
a much smaller number of cells (e.g. zero or one) need 
recalculating.  Nominally, one could do this by having a rule that 
assigns values to the target (subscriber?) cells, but this results in 
the targets being out of date, because assignments don't take effect 
until the next pulse.

So the idea of a spoke is to allow assignment to happen in the 
current pulse.  This can only work, of course, if the assigner is a 
single rule, and that rule is always calculated before the cell's 
value is read in a given pulse.

I had been thinking that this would require a special cell type, too, 
but I think I've come up with a way to avoid it, by rethinking the 
idea of a spoke from a "write" orientation to a "read" orientation.

Specifically, the reason we don't want to use ordinary rule cells as 
spokes is that if they depend on a central cell that changes every 
time something happens, then they will be "awakened" every time 
something happens.  Conversely, if they depend on a central cell that 
*doesn't* change, then they will never wake up!

However, it has occurred to me that instead of trying to write to the 
cells to get around this, what we could do instead is make them 
depend on an un-changing central cell -- that has some way to 
forcibly "wake" the desired spoke cells.  That is, force them to be 
recalculated within the current pulse.

That way, a spoke cell could simply contain a rule that depends on 
the central cell, and computes its state using some non-cell data (so 
there's no dependency).  The central cell's rule then contains code 
to force recalculation of the appropriate "spoke" cells.  Here's an 
example of this approach for the perennial "Time service" example::

class Time(trellis.Component, context.Service):

     trellis.rules(
         now        = lambda self: volatile() and Timestamp(time.time()),
         _schedule  = lambda self: [NOT_YET],
         _events    = lambda self: weakref.WeakValueDictionary(),
     )

     @trellis.rule
     def next_event(self):
         triggered = self._triggered = set()
         while self.now >= self._schedule[0]:
             key = heapq.heappop(self._schedule)
             if key in self._events:
                 triggered.add(key)
                 self._events.pop(key).force_recalc()
         return self._schedule[0]

     @trellis.rule
     def _update(self):
         self.next_event

     def after(self, when):
         if when not in self._events:
             if when <= self.now:
                 return True
             heapq.heappush(self._schedule, when)
             self._events[when] = e = \
                 Cell(lambda:
                     e.value or self._update or when in self._triggered,
                     False
                 )
         return self._events[when].value

     def delay(self, secs=0):
         return self.after(self.now + secs)

It's not what I'd call beautiful, but it's okay.  The messy bits are 
that we have to make a dummy rule (_update) that depends on 
next_event but never changes itself -- its value is always 
None.  This means that the spoke Cells depending on it will never be 
recalculated unless forced to.  The second messy bit is that 
next_event has to stick some data in a non-cell attribute.

Finally, the Cell rule itself is a bit weird looking, in order to 
ensure that it can go Constant as soon as it has a true value.  It 
checks its own value first, so that it won't depend on _update any 
more in that case.  Then, it references _update so that it (and 
next_event) will always be checked if you ask for the cell's value 
directly.  And finally, it looks in _triggered to see if it's a go.

Whew.  So, all that's needed to implement this use case, is a 
'force_recalc()' method on cells, which would be pretty easy to add.

But this use case is what you might call a "steady value spoke" -- 
once triggered, we don't need for these "after()" cells to ever 
change value again.  How does this approach work for our factbase use 
case, where 'added' and 'removed' on derived sets need to reset 
themselves automatically?

If we used the same basic approach, the spoke cells would not be 
refreshed when the main set's added/removed sets reset to being 
empty.  So, a splitter's update rules would need to also refresh any 
cells that were directly affected by the last recalculation, in 
addition to the ones that were directly affected by the current update.

This is likely to also be a bit ugly, perhaps even more so than the 
Time example, but it should still work.

So, it seems we should be able to do without special hub *or* spoke 
cell types, as long as we're willing to put up with some messiness 
being required in order to implement services like Time or 
Splitters.  Of course, it might be possible to design some decorators 
or other tools that would decrease the messiness.  But for now I'll 
put up with the messiness, if only to learn enough of the patterns to 
be able to automate them.

One slightly tricky way to clean up the Time service is this:

class Time(trellis.Component, context.Service):

     trellis.rules(
         now        = lambda self: volatile() and Timestamp(time.time()),
         _schedule  = lambda self: [NOT_YET],
         _events    = lambda self: weakref.WeakValueDictionary(),
     )

     @trellis.rule
     def _updated(self):
         if self._updated is None:
             updated = set()
         else:
             updated = self._updated
             updated.clear()
         while self.now >= self._schedule[0]:
             key = heapq.heappop(self._schedule)
             if key in self._events:
                 updated.add(key)
                 self._events.pop(key).force_recalc()
         return updated

     @trellis.rule
     def next_event(self):
         self._updated
         return self._schedule[0]

     def after(self, when):
         if when not in self._events:
             if when <= self.now:
                 return True
             heapq.heappush(self._schedule, when)
             self._events[when] = e = \
                 Cell(lambda: e.value or when in self._updated, False)
         return self._events[when].value

     def delay(self, secs=0):
         return self.after(self.now + secs)

In this version, we make a rule, '_updated', that always returns the 
same set object, modified in-place.  This makes it seem as though the 
'_updated' cell never changes, even though it really does.  Thus, the 
spokes force it to recalculate as needed, but are never awakened by 
its recalculation (since it never changes), so the spokes are only 
recalculated when '_updated' explicitly asks them to.  Thus, the 
visible messiness of the previous approach has now been replaced with 
somewhat under-handed trickery...  which alas, is not necessarily an 
improvement!

Ah well, something to consider later.  For now, it looks to me like 
the reduction to 5 decorators, and the change to the propagation 
rules will suffice to let us implement an efficient fact base system, 
time service, and indeed any other many-to-many services (such as 
select()-triggered conditions).




More information about the PEAK mailing list