PEAK-Rules/Design |
UserPreferences |
The PEAK Developers' Center | FrontPage | RecentChanges | TitleIndex | WordIndex | SiteNavigation | HelpContents |
NOTE: This document is for people who are extending the core framework in some way, e.g. adding custom action types to specialize method combination, or creating new kinds of engines or conditions. It isn't intended to be user documentation for the built-in rule facility.
The PEAK-Rules core framework provides a generic API for creating and manipulating generic functions, with a high degree of extensibility. Almost any concept implemented by the core can be replaced by a third-party implementation on a function-by-function basis. In this way, an individual library or application can provide for its specific needs, without needing to reinvent the entire spectrum of tools.
The main concepts implemented by the core are:
The first versions will focus on developing a core framework for extensible functions that is itself implemented using extensible functions. This self-bootstrapping core will implement a type-tuple-caching engine using relatively primitive operations, and will then have a method combination system built on that. The core will thus be capable of implementing generic functions with multiple dispatch based on positional argument types, and the decorator APIs will be built around that.
The next phase of development will add alternative engines that are oriented towards predicate dispatch and more sophisticated ways of specifying regular class dispatch (e.g. being able to say things like isinstance(x,Foo) or isinstance(y,Foo)). To some extent this will be porting the expression machinery from RuleDispatch to work on the new core, but in a lot of ways it'll just be redone from scratch. Having type-based multiple dispatch available to implement the framework should enable a significant reduction in the complexity of the resulting library.
An additional phase will focus on adding new features not possible with the RuleDispatch engine, such as "predicate functions" (a kind of dynamic macro or rule expansion feature), "classifiers" (a way of priority-sequencing a set of alternative criteria) and others.
Finally, specialty features such as index customization, thread-safety, event-oriented rulesets, and such will be introduced.
(Note: Criteria, signatures, and predicates are described and tested in detail by the Criteria.txt document.)
A collection of rules, combined with some policy information (such as the default action type) and optional optimization hints. A rule set does not directly implement dispatching. Instead, rule engines subscribe to rule sets, and the rule set informs them when actions are added and removed due to changes in the rule set's rules.
This would almost be better named an "action set" than a "rule set", in that it's (virtually speaking) a collection of actions rather than rules. However, you do add and remove entries from it by specifying rules; the actions are merely implied by the rules.
Generic functions will have a __rules__ attribute that points to their rule set, so that the various decorators can add rules to them. You will probably be able to subclass the base RuleSet class or create alternate implementations, as might be useful for supporting persistent or database-stored rules. (Although you'd probably also need a custom rule engine for that.)
An object that manages the dispatching of a given rule set to implement a specific generic function. Generic functions will have an __engine__ attribute that points to their current engine. Engines will be responsible for doing any indexing, caching, or code generation that may be required to implement the resulting generic function.
The default engine will implement simple type-based multiple dispatch with type-tuple caching. For simple generic functions this is likely to be faster than almost anything else, even C-assisted RuleDispatch. It also should have far less definition-time overhead than a RuleDispatch-style engine would.
Engines will be pluggable, and in fact there will be a mechanism to allow engines to be switched at runtime when certain conditions are met. For example, the default engine could switch automatically to a RuleDispatch-like engine if a rule is added whose conditions can't be translated to simple type dispatching. There will also be some type of hint system to allow users to suggest what kind of engine implementation or special indexing might be appropriate for a particular function.
Method combination is performed using the combine_actions() API function:
>>> from peak.rules.core import combine_actions
combine_actions() takes two arguments: a pair of actions. They are compared using the overrides() generic function to see if one is more specific than the other. If so, the more specific action's override() method is called, passing in the less-specific action. If neither action can override the other, the first action's merge() method is called, passing in the other action.
In either case, the result of calling the merge() or override() method is returned.
So, to define a custom action type for method combination, and it needs to implement merge() and override() methods, and it must be comparable to other method types via the overrides() generic function.
The implies() function is used to determine the logical implication relationship between two signatures. A signature s1 implies a signature s2 if s2 will always match an invocation matched by s1. (Action implication is based on signature implication; see the Action Types section below for more details.)
For the simplest signatures (tuples of types), this corresponds to a subclass relationship between the elements of the tuples:
>>> from peak.rules.core import implies >>> implies(int, object) True >>> implies(object, int) False >>> implies(int, str) False >>> implies(int, int) True >>> implies( (int,str), (object,object) ) True >>> implies( (object,int), (object,str) ) False
It's possible for a longer tuple to imply a shorter one:
>>> implies( (int,int), (object,) ) True
But not the other way around:
>>> implies( (int,), (object,object) ) False
And as a special case of type implication, any classic class implies both object and InstanceType, but cannot imply any other new-style classes. This special-casing is used to work around the fact that isinstance() will say that a classic class instance is an instance of both object and InstanceType, but issubclass() doesn't agree. PEAK-Rules wants to conform with isinstance() here:
>>> class X: pass >>> implies(X, object) True >>> from types import InstanceType >>> implies(X, InstanceType) True
Type or class objects are used to represent "this class or a subclass", but istype() objects are used to represent either "this exact type" (using istype(aType, True)), or "anything but this exact type" (istype(aType, False)). So their implication rules are different.
Internally, PEAK-Rules uses istype objects to represent a call signature being matched, because the argument being tested is of some exact type. Then, any rule signatures that are implied by the calling signature are considered "applicable".
So, istype(aType, True) (the default) must always imply the same type or class, or any parent class thereof:
>>> from peak.rules import istype >>> implies(istype(int), int) True >>> implies(istype(int), object) True >>> implies(istype(X), InstanceType) True >>> implies(istype(X), object) True
But not the other way around:
>>> implies(int, istype(int)) False >>> implies(object, istype(int)) False >>> implies(InstanceType, istype(X)) False >>> implies(object, istype(X)) False
An exact type will also imply any exclusion of a different exact type:
>>> implies(istype(int), istype(str, False)) True
In other words, if type(x) is int, that implies type(x) is not str. But of course, that doesn't work they other way around:
>>> implies(istype(str, False), istype(int)) False
These implication rules are sufficient to bootstrap the basic types-only rules engine; additional rules for istype behavior are explained in Criteria.txt to show intersection of criteria such as istype, and other more-advanced criteria manipulation used in the full predicate rules engine.
The default action type (for rules with no specified action type) is Method. A Method combines a body, a signature, a definition-order serial number, and an optional "chained" action that it can fall back to. All of these values are optional, except for the body:
>>> from peak.rules.core import Method, overrides >>> def dummy(*args, **kw): ... print "called with", args, kw >>> meth = Method.make(dummy, (object,), 1) >>> meth Method(<...dummy...>, (<type 'object'>,), 1, None)
Calling a Method invokes the wrapped body:
>>> meth(1,2,x=3) called with (1, 2) {'x': 3}
One Method overrides another if and only if its signature implies the other's:
>>> overrides(Method.make(dummy,(int,int)), Method.make(dummy,(object,object))) True >>> overrides(Method.make(dummy,(object,object)), Method.make(dummy,(int,int))) False
When a method overrides another, you get the overriding method:
>>> meth.override(Method.make(dummy)) Method(<...dummy...>, (<type 'object'>,), 1, None)
Unless the overriding method's body is a function whose first parameter is named next_method, in which case a chain of methods is created via the "tail" of a copy of the overriding method:
>>> def overriding_fn(next_method, etc): ... print "calling", next_method ... return next_method(etc) >>> chain = Method.make(overriding_fn).override(Method.make(dummy)) >>> chain Method(<...overriding_fn...>, (), 0, Method(<...dummy...>, (), 0, None))
The resulting chain is a callable Method, and the next_method is passed in to the first function of the chain:
>>> chain(42) calling <function dummy at...> called with (42,) {}
Around methods are identical to normal Method objects, except that whenever an Around method and a regular Method are combined, the Around method overrides the regular one. This forces all the regular methods to be further down the chain than all of the "around" methods.
>>> from peak.rules.core import Around>>> combine_actions(Method.make(dummy), Around(overriding_fn)) Around(<...overriding_fn...>, (), 0, Method(<...dummy...>, (), 0, None))
You will normally only want to use Around methods with functions that have a next_method parameter, since their purpose is to wrap "around" the calling of lower-precedence methods. If you don't do this, then the method chain will always end at that Around instance:
>>> combine_actions(Method.make(overriding_fn), Around(dummy)) Around(<...dummy...>, (), 0, None)
The simplest possible action type is NoApplicableMethods, meaning that there is no applicable action. When it's overridden by another method, it will of course get chained to the other method's tail (if appropriate).
>>> from peak.rules import NoApplicableMethods >>> naf = NoApplicableMethods() >>> meth = Method.make(overriding_fn)>>> combine_actions(naf, meth) Method(<...overriding_fn...>, (), 0, NoApplicableMethods())>>> combine_actions(meth, naf) Method(<...overriding_fn...>, (), 0, NoApplicableMethods())
Calling a NoApplicableMethods raises it, displaying the arguments it was called with:
>>> naf(1,2,x="y") Traceback (most recent call last): ... NoApplicableMethods: ((1, 2), {'x': 'y'})
MethodList actions differ from normal method chain actions in a number of ways:
The Before and After action types are both MethodList subclasses. Before actions are invoked before their tail action, and After actions are invoked afterward:
>>> from peak.rules.core import Before, After >>> def primary(*args,**kw): ... print "primary method called" ... return 99 >>> b = Before.make(dummy).override(Method.make(primary)) >>> a = After.make(dummy).override(Method.make(primary)) >>> b(23) called with (23,) {} primary method called 99 >>> a(42) primary method called called with (42,) {} 99
Notice that to create a MethodList with only one method, you must use the make() classmethod. Method also has this classmethod, but it has the same signature as the main constructor. The main constructor for MethodList has a different signature for its internal use.
The combination of before, after, primary, and around methods is as shown:
>>> b = Before.make(dummy) >>> a = After.make(dummy) >>> p = Method.make(primary) >>> o = Around.make(overriding_fn) >>> combine_actions(b, combine_actions(a, combine_actions(p, o)))(17) calling <function before_template at ...> called with (17,) {} primary method called called with (17,) {} 99
Around methods take precedence over all other method types, so the around method's tail is a Before that wraps the After that wraps the primary method.
Within a MethodList, methods are ordered by signature implication first, and then by definition order within groups of ambiguous signatures:
>>> b1 = Before.make("b1", (), 1) >>> b2 = Before.make("b2", (), 2) >>> b3 = Before.make("b3", (int,), 3) >>> combine_actions(b2, b3).sorted() [((<type 'int'>,), 'b3'), ((), 'b2')] >>> combine_actions(b2, b1).sorted() [((), 'b1'), ((), 'b2')] >>> combine_actions(b3, combine_actions(b1,b2)).sorted() [((<type 'int'>,), 'b3'), ((), 'b1'), ((), 'b2')]
After methods sort the opposite way:
>>> a1 = After.make("a1", (), 1) >>> a2 = After.make("a2", (), 2) >>> a3 = After.make("a3", (int,), 3) >>> combine_actions(a2, a3).sorted() [((), 'a2'), ((<type 'int'>,), 'a3')] >>> combine_actions(a2, a1).sorted() [((), 'a2'), ((), 'a1')] >>> combine_actions(a3, combine_actions(a1,a2)).sorted() [((), 'a2'), ((), 'a1'), ((<type 'int'>,), 'a3')]
And lower-precedence duplicate bodies are automatically eliminated from the results:
>>> combine_actions(a1,a1).sorted() [((), 'a1')] >>> combine_actions(b1,b1).sorted() [((), 'b1')] >>> combine_actions(b1, Before.make("b1", (int,), 1)).sorted() [((<type 'int'>,), 'b1')]
When you combine actions whose signatures are ambiguous (i.e. identical, overlapping, or mutually exclusive), you end up with an AmbiguousMethods object containing the ambiguous methods:
>>> am = combine_actions(meth, meth) >>> am AmbiguousMethods([Method(...), Method(...)])
Ambiguous methods can be overridden by an action that would override all of the ambiguous actions:
>>> m1 = Method.make(dummy, (int,)) >>> combine_actions(am, m1) is m1 True >>> combine_actions(m1, am) is m1 True
And if appropriate, the AmbiguousMethods will end up chained to the overriding method:
>>> m2 = Method.make(overriding_fn, (str,)) >>> combine_actions(am, m2) Method(<...overriding_fn...>, (<type 'str'>,), 0, AmbiguousMethods(...)) >>> combine_actions(m2, am) Method(<...overriding_fn...>, (<type 'str'>,), 0, AmbiguousMethods(...))
Ambiguous methods override and ignore anything that would be overridden by any of their members:
>>> am = combine_actions(m1, m1) >>> combine_actions(am, meth) is am True >>> combine_actions(meth, am) is am True
But anything that overlaps just results in a bigger AmbiguousMethods:
>>> combine_actions(m2,am) AmbiguousMethods([Method(...), Method(...), Method(...)]) >>> combine_actions(am,m2) AmbiguousMethods([Method(...), Method(...), Method(...)])
And invoking an AmbiguousMethods instance just outputs diagnostic info:
>>> am(1,2,x="y") Traceback (most recent call last): ... AmbiguousMethods: ([Method(...), Method(...)], (1, 2), {'x': 'y'})
Custom method types can be defined by subclassing Method, and used as a generic function's default method type by setting the functions' rules' default_actiontype:
>>> class MyMethod(Method): ... def __call__(self, *args, **kw): ... print "calling!" ... return self.body(*args, **kw) >>> from peak.rules import when, abstract >>> from peak.rules.core import rules_for >>> tmp = lambda foo: 42 >>> def func_with(mtype): ... abstract() ... def f(foo): """dummy""" ... rules_for(f).default_actiontype = mtype ... when(f, (object,))(tmp) ... return f >>> f = func_with(MyMethod) >>> f(1) calling! 42
The compile_method(action, engine) function takes a method and a dispatch engine, and returns a compiled version of the action:
>>> from peak.rules.core import compile_method, Dispatching >>> engine = Dispatching(f).engine >>> compile_method(Method(tmp, ()), engine) is tmp True
However, for our newly defined method type, there is no compilation:
>>> m = MyMethod(tmp, ()) >>> compile_method(m, engine) is tmp False >>> compile_method(m, engine) is m True
This is because our method type redefined __call__() but did not include its own compiled() method.
The compiled() method of a Method subclass takes an Engine as its argument, and should return a callable to be used in place of directly calling the method itself. It should pass any objects it plans to call (e.g. its tail or individual submethods) through compile_method(ob, engine), in order to ensure that those objects are also compiled:
>>> class MyMethod2(Method): ... def compiled(self, engine): ... print "compiling" ... return compile_method(self.body, engine) >>> m = MyMethod2(tmp) >>> compile_method(m, engine) is tmp compiling True
As you can see, compile_method() invokes our new compiled() method, which ends up returning the original function. And, if we don't define a __call__() method of our own, we end up inheriting one from Method that compiles the method and invokes it for us:
>>> m(1) compiling 42
However, if we use this method type in a generic function, then the generic function will cache the compiled version of its methods so they don't have to be compiled every time they're called:
>>> f = func_with(MyMethod2) >>> f(1) compiling 42 >>> f(1) 42
(Note: what caching is done, and when the cache is reset is heavily dependent on the specific dispatching engine in use; it can also be the case that a similar-looking method object will be compiled more than once, because in each case it has a different tail or match signature.)
Now, Method subclasses do NOT inherit their compiled() method from their base classes, unless they are also inheriting __call__. This prevents you from ending up with strangely-broken code in the event you redefine __call__(), but forget to redefine compiled():
>>> class MyMethod3(MyMethod2): ... def __call__(self, *args, **kw): ... print "calling!" ... return self.body(*args, **kw) >>> f = func_with(MyMethod3) >>> f(1) calling! 42 >>> f(1) calling! 42
As you can see, the new subclass works, but doesn't get compiled. So, you can do your initial debugging and development without compilation by defining __call__(), and then switch over to compiled() once you're happy with your prototype.
Now, let's define a method type that works like MyMethod3, but is compiled using a template:
>>> class NoisyMethod(Method): ... def compiled(self, engine): ... print "compiling" ... body = compile_method(self.body, engine) ... return engine.apply_template(noisy_template, body)
So far, it looks a little like our earlier compilation. We compile the body like before, but then, what's that apply_template stuff?
The apply_template() method of engine objects takes a "template" function and one or more arguments representing values that need to be accessible in our compiled function. Let's go ahead and define noisy_template now:
>>> def noisy_template(__func, __body): ... return """ ... print "calling!" ... return __body($args) ... """
Template functions are defined using the conventions of DecoratorTools's @template_function decorator, only without the decorator. The first positional argument is the generic function the compiled method is being used with, and any others are up to you.
Any use of $args is replaced with the correct calling signature for invoking a method of the corresponding generic function, and you must name all of your arguments and local variables such that they won't conflict with any actual argument names. (In practice, this means you want to use __-prefixed names, which is why we're defining the template outside the class, to prevent Python from mangling our parameter names and messing up the template.)
Note, too, that all the other caveats regarding @template_function functions apply, including the fact that the function cannot actually use any of its arguments (or any variables from its containing scope) to determine the return string -- it must simply return a constant string. (It can, however, refer to globals in its defining module, as long as they're not shadowed by the generic function's argument names.)
Okay, let's see our new method type in action:
>>> f = func_with(NoisyMethod) >>> f(1) compiling calling! 42 >>> f(1) calling! 42
As you can see, the method is still compiled just once, but still prints "calling!" every time it's invoked, as the compiled form of the method is a purpose-built wrapper function.
To save time and memory, the engine.apply_template() tries to memoize calls so that it will return the same function, given the same inputs, so long as the function still exists:
>>> from peak.rules import value >>> m = NoisyMethod((), value(42)) >>> m1 = compile_method(m) compiling >>> m2 = compile_method(m) compiling >>> m1 is m2 True
This will only work, however, if all the arguments passed to apply_template are usable as dictionary keys. So, it's best to use tuples instead of lists, frozensets instead of sets, etc. (Also, this means you can't pass in keyword arguments.)
You can define one method type's precedence relative to another using the >> operator (which always returns its right-side operand):
>>> NoisyMethod >> Method <class 'peak.rules.core.Method'>
You can also chain >> operators to define overall method precedence between multiple types, e.g.:
>>> Around >> NoisyMethod >> Method <class 'peak.rules.core.Method'>
As long as you don't try to introduce a precedence cycle:
>>> NoisyMethod >> MyMethod2 >> Around Traceback (most recent call last): ... TypeError: <class 'peak.rules.core.Around'> already overrides <class 'MyMethod2'>
XXX decorators and how to create them: when, around, before, after:
>>> from peak.rules import before, after >>> def p(x): print x >>> def f(): p("yo!")
Rule decorators return the function they are decorating, unless the function's name is also the name of the generic function they're adding to:
>>> before(f)(lambda: p("before")) <function <lambda> at ...> >>> after(f)(lambda: p("after")) <function <lambda> at ...> >>> f() before yo! after
Decorators can accept an entry point string in place of an actual function, provided that the PEAK "Importing" package (peak.util.imports) is available. In that case, the registration is deferred until the named module is imported:
>>> before('some.module:somefunc')(lambda: p("before")) <function <lambda> at ...>
If the named module is already imported, the registration takes place immediately, otherwise it is deferred until the named module is actually imported.
This allows you to provide optional integration with modules that might or might not be used by a given application, without creating a dependency between your code and that package.
Note, however, that if the named function doesn't exist when the module is imported, then an attribute error will occur at import time. The syntax of the target name is lightly checked at call time, however:
>>> before('foo.bar')(lambda: p("before")) Traceback (most recent call last): ... TypeError: Function specifier 'foo.bar' is not in 'module.name:attrib.name' format >>> before('foo: bar')(lambda: p("before")) Traceback (most recent call last): ... TypeError: Function specifier 'foo: bar' is not in 'module.name:attrib.name' format
(This is just a sanity check, though, just to make sure you didn't accidentally put some other string first (like the criteria). It won't detect a string that points to a non-existent module, or various other possible errors, so you should still verify that your code gets run when the target module is imported and the relevant conditions apply.)
XXX custom combination demo from RuleDispatch (compute upcharges+tax)
Rules are currently implemented as 3-item tuples comprising a predicate, a body, and an action type that will be used as a factory to create the actions for the rule. At minimum, all a rule needs is a body, so there's a convenience constructor (Rule) that allows you to create a rule with defaults. The predicate and action type default to () and None if not specified:
>>> from peak.rules.core import Rule >>> def dummy(): pass >>> r = Rule(dummy, sequence=0) >>> r Rule(<function dummy ...>, (), None, 0)
An action type of None (or any false value) means that the ruleset should decide what action type to use. Actually, it can decide anyway, since the rule set is always responsible for creating action objects; the rule's action type is really just advisory to begin with.
RuleSet objects hold the rules and policy information for a generic function, including the default action type and optional optimization hints.
Iterating over a ruleset yields its actions:
>>> from peak.rules.core import RuleSet >>> rs = RuleSet() >>> list(rs) []
And rules can be added and removed with the add() and remove() methods:
>>> r = Rule(dummy, sequence=42) >>> rs.add(r) >>> list(rs) [Rule(<function dummy ...>, (), <...Method...>, 42)] >>> rs.remove(r) >>> list(rs) []
Observers can be added with the subscribe() and unsubscribe() methods. Observers have their actions_changed method called with an "added" set and a "removed" set of action definitions. (An action definition is a tuple of the form (actiontype, body, signature, serial), and can thus be used to create action objects.)
>>> class DummyObserver: ... def actions_changed(self, added, removed): ... for a in added: print "Add:", a ... for a in removed: print "Remove:", a >>> do = DummyObserver() >>> rs.subscribe(do) >>> rs.add(r) Add: Rule(<function dummy ...>, (), <...Method...>, 42) >>> rs.remove(r) Remove: Rule(<function dummy ...>, (), <...Method...>, 42) >>> rs.unsubscribe(do)
When an observer is first added, it's notified of the current contents of the RuleSet, if any. As a result, observers don't need to do any special case handling for their initial setup. Everything can be handled via the normal operation of the actions_changed() method:
>>> rs.add(r) >>> rs.subscribe(do) Add: Rule(<function dummy ...>, (), <...Method...>, 42)
Unsubscribing, however, does not send any removal messages:
>>> rs.unsubscribe(do)
This section is currently just design notes to myself, hopefully to grow into a more thorough discussion and doctests of the relevant sub-frameworks.
# These 2 funcs must skip dupes and return the item alone if only 1 found disj(*items) = Or( [i for item in items for i in disjuncts(item)] ) conj(items) = And([i for item in items for i in conjuncts(item)] ) def combinatorial(seq, *tail): if tail: return ((item,)+t for item in seq for t in combinatorial(*tail)) else: return ((item,) for item in seq) # this func would be more efficient if 'conj' were moved inside 'combinatorial' # especially if conj were a binary operation, and the results of each nested # loop were reduced to a unique set... # intersect(*items) = Or( map(conj, combinatorial(*map(disjuncts,items))) ) # simplified, but still needs dupe skipping/flattening of the Or intersect(i1, i2) = Or( *[conj((a,b)) for a in set(disjuncts(i1)) for b in set(disjuncts(i2))] ) disjuncts(Or) = Or.items disjuncts(Not) = map(negate, conjuncts(Not.expr)) disjuncts(*) = [*] conjuncts(And) = And.items conjuncts(Not) = map(negate, disjuncts(Not.expr)) conjuncts(*) = [*] negate(And) = Or(map(negate,And.items)) negate(Or) = And(map(negate,Or.items)) negate(Not) = Not.expr negate(Compare) = reverse comparison sense ? negate(*) = Not(*) Not-methods and negate() function could be eliminated by CriteriaBuilder tracking negation during build. implies(Or, *) iff all Or.items imply * implies(And, *) iff any And.items imply * implies(*, *) iff equal items [need to define for struct/struct, too!] implies(Range, Range) by range overlap implies(IsInstance, IsInstance) by subclass relationships/truth map to_logic(Call) -> via function mapping for Call(Const(),...) to_logic(Compare) -> Identity, IsInstance, Range, etc.? to_logic(*) -> Truth(expr, mode)
dispatch_table(*, Identity, cases) -> {seed: bitmap} where bitmap = inclusions[seed] | (inclusions[None] - exclusions[seed]) | (cases - known_cases)