[PEAK] Function expansion and condition negation issues in
PEAK-Rules
Phillip J. Eby
pje at telecommunity.com
Thu Jul 3 18:29:52 EDT 2008
So, I've got a prototype syntax.match() "meta function" implemented
in PEAK-Rules, but there are some issues. Right now, it works
correctly if -- and only if -- you call it as a top-level condition. e.g.:
syntax.match(...) and ... and ...
It does not work correctly if you use it within an expression, e.g.:
syntax.match(...) * 42
or as a negated condition, e.g.:
not syntax.match(...)
This is due to the rather naive way in which I implemented inline
function expansion, such that the expanded function doesn't know what
"truth mode" it's operating in (i.e., whether a "not" is active), or
whether it's in "expression mode" or "condition mode".
There are a few possible ways of refactoring to handle this. First,
I could require separate meta-functions to be registered for
conditions vs. expressions. This seems fairly reasonable since so
far most of my use cases for meta-functions are at the condition
level rather than the expression level. Indeed, I could just change
what I already have to make meta-functions operate strictly at the
condition level, and that would be fine for now.
Dealing with negated conditions is trickier. I could add a
general-purpose condition negation generic function to the framework,
but implementing it is not necessarily straightforward. A pattern
match like "type(`x`) is `y`" involves peering into various data
structures with about 11 or 12 conditions and-ed together. So in
principle, one could turn this into an or-ed collection of the
inverse conditions
But in PEAK-Rules' logic system, "ands" have a required execution
order, and "ors" do *not*. To simulate ordered logic, you have to turn this:
not (a and b)
into this:
(not a) or (a and not b)
And this:
not (a and b and c)
into:
(not a) or (a and not b) or (a and b and not c)
And so on, ad nauseam. This is because PEAK-Rules effectively tests
or-ed conditions in "parallel"; using an index tree something like
this, where the terminal nodes indicate whether "not (a and b and c)" is true:
a?
y: b?
y: c?
y: False
n: True
n: True
n: True
If you're wondering why it has to be in this order, consider what
happens if "a" is a type test of an object, and "b" is a test on some
attribute of the object; we must ensure that "b" is ONLY checked if
"a" is true. That's why this condition is so important.
Ironically, however, neither PEAK-Rules nor its predecessor
RuleDispatch actually implement this logic correctly right now;
they're treating "not (a and b)" as identical to "not a or not b",
despite the difference. So, that seems to actually be an argument in
favor of implementing a logical negation GF that handles this automatically.
This would also be the simplest way to handle match predicate
expansion, as I could get rid of the "truth mode" kluge currently
used by predicates.CriteriaBuilder. Creators of meta-functions that
return logical expressions would also not need to concern themselves
with the "truth mode" of the current expression.
One piece I'm not yet clear on, though, is how to negate OR-ed
conditions, since they are currently not treated sequentially. If
you write "a or b" in a rule, the conditions are tested
independently; i.e. "b" can be tested even if "a" is true, unlike
regular Python where condition "a" must fail for "b" to be evaluated.
This suggests to me that PEAK-Rules needs a way to represent an
ordered "or" sequence; what in some languages is called an "or-else",
so that if you have a sequence like:
OrElse([a, b, c])
Then its disjuncts() would be:
a, intersect(negate(a), b), intersect(intersect(negate(a), negate(b)), c)
And its negate() would be:
intersect(intersect(negate(a), negate(b)), negate(c))
Meanwhile, the negate() of any sequential conjunction (like a
Signature) would result in the OrElse of the negated parts of that
signature. CriteriaBuilder would then use OrElse in place of a
general-purpose disjunction to represent Python "or" lists.
I'm a little bit concerned since this is not a trivial semantic
change; on the other hand, it actually improves the determinism of
the system. Currently, it is undefined and perhaps even
unpredictable/nondeterministic which branch of an "or" condition will
be tested first, as it may depend indirectly on the contents of the
entire rule system for that generic function.
In contrast, the new approach would mean that rules always execute
just like Python, in strict-left-to-right order. (The only exception
is that tests operating directly on parameters, like value
comparisons or type tests on the arguments, can still be evaluated in
any order.)
Okay, so it looks like the implementation plan will work something like this:
* pull back predicates.meta_function to operate only on
criteria-level parsing, not expression level
* Define a new OrElse() type and a negate() GF in the criteria
module, with doc and tests. negate() needs to work on a minimum of
Test, Signature, OrElse, Class, istype, Classes, IsObject, Range,
Value, and NotObjects.
* Change CriteriaBuilder to use OrElse() and negate() in place of its
current use of Disjunction and a "truth mode" flag
New problem(s): it's easy to definte a negate() for singular criteria
with truth flags (like istype, Class, IsObject, and Value), but
Classes, Range, and NotObjects are all *unordered* conjunctions. How
do you negate those, except as unordered dis-junctions?
Suppose we are doing something like:
not (a is not b and a is not c)
Okay, bad example; that means that 'a is b is c', which is either
always false or else it's redundant. But the point is that the inner
condition:
a is not b and a is not c
currently compiles down to:
Test(Identity(Local('a')), NotObjects(IsObject(..., False),
IsObject(..., False))
This compaction is done because a Signature should only contain a
single Test per target expression+criterion type ("identity of a" in
this case), and the testing is done simultaneously by index lookup,
so order is irrelevant at this level. But if these were say, range tests:
not (1 <= a <= 20)
Then we would need to get this to mean the same as "a<1 or
a>20". Conversely, if we have:
not (a<1 or a>20)
Can we negate this and end up with a correct signature? Yes, because
when we convert the interior to "a>=1 and a<=20", we have two
operations on Comparison(a) that can be intersected to form a range
again, and their orderedness is irrelevant at that point.
So, it appears safe to convert conjunctive criteria to unordered
disjunctions and vice-versa. The only part that's still not clear is
what happens when the conversion to an unordered disjunction affects
the top-level structure being created.
More precisely, what is the OrElse (ordered disjunction) of an
(unordered) Disjunction? It seems that unlike Disjunction, OrElse
should not unpack the disjuncts of its children (at creation time),
thereby preserving their non-orderedness. It could instead unpack
child disjunctions when disjuncts() is called on itself.
Whew. I *think* that covers everything needed to implement negation
correctly, but I'll probably need to review this again when it gets
right down to it. It's also likely that this refactoring will
introduce bugs (like the istype/index refactoring did) because it
will affect behaviors and assumptions that are probably not fully
covered by the test suite at the moment. But I don't really see any
way around that; I'll be adding tests for everything I can think of,
but it's an even bigger combinatorial explosion than the istype changes were.
More information about the PEAK
mailing list