[PEAK] ORM and "fact base" objects for the Trellis
Phillip J. Eby
pje at telecommunity.com
Wed Jul 18 01:15:47 EDT 2007
Way back in 2004, I was sketching a system for implementing
constraint satisfaction, ORM, and event-driven programming, using
generic functions and something I dubbed a "fact base":
http://dirtsimple.org/2004/12/fact-types-fact-sets-and-change-events.html
As it happens, this idea seems a lot more practical now, with the
Trellis to handle event propagation and fact sets that are defined in
terms of each other.
In fact, it's downright straightforward with the hub-and-spokes
technology I described in my last post. Each fact type (table or
query) simply delegates all mutation operations to the fact base as a
whole, and then updates itself in response to the events it receives.
Its event inputs are spokes, tied to an all-purpose update rule on
the fact base. So, any set that the master fact base object can
generate keys for, will receive its events straight from the
source. However, a fact set can *also* have rules for its events, in
order to derive them from other sets. Either way works, or even both
at once. (If it receives data directly from the master fact base,
its derivation rules won't run on that update cycle -- just like any
other mutually recursive rule overridden by direct update.)
The article I linked to fills in a lot of the remaining pieces needed
to define a kind of "all-purpose database" as a giant, non-enumerable
set of records. It just needs a way to generically parse records
into keys (identifying target sets) and values (indicating the values
to be added to or removed from the target sets), and a way to create
an appropriate implementation set for a given key, if one isn't
already available in cache.
The fact base cache is just a weakref dictionary pointing to sets,
where the sets' add/delete event cells are spokes off the fact base's
update rule. Indexes, whether they are simple key lookups or sorted
lists, can also be implemented this way, even in memory. Really, any
data structure that can be maintained by noting the creation or
deletion of rows is practical. And the fact base itself doesn't
really need to know or care how those data structures work; all that
matters is that the sets be cached by key and that the events be
linked as spokes.
A fairly simple in-memory database could be implemented using a
handful of set types, along with a small framework for defining
simple tuple-structured record types, similar to the EIM (External
Information Model) framework I designed for Chandler (and that was
foreshadowed a few years ago in the thread of blog posts I linked above!).
With some extensions to the model, one could create tuples that
represent queries, by putting "wildcard" or "variable" objects in the
fields whose value is left open for the query to determine. Such
tuples would make fine "keys" for the fact base, such that adding and
deleting them will update any currently-live query sets based on those keys.
The two pieces that are still a little bit vague are the mechanism
for figuring out what keys to potentially extract (which varies based
on the sets currently cached by the fact base), and the mechanism for
creating new implementation sets.
In my 2004 blog posts, I outlined an idea for using generic functions
to do this, but it was based on RuleDispatch. I'd rather use
PEAK-Rules now, but I may need to make some more progress on it
first. I originally intended to have PEAK-Rules finished this month,
with the Trellis coming out later in the year, but now things seem to
have reversed, with lots of progress on the Trellis and relatively
little on PEAK-Rules so far this month.
So... current plan looks like:
* Finish first cut doc and test re-org for the Trellis API
* Implement Hub, Spoke, and nail down some set manipulation patterns
and perhaps set base classes
* Start hammering out a fact base implementation, and a prototype
record schema framework. The latter will be something of a
throwaway, intended mainly to get the kinks worked out of the fact
base framework, and less to be the basis of a production-quality O-R
mapping system.
In other words, it'll be a proof of concept for how to handle set/key
management, indexes, and queries, without the use of an actual
backing store, and probably without general-purpose joins or a host
of other relational operators.
In the long term, however, it'll probably grow all of those things,
including the generator expression query syntax. However, the
precise definition of "long term" depends heavily on whether one of
my clients likes the results of their prototyping. :)
More information about the PEAK
mailing list