[TransWarp] Thoughts on the design of WarpCORE 2's kernel tables
Phillip J. Eby
pje at telecommunity.com
Fri Jun 28 09:38:32 EDT 2002
Changes from WarpCORE 1:
* Standardized names across all DB backends
* GUID support for "unbreakable" cross-database references
* Get rid of context, counter, and class tables, using objects
to represent them instead, thus allowing extensibility of
information associated with them, while making metaclass
support possible. Coupled with GUID support, this makes it
impossible for apps written separately to have class, context,
or counter naming conflicts.
* Require unique app_instance/class combination, to prevent
object_id integer primary key
object_guid varchar() unique key
instance_key varchar() class-specific key (default to GUID)
class_obj integer reference to wc_objects:object_id
counter integer updated on every modification to object
description varchar() human-readable object description
app_instance+class_obj must be a unique key
all fields are NOT NULL!
Instances that should be loaded at DB creation, using fixed,
negative object ID's and standardized GUIDs:
"class" object, of class "class"
"counter" class, of class "class"
"object" counter, of class "counter"
named_object integer reference to wc_objects:object_id
context_object integer reference to wc_objects:object_id
context_object+object_name must be unique (primary key)
Notice that since any object can be a context, this allows any object to be
treated as a namespace, and any other object to be named in it. So a
"class" object's namespace could refer to objects to be used as attribute
descriptors in the class! Namespaces could also be used for URL traversals.
Sometimes I find it really strange that *removing* concepts from a model
can make it more powerful. In this case, removing three tables from the
previous WarpCORE pattern makes it considerably more expressive.
Using objects to represent classes means that we can define any number of
additional interfaces on classes as needed for the application
metamodel. For example, we could define metadata tables describing which
tables are used for what classes. And all of the metadata could be
manipulated using the same kinds of object-based routines as the data.
Potentially, this metadata could include SQL for manipulating the data,
which could be used by stored procedures. In practice, it might not a good
idea to do this, although it could potentially make WarpCORE databases more
usable from other languages. One big downside is that it means writing
more SQL and less Python, which seems like the opposite of where we want to be!
It's probably more useful for our purposes to let the metadata drive
generation of Python classes and state management objects, instead. And
these objects in turn could be used to generate DDL for the non-metadata
portions of the database schema...
One interesting thing about this WarpCORE 2 approach, is that it actually
lends itself to using ZODB's Persistent protocols for state
management. Every object has a _p_oid (its object_id) and a _p_jar (the
WarpCORE database driver). References are always in the form of _p_oids,
so "ghosting" can easily be done. Object's classes are even in the form of
_p_oids, so the same persistence mechanisms and caching can be used to
retrieve the class... The DB object would need some kind of factory hook
for the classes, so that domain logic could be wrapped around them.
The "counter" field of objects also enables optimistic conflict checking
ala ZODB, or quick cache consistency checking otherwise. The counter
should be bumped when any contents of the object are changed, so that
complex objects can be checked for consistency without lots of individual
checks. This could be used for things like dropdown list objects that keep
a list of which other objects are to be included on the list. If items are
added or removed from the list, the list object counter should
change. (Note that if the items themselves change in some way, this
doesn't affect the list object, which is still a valid *list* of the
items. The items only need to be validated when used.)
Presumably we could designate some objects as "data" and others as
"metadata", based on their classes. A specially desginated "metadata"
object's counter could be used to flag the fact that any "metadata" object
had been changed in the database. Read-write transactions could issue a
read-and-hold-lock against this counter to prevent concurrent modification
of metadata by any other transaction, thus ensuring consistent metadata
within any one transaction. (MVCC makes this unnecessary in PostgreSQL,
but other backends will need it.) Both read-write and read-only
transactions would check whether the current metadata counter value was
different than their cache value, and if so, require version checking upon
access of all cached metadata objects for the remainder of the transaction.
Data objects, of course, would be version-checked on every transaction.
In an application based on WC2, specialists would have little to do but
hand off queries to a WC driver object. Only specialists in charge of
multi-database objects (e.g. an LDAP<->WC2 bridge) would need to do much
else. Inverse relationships (ones where the "pointer" is on the opposite
side) could be implemented as persistent collections that looked up the
related objects upon use.
Implementing the full WC2 concept might take a while, but the basics are
pretty straightforward. In some ways, it's too bad Ty and I still have to
deal with "legacy" CORE and WarpCORE 1 databases, as they don't work quite
so well for a ZODB-style persistence mapping.
On the other hand, I may be able to "back out" this design to a generic
form that could work with *any* database, not just CORE or WarpCORE ones,
using a class factory approach... Hmmm. That just might work... Have
OID's contain information to identify what primary key fields they
represent, and have a factory that takes an OID and returns a class,
perhaps along with callables for state management... In a legacy database,
you could map a table to a class (in which case the OID contents would map
directly to a class), or use its contents (in which case, a select is done
to retrieve the info and determine the class).
Presto! We can now use the Persistent machinery for managing object state
(ghost, active, changed...). And when an object is saved, any objects it
refers to that have a _p_oid of None can have an ID generated, and get
written to the DB. Voila... instant referential dependency management,
with no special coding. The only downside is not supporting queries within
a transaction that refer to changed values... but Z3 persistence registers
changes with the _p_jar, so the jar (DB driver) could possibly flush
changes whenever a query was performed.
Hm. Much to think about. I like the idea of being able to leverage the
Zope 3 persistence machinery in order to avoid reinventing wheels. But
right now, I've got to stop writing and head to the office!
More information about the PEAK