[TransWarp] Thoughts on the design of WarpCORE 2's kernel tables

Fri Jun 28 09:38:32 EDT 2002

Changes from WarpCORE 1:

    * Standardized names across all DB backends

    * GUID support for "unbreakable" cross-database references

    * Get rid of context, counter, and class tables, using objects
      to represent them instead, thus allowing extensibility of
      information associated with them, while making metaclass
      support possible.  Coupled with GUID support, this makes it
      impossible for apps written separately to have class, context,
      or counter naming conflicts.

    * Require unique app_instance/class combination, to prevent
      missing/invalid keys

Table changes:

wc_objects:
     object_id    integer   primary key
     object_guid  varchar() unique key
     instance_key varchar() class-specific key (default to GUID)
     class_obj    integer   reference to wc_objects:object_id
     counter      integer   updated on every modification to object
     description  varchar() human-readable object description

     app_instance+class_obj must be a unique key
     all fields are NOT NULL!

     Instances that should be loaded at DB creation, using fixed,
     negative object ID's and standardized GUIDs:

          "class" object, of class "class"
          "counter" class, of class "class"
          "object" counter, of class "counter"

wc_object_names:
     named_object   integer   reference to wc_objects:object_id
     context_object integer   reference to wc_objects:object_id
     object_name    varchar()

     context_object+object_name  must be unique (primary key)

Notice that since any object can be a context, this allows any object to be 
treated as a namespace, and any other object to be named in it.  So a 
"class" object's namespace could refer to objects to be used as attribute 
descriptors in the class!  Namespaces could also be used for URL traversals.

Sometimes I find it really strange that *removing* concepts from a model 
can make it more powerful.  In this case, removing three tables from the 
previous WarpCORE pattern makes it considerably more expressive.

Using objects to represent classes means that we can define any number of 
additional interfaces on classes as needed for the application 
metamodel.  For example, we could define metadata tables describing which 
tables are used for what classes.  And all of the metadata could be 
manipulated using the same kinds of object-based routines as the data.

Potentially, this metadata could include SQL for manipulating the data, 
which could be used by stored procedures.  In practice, it might not a good 
idea to do this, although it could potentially make WarpCORE databases more 
usable from other languages.  One big downside is that it means writing 
more SQL and less Python, which seems like the opposite of where we want to be!

It's probably more useful for our purposes to let the metadata drive 
generation of Python classes and state management objects, instead.  And 
these objects in turn could be used to generate DDL for the non-metadata 
portions of the database schema...

One interesting thing about this WarpCORE 2 approach, is that it actually 
lends itself to using ZODB's Persistent protocols for state 
management.  Every object has a _p_oid (its object_id) and a _p_jar (the 
WarpCORE database driver).  References are always in the form of _p_oids, 
so "ghosting" can easily be done.  Object's classes are even in the form of 
_p_oids, so the same persistence mechanisms and caching can be used to 
retrieve the class...  The DB object would need some kind of factory hook 
for the classes, so that domain logic could be wrapped around them.

The "counter" field of objects also enables optimistic conflict checking 
ala ZODB, or quick cache consistency checking otherwise.  The counter 
should be bumped when any contents of the object are changed, so that 
complex objects can be checked for consistency without lots of individual 
checks.  This could be used for things like dropdown list objects that keep 
a list of which other objects are to be included on the list.  If items are 
added or removed from the list, the list object counter should 
change.  (Note that if the items themselves change in some way, this 
doesn't affect the list object, which is still a valid *list* of the 
items.  The items only need to be validated when used.)

Presumably we could designate some objects as "data" and others as 
"metadata", based on their classes.  A specially desginated "metadata" 
object's counter could be used to flag the fact that any "metadata" object 
had been changed in the database.  Read-write transactions could issue a 
read-and-hold-lock against this counter to prevent concurrent modification 
of metadata by any other transaction, thus ensuring consistent metadata 
within any one transaction.  (MVCC makes this unnecessary in PostgreSQL, 
but other backends will need it.)  Both read-write and read-only 
transactions would check whether the current metadata counter value was 
different than their cache value, and if so, require version checking upon 
access of all cached metadata objects for the remainder of the transaction.

Data objects, of course, would be version-checked on every transaction.

In an application based on WC2, specialists would have little to do but 
hand off queries to a WC driver object.  Only specialists in charge of 
multi-database objects (e.g. an LDAP<->WC2 bridge) would need to do much 
else.  Inverse relationships (ones where the "pointer" is on the opposite 
side) could  be implemented as persistent collections that looked up the 
related objects upon use.

Implementing the full WC2 concept might take a while, but the basics are 
pretty straightforward.  In some ways, it's too bad Ty and I still have to 
deal with "legacy" CORE and WarpCORE 1 databases, as they don't work quite 
so well for a ZODB-style persistence mapping.

On the other hand, I may be able to "back out" this design to a generic 
form that could work with *any* database, not just CORE or WarpCORE ones, 
using a class factory approach...   Hmmm.  That just might work...  Have 
OID's contain information to identify what primary key fields they 
represent, and have a factory that takes an OID and returns a class, 
perhaps along with callables for state management...  In a legacy database, 
you could map a table to a class (in which case the OID contents would map 
directly to a class), or use its contents (in which case, a select is done 
to retrieve the info and determine the class).

Presto!  We can now use the Persistent machinery for managing object state 
(ghost, active, changed...).  And when an object is saved, any objects it 
refers to that have a _p_oid of None can have an ID generated, and get 
written to the DB.  Voila...  instant referential dependency management, 
with no special coding.  The only downside is not supporting queries within 
a transaction that refer to changed values...  but Z3 persistence registers 
changes with the _p_jar, so the jar (DB driver) could possibly flush 
changes whenever a query was performed.

Hm.  Much to think about.  I like the idea of being able to leverage the 
Zope 3 persistence machinery in order to avoid reinventing wheels.  But 
right now, I've got to stop writing and head to the office!