[TransWarp] Re: [ZODB-Dev] SQL/ZODB integration (was Re: ZODB and new styleclasses?)

Mon Jul 1 13:53:29 EDT 2002

At 01:24 PM 7/1/02 -0400, Shane Hathaway wrote:
>Also FWIW, I've been working on something very similar for the O'Reilly 
>open source conference.  The goal was transparent persistence to 
>anything, SQL in particular, reusing as much of ZODB as possible.  I 
>started by leveraging only persistence and transactions, but in the 
>latest iteration, I was able to reuse most of the ZODB Connection (jar) 
>and DB classes as well.
>
>I used ZODB 3 / Zope 2 to avoid depending on a moving target, but it 
>should be easy to move to ZODB 4 when it's ripe.  I have a working 
>product that lets you "mount" a SQL database, file system, or some 
>combination thereof.  And it can theoretically work over ZEO.  I have a 
>high level screenshot demo in a slide presentation.

I would be very interested in seeing it, but won't be at OSCON.  Any chance
of you posting a copy somewhere?

>Don't slow down on my account, though.  You have ideas I can use, and 
>maybe I have a few of my own.  I was impressed by the notion that you 
>may be building the Python equivalent of J2EE.  And you've written a lot 
>more words (English words ;-) ) than I have.

Well, it won't be a complete J2EE replacement.  The application server and
JSP-equivalent parts of PEAK will be Zope 3 and PageTemplates!  But we have
a partial JNDI replacement basically up and running, along with module
inheritance, a JavaBeans-like structural framework for domain level
objects, and an awful lot of component binding tools (that are used to
"wire" components together in explicit and semi-implicit fashions).  The
storage and deployment packages are the big unwritten zones at present, but
my "day job" is entering a project phase where I need all this stuff to
work Real Soon Now, so I'll be putting in a lot more day cycles on PEAK,
and I expect Ty will be as well, and maybe even our new developer that's
starting this week, at least once she learns Python.  :)

But back on topic, what ideas did you find useful, and what ideas do you
think you would add to what I've explained thus far on the TW list?

>The latest thing I'm trying to work out is conflict detection.  Make 
>sure you have some kind of answer for that.  If the Python app fetches 
>some objects from the database, then an external application writes to 
>the database, then the Python app tries to persist some changes that 
>would conflict with the external application's changes, a ConflictError 
>must be raised.  My current implementation ignorantly stomps on the 
>other app's changes. :-)

By default, so does mine.  But the framework specifies a place to check for
ConflictErrors, in the save() method of a jar.  If the DB schema includes
an update timestamp, this is straightforward.  If not, then you have to
keep around a record of what you loaded.  Either way, you can implement
this in a reasonably straightforward manner, and I might even automate the
"compare to what you loaded" approach once I get to developing SQL-specific
data managers.  (Note that the "compare to what you loaded" approach
doesn't work for long-running transactions, since what you most recently
loaded is not necessarily what was loaded when the user started editing.
That kind of checking has to be done at the application level.)

Side note: ZODB transactions aren't necessarily 100% serializable in any
event; it's not enough to check that something you changed wasn't changed
by someone else.  Technically, anything you *access* in the current
transaction that was read in during a previous transaction, but whose saved
state changed in the meantime, results in an inconsistent transaction if
you used that information to decide what to write!
Unfortunately, there is not a good mechanism in the current ZODB
architecture to allow detecting this condition, as there is no list of
"objects accessed but not loaded in this transaction" nor a way to generate
one.

Anyway, that's why my framework is biased towards letting the underlying DB
handle transactions, so really the correct thing is to ensure that a
transaction is in effect at the DB level and that the reads are part of it.
 Then the rows are read-locked and other apps are prevented from writing.
In order for this to work, "volatile" (i.e. non-content, non-metadata)
records are deactivated at transaction end.  We don't keep them cached
across transactions, for this reason.

For the most part, I'm making the overall assumption that "explicit is
better than implicit" where managing the mappings with external databases
is concerned, and that it's important for the developer to be able to
fine-tune such matters as object lifetimes and conflict detection methods
for the specific situation, if needed.  STASCTAP, in other words.  (Simple
Things Are Simple, Complex Things Are Possible.)