[PEAK] Making Data Managers easier to use

Fri May 20 22:23:09 EDT 2005

After thinking some more about Erik Rose's questions and suggestions, I've 
decided to go ahead and implement parts of the planned peak.schema 
interface for the current peak.storage DM classes.  Specifically, here's 
what I've added and am now checking in to CVS:

``dm.get(oid,default=None)``
     Return the object if found, otherwise return the default.  Unlike 
``dm[oid]``, this returns a *non-lazy* object (i.e., it will never return a 
ghost).  It should not be used when setting up inter-object references -- 
keep using ``dm[oid]`` for that, so that you don't cause your entire 
database to load at once!  But for "client" code that accesses DM's, you 
will usually want to use this method instead.  Note that to be able to use 
this method, you *must* modify the DM's ``_load()`` method to raise 
``storage.InvalidKeyError`` when it cannot find the requested object ID.  I 
had to use a new exception to avoid unintentionally trapping errors other 
than the item being not found.  Also note that using ``get()`` *will* cause 
a database access if the requested object is not in cache, or does not 
exist (which implies that it won't be in cache).

``dm.__contains__(ob_or_oid)``
     Returns true if the given object or oid is present in the DB.  If ob 
is an instance of a ``Persistent`` subclass, then the return value 
indicates whether the object "belongs" to this DM.  Otherwise, the object 
is considered an ID, and the return value indicates whether ``dm.get(oid)`` 
would return a non-None value.  (Note: this method is implemented in terms 
of ``get()``, so note that using it may result in a database access.  You 
must also update your ``_load()`` method as you would for using ``get()``.)

``dm.add(ob)``
     Add the already-created object to the DM.  The object must not already 
be owned by another DM.  (If it has already been added to the DM, this 
method is a no-op.)  You do not need to do anything special to use this 
method, but you *should* add a ``_check(ob)`` method to verify that `ob` is 
of a suitable type for storing in the DM.  Note that this method is now 
preferred to ``newItem()``, which is consequently deprecated.

``dm.remove(ob)``
     Remove a previously-added object.  The object will no longer be owned 
by the DM, and future attempts to retrieve the object will fail.  (i.e. 
``ob in dm`` will be false, and ``dm.get(ob._p_oid)`` will return 
None).  The object is not actually deleted from the underlying database 
until ``dm.flush()`` is called, or the transaction commits.  You must 
implement a ``_delete_oids(oidList)`` method in order to use this 
method.  The ``_delete_oids()`` method should delete each of the supplied 
oids from the database, in the order given.  It is possible that some of 
the supplied oids may not exist in the underlying database, and this should 
*not* result in an error; it may be that the object was added and deleted 
without ever having been stored, and that is a perfectly valid use case.

``dm.__iter__()``
     This is not actually implemented; in fact it raises 
``NotImplementedError``.  But if you want to do something like the 
``getAll()`` methods on the bulletins example DM's (which I'm changing to 
use ``__iter__``, by the way), then you should override it.  Your method 
should look something like this::

         def __iter__(self):
             for state in anIterableOfStates():
                 oid = state['whatever_field_the_oid_is_in']
                 if oid not in self.to_delete:
                     yield self.preloadState(oid, state)

     Your ``__iter__`` method should *not* yield items that have been 
deleted, but it *should* yield items that have been added to the DM even if 
they have not yet been added to the underlying database.  If you can't do 
this in a more efficient way, your method can always just start with a 
``self.flush()`` call.  This will flush all pending adds, deletes, and 
modifications to the underlying database, so you can then just query that 
database and get on with it, without worrying about uncommitted adds or 
deletes.

Whew.  I think that about covers it.  Hopefully, this will address a fairly 
long list of DM quirks that have been somewhat annoying to a lot of people 
for some time now.  Interestingly, it also makes several chunks of the 
IntroToPeak tutorial unnecessary.  The things that those sections say are 
still correct and should still work, mind you.  It's just that you don't 
need to write your own ``__contains__`` or ``get()`` or ``remove()`` 
methods any more, and the quirky nature of ``newItem()`` can be now be 
avoided as well.

The implementation passes the existing tests, and a few new ones as well, 
but there remains the possibility that I've nonetheless introduced a bug 
somewhere, so be sure to test thoroughly before upgrading a production 
application to the latest CVS version -- not that any of you would do that, 
right?  :)  And remember, that applies whether you actually use the new 
features or not.