[PEAK] Progress on peak.schema
Phillip J. Eby
pje at telecommunity.com
Mon Feb 28 22:38:40 EST 2005
Just an FYI for anybody who's wondering what ever happened to peak.schema,
the future "workspace" persistence API, and conceptual queries. I have in
fact made some great progress in the last month.
OSAF (www.osafoundation.org) has a short-term consulting contract with my
corporation that involves, among other things, designing a developer
platform API for the Chandler open-source PIM project. This month I've
been working on Spike, which is a Chandler sandbox project at:
http://cvs.osafoundation.org/viewcvs.cgi/internal/Spike/
I've been keeping (somewhat) quiet about it because it wasn't clear what
its future with respect to Chandler was; things have now settled a little
bit on that front. That's not to say that it's going into Chandler, but
it's no longer a completely speculative project and there's now been an
official announcement regarding it:
http://lists.osafoundation.org/pipermail/dev/2005-February/002482.html
Anyway, all that aside, the relevant bit for PEAK users is that Spike's
'spike.schema' module is in fact a rough draft of many of the ideas I had
for peak.schema and SOAR (Simple Objects Archived Relationally). And,
Spike's architecture overview (src/spike/overview.txt) now contains a rough
plan of what the workspace API will end up looking like.
As you probably know, these are both APIs that I've been babbling about for
years, but never quite gotten around to making them a reality. Working on
a tight deadline for OSAF with a focused set of requirements has helped
tremendously in narrowing down my vision to concrete APIs.
Some cool features of the implementation and architecture include:
* Fully-event driven model (i.e. change events for every attribute)
* Completely ZODB-free
* Monkey-typing for data (you can ask for 'SomeClass.someAttr.of(anyObject)')
* Relationships can exist independently of entity types (i.e. you can
create and persist relationships between types without those types needing
to know about it)
* Compact schema notation compared to peak.model (see src/spike/schema.txt
for examples)
* Schema objects have UUIDs for database synch and schema evolution (and
there's a tool to automatically edit your model files and tack UUIDs on at
the end for you, so they don't clutter up the schema definition and you
don't have to do it by hand)
I've also already figured out a process for automatically mapping the
schema to a relational database using the SOAR patterns that Ty and I
developed, so applications that don't need access to a legacy database
schema will be able to have worry-free automatic persistence. The
persistence API will be quite simple too, something like:
class ISet(Interface):
"""Set with change events"""
def add(item): pass
def remove(item): pass
def reset(iterable=()): pass
def subscribe(receiver,hold=False): pass
# a bunch of other methods including query support
class IWorkspace(ISet):
# add a bunch of undo/redo/flush stuff
That is, a workspace is logically just a set of objects with the same query
capabilities as any other set, and undo/redo/commit/rollback
support. Multi-valued attributes are modelled as sets, so you can perform
queries starting from any object, not just from "the database".
My earlier conception of workspaces is that they would be used to access
specially-altered classes using dotted names, but I've pretty much tossed
that out now. Class-replacement wiring can be managed a bit more easily
via command objects anyway.
Queries are now getting clearer, too, and I just drafted a rough API that
covers pretty much any kind of "single object per row" query, which is to
say it doesn't handle aggregation except of the IN (...) and EXISTS (...)
varieties. I do expect to be able to pick those other features up later,
but they're out of scope for Chandler in the near future so for now I've
got to think about them on my own time. ;)
Anyway, here's a couple of ways to express the same simple query using the
"bulletins" example schema::
aUser = ws.groupBy(User.loginId)['joe']
aUser = ws[User.loginId.eq('joe')]
There will presumably also be some sort of string syntax for queries, but
I'm not settled yet on what it will look like, except that it will be
syntactically valid Python. The intermediate form of filter objects,
however, will be objects like those in 'peak.model.query', only they won't
be lambdas. Instead, they will be introspectable so that for example the
SOAR backend can convert them into SQL, the Chandler backend can convert
them into repository-speak, etc.
*All* sets will have the query interface, not just workspaces, so you could
do this (again using the bulletins schema):
someBulletins =
aCategory.bulletins[Bulletin.postedBy[User.loginId.eq('joe')]]
Which would retrieve all posts in 'aCategory' that were posted by users
whose login ID equals 'joe'. In the case of a SOAR backend, all queries
end up getting pushed all the way down to the backend and implemented as
SQL, and there should be a generic function somewhere to allow custom query
tuning where needed.
Anyway, in case it wasn't clear, this model involves *no DMs*, so you don't
have to write custom ones for every class. Instead, the backend of a
workspace just has to be able to map from class and attribute objects to
whatever storage mechanism it uses. (Probably using a generic function, so
that you can use mixins to do cross-database stuff.)
PEAK's overall requirements are a lot broader and deeper than I currently
know how to handle with the architecture I'm doing for Spike, but the
really nice thing is that the API is now much more concrete. Most of
PEAK's specialty requirements (like cross-database storage and joins,
legacy DB support, workspaces that represent files or documents, etc.) can
be handled on the back-end in a way that's relatively invisible to the
API. PEAK also has lots of related APIs that have to be integrated, like
peak.events and the binding package metadata facilities.
So, to sum up... there's still no concrete timeframe for anything
regarding peak.schema, but you can see many hints of coming attractions in
spike.schema and related modules. Right now my days are occupied with OSAF
work, and my nights and weekends, when I have any time to spare, are spent
trying to finish my PyCon stuff. After PyCon, though, I'll hopefully have a
bit more time to start actually implementing peak.schema and SOAR.
More information about the PEAK
mailing list