[PEAK] PROPOSAL: Remove ZODB/Zope X3 dependencies/"compatibility"
Phillip J. Eby
pje at telecommunity.com
Sat May 29 17:36:40 EDT 2004
I've been reviewing the latest version of Zope X3, and it appears to me
that it's a waste of time to keep chasing compatibility with it in peak.web
and peak.model. So, I'd like to propose some changes to the affected packages:
ZODB 4 and peak.model
---------------------
PEAK's persistence machinery was built on an early version of ZODB 4. But
Zope X3 has dropped use of ZODB 4 and gone to an enhanced ZODB 3.3
instead. So, we're completely incompatible at this point, and if we're
going to make a change, we might as well change to something that suits
PEAK better.
My plan is basically as follows: create a dictionary-like mapping "record"
object that can be given functions that load individual attributes, as well
as a catch-all "load everything else" function, and that provides an event
source for changes made to loaded values. Then, model.Element will be
changed to no longer subclass the ZODB "persistent" base class, and the
various DataManager base classes will be changed to set objects' '__dict__'
attribute to a "record" object, and to subscribe to the event source. (And
before anybody asks: no, this will not make data managers asynchronous,
although it would in principle become *possible* for you to make them so,
with sufficient effort.)
I don't know if these changes can be accomplished with full
backwards-compatibility. I'd like to try to leave the current DM extension
API intact if possible, but it will necessarily involve performance
compromises. For example, today the normal return from a _load() method is
a dictionary containing values or LazyLoader instances. To preserve that,
I'm going to have to have code scan the return value for LazyLoaders and
handle them separately, which is going to impose some overhead that wasn't
there before. (Unless of course everybody tells me they aren't using
LazyLoaders and I don't need to preserve that functionality.)
One particularly dicey area is '_p_jar' and '_p_oid'. Technically these
are not part of the public interface of peak.model objects, but some of you
are probably using them (especially since I used them in peak.ddt). As it
happens, any attribute or method beginning with '_p_' would go away under
this proposal. Instead, one would have to use some extrinsic information
to track an object's oid or DM, such as via a closure created when the
object's ghost is created.
This isn't really as bad as it sounds, as I believe I can make it so that
code written to the old API won't notice any changes, as long as you can
identify one or more attributes that the old '_p_oid' was based
on. However, let this be a warning now: if you're using '_p_' attributes
in your code for any reason, you should probably get rid of them ASAP, as
they will definitely not be backward compatible.
The ultimate goal will be to support a new, '_p_'-less and DM-free API,
using "editing context" objects with an API that's something like:
# Find by primary key
something = ec.find(SomeType, some_field=27)
# Find by alternate key
something = ec.find(SomeType, foo=99)
# Query for multiple items
for item in ec.find(SomeType, some_non_unique_field="baz"):
pass
# Query for all items
for item in ec.find(SomeType):
pass
# Add an object
newObj = SomeType(some_field=42, other_field="test")
ec.add(newObj)
# Delete object(s)
ec.delete(SomeType, some_field=27)
instead of the current:
# Find by unique key
something = self.someDM[27]
# Find by alternate key
something = self.anotherDM[27]
# Query for multiple items
for item in self.someQueryDM["baz"]:
pass
# Query for all items
# XXX write your own special method
# Add an object
newObj = self.someDM.newItem(SomeType)
newObj.some_field=42
newObj.other_field="test"
# Delete object(s)
# XXX write your own special method
My experience with DMs so far is that it's really tedious to keep track of
them, even in trivial applications like the 'bulletins' example. So, this
new format eliminates the need for FacadeDMs when searching on alternate
keys, *and* it eliminates the need for QueryDMs to manage collections. In
other words, the single 'find' method should be usable for finding both
individual items (whenever all the fields for a unique key are supplied)
and doing multi-item queries (when no unique keys are present). This API
should also be much more amenable to simple object-relational mappings, and
should be able to integrate with peak.query in a straightforward fashion.
For this API to be implementable, editing contexts will need to know what
the unique keys are, and a rough idea of collection sizes. (Specifically,
whether a given query or collection can or should be cached.) However,
it's not clear to me at present whether this info will be part of the
peak.model objects, or in the underlying storage mechanism. It seems to
me, however, as though this information needs to be part of the program's
static model, because when you issue a 'find()' call it should be clear
whether you expect a single item or multiple ones. (And asking for a
single item that doesn't exist should result in an error.)
Anyway, this API should fix a lot of quirks and warts in the current
DM-based API, such as the inability to check whether an object "exists" at
retrieval time. I'll probably write other posts later to flesh out this
design further, and the path along which the existing code will be migrated.
Zope X3 and peak.web
--------------------
PEAK's web application package is based on Zope X3's 'zope.publisher'
package. My original intent in doing this was to:
* Avoid having to maintain HTTP request/response classes
* Take advantage of Zope's 'publish()' algorithm
* Take advantage of Zope's I18N support for determining browser-preferred
languages and character sets (and possibly other locale preferences later)
The downside to all this is that one must currently install a significant
portion of Zope X3 (zope.testing, zope.interface, zope.component,
zope.proxy, zope.security, zope.i18n, and zope.publisher, to be
precise). PEAK doesn't really need most of this stuff. Indeed,
'zope.i18n.locales' and 'zope.publisher' are all we're really trying to use
at present.
And, even the packages we do use have stuff we don't necessarily need. For
example, much of what HTTPRequest and HTTPResponse do are there (IMO) to
support the traditional Zope 2 APIs and marshalling functions, which are
just cruft where PEAK is concerned.
So, it's tempting to consider replacing them with simpler objects. The
"lingua franca" used by PEAK for HTTP requests and responses is just an
environment dictionary, and a set of in/out/error streams. Passing them
directly to published objects would give them total control over how their
inputs were parsed and their outputs formatted.
Another advantage of this approach is that it would allow a functional
interface for HTTP: pass in an environment and input stream, and receive
back a set of headers and an iterator over the output. The disadvantage is
that it couples knowledge of HTTP to the function. That is, the function
is defined in terms of HTTP and can't be migrated to something else. I
don't see this as a huge concern, however.
What else do we lose by not having request/response objects? Not much,
that I can tell. Pretty much every piece of functionality that they
provide can be replaced with function calls. For example, a
'get_cookie(environ)' function. Such functions can even cache their
results inside the 'environ' mapping, so that the work is only done
once. In essence, we're centralizing all of the system's mutability in an
environment mapping. Actually, we could make 'environ' immutable and have
the functions return a new environment, but that seems like overkill.
So, throughout the existing peak.web interfaces, the 'interaction' and
'ctx' parameters could be replaced by an 'environ' parameter. Most of the
functionality of the current 'web.TraversalContext' would move to functions
that operate on an 'environ', possibly returning a new 'environ'.
In this way, we could replace linear traversal with recursive traversal,
making all renderings capable of functional composition. Or, to put that
in English, it means you can use a "pipes and filters" pattern to assemble
components. (Ulrich will be happy because this means he'll be able to do
his XSL transforms over arbitrary renderables, without having to create a
dummy Response object, for example.)
The end result is that we'd have a uniform interface at all levels of
peak.web: a single-method interface, perhaps something like:
def handle_http(environ, input_stream, error_stream):
return status_code, header_sequence, output_iterable
You could then create an arbitrary number of processing stages over
this. For example, PEAK's first processing stage could simply call the
next stage, wrapped in a 'try' block that does transaction and error
handling. Each stage can delegate to a subsequent stage following a
traversal operation, if there is anything to traverse. Interestingly, this
approach completely eliminates the need to have complex logic like Zope's
"traversal stack" system, as each object being traversed has total control
over the subsequent traversal.
This interface is easy to adapt to the existing IRerunnableCGI interface, too:
def runCGI(self, input, output, errors, env, argv=()):
status,headers,iterable = self.subject.handleHTTP(env,input,errors)
print >>output,"Status:",status
for header in headers:
print >>output,header
map(output.write, iterable)
So, it can integrate nicely with our current CGI, FastCGI, and HTTP wrappers.
Thus, I believe we can end our dependency on Zope X3 for publishing
support, but the changes to peak.web will be substantial. In addition to
the changes I've outlined above, we would also be dropping the use of
adaptation to convert components to their decorator/view objects. We'll
still have decorators, but they'll be registered via the configuration
system instead, allowing "placeful" lookups, and getting rid of the need to
have the page, error, and traversal protocols for registration
purposes. Indeed, we should end up with it being possible for most
decorators to be defined by simple configuration, without coding (as is
currently required).
Wrap-up
-------
So, those are the proposals. They involve quite a bit of work to
implement, so I don't know how long they'll take me, or even when I'll be
getting started. In the meantime, I'd like to hear your feedback,
comments, questions, objections, etc.
More information about the PEAK
mailing list