[PEAK] PROPOSAL: Remove ZODB/Zope X3 dependencies/"compatibility"

Wed Jun 2 17:20:10 EDT 2004

At 09:05 PM 6/2/04 +0200, ueck at net-labs.de wrote:
>Quoting "Phillip J. Eby" <pje at telecommunity.com>:
> > Most implementations of IHTTPHandler will check to see if the environ
> > indicates a need to traverse further, and if so, call an appropriate
> > ITraversable's traverse method to obtain the new object, then adapt the 
> new
> > object to IHTTPHandler and delegate the handle_http call to it.
>
>one could then, within some component close the the traversal-root
>also do the work for parsing headers and request-type to decide
>wether it is a Browser/WebDAV/XMLRPC/SOAP-Request and
>adapt to more specific Interfaces (e.g. IXMLRPCHandler) if needed.
>but this component is only plugged into the publishing if this feature
>is needed(same for ++XXX++/@@-URL handlers).

You could do that, of course, but IMO it makes more sense to let endpoints 
handle these matters.  That is, the actual method or template or whatever 
that's getting executed can implement these protocols if and as they wish.

I've been thinking since yesterday that there should be an API call as follows:

     web.lookup(environ, key, default=NOT_GIVEN)

where 'key' is adaptable to config.IConfigKey.  This API would attempt to 
compute the value of 'key' for 'environ', or return its cached value.  To 
compute a value, it would look up the key 'peak.web.context', look up 
'config.FactoryFor(key)' in that context, and then invoke the found factory 
on 'environ' and 'key'.  That is, something like:

     def lookup(environ,key,default=NOT_GIVEN):
         if key in environ:
             return environ[key]

         context = environ['peak.web.context']
         factory = 
config.lookup(context,config.FactoryFor(key),default=NOT_FOUND)

         if factory is NOT_FOUND:
             if default is NOT_GIVEN:
                 raise XXX
             return default

         result = environ[key] = factory(environ,key,default)
         return result

Now, by registering factories for property names in an .ini, one can create 
configuration namespaces within the environment for whatever 
application-defined values might be needed.

> > ITraversable, however, will not be the one to make changes to the
> > HTTP-specific environ values such as PATH_INFO and SCRIPT_NAME.  Only
> > IHTTPHandler will make those changes.  However, I'm thinking that
> > ITraversables will still be required to update something in the environ to
> > reflect the path travelled.
>
>and an uptodate list of remaining items to traverse to.

No.  ITraversable is only responsible for traversing the name it has been 
given.  IHTTPHandler is responsible for adjusting PATH_INFO and SCRIPT_NAME 
as traversal progresses.

The reason for this division is that ITraversables can be used in non-HTTP 
contexts, such as traversal to resources from skins, and data path 
traversal in DOMlets.  These kinds of traversal should not have any effect 
on HTTP variables in the environment, or else anything that needs to know 
what URL was actually in effect would get confused.

So, although I still think that there will be something updated in the 
environment by ITraversables, I'm quite clear that it will not be any of 
the HTTP variables.

> > There are still some bits to be worked out with regard to how absolute and
> > relative URL's get calculated, but the rest is mostly just nailing down
> > division of responsibilities, defining names for certain values in
> > 'environ', and establishing API calls to get at most of the things that 
> one
> > now gets at via attributes of the context or the interaction.
>
>the parrallelism of Traversed-Path vs. Component-Tree will still exist ?

Depends on what you mean by that.  There was never any such parallelism 
required, although it's certainly convenient for lots of things.

For applications laid out using XML, yes, they will essentially be 
hierarchical in that way.  Resource directories too.  But, there is no 
tight binding occuring here, and the return value of a traversal operation 
need not have *any* relationship to the ITraversable that returned it.

> > At that point, peak.web will be something of a microkernel, with only two
> > very simple interfaces for someone to implement, and lots of imperative 
> API
> > calls that can be used to do useful things.  To make it easy-to-use,
> > however, we'll then define controller classes (replacing the previous
> > Decorator concept) and an XML vocabulary for defining/declaring them.  And
> > as time goes by we'll likely end up writing lots of HTTP utility functions
> > to do things like extract cookies, set cookies, etc. given an
> > 'environ'.  But, since these functions won't be bound into "request" or
> > "response" types, any component will be free to use alternative ways of
> > accomplishing those tasks, should they need to do so.
>
>i think i like this :)
>
>how can we get ahead ?

We need to more precisely specify which interfaces do what.  I've decided 
also that IHTTPHandler is a base class of IHTTPApplication, where the 
latter is the interface that applications are adapted to by a container 
(like the web runner, supervisor, FastCGI control, etc.).  This then makes 
it easy to put a transaction/error handling wrapper around the root 
component of a web app, without forcing the serialization format to specify 
that an object is "top level".

What I meant to say is that if you design an application's site layout with 
the XML format, it should be composable.  You should be able to "include" 
that XML file into a larger application, and just have it work.  So, there 
shouldn't need to be anything in the configuration that says, "I'm the root 
of the site".

Of course, we could handle this by having a flag in the environment that 
indicates whether the transaction/error wrapping is being handled by a 
higher level component, but then this means that every 'handle_http()' 
method would have to check this.  So, better to wrap the top level in an 
adapter.  Anyway, this bit of policy (one hit = one transaction, plus a 
mechanism for viewing/handling exceptions) should be the only thing hardwired.

It's possible to question even this, though.  Should individual components 
within a site perhaps be specifically marked for transaction handling or 
exception handling?  What happens if a subcomponent traps certain errors, 
and then the parent component commits the transaction anyway?  But I don't 
see much way around this, since even the very beginnings of traversal will 
likely want to have authentication, which probably means using a DM, which 
therefore means a transaction is needed.  So, I guess it does make sense to 
stick with the implicit transaction model.

Anyway, IHTTPApplication will be the same as IHTTPHandler, except that it's 
responsible for preparing the environment by:

* setting the initial value for 'peak.web.context'
* beginning the transaction
* trapping exceptions from subsequent handlers
* committing or aborting the transaction, if it is still in progress

However, to make this more flexible, it's possible that it could be given a 
plugin architecture, so as to allow registering plugins to do additional 
pre or post processing functions, and I suppose that these functions 
themselves could be plugin-ized.

Luckily, all of these policy issues do not interfere with mechanism.  The 
mechanism is simply that you need an IHTTPApplication, and you can always 
write your own.

Anyway, the main bits that are still really open-ended in my mind have to 
do with computing traversed and absolute URLs.  A traversable should be 
able to give you its URL, given an environment.  (Which means that 
ITraversable also needs a getURL() method.)

In the current system, TraversalContext can compute both a "traversed URL" 
and an "absoluteURL" by keeping track of the previous TraversalContext.  I 
suppose I could do the same with 'environ', i.e. having each 'environ' have 
a pointer to the previous environ traversed from.  And the IHTTPApplication 
could set the initial environment's base "absolute" URL.

So, attempting to map from old -> new:

     TraversalContext -> functions operating on 'environ'
     Traversal        -> IHTTPApplication initializes starting 'environ'
     *Traversable,Decorator,MultiTraverser -> roughly the same, but maybe 
fewer methods
     CallableAsWebPage -> needs complete reimplementation to handle mapply 
and argument parsing
     Interaction, InteractionPolicy -> more funcs on 'environ'
     Resource classes -> some refactoring
     Request/response classes -> functions operating on 'environ'
     etc...

All in all, this is mainly a matter of rather tedious rewriting, while 
adding tests.  One of the annoying things about the Zope dependency of 
peak.web is that it was hard to write good unit tests that didn't depend on 
Zope, too.  And, it was hard to define sufficiently small units, because so 
many things were emergent properties of a larger system.  It should be a 
lot easier to write tests for small operations against 'environ' dictionaries.

The rewriting will take a while, though.  When I've replaced all the 
existing code, I will not yet have replaced the functionality, because we 
will be losing Zope-supplied functions like cookie parsing, query string 
and form post parsing, cookie setting on responses, character set 
negotiation, and all of that sort of thing.  Those will have to be written 
as property factories and added to the configuration, but that also makes 
them good candidates for contributed patches.  And so, as soon as a 
'web.lookup()' function exists, that would be a good time when volunteers 
such as yourself could start writing factories to replace functionality we 
currently get from Zope X3.

Contributions should include unit tests that can be integrated into the 
peak.web test suite.

Oh, and by the way, I don't want to go with the 'foo:int' style of 
parameter passing as the default means of handling input parameters.  That 
doesn't mean it shouldn't be optionally available to people (via a 
different property namespace), I just mean that I don't consider it a high 
priority, and I don't think it should be the default mechanism for getting 
at query/cookie/form variables.  It was originally designed for interacting 
with oblivious Python code that wasn't designed for web use.  But that's 
what controllers (formerly decorators) are for, and we have other things 
like function attributes and adaptation that can be used to determine what 
kind of data a function is looking for.

Anyway...  did that answer your question?  :)