[PEAK] What do you use 'offerAs' for?

Tue Sep 20 03:24:13 EDT 2005

No, don't worry, I'm not planning to remove it.  :)  At least, not any time 
soon.

Here's the thing.  I've been mulling over one of the big aspects of what 
PEAK does - providing context for components - and realizing that it looks 
a lot like a giant kludge to work around the absence of Lisp-style dynamic 
variables.  If this is true, it would mean that PEAK's API could be 
dramatically simplified for certain types of tasks, to the point of seeming 
to disappear altogether.  In essence, many aspects of PEAK would look a lot 
more like a library and a lot less like a framework.

So, the thing that we found a couple of years back was that most PEAK 
applications tend to concentrate their configuration at some "global" 
level.  We dubbed this a "service area", because you tend to have a bunch 
of singleton services in it, that are shared by virtually every component 
that needs a service of the given type.  An application can have more than 
one service area, but this is usually for special reasons, like the 
supervisor tool that maintains a pre-populated service area for processes 
that it's going to fork off from itself, and a separate service area for 
managing the child processes.  Similarly, an application server hosting 
multiple applications would want separate service areas.

So, our implementation shifted a wee bit to make it easier to do this sort 
of thing.  We added an explicit ServiceArea type and interface, facilities 
to automatically create singletons when you ask for a general-purpose 
component, and so on.

There are some gotchas with the existing setup, however.  One use case for 
creating an alternate service area is when you want to say, create a second 
transaction independent of your current transaction.  For example, if 
you're in an error handler rolling back a primary transaction, you might 
need a separate transaction in which to send out an email notice about the 
error.  This is rather awkward in PEAK at the moment because you can't just 
create a separate transaction and share the remaining components in your 
existing service area, because the coupling between components is via the 
service area.  In other words, all your existing components that use a 
transaction will be tied to your original service area.  This is a bit of a 
pain, to say the least.

Similarly -- and this is the use case that got me thinking about all this 
-- peak.web right now jumps through a lot of hoops with its Context object 
to keep track of a whole bunch of theoretically-orthogonal but practically 
linked variables.  The base URL, current user, current skin, and a bunch of 
other stuff.  Not only does this induce a bunch of slow context lookups to 
find common values, it has no clean way of being extended by multiple 
components.  You can't just go and add a 'shopping_cart' request variable 
cleanly, for example.  Also, the way the peak.web InteractionPolicy works, 
it's hard to replace any of the things it configures at lower levels of 
your object hierarchy.  As a result, peak.web isn't really geared to host 
multiple applications in the same site very well.

What I realized at some point was that peak.web's Context variables really 
want to be like Lisp dynamic variables, that can be set in a function, and 
retrieved by any code called from that function.  Whenever you return from 
a function that sets such a variable, its old value would be 
restored.  Thus, you could for example track the current user by creating a 
web.user variable, something like:

     import context
     user = context.Variable()

You could then in a publishing routine do something like:

     web.user.push(auth_svc.getUser(request_data))
     try:
         print "the current user is", web.user.get()
     finally:
         web.user.pop()

Or in Python 2.5:

     with web.user(auth_svc.getUser(request_data)):
         print "the current user is", web.user.get()

Looks simple enough, yes?  Under the hood, Variable objects would be 
thread-local.  And, to support peak.events and Twisted, there would be a 
way to take a snapshot of all the current context variables, and restore 
the snapshot later, so that e.g. events.Task objects could swap their 
current context in and out.

I've actually prototyped data structures that do all this, and they are 
amazingly efficient - much more so than multi-level parent component 
walking to find things.  So I got to thinking about other uses.

One thing that occurred to me fairly quickly is that a *lot* of the edge 
cases in the usefulness of PEAK's component model can be fixed by using 
dynamic variables in place of hierarchy-based lookups.  For example, the 
idea of the "current transaction" is really time-bound, not 
space-bound.  It really should be "the transaction I'm in right now", not 
"the transaction service for my service area".  The latter just happens to 
be a useful approximation to the former, but if you can actually time-bind 
instead of space-bind, then why bother with the approximation?

So I tried writing some code samples using the idea of making 
'storage.transaction' a dynamic variable, and I quickly found that 
'transaction.get().begin()' was a pain.  So, I came up with the idea of a 
'context.Service', which would just be a proxy that delegated all its 
attributes and methods to the get() of a specified Variable.  I haven't 
worked out the precise API for defining and initially configuring one, but 
it would allow you to do stuff like 'storage.transaction.begin()' and 
'storage.transaction.commit()', and automatically forward the method calls 
to the value of the hidden Variable, so that they go to the right object 
for the current context.

Now, here's something interesting.  If you look closely, you'll see that I 
just wiped out the need to Obtain(ITransaction).  In fact, I wiped out the 
need for ITransaction itself, except for documentation or adaptation.  I 
just say, "storage.transaction", and there it is.  How simple can you get?

Really, the only way to get simpler is with a global variable.  But global 
variables are subject to unrestricted manipulation, and they don't play 
well with threads or events.Task pseudothreads.   But all context.Variables 
can be swapped out with just one call:

     old_vars = context.swap(new_vars)

This operation takes around 2 microseconds on my PC, so it doesn't slow 
down inter-Task switching.  And, each thread has its own current "state of 
all variables" mapping, so there's no inter-thread pollution.  Finally, the 
API is designed so that push() and pop() have to be paired, so manipulation 
is constrained to relatively-comprehensible hierarchies based on calling 
contexts.

All this sounds really great in theory.  But how often would you need to 
change context variables?  If it's done a lot, it would probably be a pain 
to push and pop all those variables.  So, I decided to grep the PEAK source 
for uses of 'offerAs', because 'offerAs' is a way of saying, "I'd like to 
set a new value for this configuration key in the context beneath me."

What I discovered shocked me.  Of the 38 uses of offerAs in the PEAK source 
tree, *18* were in tests, mostly to override a default service with a mock 
one for the test.  What's more, virtually all of the remaining uses were 
ones that would be unequivocally easier to use, understand, and extend if 
they used dynamic variables instead.  Some were transaction-related, but 
there was another edge case I hadn't thought of before the grep: commands.

You see, command objects use component context to get their stdin, stdout, 
argv, envrion, etc.  As soon as I saw that in the grep, I recalled all of 
the pain I went through getting that part of the running.commands framework 
to work correctly, and instantly saw that it would have been a piece of 
cake with dynamic variables, because they really "want" to be 
calling-context based.  It's only because the "real" (Python-supplied) 
versions of those variables aren't dynamic (thread/task-local) that any of 
the fancy footwork was needed in the first place!

Another new use case was the current IEventLoop or IMainLoop a task is 
running under.  There are aspects of that one that have bothered me for 
years!  And yet, the dynamic variable version of those is so simple and 
elegant and clean it makes me amazed I never thought of it before.  My 
prototype implementation doesn't use any Python features that weren't 
around in 2.2, so in principle I could have done this years ago.  (Granted, 
the 'with' statement coming in Python 2.5 provided some inspiration for the 
idea, and it will really be the easiest way to use dynamic variables, all 
things considered.)

So, it's gotten me quite curious, because apart from testing, *every* 
nontrivial use of offerAs in the PEAK core was a part of some sort of cruft 
that would've been incredibly better with dynamic variables.

So what uses do *you* have for offerAs?  Would they be clearer or less 
clear by using dynamic variables?  I'm intrigued by the possibility of 
simplifying the PEAK core quite a bit with this notion.  Most 
binding.Obtain() bindings would be replaced by simple direct reference to 
services, although component-local Obtain's would still be useful.  All the 
crufty bits of PEAK I found would become non-crufty.  All of the colorful 
configmaps and eigenvalues and all that would mostly melt away, with the 
.ini files just being used to set dynamic variables instead.

There are a lot of things that would have to be worked out to do that, of 
course.  Especially since the API, although improved, would likely be quite 
different.  I haven't even worked out the destination, let alone how the 
existing codebase would transition.  And the "when" of all this is quite 
questionable.  I have no idea when I'd even really start.

But in the meantime, I am quite curious.  If you have any uses for offerAs, 
please post about them.  I'm interested in whether there are any uses that 
really are better than using a dynamic variable for the same thing.