[PEAK] Untangling the running.IExecutable mess
Phillip J. Eby
pje at telecommunity.com
Tue Nov 16 15:03:50 EST 2004
After considerable study, I think I've found the broken bits of the current
command-line app framework.
First, running.ICmdLineAppFactory shouldn't subclass
binding.IComponentFactory, despite the fact that it is a compatible
extension of the interface. The problem is that any callable object can be
an ICmdLineAppFactory, but that doesn't make it a component factory!
Second, there needs to be a way for interpreters to handle errors better in
their constructors, in connection with their parent command. Specifically,
right now interpreters try to immediately parse their arguments and return
a target object in place of themselves. However, if there's an error, they
simply return themselves, hoping they'll be re-run, and they can output the
real error. This doesn't really work. They should instead immediately
output the command's help message, and raise an error.
The tricky part is handling that error. What should the error be? It
seems to me that it should be a SystemExit, since this will cause a nice
clean exit, whenver you attempt to use a broken command. The downside may
be that SystemExit gets trapped somewhere, and the output may be written
somewhere strange, but I don't know if there are really any alternatives,
there.
Finally, ICmdLineAppFactory should be redefined such that it is not
guaranteed to return an ICmdLineApp; and its users should be required to
adapt to ICmdLineApp if that is what they want. In essence, an
ICmdLineAppFactory is characterized more by the arguments it may receive,
than the precise nature of what it creates.
I am also sorely tempted to remove the adaptation from arbitrary callables
to ICmdLineAppFactory. It's only used to enable functions to be run as
applications, and could be replaced with e.g. a 'peak call'
interpreter. The only real downside to this is that the nice IntroToPeak
"hello world" function would no longer work.
On the other hand, we could narrow that behavior to only work with
functions. That would eliminate the current "silent failure" mode, where
specifying a class that has no __init__-time behavior, produces no
result. This would still support backward-compatibility with typical
'main()' functions. I think I'll take this approach unless somebody is
using the current behavior that supports classes, class methods, callable
instances, and instance methods as well.
There is actually another issue, wherein a Bootstrap interpreter always
expects an ICmdLineAppFactory, meaning that Bootstrap should not be used
when such an item is not expected. However, there is no standard for
initializing other kinds of objects as the target of a command/import
operation. For example, if the current CGIInterpreter command finds its
target is an IComponentFactory, it will invoke it to create a new instance.
It seems silly, though, to create a series of alternative interfaces like
ICGIFactory, ICmdLineAppFactory, and so forth, just to say, "hey, if you
find certain kinds of objects, give them an opportunity to instantiate
in-context."
So, what are some alternatives?
* We could create a new kind of 'cmdFactory()' protocol, akin to
'sequenceOf()', taking the protocol we expect the instantiated object to
provide. Then, each context like Bootstrap, CGIInterpreter, and so on,
would adapt to e.g. 'cmdFactory(ICmdLineApp)',
'cmdFactory(IRerunnableCGI)', etc.
* We could move the creation to the import URL. E.g., we could allow '()'
at the end of an import URL, meaning, "call and create an instance", so
that 'import:foo.Bar()' returns a new instance of the 'foo.Bar' class.
* We could have a reference type like 'ref:instance at import:foo.Bar'. This
would have the additional advantage of being applicable to any URL type,
not just imports, and implementation would consist only of adding one
class, and an entry to peak.ini.
* We could allow reference URLs to omit the '@url' part, so that
'ref:foo.Bar' is a valid URL. (Currently, you have to at least have
'ref:foo.Bar at x'). Using this approach, any to-be-instantiated class must
implement naming.IObjectFactory or naming.IState. But, you can configure
alternative implementations for a given dotted name, by defining properties
in 'peak.naming.factories'. Also, these "bare reference" URLs would be
shorter even than the 'import:foo.Bar()' syntax (compare against 'ref:foo.Bar'.
Of these, I think I lean towards the 'import:foo.Bar()' solution, as easy
to explain. On the other hand, it may have to be quoted in some shells.
There may be a better solution, though. Currently, the lookup process for
most commands already calls through the naming system. The naming system,
by default, will check whether the object being returned is an IState or
IObjectFactory, and if so, invoke them with the naming context being
used. So, there's already a perfectly good way to ensure that you get an
instance of something: just make sure that the thing you're referencing is
an IState or IObjectFactory.
The downside? Well, now you have no way to reference the object, instead
of a new instance of it. But so far, I've never had a use case for that in
PEAK. Sure, there are places where we use import strings (or
'importString()') to access a factory of some kind, but not 'import:'
URLs. As further proof, consider the fact that storage.ManagedConnection
subclasses all implement IObjectFactory, so you can't reference one of
those classes with any kind of URL, and not end up with an instance instead
of the class. The fact that this is not a problem in practice suggests
that this is a good idea.
But, I just tested making all Component classes do this, and it produces a
whopping 115 errors in the test suite. The "flaw in the ointment" is that
.ini files also use the object restoration protocols
(IState/IObjectFactory), so making every class one means that they try to
instantiate in that context.
I could fix *that*, by making component classes also be ISmartProperty
instances that return themselves. However, that just seems too convoluted
and kludgy.
I think that this is basically an area where, in the face of ambiguity, one
should refuse to guess. Therefore, whatever we do should be explicit, and
we should get rid of all the "implicit" command restoration techniques.
The downside to this approach in the command framework is that it has been
leading towards really elaborate commands like 'FastCGI fd.socket:stdin
WSGI import:foo.bar'. This is explicit, but very verbose. Also, the
CGI/WSGI runners *really* want to instantiate a target component
class. Even though 'peak.web' is moving towards sitemap references as the
way to run an application, the 'trivial_web' and 'trivial_cgi' example
applications, along with the DDT web runner, currently all need to
instantiate from a class.
Sigh. I think I'm going to go with my gut here, and implement the
'import:foo.Bar()' facility. Here's my reasoning:
* No use case has yet arisen for declaring instantiation, for anything
other than an import.
* Although it may cause shell quoting problems, this is a minor problem,
easily worked around, compared to some of the other issues I've run into here.
* Explicit is way better than implicit.
So, the plan here is to:
1. Add the '()' syntax to import: URLs.
2. Change CGIInterpreter and its descendants (e.g. 'peak launch') not to do
implicit instantiation
3. Fix all the current CGIInterpreter-run commands to use the '()' syntax
so they work again.
4. Drop support for treating arbitrary callables as ICmdLineAppFactory,
restricting this behavior to function objects only.
Any objections?
(In the meantime, I'm going to go ahead and check in changes for the items
mentioned in the first four paragraphs of this message.)
More information about the PEAK
mailing list