[PEAK] Command-line option parsing
Phillip J. Eby
pje at telecommunity.com
Fri Nov 12 19:42:05 EST 2004
The peak.running.commands framework currently doesn't handle command line
options, which rather limits its utility for creating command-line
tools. For example, it would be nice if e.g. 'peak serve' let you specify
the port or host as a command-line argument.
While Python 2.3 includes a rather nice 'optparse' module for argument
processing, it (understandably) lacks several important features for
integration with PEAK. For example, it's not designed to allow tying
options to attribute descriptors.
So, what would a PEAK command-line option framework look like? Well, here
are some of the requirements I have:
* Inherit existing options from base class
* Ability to *not* inherit (i.e. delete) a specific option, or to avoid
inheriting any options from a superclass
* Generate help message automatically
* In response to a specific option:
- Set a value (or append to a list of values) with optional conversion
from a string (and wrap any errors in an InvocationError)
- Set a boolean true or false value
- Increment or decrement a value
- Perform an arbitrary action, with access to (and ability to modify)
the remaining arguments
- Raise InvocationError when a non-repeatable option (e.g. a "set"
option) appears more than once
* Map non-argument options arguments to attributes or methods, both
positionally and "*args" style, with optional type conversion from
string. That is, you should be able to declare attributes or methods that
receive either the Nth argument, or that receive arguments from the Nth
argument on.
* Raise InvocationError for missing "required options" or required arguments
* Compact notation allowing easy association of an option with attributes
or methods
So, with these requirements in mind, let's try some syntax
possibilities. I'm assuming this will live in 'peak.running.options', and
be used with an 'options' prefix. I'm also assuming that we will add a
metadata option to all binding attributes, and that it's one place where
option metadata can be defined. We can also use decorators to define
methods/actions.
So, we'll probably have some things like:
* Metadata for options:
- options.Set(*option_names, help/value/type+metavar, repeatable=False)
- options.Add(*option_names, help/value/type+metavar, repeatable=True)
- options.Append(*option_names, help/value/type+metavar, repeatable=True)
* Metadata for non-option arguments:
- options.Argument(argnum, metavar)
- options.After(argnum, metavar) (arguments following argument 'argnum')
* Other attribute metadata:
- options.Required(message) (indicates that if at least one
option/argument associated with this attribute was not set, raise an error
at parse completion, with the supplied message).
* Function Decorators:
- [options.argument_handler(help/type)]
def _handler(self, argument, remaining_args):
- [options.option_handler(*option_names, help/type+metavar,
repeatable=False)]
def _handler(self, parser, optname, optval, remaining_args):
* Class Advisors:
- options.reject_inheritance(*option_names) (reject named inherited
options/required attrs, or reject all inherited options/required attrs if
no names specified)
Items above that say 'help/value/type+metavar' would take keyword arguments
to describe the help string, or to determine the value or type being
set. Either a value or a type *must* be set. If it's a value, the option
is just a flag, but if it's a type, the option will accept an option
argument, and the type will be called to convert the argument string to the
value to be set, added, appended, or whatever. The handler decorators
allow an optional 'type' keyword as well, to allow conversion of arguments
or options from a string to something else. The 'type' invocation should
be wrapped in a handler that traps ValueError for conversion to an
InvocationError.
The 'metavar' keyword, if specified, indicates the placeholder name to be
used in "usage" output, for either option arguments or positional
arguments. The 'repeatable' keyword argument on various items indicates
that the option can occur multiple times.
Here's some example code showing some of these features in hypothetical use:
# ========
from peak.running import commands, options
class SomeCommand(commands.AbstractCommand):
first_arg = binding.Require(
"First positional argument", [options.Argument(1)]
)
second_arg = binding.Require(
"Second positional argument", [options.Argument(2)]
)
extra_args = binding.Make(list, [options.After(2)])
verbose = binding.Attribute(
[options.Set('-v','--verbose', value=True, help="Be talkative"),
options.Set('-q','--quiet', value=False, help="Shhh!!")],
defaultValue = False
)
debugFlags = binding.Attribute(
[options.Add('--debug-foo', value=1, help="Enable foo debugging",),
options.Add('--debug-bar', value=2, help="Enable bar debugging",)],
defaultValue = 0
)
configSources = binding.Make(list, [
options.Append('-c', '--config', type=str, help="Configuration
source")
])
[options.option_handler('--log-url', help="Logfile URL")]
def _setLogFile(self,parser,optname,optval):
try:
self.logfile = self.lookupComponent(optval)
except something...
raise InvocationError("blah")
# ==========
So, what are the open design issues here? Well, there is some potential
for conflicting/inconsistent metadata. For example, what happens if you
specify both some positional 'Argument()' attributes and an argument
handler? More than one attribute for the same 'Argument()'? How does
'After()' relate? What if you use an option name more than once?
I think we can safely ignore option name conflicts between a class and its
superclass, or more precisely, we can simply resolve them in favor of the
subclass. Conflicts *within* a class, however, should be considered
programmer error.
Hm. Maybe there's another way to handle arguments. Suppose it looked like
this:
[options.argument_handler()]
def _handle_args(self, foo, bar, *etc):
# ...
The idea here is that non-option arguments are simply passed to the handler
positionally. If you have a '*' argument, you get any remaining arguments
supplied therein. If you don't, you accept only a limited number of
arguments. If you use function defaults, those positional arguments are
optional, and default to the default. Finally, the argument names
themselves can be altered to produce an automatic usage message. For
example, the above might be rendered as:
usage: progname [options] foo bar etc...
Anyway, if we take this approach, we can get rid of the Argument/After
metadata, and thus reduce this to a question of whether or not a given
class has an argument handler, and consider multiple handlers per class to
be an error. I don't see a need to have a way to reject inheriting an
argument handler, because you just define a new one. If your command
doesn't take arguments, then:
[options.argument_handler()]
def _handle_args(self):
# ...
is sufficient to cause an invocation error if any arguments are supplied.
Hm. Should the argument handler be invoked at argument parse time, or at
command run time? What is the difference? Actually, when does parsing
happen, period?
It seems to me that option parsing should take place relatively
early. This is because some PEAK command interpreters actually want to
replace themselves with another object, when they are being used as an
argument to some higher-level interpreter. For example, when running 'peak
CGI WSGI import:foo.bar', the 'WSGI' interpreter wants to substitute a
wrapped version of the 'import:foo.bar' object for itself, so that the
'CGI' command sees the wrapper when it goes to do "CGI things" to that object.
Currently, commands.AbstractInterpreter contains an ugly kludge to actually
attempt to parse arguments at __init__ time, in order to replace the
interpreter with the target object. This is kind of sick, to say the
least, and has led to quirky bugs in the past, not to mention various
kludges in the commands framework like the 'NoSuchSubcommand' crap.
I think, however, that the best thing to do here is to fix the kludginess,
such that commands that act on a subcommand always first ask the subcommand
if it wants to replace itself with a target object. Interpreters would
pass this request along to their subcommands, too, so that the parent
command always receives the "innermost" replacement possible. Hm. This
could probably be part of the 'getSubcommand()' method, actually, so that
all commands would inherit it. And, if I made AbstractInterpreter support
"replacing itself", then all existing command objects would still work, and
I could see about phasing out the quirky bits.
Hm. One thing that would be really interesting, would be if we could make
it so that '_run()' is normally the argument-processing method. Maybe we
can arrange things such that, if you haven't defined any option metadata,
your '_run()' method gets called. There might be some backward
compatibility issues, however, with command classes that inherit from
non-abstract PEAK commands (i.e., other than AbstractCommand and
AbstractInterpreter), and override '_run()' now. For example, many people
subclass commands.EventDriven, and if we added a '--stop-after' argument,
and they overrode '_run()', there would be a conflict.
So, I think maybe we'll need another method name, like 'go', e.g.:
[options.argument_handler()]
def go(self, source, dest):
# ...
And then the '_run()' in AbstractCommand can just run the option parser,
triggering the argument handler once everything's
parsed. AbstractInterpreter's 'go' might then look like:
[options.argument_handler()]
def go(self, cmd, *args):
self.subCmdArgs = args
return self.interpret(cmd).run()
I don't much care for 'go()' as a method name. Maybe 'cmd()'? Anybody got
any suggestions?
Anyway, looks like we've got the basic structure figured out, such that
existing commands should still work the way they do today, but there are
still some open design items:
* partial option name matches -- should we support this?
* what about tab-completion of command line options? (e.g. via
'optcomplete', see http://furius.ca/optcomplete/ for info)
* precise definition of how usage messages are generated, including the
ability for a command to participate in generating the help message (e.g.
commands.BootStrap wants to list available subcommands)
* how to override default usage messages/help strings, etc. (e.g., can
you override *just* an option or argument's help message from a base class,
or do you have to just redefine the whole option?)
* should we allow "interspersed arguments", and if so, how do we control
that on a per-class basis?
* should we even have the 'argument_handler()' decorator, or should we
just declare type information for the arguments, and just let '_run()' call
'go(self,*args)', checking for the argument count mismatch if any?
* do we want to allow "option groups", ala 'optparse'?
I'll have to tackle these questions in a follow-up post at a later time,
along with implementation issues like, should we use 'optparse' to
implement the underlying parse mechanism? There are advantages and
disadvantages that need to be weighed.
In the meantime, these prerequisite tasks can be started:
* Create a basic metadata registration facility for peak.binding, such that
metadata given to bindings is invoked at class creation time, and told what
class+attribute it's being applied to
* Allow 'binding.Attribute' to have a 'defaultValue' specified, so we have
a simple way to define bindings with a default value, besides Make/Obtain.
* Fix the interpretation kludge in the commands framework, by defining a
"replaceable subcommand" facility (and investigate how the IExecutable
stuff might be cleaned up, too.)
* Investigate whether 'optparse' has sufficient hooks to allow it to drive
the option parsing we want.
Finally, I'll recap the currently-anticipated, moderately well-defined APIs
that would be needed:
- options.Set(*option_names, help/value/type+metavar, repeatable=False)
- options.Add(*option_names, help/value/type+metavar, repeatable=True)
- options.Append(*option_names, help/value/type+metavar, repeatable=True)
- options.Required(message)
- [options.option_handler(*option_names, help/type+metavar,
repeatable=False)]
def _handler(self, parser, optname, optval, remaining_args):
- options.reject_inheritance(*option_names)
- maybe an 'options.Group(description, sortPosn=None)', that can then be
used as a 'group' kwarg for other option definitions.
Hm. That doesn't look too bad, especially since
Set/Add/Append/option_handler are mostly trivial variants of each other,
while Required and reject_inheritance just set some simple
metadata. Really, most of the complexity will be buried in the parsing,
and in the assembling of a parser from the metadata.
More information about the PEAK
mailing list