[TransWarp] Configuration System Overview
Phillip J. Eby
pje at telecommunity.com
Thu Jul 4 13:16:23 EDT 2002
The purpose of the PEAK configuration system is to allow components to
expose hook points for binding to configuration sources from outside
themselves. PEAK's binding machinery already provides an
acquisition-by-name mechanism for such binding, but this is primarily
intended for relatively-local sharing of components that are configured in
source code.
The peak.running.config package, possibly in connection with functions or
classes made available via the binding package, is intended to allow hook
points to be configured from outside of program source code, through
external configuration files, OS environment variables, or other sources
such as LDAP directories, databases, CORBA services, etc.
A key issue in using these external resources for configuration, is that
the resources available for doing configuration vary significantly between
platforms and operating environments. Further, the security issues
involved in using these resources can vary between platforms as well, and
the tools available for working with the configuration resources can be
equally varied.
Therefore, PEAK should not impose its own special set of configuration
mechanisms upon the application developer or administrator. Instead, it
should be possible for the application developer to specify configuration
policies that will be used. Ideally, these policies (which represent *how*
configuration data is obtained), will be specifiable in a very brief form,
suitable for adding to the beginning of a startup script, or to be
incorporated in a module that will be imported for all programs requiring
that configuration policy. Administrators should then have maximum
flexibility in configuring an application using the tools they have available.
Apart from a few useful "standard" configuration sources, the configuration
package will not focus so much on getting configuration data, as on
establishing an architecture within which configuration data can be
funnelled from configuration providers to configuration consumers.
Configuration Policy
--------------------
Configuration policy (or "meta configuration", as it's called in Zope 3)
consists of providing information about where and how to retrieve other
configuration information. In the case of PEAK, the configuration policies
will determine how the default configuration stack will be arranged. That
is, what sources will provide configuration information, and what
precedence order they will have relative to one another.
The basic API will be something like 'config.setup(namespace,
**keywordArgs)'. The namespace represents a prefix that will be added to
keywordArgs before they are saved in the root configuration space. So
you'd do something like:
from peak.api import *
import os
config.setup('peak.config.policy',
config_file_name = 'app.cfg',
config_file_home = os.environ.get('HOME', os.getcwd())
)
This would set the configuration properties
'peak.config.policy.use_environment' and
'peak.config.policy.config_file_name'. Which brings us to the next point...
Configuration Namespaces
------------------------
Configuration variables must be named in a way that allows multiple
configuration namespaces to co-exist without interference. A logging
package and a python import utility may both want a 'path' configuration
variable that means something completely different. This means that any
usage of a configuration variable must be using a qualified name of some
sort. We will follow the Java (and I believe XWindows) approach of using
'.'-qualified names, since this also allows for resonance with Python
package names, where relevant, and is easily distinguished from component
path lookups (which use '/'-separated names).
Most configuration sources, however, will probably not have a notion of
namespaces, and many will not support '.'-qualified names at all. For
example, Posix/Win32 environment variables have poor or non-existent
ability to handle '.' in a variable name. Qualified names are also tedious
to deal with in Python syntax, since keyword argument names can't contain
'.', and it would be tedious to type out the qualifications even if they
could be used. So the config system will need some utility methods to deal
with namespaces, similar to the previous example where a qualifier string
was used in conjunction with keyword arguments.
Configuration source objects will need namespace mapping capabilities, to
express that say, an environment variable should be used to set certain
properties, or that a property should be looked up in a series of
environment variables until it is found. (In other words, namespace
mapping is potentially many-to-many.)
Such mappings are themselves tedious to produce and maintain, and so should
also be reusable, for example in a package. They should also not be part
of code, unless it is to provide defaults or set core policy. Namespace
mappings instead should themselves be defined via configuration namespaces.
Yes, my head is starting to spin, too. In practice, you won't need all
this flexibility - at least, not very often. Environment variables are
kind of a worst-case scenario: we need to support them, but they're a huge
shared namespace filled with all sorts of things, from the wonderful, to
the useless, to the downright dangerous. You don't want a broad mapping of
any configuration namespace, no matter how qualified, directly onto
environment variables, for security reasons if nothing else.
For dealing with environment variables, it may be best not to simply not
support them directly at all! Instead, as in the example above,
configuration policy code could be used to explicitly read environment
variables and place them in the configuration stack. Or, perhaps an
"EnvironmentVars" configuration class could be used, as follows:
ev = config.EnvironmentVars(
PYTHON_PATH = [
'something.that.wants.python.path',
'something.else'
],
HOME = ['where.config.files.go']
)
The resulting object could be placed in the configuration stack, and any
access attempt to a property in the string lists would be looked up from
the specified environment variable. I think this is about as far as we
can/should go with environment support.
The other kind of mapping (search in multiple places), might be spelled
like this:
newConfig = config.Remapper(searchConfig, 'some.prefix',
prop1 = ['place.1', 'place.2', ...]
prop2 = ['place.2', 'place.1', ...]
)
Items like this would be first-class configuration objects, capable of
being used to compose a configuration stack. Looking for
newConfig["some.prefix.prop1"] would cause the remapper to look for
searchConfig["place.1"], searchConfig["place.2"], and so on.
This second kind of mapping is probably more likely to be used on the
component side (configuration consumer), rather than on the administration
side (configuration provider). It's more likely that as an administrator,
you'll want to have the variables and properties that make sense to your
operating environment, application model, etc., and have policy settings
that "push" them towards the namespaces of things that want them. There
will likely be two levels of policy here: "system-wide" type policies, and
app-specific policies. Probably the system-wide policy will be used to
specify where app-specific policies will be looked for, e.g. in the
application's home directory, or using the name of the __main__ script,
etc. App-specific policies may refer to system-level policies to set
defaults for their policies.
(I keep thinking this is all far too complicated, and it
is. Unfortunately, it's because the needs are complicated. Luckily, we
can apply heavy does of STACSTAP via sensible defaults. We just need to
have the hook points available.)
Configuration Stacking
----------------------
It should be possible to assemble configuration objects in an ordered way,
such that higher-precedence configuration settings override
lower-precedence settings. Or, depending on your viewpoint, such that
lower-precedence settings provide defaults for higher-precedence
settings. This should be done through a simple API like:
lowConfig.withOverrides(highConfig), or
highConfig.withDefaults(lowConfig)
which both produce an identical configuration stack.
Originally, we thought of using addition ('+' operator) to assemble stacks
from configuration objects, but there is a certain ambiguity to the meaning
of the ordering. The notation above, while more verbose, is unequivocal
about what is happening. It can also be chained, as in:
medium.withDefaults(low).withOverrides(high).withDefaults(extraLow)
The order of the resulting stack will be: high, medium, low,
extraLow. 'withDefaults()' always adds to the *bottom* of the stack, and
'withOverrides()' always adds to the top of the stack.
There is much more to say about configuration stacks, including how their
"write many, read once" semantics will work, caching, configuration
schemas, "deferred" settings, etc., but I will leave that to another post,
as I think this is good enough for an initial overview.
More information about the PEAK
mailing list