[TransWarp] More configuration design: the "schema sandwich"
Phillip J. Eby
pje at telecommunity.com
Thu Jul 4 21:23:06 EDT 2002
One fundamentally frustrating aspect of designing a good configuration
system for our purposes, is that the configuration system must make it
possible to specify, without programming, things that are hard to do
without programming. For example, referencing pre-existing objects,
performing calculations based on other settings, and so on. Most of the
handy sources of configuration data are text-based, and offer little in the
way of data structure support.
Configuration consumers want objects, not text. But consumers don't know
anything about the underlying configuration providers, and shouldn't have
to. It's up to the providers to supply information that is of the correct
type, and it's up to the people who are setting configuration policies and
assembling the configuration stack to do so in such a way that the correct
data is provided.
However, if an administrator fails to set a configuration value correctly,
a system may fail with an unhelpful error message, that does not show the
source of the bad data. Further, the failure may be difficult to reproduce
if the configuration setting in question is infrequently accessed by the
running program. Better to at least have a TypeError or
"ConfigurationError" raised with the configuration variable name and a
description of the problem, ideally with some kind of "reverse traceback"
that shows where the incorrect setting came from.
To do the checking and calculations, we need schemas for our settings. A
schema really just consists of a callable that's passed a name, a value,
and the place the value was found. It validates or transforms the value,
and returns it or raises an error. The value passed in may be a singleton,
"UNDEFINED", to indicate that no value for the name was found. In that
event, the schema may provide a default value, by looking up other settings
in the same "place" where the setting was considered undefined.
We'll implement the configuration stack as a chain of filters, each of
which can introduce a collection of schemas (or values, or both). It's
likely that we'll also have the schema-introducing filters also
automatically interpret any LinkRef() instances as names to be looked up
using the configuration stack from the filter down as the configuration for
the InitialContext used in the lookup. This ensures that the schema
checking will run against objects, not names.
A typical configuration stack will probably start with a collection of
schemas representing meta-configuration policies. These schemas will
simply be algorithms for producing base configuration defaults, using
environment variables or other contextual information to "prime the pump"
for loading more complex configuration files or other systems.
A second-tier configuration class or function will take this starter
meta-configuration and look things up in it, following the directions
supplied to build up the stack with configuration providers for config
files, lookups from databases, etc. Then, it'll finish off the top of the
stack with a schema filter based on a global schema registry, into which
imported modules will place their property schema definitions. Thus, any
access to configuration settings will pass through the appropriate schema
checking and calculation of defaults. Caching can occur at all levels,
even if the stack becomes a tree, through different parts of the system
tee-ing off their own top-level filters. But under most circumstances, few
if any levels will exist beyond the top of the "schema sandwich"
(meta-config schemas on the bottom, app-config schemas on the top,
configuration providers in the middle).
Most components in a tree will simply share their root object's
configuration stack, but there will be a facility whereby they can define a
method that will add schema information or other filters to the top of the
stack for their own use. The filtered stack will then be inherited by
their child components.
Wow. That seemed easy enough to explain, considering it took me all day to
work it out. I decided to spare the list all the text I wrote in the
process of figuring it out, and just write a new letter (this one)
supplying the answer. :) It was a very complex process to work out
exactly where to place each of the functions of name resolution, type
checking/conversions, default calculations, etc. The end result is very
STASCTAP, though.
At this point there are still some questions in my mind about how certain
things will look, what they'll be called, etc., but I think I'm going to
have to actually start writing code to sort those issues out. My wrists
are starting to act up from all this typing over the last week or two, so I
think it's best if I stop for tonight. It may be a couple days before I'll
be able to resume posting or coding in volume.
More information about the PEAK
mailing list