[TransWarp] PROPOSAL: Use PEP 282 'logging' as the core of PEAK's log
management
Phillip J. Eby
pje at telecommunity.com
Fri Apr 18 19:10:07 EDT 2003
Overview
========
Why the PEP 282 'logging' package?
* Logging isn't a core competency for PEAK; let somebody else handle it
* Zope Pope-tested, BDFL-approved
* AFAICT, it does everything we could do with 'peak.running.logs' or
'AppUtils.Logging', as well as many things we thought about but never
implemented, or implemented but never used.
Why NOT the PEP 282 'logging' package?
* When we first reviewed it, the weird configuration subsystem gave us more
than a little pause (perhaps because it seemed like overkill for just a
logging system). Apparently, many other people felt the same way, and in
later versions the configuration subsystem was split into its own module,
and one need not ever import it. Therefore, we won't.
* Switching cost? (Not very high, unless there are a lot of people out
there using PEAK's LOG_DEBUG, etc. calls. PEAK doesn't do very many of
those itself. Switching from AppUtils.Logging was going to incur a similar
cost anyway, and PEP 282 includes some methods that have an almost
identical signature to the AppUtils.Logging.LoggingInterface, so conversion
should be fairly mechanical.)
* Ummm... anything else?
All that having been said, on to the gory details...
Architectural Comparisons
=========================
AppUtils.Logging
----------------
AppUtils' Logging package is heavily based on 'syslog' conventions. It
uses priority levels imported straight from the 'syslog' module, it uses
syslog's logging format, and it offers the ability to send data to syslog
(again via the syslog module). (I don't know if we've ever actually used
that ability.)
Usage of AppUtils.Logging is via simple calls on objects that implement
'LoggingInterface', that then either do something with the message, or not,
depending on its priority level. Usually these would be 'LogFile' objects
that simply write directly to a file, opening and closing it after every
message, in order to support external log rotation.
peak.running.logs
-----------------
PEAK's logging package is a bit more sophisticated, and in fact borrowed
some aspects of its design from PEP 282. In particular, PEAK uses "event"
objects that represent a log message, and these event objects "bubble up"
from a source location until they find a suitable destination. However,
unlike PEP 282, PEAK logging events handle their own formatting, rather
than having a separate "formatter" object, and they compute a variety of
data values lazily and once, rather than eagerly or more than once as the
PEP 282 implementation sometimes does.
The current PEAK logging architecture also has some potential performance
issues that tend to stop me from using logging as much as perhaps I
should. Although PEAK logging events compute most of their contents
lazily, the logging events themselves have to be created before there's a
check to see if the event will actually be published. Also, if you're
going to have a formatted message with string interpolation, this also must
be done in advance of knowing whether the event will be published. PEP 282
doesn't create a LoggingRecord or interpolate your message arguments until
it has verified that the logger or one of its parents will at least
consider publishing it (subject to filters on criteria other than the
event's priority level).
PEAK was attempting to unite the PEP 282, zLOG, and AppUtils.Logging
priority schemes. This is already silly because zLOG is not used by Zope
X3; PEP 282 and AppUtils.Logging are the only things we need support. To
handle the full syslog priority set with PEP 282, we need only make a few
'logging.addLevelName()' calls as follows:
logging.addLevelName('EMERG', 70)
logging.addLevelName('ALERT', 60)
logging.addLevelName('NOTICE', 25)
Once this is done, even PEP 282's 'SysLogHandler' will instantly support
all of the standard 'syslog' priorities (although they will only be
available via 'log(level,...)' calls, not as 'notice()', 'alert()', etc.
methods unless we also define and install a custom Logger class, which
doesn't seem worth it). The new 'whenImported()' hook in PEAK will let us
configure these levels to be installed as soon as the 'logging' module is used.
PEP 282
-------
The PEP 282 logging package is based on a notion of "loggers". A logger
is a thing that accepts and distributes log messages. Loggers are arranged
in a conceptual hierarchy using dotted names, similar to Python package
names or PEAK property names. By default, loggers propagate messages to
their "parent" loggers, so sending a message to logger 'foo.bar' will also
send it to logger 'foo'. Loggers inherit their priority filter from parent
loggers, so if you don't specify a priority level for 'foo.bar' it will
accept messages of the same priority as 'foo' will accept.
Conceptually, an infinite number of loggers exist. Practically, they don't
exist until you access them, and unless you explicitly configure them, they
will just send things to their parents.
Loggers don't actually do anything with log messages except decide whether
they should be published (by checking the priority), then creating a
LogRecord and passing it through any Filters. If the record is acceptable
to the filters, the record is then forwarded to the appropriate Handlers
(their own and those of their parents, if propagating to the parents).
Handlers actually do something with the message, like write it to a file,
send it on a socket, etc. Handlers can also have their own filters, and
each logger can have multiple handlers. It's one big happy
pipes-and-filters system. PEP 282 offers *many* fancy handlers, including
one that does automatic logfile rollover and renaming. (No compression,
but we could probably subclass for that if we needed it...) There's even
an SMTPHandler that mails out log messages!
All of this flexibility implies a certain complexity of
configuration/setup. But, if no options are specified, the defaults are
fairly reasonable: messages at WARNING or higher priority are written to
sys.stderr using the default format, all others are discarded.
The big architectural difference between 'logging' and 'peak.running.logs'
is that 'logging' is optimized for a *static* choice of "logger", while
PEAK assumes that this is dynamic, based on a component's position in a
component hierarchy. I see two ways to deal with this:
1. Use a static approach to the logs, putting 'logger =
logging.getLogger("foo.bar")' calls into a module's code and then just
using 'logger.warning(...)' calls in the code. This is how most examples
using the 'logging' package do it. Pro: output by a particular subsystem
is together. Con: might be a bit harder to control things related to a
specific component.
2. Use a dynamic approach using lazy binding. Add a 'logger:' URL scheme
to 'peak.naming', and use 'log=binding.bindTo("logger:foo.bar")' as a
component attribute. Calls would be made with 'self.log.warning(...)' et
al. Pro: fine-grained control can be had. Con: now you have to decide for
every darn component what log you want things to go to... but you could
still opt for a module-based scheme by default, and override just one
component if you need to.
I think I'm leaning towards approach 2, because approach 1 doesn't lend
itself to being changed by anything short of module inheritance. However,
any given module can make its own decision on this, and can change its mind
later. Also, the ability to have a variety of custom handlers, each with
their own filters, may be more useful than per-component message
routing. With our existing packages, per-component routing was the only
option.
One other issue we need to address is formatting. The PEP 282 default
format just dumps raw message text, with no timestamps or other info. We
could use a 'whenImported()' hook to change the format, but I'm not sure
everyone will agree with our format choice. Also, formatters are set per
output handler, not systemwide, so its somewhat moot anyway.
What I think we really need for dealing with logging configuration is a
ZConfig file format, and a PEAK configuration property for a ZConfig file
to be loaded at the time the logging package is imported. We can supply a
default version of that configuration file that sets up a few things for
PEAK, like a syslog-like default format. To do all this, we'll need to
create a ZConfig schema for the logging package, and write a few helper
routines. However, this is probably easier than creating our own scheme
and implementing and maintaining the rest of PEP 282. Also, Chris
McDonough's recent PyCon presentation on ZConfig included some examples
based on configuring the 'logging' package, so there's something to start
from. Who knows, maybe if we wait a little bit, somebody else will create
a ZConfig schema for 'logging'. Certainly, Zope X3 could use such a
thing. Perhaps we can share the effort with the Zope 3 developers.
Interestingly, because the 'logging' package is designed to have
configuration changes "on-the-fly", it's possible to meaningfully load
multiple configuration files during application startup or even later, so
our loading a single configuration file when logging is first used is
unlikely to create a problem -- except that multiple loads of the same
file, or the same information from two files, will result in duplicate
logging (if the same handler appears more than once). Of course, it should
probably be easy to notice that this is happening!
twisted.python.log
------------------
I almost forgot Twisted. Twisted has its own quirky little logging system,
very minimalistic and evidently designed with raw performance in
mind. Instead of having different logs or loggers, it instead has the
notion of who's the current "owner" of *the* log. The "owner" gets to
determine precisely what bytes will be written to the log, given an input
string. This is used mainly to prefix output lines with timestamp info, etc.
There is no priority scheme. You can write stuff to the log. Exceptions
go to the log. That's about it. Clearly, this architecture isn't even in
the same ballpark as PEAK, AppUtils, and PEP 282. Apparently, Twisted
automatically sets the "owner" of the log whenever it task-switches between
protocol handlers, so that each protocol handler doesn't have to know where
to find its log. Given Twisted's emphasis on persistent protocol handlers,
this makes a certain amount of sense. Putting the log in the object would
have added that to its persistent state.
Integrating this with *anything* else is going to be tricky, and I don't
think it's going to happen without the Twisted developers doing a
significant part of it. There isn't enough information (AFAICT) in the
Twisted log API to create PEP 282 logging records *from*, so although in
principle one could create a pseudo-logfile object to forward everything
from Twisted to a specified PEP 282 Logger, in practice all it would do is
give you the ability to "tee" Twisted log data to a variety of places, and
maybe do some regex filtering.
On the plus side, developers of both Twisted and Zope want those two
systems to share code and interoperate, so perhaps they'll be motivated to
address this. As a BDFL-blessed standard for Python, PEP 282 carries more
weight than anything we might develop as a logging package.
Anyway, PEAK doesn't rely on Twisted for anything as yet, so this is not an
issue we need to care very strongly about. If at a later time we start
doing networking code with PEAK based on Twisted, we may care a bit more --
or we may find we'd rather have Twisted's output going to a separate file
anyway.
Migration Plan
==============
So what does the move to PEP 282 entail for PEAK?
* We need to distribute the 'logging' package so that PEAK has "batteries
included" on Python 2.2 in the absence of a Zope X3 installation.
* The current contents of 'peak.running.logs' needs to be replaced with:
1. a Formatter implementation that supports the PEAK/AppUtils log format.
2. support code for setting up the "missing" log levels in the 'logging'
package, and exporting them for use by PEAK code.
3. support code for loading ZConfig files into the 'logging' package,
automatically and manually
4. any PEAK-specific Handler or Filter objects (probably none to start with)
* We add a 'logger:' naming scheme to make it easy to refer to
loggers. (We could make ones for handlers, too, but this seems better
"handled" directly by the config file format.)
* We create a ZConfig schema for 'logging', and add it to a peak.running
'component.xml' file. (This is actually a bit tricky, because we need two
schema components: one for all the built-in objects supplied by 'logging',
and one for PEAK-specific formatters, handlers, and filters.)
* The current LOG_XXX() functions in 'peak.api' need to be removed, and
their use expunged from the codebase. (So if you're using them now, get
started on moving to using the 'logging' package instead!)
Notice that for the most part, we won't really *do* anything with logging
in PEAK any more, except provide some useful front-end bits. Sounds like a
big win to me. What do you think?
More information about the PEAK
mailing list