[TransWarp] weakrefs, garbage collection, and persistence
Phillip J. Eby
pje at telecommunity.com
Wed Jul 31 20:27:37 EDT 2002
I'm contemplating dropping the use of weak references from the PEAK
component binding system. Not from caches, just for binding attributes and
references to parent components.
I originally used weak references for parent component links starting with
code that was based on Python 2.1, whose garbage collection capability was
disabled by default. Also, garbage collection was a relatively new feature
to Python, which didn't do much to relieve my accumulated paranoia
regarding circular references. (Even now, garbage collection is relatively
new, and there are still some "interesting" issues, such as behavior of the
__del__ method.)
But weak references have their own issues. They aren't picklable, and so
don't play well with persistence. They add performance overhead and code
complexity, whether you use weakref.ref() or weakref.proxy() (just to
different parts of your application).
Until now, I've put up with these issues in order to guarantee an absence
of hard circular references. But there are use cases coming up which may
make the point moot.
First, persistent objects, and domain models in general. Circular
references are the rule, rather than the exception, in any moderately
complex application domain. Persistent objects would need to have
__getstate__ and __setstate__ convert to/from weak references for
pickling. But if we're using ZODB or the Persistence-SIG's future
persistence code, circular references will be managed by sweeping a cache
and deactivating the objects anyway.
Second, namespace-to-class mapping. We want to extend the peak.naming
framework to make it easy to turn any naming context namespace into a class
object, with lazy loading of the class' descriptors. To do this,
descriptors will need to hang onto their naming contexts. The naming
contexts, in turn, need to hang onto their parent objects -- possibly a
parent naming context -- as long as the context is in use. But parent
component references are weak right now. This means that a context *must*
be held onto by its parent, in order for its children's references to
remain valid. This works well for the typical application component
objects, but it shows a "weakness" in our architecture, so to speak. :)
Essentially, it isn't valid to pass any component in a tree to some object
outside the tree that may use it past the lifetime of its parent. Or, in
other words, it's possible for a component's hierarchical context to change
over time through no intentional action on anybody's part. That seems wrong.
The natural solution would seem to be to get rid of weakrefs, and hope that
garbage collection takes care of things. One concern is that __del__ is
not supported by the garbage collector: objects with __del__ methods
specifically cannot be garbage collected. This is probably not a big
issue; I believe that most objects that implement __del__ because they
reference external resources like file handles or sockets, will not
themselves be part of object cycles. Some testing in the Python
interpreter has shown that the collector will handle cycles that reference
objects with __del__ methods, as long as the cycle itself doesn't contain
objects with __del__ methods. So I think this issue can be avoided,
through appropriate care.
A recent thread on the python-dev list seems to imply that garbage
collection will become a mandatory feature as of Python 2.3, in the sense
that there will no longer be an option to disable it. Apparently the core
developers feel strongly enough that it's a robust feature.
Anyway, the main point of this rant is: what's your opinion? Do you
disable gc? Had any bad experiences with it? Any reasons why we shouldn't
just throw caution to the wind and unbuckle the weak reference safety
belt? As Dennis Miller says, "I want to know what you think, America",
except that I want to know what the non-Americans think as well. :)
More information about the PEAK
mailing list