[TransWarp] memory leak in peak.running
Phillip J. Eby
pje at telecommunity.com
Fri Aug 22 16:45:45 EDT 2003
At 10:43 PM 8/22/03 +0300, alexander smishlajev wrote:
>Phillip J. Eby wrote, at 22.08.2003 22:13:
>
>>>the following script is leaking memory:
>>Have you already tried adding a gc.collect() call to make sure that it's
>>not just buildup between collections?
>
>yes. we have tried to do gc.collect() and look at gc.garbage in our
>application. this did not help. the garbage list is always empty, but
>the application core grows from ~8M to more than 150M by one night -
>several Kb per second.
>
>i managed to narrow the problem down to the given example.
>
>i have studied the code of MainLoop, UntwistedReactor, TaskQueue and
>AdaptiveTask, and i must confess that i cannot see any possible leakage
>sources there. do you have any clues?
>
>just to be sure, i have just added gc.collect() to the doWork() of this
>Test class. this slowed things down a lot (also leakage is much slower),
>but the memory is still leaking.
I found the problem. It was in the C version of adapt(), and had nothing
to do with any of the specific work the app was doing. Specifically, the
'pollInterval' attribute binding was trying to suggest a parent component
for the new poll interval on each execution of the task. In doing this, it
tried to adapt the poll interval to IAttachable, which then leaked a new
bound method for IAttachable.__adapt__ on each execution. Unfortunately,
gc.collect() doesn't catch refcount bugs. :(
I found the problem by simple binary search... first I changed the test's
run() method to just loop, calling self.doWork(). Since that didn't leak,
I knew doWork() wasn't the problem. I then tried getWork() and doWork(),
which also didn't leak. That narrowed it down to something in __call__(),
and a little more testing showed it wasn't lockMe() or unlockMe(). After a
bit more of this binary searching I narrowed it down to the line where
__call__ set 'self.pollInterval = pi'. Looking through the code used by
binding descriptors to set attributes, I eventually noticed that
'suggestParentComponent()' was being called, so I changed 'pollInterval' to
set 'suggestParent=False', and the leak went away.
Finally, noticing that all 'suggestParentComponent()' will do if it is
given a number is call adapt() on it, it was then just a matter of combing
through protocols._speedups source to find the refcount leak. That, of
course, took a bit more time.
More information about the PEAK
mailing list