Python’s Weakref is weak sauce (but not poisonous)

A little bit ago, I released an itty bitty Python module called di that caused quite the kerfluffle. di is simply the inverse of id; di(id(someObject)) will simply return someObject.

Of course, this is incredibly dangerous and a desperately stupid thing to do anywhere but in debugging code (as I noted). Why? Because it is effectively adding all of the fragility of C style pointers to Python. Yay! Bus Errors! SegFaults! Random Objects Where They Should Not Be! Whee!

Pretty much every criticism of di found in the comments section mentioned that Python’s weakref implementation could be used to accomplish the same thing with a little bit of bookkeeping code on the backend.

Nice. Except for the fact that weakref is not a general purpose solution for tracking objects, regardless of type.

Specifically, from the documentation:

Not all objects can be weakly referenced; those objects which can include class instances, functions written in Python (but not in C), methods (both bound and unbound), sets, frozensets, file objects, generators, type objects, DBcursor objects from the bsddb module, sockets, arrays, deques, and regular expression pattern objects. Changed in version 2.4: Added support for files, sockets, arrays, and patterns.

Ouch. In other words, the suggestions to use a simple weakref based system for keeping track of objects will not be so simple (I was debugging memory leaks). I would have to special case for whatever objects can’t be weakref’d and either change their potential lifespan by creating a strong ref to ’em or simply ignor them (or use di(), which is what I did).

So, what can’t be weakref’d (using Python 2.5)?

>>> from weakref import ref
>>> ref({})
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: cannot create weak reference to 'dict' object
>>> ref("")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: cannot create weak reference to 'str' object
>>> ref([])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: cannot create weak reference to 'list' object
>>> ref(object())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: cannot create weak reference to 'object' object
>>> ref(1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: cannot create weak reference to 'int' object
>>> class Foo(object): pass
... 
>>> ref(Foo())
<weakref at 0x6a1b0; dead>

Unfortunately, a lot of things can’t be weakref’d. In particular most of the instances of the various classes — dict, array, string, and int — that are used to store away or represent data in applciations cannot be weakref’d.

Or, for my purposes, it pretty much means that using weakref to develop a generic memory monitoring subsystem was out of the question. Not outright impossible, by any means, just very very limited in its general purpose applicability without changing the lifespan of objects in the target app.

Ultimately, I don’t consider this some huge, glaring, gigantic, flaw in Python’s implementation.

That weakref doesn’t work with such fundamental types as dictionaries, arrays, and strings is really quite annoying, but not terribly surprising given the history of the language and the focus on backwards compatibility. It is rare that one would write code that needs completely type agnostic weak referencing capabilities.

To put that last statement more strongly: completely type agnostic containment is an extreme rarity within well designed software architectures.

One exception is debugging tools. More specifically, type agnostic containment without impacting object lifespan is quite valuable within memory debuggers (stupid name: “object population monitoring” or “object census & analysis” is more precise). Fortunately, Python offers some decent tools for interrogating the population of objects currently in existence. The lack of generic weakref makes generation-to-generation tracking more difficult, but not intractable.

Anyway — enough theoretical whining. I spent a bit of Labor day (the other three day weekend that I can’t keep straight) writing a Twisted memory debugger. I’ll drop it on the world in a post or two.



One Response to “Python’s Weakref is weak sauce (but not poisonous)”

  1. n[ate]vw says:

    Labor day?

Leave a Reply

Line and paragraph breaks automatic.
XHTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>