Python: di

The id() function doesn’t have an inverse1. Now it does.

>>> k = object()
>>> print k
<object object at 0x9a3b8>
>>> id(k)
631736
>>> from di import di
>>> j = di(631736)
>>> j
<object object at 0x9a3b8>
>>> id(j)
631736

Stupid simple module available in my SVN repository.

1 For good reason. An object’s id() is just its address in memory. This hack is effectively exposing a pointer — a fragile, prone to disappear out from under you, nasty, C-ism — into Python. It is like running on marbles with scissors. No stability and potential for loss of precious runtime fluids.

In a debugging context, though, being able to turn an id() back into an object reference allows interrogation of objects where identification of “objects of interest” and the actual interrogation of said objects does not have to happen within a contiguous set of expressions.

Chris says:

I wish people wouldn’t post stuff like this. There is never any case where this is a useful operation – the only time it works is when the object you want to retrieve is still around anyway – and there’s never any reason to store the id of an object instead of the object reference (or a weakref) instead.

People asking how to do this isn’t uncommon, and being able to google for it will just lead people to doing it instead of being told that they need to fix the problem that leads to them wanting to do it instead.

“Never any case”? That is a bit unimaginative and could even be construed as arrogant. The only safe “never” is in the statement “never assume you know everything”.

Sure — if you pass a random number or the id of a no longer existing object to di(), your process is going to crash. Totally true and I have added some emphasis on exactly why this feature is not a part of python, nor should it be.

As for this tool and the dangers therein: So what? Don’t do that. Tools can be misused. If something hurts, don’t do it.

This tool has saved me hours and hours of engineering time, several hundred lines of fairly complex code, and having to make some fairly nasty & intrusive changes to several relatively complex client/server focused codebases. More subtly, it allowed me to vastly reduce memory leaks without having to instrument the code in ways that would have very likely changed the lifespan of objects.

Lame. Python’s weakrefs don’t support weak references to Dictionaries, Lists, Tuples, Strings, or None. Four out of five of these types are very often exactly the kind of thing I need to figure out why there are tons of ’em floating around that shouldn’t be. Bogus.



20 Responses to “Python: di”

  1. Chris says:

    I wish people wouldn’t post stuff like this. There is never any case where this is a useful operation – the only time it works is when the object you want to retrieve is still around anyway – and there’s never any reason to store the id of an object instead of the object reference (or a weakref) instead.

    People asking how to do this isn’t uncommon, and being able to google for it will just lead people to doing it instead of being told that they need to fix the problem that leads to them wanting to do it instead.

  2. Lennon says:

    Chris is right: a WeakRef accomplishes exactly the same objective, without bringing the wonderful world of segfaults into the hands of the average Pythonista. Unless your goal is to deliberately generate SIGSEGV and then trap it, this will just bring you a world of pain.

    …and if you’re doing that, you deserve whatever you get, IMHO. Abuse of segfault traps is one of my pet peeves, after spending hundreds of hours trying to work around a codebase that used it extensively, and having my repeated attempts to switch to a clean exception-based approach reverted by the lead developer because exceptions “weren’t portable.”

  3. bbum says:

    Sorry. You are either both choosing to ignore context or you are holier-than-thou arrogant. This is a debugging tool, nothing more, nothing less.

    If someone is stupid enough to put this into a production system, that is their choice and I’m not here — like you are, apparently — to police their dumb asses. They will feel the wrath of the SIGSEGV.

    This little tool has saved me a ton of hours, a ton of changes, has worked flawlessly, and has yielded tremendous optimizations in an otherwise complex system.

    Lennon: How, exactly, do you log a weakref such that you can later parse said log and find the object that the weakref points to when interacting with the interpreter in a new context? I suppose I could build a big ass global hash of something kind of like the id() as key with weakref as value. But why should I go through that effort when di works perfectly to solve the debugging problem I have without requiring that I go and implement such a beast?

  4. Paddy3118 says:

    Should have waited until the first of April to post, like this.

    You might have also added a huge warning banner.

    – Paddy.

  5. Paul Prescod says:

    I can’t believe the censorious bastards giving you a hard time. At least as I’m reading the post now (dunno about a few hours ago), there are more than enough disclaimers on the page to allow people to get what they deserve if they abuse the tool. Thanks for sharing it! I can absolutely see how it would be useful for debugging and perhaps for a few other carefully planned applications.

    It isn’t as if the weakref cannot also be abused. After all, it throws an exception and a naive user might not catch the exception, thereby crashing their app just as destructively as the SIGSEGV. Yes, the exception can be caught, but that requires forethought, the same forethought that could be applied to NOT abusing di.

  6. Peter Fein says:

    Prescod: It isnÒ€ℒt as if the weakref cannot also be abused. After all, it throws an exception and a naive user might not catch the exception, thereby crashing their app just as destructively as the SIGSEGV

    Are you kidding me?

    When an unhandled exception occurs, the interpreter shuts down cleanly. finally: blocks are executed, atexit hooks are run, temp files cleaned up, __del__’s are called, etc., etc.. Remaining threads (and therefore, the program as a whole) may contine to run.

    When a segfault occurs, the interpreter just shits itself.

    If unhandled exceptions are truly equally harmful as segfaults, we might as well all go back to programming in C.

    And FWIW, you’re confused on the behavior of weakrefs – you only get an exception when using a weakref.proxy(), which wouldn’t make sense here.

    As for the merits of the module itself for debugging purposes, I don’t quite see it. bbum says: How, exactly, do you log a weakref such that you can later parse said log and find the object that the weakref points to when interacting with the interpreter in a new context? How the heck does that work? It’s not like a subsequent run of your script is going to put objects at the same mem addresses.

    bbum: I suppose I could build a big ass global hash of something kind of like the id() as key with weakref as value.

    Come on, it’s not that hard. Here’s a free snippet I use exactly for this purpose:

    class MetaInstanceTracker(type):
    “””Metaclass for L{InstanceTracker}”””
    def __new__(klass, name, bases, dic):
    cls = super(MetaInstanceTracker, klass).__new__(klass, name, bases, dic)
    cls.__instances__ = weakref.WeakValueDictionary()
    return cls

    class InstanceTracker(object):
    “””a class that tracks it’s objects using weakreferences”””

    __metaclass__=MetaInstanceTracker

    def __init__(self, *args, **kwargs):
    self.__instances__[id(self)]=self
    super(InstanceTracker, self).__init__(*args, **kwargs)

    Another free tip: try not to be so offended when the intarweb runs around screaming “My eyes! My eyes!” when looking at something that the author himself recognizes as an awful hack. πŸ˜‰

  7. bbum says:

    Thanks for the code. While a step in the right direction, there would still be a lot of work left to make it do what I was able to do based upon di(). It is about 9 lines of code longer than what I needed. πŸ˜‰

    In the end, my particular problem — stupid huge amounts of memory leaks due to overrooting of objects — has been solved very very quickly with minimal fuss and full cognizance of the limited applicability of the model.

    Sometimes a quick, nasty, intrusive, and fragile hack is exactly the right answer.

  8. Chris Hanson says:

    You should get in touch with Guido and see if something like this could be added to Python 3000 core language. It would be really useful in all sorts of situations!

  9. Shalabh says:

    Since ids are reused, how can you be sure the object you got from di is the same one you called id() on? In fact this happens easily on my Mac:

    >>> o = object()
    >>> id(o)
    5387360
    >>> del o
    >>> o2 = object()
    >>> id(o2)
    5387360
    >>>

    Also, can you tell us why a weakref didn’t work for you? For example, how did you pass the id of the object from the code where the object existed, to the code where the object needed to be ‘found’?

  10. Shalabh says:

    Ok, I saw the comment about logging an id and finding the object based on the id in the log file, which means the same instance of the interpreter is still running. But you may get a different object back as ids are reused. Peter Fein’s code has the same issue since it uses the id as the key. But it’s a simple enhancement to use (id, count) as they key.

  11. Peter Fein says:

    Shalabh: No, my code does not have an id-reuse problem. id() is called from the __init__. There’s no way you could end up with two live objects at the same address at the same time.

  12. DeanG says:

    Guess we’re not all Consenting Adults here. πŸ˜€

    A good compromise may be to include it at the same class of feature as Assert: included, but not when run when optimization set at compile time. Would that be too restrictive for the various debugging environments?

  13. bbum’s weblog-o-mat » Blog Archive » Python’s Weakref is weak sauce. says:

    […] little bit ago, I released an itty bitty Python module called di that caused quite the kerfluffle. di is simply the inverse of id; di(id(someObject)) will simply […]

  14. bbum’s weblog-o-mat » Blog Archive » TwistedElephant: A Memory Debugger for Twisted Applications says:

    […] one-off bits of logging within the target application that used Python’s GC module and my di() hack such that I could explore the active object graph within the Twisted […]

  15. Tiran says:

    Here is just a friendly reminder from me. Python has something like di() already build in:

    >>> import _ctypes
    >>> _ctypes.PyObj_FromPtr(id(_ctypes))

    πŸ™‚

  16. Satoru says:

    Hi.
    I don’t know Python well. Is it possible to code like this without di?

    import di

    class Debug:
    @classmethod
    def show(self, *val_name):
    print “”,
    print ” “.join([m + “:” + str(di.di(id(eval(m)))) for m in val_name])

    if __name__ == ‘__main__’:
    total = 0
    for n in range(10):
    total += n
    Debug.show(“n”, “total”)

  17. martineau says:

    @Satoru: Yes, it would work fine (using Tiran’s solution):


    # di,py module
    import _ctypes

    def di(obj_id):
    """ reverse of id() function """
    return _ctypes.PyObj_FromPtr(obj_id)

    Sample usage:


    import di

    class Debug:
    @classmethod
    def show(self, *val_names):
    print "",
    print " ".join([m + ":" + str(di.di(id(eval(m)))) for m in val_names])

    if __name__ == '__main__':
    total = 0
    for n in range(10):
    total += n
    Debug.show("n", "total") # n:9 total:45

  18. Chris Barker says:

    Hey Bill,

    Regardless of those thoughts about how dangerous this is, I think it’s a very handy tool for debugging and exploring the ins and outs of CPyhton’s reference counting, etc.

    In fact, i found this because I am doing just that, and was looking for a way to get the reference count of an object caught in a circular reference that no longer has a reference, i.e:

    l1 = [1,]
    l2 = [2,]
    l1.append(l2)
    l2.append(l1)

    this is a circular reference that will not get cleaned up by the reference counting system. but when I do:

    del l1, l2

    I no linger have a name for the objects, so can’t check their reference counts with sys.getrefcount()

    Anyway, I liked this, so I grabbed it and added a function that returns the reference count of an object by its id. Of course, it crashes hard if you pass in an invalid id — but them’s the breaks — this is only intended to be a testing / debugging / exploring tool….

    You can find my version at:

    https://github.com/PythonCHB/di_refcount

  19. bbum says:

    Nice addition, Chris! Thank you.

  20. » Python:Python: Get object by id says:

    […] mentioning this module for completeness. This code by Bill Bumgarner includes a C extension to do what you want without looping throughout every object in […]

Leave a Reply

Line and paragraph breaks automatic.
XHTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>