Quantcast
Channel: Planet Plone - Where Developers And Integrators Write
Viewing all articles
Browse latest Browse all 3535

Lennart Regebro: Python: Implementing rich comparison the correct way.

$
0
0

Since quite some time the so called “rich comparison methods”, ie __lt__, __le__, __eq__, __ge__, __gt__ and __ne__, has been around in Python and preferred to the old __cmp__, and in Python 3 __cmp__ is gone, so you must use the rich methods. That’s six methods instead of one, so most people try to avoid doing that, and use some mixin or other technique to only implement one. But that is surprisingly tricky, and is in fact so tricky that the official Python documentation points to a recipe that is incorrect. Even worse, this incorrect recipe has been included into Python 2.7! (If I knew this was planned, I would have written this post earlier).

The recipe in question is the “Total Ordering” recipe, in Python 2.7 included as functools.total_ordering. And the fault it makes is that it flips the comparison around. If the class you implement doesn’t implement the greater than method, __gt__ it will add a method to your class that asks the other object to do a less than comparison. In practice it does this:

__gt__ = lambda self, other: other < self

That sounds extremely reasonable. The problem is that your comparison methods can indicate that they don’t know how to compare self to the other object. They do this by returning NotImplemented. Python will then ask the other object. In other words,  Python will do exactly this “reversal” of comparisons all by itself. What happens then if you use the total_ordering deocrator, and then compare self to an object that doesn’t know how to compare itself to you?

  1. You ask a if it is less than b.
  2. The total_ordering methods will ask b if it is greater than a.
  3. b will return NotImplemented, hence telling Python that is does not know how to compare itself to a.
  4. Python will then ask a if it is less than b.
  5. And you got infinite recursion!

Here is code that will demonstrate this in Python 2.7:

>>> class MinimalOrdering(object):
...    
...     def __init__(self, value):
...         self._value = value
...        
...     def __lt__(self, other):
...         try:
...             return self._value < other._value
...         except AttributeError:
...             # I don't know how to compare to other
...             return NotImplemented
...
>>> from functools import total_ordering
>>> @total_ordering
... class TotalOrder(object):
...    
...     def __init__(self, value):
...         self.value = value
...        
...     def __lt__(self, other):
...         self.value < other.value
...
>>> total = TotalOrder(1)
>>> mini = MinimalOrdering(2)
>>> total > mini
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
 File "/opt/python27/lib/python2.7/functools.py", line 56, in <lambda>
 '__lt__': [('__gt__', lambda self, other: other < self),
....waaaaaayyyyyyy later....
File "/opt/python27/lib/python2.7/functools.py", line 56, in <lambda>
 '__lt__': [('__gt__', lambda self, other: other < self),
RuntimeError: maximum recursion depth exceeded

In short: never, ever in your rich comparison methods, ask the other object to do the comparison.

Here is the recipe I use instead. (Or well, I don’t, I used something more complicated, but a comment from Martin von Löwis started a process in my head that ended up in this, much simplified version. So this is the recipe I *will* use, in the future). It’s a mixin. If you are allergic to multiple inheritance you can make it into a class decorator instead, but I like mixins.

class ComparableMixin(object):
    def _compare(self, other, method):
        try:
            return method(self._cmpkey(), other._cmpkey())
        except (AttributeError, TypeError):
            # _cmpkey not implemented, or return different type,
            # so I can't compare with "other".
            return NotImplemented

    def __lt__(self, other):
        return self._compare(other, lambda s,o: s < o)

    def __le__(self, other):
        return self._compare(other, lambda s,o: s <= o)

    def __eq__(self, other):
       return self._compare(other, lambda s,o: s == o)

    def __ge__(self, other):
        return self._compare(other, lambda s,o: s >= o)

    def __gt__(self, other):
        return self._compare(other, lambda s,o: s > o)

    def __ne__(self, other):
        return self._compare(other, lambda s,o: s != o)

Not so difficult. Note how instead of delegating to the other object, I delegate to a _cmpkey() method (that was Martins brilliant idea) and if the other object does not have a _cmpkey() method, or returns a different type from what your class does, it will return NotImplemented, and Python will then delegate to the other object. This is how it’s designed to work.

You use this by subclassing from the mixin and implementing the _cmpkey() method:

class Comparable(ComparableMixin):
    def __init__(self, value):
        self.value = value

    def _cmpkey(self):
        return self.value

Which is simple enough. You don’t even need to implement one comparison method, you just return a key that can be compared. Suddenly all need to think when implementing comparison is gone. :-)

Feedback on this is welcome!


Filed under: plone, python, python3 Tagged: comparison, total ordering

Viewing all articles
Browse latest Browse all 3535

Trending Articles