Quantcast
Channel: Planet Plone - Where Developers And Integrators Write
Viewing all articles
Browse latest Browse all 3535

RedTurtle: Play these (Python) strings until my fingers are raw

$
0
0

In one project I had to subclass the Python string type (namely str) in order to get some additional features.

Why I decided to do that?

Because I needed something:

  • supporting almost all the methods of the standard strings
  • with some custom attributes, additional methods
  • that could be compared and mixed with strings.

I had almost no choice. But subclassing str is a task that should be handled with special care because it is a so called immutable type.

I will show how to achieve this with a couple of examples.

Example 1: a lowercase string

Let's consider a simple, but very helpful in many circumstances, use case: the implementation of a "lowercase string" type.

To create a similar object, a developer could write something like that:

class BrokenLowerCaseString(str):
    ''' This is going to fail!'''def__init__(self, value):
        ''' Return a string instance'''
        value =str(value).lower()
        str.__init__(self, value)

This code is going to silently fail in Python 2:

>>> BrokenLowerCaseString('Alice')
'Alice'

Even if the code runs smoothly, the string case is not lowered at all.

In Python 3 it will not even run:

>>> BrokenLowerCaseString('Alice')Traceback (most recent call last):File "<stdin>", line 1, in <module>File "test.py", line 27, in __init__    str.__init__(self, value)TypeError: object.__init__() takes no parameters

The right way to implement this type is to override the __new__ operator instead of the __init__ one.

This is generally true for all the immutable types [1].

class LowerCaseString(str):
    ''' Provides an object that is like a string    but that will always be converted to lowercase'''def__new__(cls, value):
        ''' Return a string instance'''
        value =str(value).lower()
        returnstr.__new__(cls, value)

This time we got the expected result:

>>> LowerCaseString('Alice')
'alice'

This latter class is working because the __new__ operator returns a new instance of a string object created with an already lowered string! The __init__ method, instead, pretends to modify an already created immutable instance.

Once we have got this concept we can give more "superpowers" to our subclassed types.

Example 2: an email string

The next example shows a simple Email type, a string with:

  • a constraint
  • new attributes
  • a property.
class Email(str):
    ''' Provides an object that is like a string    but with additional attributes'''@staticmethoddef _is_valid(value):
        ''' Very simple validation'''return'@'instr(value)

    def__new__(cls, value, firstname='', lastname=''):
        ''' Return a string instance'''ifnot cls._is_valid(value):
            raiseValueError(value)
        returnstr.__new__(cls, value)

    def__init__(self, value, firstname='', lastname=''):
        ''' Add some attributes to the instance'''self.firstname =str(firstname)
        self.lastname =str(lastname)

    @propertydef fullname(self):
        ''' This property returns the name of the string'''return"".join((self.firstname, self.lastname))

The static method _is_valid accepts only objects that contain a '@' in it, so we can pass any type of object to the constructor:

>>> Email(['@'])
"['@']"

Of course the validator could be improved, but for this post it is enough that it raises an error on invalid strings:

>>> Email('Sample string')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "test.py", line 114, in __new__
    raise ValueError(value)
ValueError: Sample string

I will now construct an Email instance:

>>> alice_email = Email('alice.burton@example.com', 'Alice', 'Burton')

This instance is also an instance of a string:

>>> isinstance(alice_email, str)
True

Email instances have a property called fullname:

>>> alice_email.fullname
'Alice Burton'

And the additional attributes can be modified:

>>> alice_email.lastname = 'Cooper'
>>> alice_email.fullname
'Alice Cooper'

There are some things to bear in mind when a similar operation is done. The subclassed object compares perfectly to a string:

>>> alice_email == 'alice.burton@example.com'
True

The firstname and lastname attributes, in fact, are not taken into account during comparison:

>>> alice_email_noname = Email(u'alice.burton@example.com')
>>> alice_email_noname.fullname == alice_email.fullname
False
>>> alice_email_noname == alice_email
True

If you want to compare also the custom attributes, you should implement a custom __cmp__ method [2]. This is generally true when subclassing.

Conclusions and prospects

Subclassing strings (and other immutable types) has to be done in a peculiar way, but when you have to do it, this can give you a lot of power and functionality with very little amount of code. On the next post I will show you production code released on GitHub and pypi showing that, the same technique, applied to the int type, leads to a very elegant and simple solution for a complex problem.

Footnotes

[1]See this document for further details: http://python-history.blogspot.it/2010/06/inside-story-on-new-style-classes.html
[2]See this in the Python data model documentation: https://docs.python.org/2/reference/datamodel.html#object.__cmp__

Credits

The picture of David Gilmour playing strings is taken from wikimedia.

The post title is inspired by the Divison bell song "What Do You want from me".


Viewing all articles
Browse latest Browse all 3535

Trending Articles