This chapter is a continuation from chap 1 and dives deep into creating more pythonic objects.

Object Representations

Python has 2 standard ways of representing an object

  1. repr() -> __repr__
  2. str() -> __str__

There are 2 additional methods called __bytes__ and __format__.

Lets understand more about these by learning from a running example class Vector2d. This class is used to represent the vectors in the euclidean plain.

from array import array
import math

class Vector2d:
    typecode = 'd'  # this is a class attribute we use to convert Vector2d
                    # instances to/from bytes
    
    def __init__(self, x, y):
        self.x = float(x)  # catches errors early
        self.y = float(y)
        
    def __iter__(self):  # this makes `x, y = my_vector` work
        return (i for i in (self.x, self.y))
    
    def __repr__(self):
        class_name = type(self).__name__
        return '{}({!r}, {!r})'.format(class_name, *self)
    
    def __str__(self):
        return str(tuple(self))
    
    def __bytes__(self):
        return (bytes([ord(self.typecode)]) +
                bytes(array(self.typecode, self)))
    
    def __eq__(self, other):
        return tuple(self) == tuple(other)
    
    def __abs__(self):
        return math.hypot(self.x, self.y)
    
    def __bool__(self):
        return bool(abs(self))
    
    @classmethod
    def frombytes(cls, octets):
        typecode = chr(octects[0])
        menv = memoryview(octects[1:]).cast(typecode)
        return cls(*memv)
v1 = Vector2d(3, 4)
print(v1.x, v1.y)
3.0 4.0
x, y = v1
x, y
(3.0, 4.0)
v1
Vector2d(3.0, 4.0)
v1_clone = eval(repr(v1))
v1_clone == v1, repr(v1)
(True, 'Vector2d(3.0, 4.0)')
print(v1)
(3.0, 4.0)
octects = bytes(v1)
octects
b'd\x00\x00\x00\x00\x00\x00\x08@\x00\x00\x00\x00\x00\x00\x10@'
abs(v1)
5.0
bool(v1), bool(Vector2d(0, 0))
(True, False)

We have implemented all the basic methods but the one operation that is missing is rebuilding a Vector2d from the binary representation. This is the classmethod that we have implemented.

classmethods are used to define methods that operates on the class and not on the instances. Its most commonly used for alternative constructors.

staticmethod changes the method so that it receives no special first argument. It is a plain function that happens to live in the class body.

class Demo:
    
    @classmethod
    def klassmeth(*args):
        return args
    
    @staticmethod
    def statmeth(*args):
        return args
Demo.klassmeth()
(__main__.Demo,)
Demo.klassmeth('spam')
(__main__.Demo, 'spam')
Demo.statmeth()
()
Demo.statmeth('spam')
('spam',)

Formatted Displays

The format() built-in function and the str.format() method delegates the actual formatting to each type by calling their .__format__(format_spec) method. The format_spec is a formatting specfier, format_spec can be either

  1. The second argument in format(my_obj, format_spec) or
  2. Whatever appears after the colon in a replacement field delimited with {} inside a format string used with str.format()
brl = 1/2.43
brl
0.4115226337448559
format(brl, '0.4f')
'0.4115'
'1 BRL = {rate:0.2f} USD'.format(rate=brl)
'1 BRL = 0.41 USD'

Now there is lot about format mini-language and format function but lets see an example of what we want to build. Ideally the Vector2d should work like this

>>> v1 = Vector2d(3, 4)
>>> format(v1)
'(3.0, 4.0)'
>>> format(v1, '.2f')
'(3.00, 4.00)'
>>> format(v1, '.3e')
'(3.000e+00, 4.000e+00)'

Additionaly it would be great if we can implement a polar format for the vector too.

from array import array
import math

class Vector2d:
    typecode = 'd'  # this is a class attribute we use to convert Vector2d
                    # instances to/from bytes
    
    def __init__(self, x, y):
        self.x = float(x)  # catches errors early
        self.y = float(y)
        
    def __iter__(self):  # this makes `x, y = my_vector` work
        return (i for i in (self.x, self.y))
    
    def __repr__(self):
        class_name = type(self).__name__
        return '{}({!r}, {!r})'.format(class_name, *self)
    
    def __str__(self):
        return str(tuple(self))
    
    def __bytes__(self):
        return (bytes([ord(self.typecode)]) +
                bytes(array(self.typecode, self)))
    
    def __eq__(self, other):
        return tuple(self) == tuple(other)
    
    def __abs__(self):
        return math.hypot(self.x, self.y)
    
    def __bool__(self):
        return bool(abs(self))
    
    @classmethod
    def frombytes(cls, octets):
        typecode = chr(octects[0])
        menv = memoryview(octects[1:]).cast(typecode)
        return cls(*memv)
    
    def angle(self):
        return math.atan2(self.y, self.x)

    def __format__(self, fmt_spec=''):
        if fmt_spec.endswith('p'):
            fmt_spec = fmt_spec[:-1]
            coords = (abs(self), self.angle())
            outer_fmt = '<{}, {}>'
        else:
            coords = self
            outer_fmt = '({}, {})'
        components = (format(c, fmt_spec) for c in coords)
        return outer_fmt.format(*components)
format(Vector2d(1, 1), 'p')
'<1.4142135623730951, 0.7853981633974483>'
v = Vector2d(1, 1)
f'{v:p}'
'<1.4142135623730951, 0.7853981633974483>'
format(Vector2d(1, 1), '.3ep'), format(Vector2d(1, 1), '0.5f')
('<1.414e+00, 7.854e-01>', '(1.00000, 1.00000)')

A Hashable Vector2d

Rt now our Vector2d object is not hashable. It can't be used in sets or as a key for dict and for that we have to make it hashable.

quick recap, in order to make an object hashable we have to

  1. Implement __eq__ (we have that)
  2. Implement __hash__
  3. should be immutable
v1 = Vector2d(3, 4)
hash(v1)

TypeErrorTraceback (most recent call last)
<ipython-input-36-293f2e233a24> in <module>
      1 v1 = Vector2d(3, 4)
----> 2 hash(v1)

TypeError: unhashable type: 'Vector2d'
set([v1])

TypeErrorTraceback (most recent call last)
<ipython-input-37-bc2432ceb71f> in <module>
----> 1 set([v1])

TypeError: unhashable type: 'Vector2d'
from array import array
import math

class Vector2d:
    typecode = 'd'
    
    def __init__(self, x, y): # makes x and y private
        self.__x = float(x)
        self.__y = float(y)
        
    @property
    def x(self):
        return self.__x
    
    @property
    def y(self):
        return self.__y
        
    def __iter__(self):
        return (i for i in (self.x, self.y))
    
    def __repr__(self):
        class_name = type(self).__name__
        return '{}({!r}, {!r})'.format(class_name, *self)
    
    def __str__(self):
        return str(tuple(self))
    
    def __bytes__(self):
        return (bytes([ord(self.typecode)]) +
                bytes(array(self.typecode, self)))
    
    def __eq__(self, other):
        return tuple(self) == tuple(other)
    
    def __abs__(self):
        return math.hypot(self.x, self.y)
    
    def __bool__(self):
        return bool(abs(self))
    
    # hashes of individual attributes are joined with XOR
    def __hash__(self):
        return hash(self.x) ^ hash(self.y)
    
    @classmethod
    def frombytes(cls, octets):
        typecode = chr(octects[0])
        menv = memoryview(octects[1:]).cast(typecode)
        return cls(*memv)
v1 = Vector2d(3, 4)
v1.x
3.0
v1.x = 4

AttributeErrorTraceback (most recent call last)
<ipython-input-45-13b35e4a2103> in <module>
----> 1 v1.x = 4

AttributeError: can't set attribute
dir(v1)
['_Vector2d__x',
 '_Vector2d__y',
 '__abs__',
 '__bool__',
 '__bytes__',
 '__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 'frombytes',
 'typecode',
 'x',
 'y']
v1 = Vector2d(3, 4)
v2 = Vector2d(3.1, 4.2)
hash(v1), hash(v2)
(7, 384307168202284039)
set([v1, v2])
{Vector2d(3.0, 4.0), Vector2d(3.1, 4.2)}

The hash should return an int and ideally take into account the hashes of the object attributes that are also used in the eq method, because objects that compare equal should have the same hash. The hash special method documentation suggests using the bitwise XOR operator (^) to mix the hashes of the components, so that’s what we do.

from array import array
import math

class Vector2d:
    typecode = 'd'  # this is a class attribute we use to convert Vector2d
                    # instances to/from bytes
    
    def __init__(self, x, y): # makes x and y private
        self.__x = float(x)
        self.__y = float(y)
        
    @property
    def x(self):
        return self.__x
    
    @property
    def y(self):
        return self.__y
        
    def __iter__(self):  # this makes `x, y = my_vector` work
        return (i for i in (self.x, self.y))
    
    def __repr__(self):
        class_name = type(self).__name__
        return '{}({!r}, {!r})'.format(class_name, *self)
    
    def __str__(self):
        return str(tuple(self))
    
    def __bytes__(self):
        return (bytes([ord(self.typecode)]) +
                bytes(array(self.typecode, self)))
    
    def __eq__(self, other):
        return tuple(self) == tuple(other)
    
    def __abs__(self):
        return math.hypot(self.x, self.y)
    
    def __bool__(self):
        return bool(abs(self))
    
    @classmethod
    def frombytes(cls, octets):
        typecode = chr(octects[0])
        menv = memoryview(octects[1:]).cast(typecode)
        return cls(*memv)
    
    def angle(self):
        return math.atan2(self.y, self.x)

    def __format__(self, fmt_spec=''):
        if fmt_spec.endswith('p'):
            fmt_spec = fmt_spec[:-1]
            coords = (abs(self), self.angle())
            outer_fmt = '<{}, {}>'
        else:
            coords = self
            outer_fmt = '({}, {})'
        components = (format(c, fmt_spec) for c in coords)
        return outer_fmt.format(*components)
    
    # hashes of individual attributes are joined with XOR
    def __hash__(self):
        return hash(self.x) ^ hash(self.y)

Now you don't have to implement anything you won't be using just for the sake of making it more pythonic but hopefully now you know all that is possible.

Private and "Protected" Attributes in Python

Python has no explicit ways to create private variables. Instead what is does is a process called 'name mangling'. If we want to create a private attribute we prefix the attribute with 2 '_' (sunderscores). Python stores the name in the instance __dict__ prefixed with a leading underscore and the classname. So in the above example __X becomes _Vecotor2d__X.

Note its about safety and not security. You can access the attribute if you wanted to, there is nothing stopping you from doing it.

Not a lot of pythonistas preffer this convention and the one they choose is _varname which signify "protected" variables. This is just to signal that this attribute should not be used directly from outside the class.

v1 = Vector2d(3, 4)
v1.__dict__
{'_Vector2d__x': 3.0, '_Vector2d__y': 4.0}

__slots__ Class Attribute

This is a special attribute that can affect the interal storage of the object. By default, Python stores instance attributes in a per-instance dict named __dict__. As we saw in “Practical Consequences of How dict Works”, dictionaries have a significant memory overhead because of the underlying hash table used to provide fast access. If you are dealing with millions of instances with few attributes, the __slots__ class attribute can save a lot of memory, by letting the interpreter store the instance attributes in a tuple instead of a dict.

If we where to modify our Vector2d example with __slots__ this is how it would look.

class Vector2d:
    __slots__ = ('__x', '__y')
    typecode = 'd'
    
    # methods follow (omitted in book listing)

But they do have a few caveats:

  • You must remember to redeclare __slots__ in each subclass, because the inherited attribute is ignored by the interpreter.
  • Instances will only be able to have the attributes listed in slots, unless you include '__dict__' in __slots__ (but doing so may negate the memory savings).
  • Instances cannot be targets of weak references unless you remember to include '__weakref__' in __slots__.

Overriding Class Attributes

Class attributes (like typecode in Vector2d) can be effectively be used as default values for instance attributes.

You can also override the class attributes in a per instance basis. Eg. in the Vector2d case, typecode is to specify how the instance should be exported to bytes. 'd' means 8-byte double precision float but we can change it for some other instance to 'f' (4-bytes single precision).

v1 = Vector2d(1.1, 2.2)
dumpd = bytes(v1)
dumpd, len(dumpd)
(b'd\x9a\x99\x99\x99\x99\x99\xf1?\x9a\x99\x99\x99\x99\x99\x01@', 17)
v1.typecode = 'f'
dumpd = bytes(v1)
dumpd, len(dumpd)
(b'f\xcd\xcc\x8c?\xcd\xcc\x0c@', 9)

Here python creates a new instance attribute named typecode but the original class attribute is left untouched. But from then on any references to typecode will be to the instance attribute. In effect the instance attribute is shadowing the class attribute.

If you want to change is in a class level, the best way is to subclass it like this

class ShortVector2d(Vector2d):
    typecode = 'f'
sv = ShortVector2d(1/11, 1/27)
sv, len(bytes(sv))
(ShortVector2d(0.09090909090909091, 0.037037037037037035), 9)