Decorator 101

The decorator is a callable that takes another function as argument. The decorator helps to transform the function as follows.

@decorator
def target():
    print('running target()')

# has the same effect as

def target():
    print('running target')
    
target = decorator(target)

The first crucial fact about decorators is that they have the power to replace the decorated function with a different one.

Second crucial fact is that they are exectuted immediately when a module is loaded. A key feature to note here is that the decorator runs right after the decorated function is defined, which is usually at import time

registry = []

def register(func):
    print('running register (%s)'%func)
    registry.append(func)
    return func

@register
def f1():
    print('running f1()')
    
@register
def f2():
    print('running f2()')
    
def f3():
    print('running f3()')
    
def main():
    print('running main()')
    print('register -> ', registry)
    f1()
    f2()
    f3()
running register (<function f1 at 0x7f1f2c435670>)
running register (<function f2 at 0x7f1f2c35b1f0>)
main()
running main()
register ->  [<function f1 at 0x7f1f2c435670>, <function f2 at 0x7f1f2c35b1f0>]
running f1()
running f2()
running f3()

This emphasis the point that function decorators are executed as soon as the module is imported but the imported functions run only when they are explicitly called. This is what Pythonistas call importtime and runtime

Decorator-Enhanced Strategy Pattern

Decorators offer a way of implementing the bestprom functionality from the prev chapter. Here we can use decorators to register each promo code and that way minmise code reuse.

promos = []

def promotion(promo_func):
    promos.append(promo_func)
    return promo_func

@promotion
def fidelity(order):
    """5% discount for customers with 1000 or more fidelity points"""
    return order.total() * .05 if order.customer.fidelity >= 1000 else 0

@promotion
def bulk_item(order):
    """10% discount for each LineItem with 20 or more units"""
    discount = 0
    for item in order.cart:
        if item.quantity >= 20:
            discount += item.total() * .1
    return discount

@promotion
def large_order(order):
    """7% discount for orders with 10 or more distinct items"""
    distinct_items = {item.product for item in order.cart}
    if len(distinct_items) >= 10:
        return order.total() * .07
    return 0

def best_promo(order):
    """Select best discount available
    """
    return max(promo(order) for promo in promos)

Now in the above examples we saw cases the decorators send back the same function, but most of the decorators do change the function. This is done by defining a inner function and returning that. To understand that better lets look at Closures and varialbe scope rules.

Variable Scope Rules

Below we have definied as function that reads 2 variable: a local var a and a var b that is not defined.

def f1(a):
    print(a)
    print(b)
    
f1(4)
4

NameErrorTraceback (most recent call last)
<ipython-input-5-876a9d93f0d6> in <module>
      3     print(b)
      4 
----> 5 f1(4)

<ipython-input-5-876a9d93f0d6> in f1(a)
      1 def f1(a):
      2     print(a)
----> 3     print(b)
      4 
      5 f1(4)

NameError: name 'b' is not defined
# but if we define a global variable b 

b = 10
f1(4)
4
10
def f2(a):
    print(a)
    print(b)
    b = 5
    
f2(10)
10

UnboundLocalErrorTraceback (most recent call last)
<ipython-input-7-44688ef5aa7c> in <module>
      4     b = 5
      5 
----> 6 f2(10)

<ipython-input-7-44688ef5aa7c> in f2(a)
      1 def f2(a):
      2     print(a)
----> 3     print(b)
      4     b = 5
      5 

UnboundLocalError: local variable 'b' referenced before assignment

Why this happens?

when Python compiles the body of the function, it decides that b is a local variable because it is assigned within the function. The generated bytecode reflects this decision and will try to fetch b from the local environment. Later, when the call f2(3) is made, the body of f2 fetches and prints the value of the local variable a , but when trying to fetch the value of local variable b it discovers that b is unbound.

def f3(a):
    # use the global b
    global b
    print(a)
    print(b)
    b = 9
    
f3(3)
3
10
b
9
from dis import dis
dis(f1)
  2           0 LOAD_GLOBAL              0 (print)
              2 LOAD_FAST                0 (a)
              4 CALL_FUNCTION            1
              6 POP_TOP

  3           8 LOAD_GLOBAL              0 (print)
             10 LOAD_GLOBAL              1 (b)
             12 CALL_FUNCTION            1
             14 POP_TOP
             16 LOAD_CONST               0 (None)
             18 RETURN_VALUE
dis(f2)
  2           0 LOAD_GLOBAL              0 (print)
              2 LOAD_FAST                0 (a)
              4 CALL_FUNCTION            1
              6 POP_TOP

  3           8 LOAD_GLOBAL              0 (print)
             10 LOAD_FAST                1 (b)
             12 CALL_FUNCTION            1
             14 POP_TOP

  4          16 LOAD_CONST               1 (5)
             18 STORE_FAST               1 (b)
             20 LOAD_CONST               0 (None)
             22 RETURN_VALUE

Closues

closue is a function with an extended scope that encompasses nonglobal variables referenced in the body fo the function but not defined there.

Now to show this in practice lets build a function avg() that calculates the average of all the numbers that are called using the concept of closures.

def make_averager():
    series = []
    
    def averager(new_value):
        series.append(new_value)
        total = sum(series)
        return total/len(series)
    
    return averager
avg = make_averager()
avg(10)
10.0
avg(11)
10.5
avg(12)
11.0

look at series variable. Its accessed in averager() function where its is no longer a local variable, but averager is still able to call series.append().

In this case series is what we technically call a free variable. It is a variable that is not bound in the local scope.

Lets inspect the function created by make_averager

avg.__code__.co_varnames
('new_value', 'total')
avg.__code__.co_freevars
('series',)

The binding for series is kept in the __closure__ attribute of the return function avg. Each item in avg.__closure__ corresponds to a name in avg.__code__.co_freevars. These items are cells, and they have an attribute called cell_contents where the actual values can be found.

avg.__code__.co_freevars
('series',)
avg.__closure__
(<cell at 0x7f3a548da890: list object at 0x7f3a548bfe10>,)
avg.__closure__[0].cell_contents
[10, 11, 12]

To summarize: a closure is a function that retains the bindings of the free variables that exist when the function is defined, so that they can be used later when the function is invoked and the defining scope is no longer available.

Note that the only situation in which a function may need to deal with external variables that are nonglobal is when it is nested in another function.

nonlocal Declaration

Our implementation of make_average is not opimal. A better way would be to problably store the total and the number of values.

Now lets look at a broken implementation of the same.

def make_average():
    total = 0
    count = 0
    
    def average(new_value):
        total += new_value
        count += 1
        return total/count
    
    return average

avg = make_average()
avg(10)
---------------------------------------------------------------------------
UnboundLocalError                         Traceback (most recent call last)
<ipython-input-2-27f4f3e80f71> in <module>
     11 
     12 avg = make_average()
---> 13 avg(10)

<ipython-input-2-27f4f3e80f71> in average(new_value)
      4 
      5     def average(new_value):
----> 6         total += new_value
      7         count += 1
      8         return total/count

UnboundLocalError: local variable 'total' referenced before assignment

Surprise! it errors out. Here if you notice, inside the average() function the statement is count += 1 where count is a number or an immutable type (yeah int is an immutable type). But this statement makes count a local variable. The issue is there for total.

We didn't have this previously beacause it was a list and mutable, but for immutable objects which we can only read this is not possible.

To work arond this, the nonlocal declaration was introduced. nonlocal specifies that the variables are still free variables.

def make_averager():
    total = 0
    count = 0
    
    def averager(new_value):
        nonlocal count, total
        count += 1
        total += new_value
        return total/count
    
    return averager

avg = make_averager()
avg(10)
10.0

Implementing a Simple Decorator

import time

def clock(func):
    def clocked(*args):
        t0 = time.perf_counter()
        result = func(*args)
        elapsed = time.perf_counter() - t0
        name = func.__name__
        arg_str = ', '.join(repr(arg) for arg in args)
        print('[%0.8fs] %s(%s) -> %r'%(elapsed, name, arg_str, result))
        return result
    return clocked
import time

@clock
def snooze(seconds):
    time.sleep(seconds)
    
@clock
def factorial(n):
    """
    Calculates the factorial of n recursively.
    """
    return 1 if n < 2 else n*factorial(n-1)

if __name__=='__main__':
    print('*' * 40, 'Calling snooze(.123)')
    snooze(.123)
    print('*' * 40, 'Calling factorial(6)')
    print('6! =', factorial(6))
**************************************** Calling snooze(.123)
[0.12316652s] snooze(0.123) -> None
**************************************** Calling factorial(6)
[0.00000196s] factorial(1) -> 1
[0.00004121s] factorial(2) -> 2
[0.00026267s] factorial(3) -> 6
[0.00091569s] factorial(4) -> 24
[0.00099629s] factorial(5) -> 120
[0.00102192s] factorial(6) -> 720
6! = 720
factorial.__name__, factorial.__doc__
('clocked', None)

As you can see the factorial function is changed an points to the function returned by the doecorator. The function in the decorator is doing the work. This is evident from the __name__ attribute of the factorial function. It is changed to "clocked".

Our decorator has a few issues, it doesnot handle keyword arguments and masks the __name__ and __doc__. We hence use functools.wraps decorator

import time
import functools

def clock(func):
    @functools.wraps(func)
    def clocked(*args, **kwargs):
        t0 = time.perf_counter()
        result = func(*args, **kwargs)
        elapsed = time.perf_counter() - t0
        name = func.__name__
        arg_str = ', '.join(repr(arg) for arg in args)
        print('[%0.8fs] %s(%s) -> %r'%(elapsed, name, arg_str, result))
        return result
    return clocked

@clock
def factorial(n):
    """
    Calculates the factorial of n recursively.
    """
    return 1 if n < 2 else n*factorial(n-1)
factorial.__name__, factorial.__doc__
('factorial', '\n    Calculates the factorial of n recursively.\n    ')

Two of the most interesting decorators in the std lib are lru_cache and singledispatch.

Memoization with functools.lru_cache

The lru_cache implements memoization ie storing the return value of expensive functions so that they can be used later. lru -> Least Recently Used, meaning that the growth of the cache is limited by discarding the entries that have not been read for a while. Lets see this in action!

@clock
def fibonacci(n):
    if n < 2:
        return n
    return fibonacci(n-2) + fibonacci(n-1)

if __name__=='__main__':
    print(fibonacci(6))
[0.00000188s] fibonacci(0) -> 0
[0.00000524s] fibonacci(1) -> 1
[0.00081197s] fibonacci(2) -> 1
[0.00000195s] fibonacci(1) -> 1
[0.00003227s] fibonacci(0) -> 0
[0.00000245s] fibonacci(1) -> 1
[0.00052311s] fibonacci(2) -> 1
[0.00085437s] fibonacci(3) -> 2
[0.00176118s] fibonacci(4) -> 3
[0.00000189s] fibonacci(1) -> 1
[0.00000182s] fibonacci(0) -> 0
[0.00000189s] fibonacci(1) -> 1
[0.00011098s] fibonacci(2) -> 1
[0.00016867s] fibonacci(3) -> 2
[0.00000140s] fibonacci(0) -> 0
[0.00000188s] fibonacci(1) -> 1
[0.00023110s] fibonacci(2) -> 1
[0.00000140s] fibonacci(1) -> 1
[0.00000140s] fibonacci(0) -> 0
[0.00000147s] fibonacci(1) -> 1
[0.00002528s] fibonacci(2) -> 1
[0.00009254s] fibonacci(3) -> 2
[0.00073033s] fibonacci(4) -> 3
[0.00093189s] fibonacci(5) -> 5
[0.00273435s] fibonacci(6) -> 8
8

As you can see there is a lot of wasted cycles.

import functools 

@functools.lru_cache()
@clock
def fibonacci(n):
    if n < 2:
        return n
    else :
        return fibonacci(n-1) + fibonacci(n-2)
    
fibonacci(6)
[0.00000189s] fibonacci(1) -> 1
[0.00000237s] fibonacci(0) -> 0
[0.00014730s] fibonacci(2) -> 1
[0.00018312s] fibonacci(3) -> 2
[0.00021742s] fibonacci(4) -> 3
[0.00025066s] fibonacci(5) -> 5
[0.00028509s] fibonacci(6) -> 8
8

Try it with really high number to see how much difference it makes (something like 30 will give u really big differences)

other than making silly recursinve functions faster, they are also super helpfull in applications that fetches info from the web

They have additional arguments

lru_cache(maxsize=128, typed=True)

Generic Functions with singledispatch

Now this one is a personal favorite of mine. Let take an example, say you want to generate HTML displays for different Python objects.

import html

def htmlize(obj):
    content = html.escape(repr(obj))
    return '<pre>{}</pre>'.format(content)
htmlize({1, 2, 4}), htmlize(1)
('<pre>{1, 2, 4}</pre>', '<pre>1</pre>')

What we want to build is a htmlize that displays according to the data that is passed. Since we have no function overloading in Python we cannot create different function for each signature. What we usally do is create a dispatch function with many if/elif/else statements to handle it.

But another way is to use functools.singledispatch. If a function is decorated with this, it becomes a generic function. See the example to know how this is implemented.

from functools import singledispatch
from collections import abc
import numbers
import html

@singledispatch  # this marks the base function that handles the obj type.
def htmlize(obj):
    content = html.escape(repr(obj))
    return '<pre>{}</pre>'.format(content)

@htmlize.register(str)  # each special func is decorated using the base func
def _(text):  # here the name is irrelevent, hence _ is a good choice
    content = html.escape(text).replace('\n', '<br>\n')
    return '<p>{0}</p>'.format(content)

@htmlize.register(numbers.Integral)  # create new funcs for other types
def _(n):
    return '<pre>{0} (0x{0:x})</pre>'.format(n)

@htmlize.register(tuple)
@htmlize.register(abc.MutableSequence)  # stack together different types that support the same func
def _(seq):
    inner = '</li>\n<li>'.join(htmlize(item) for item in seq)
    return '<ul>\n<li>' + inner + '</li>\n</ul>'
print(htmlize('hai'))
<p>hai</p>
print(htmlize(1))
<pre>1 (0x1)</pre>
print(htmlize(('this', 'is', 'awesome')))
<ul>
<li><p>this</p></li>
<li><p>is</p></li>
<li><p>awesome</p></li>
</ul>

tip: When possible, register the specialized functions to handle ABCs (abstract classes) such as numbers.Integral and abc.MutableSequence instead of concrete implementations like int and list . This allows your code to support a greater variety of compatible types. For example, a Python extension can provide alternatives to the int type with fixed bit lengths as subclasses of numbers.Integral.

A notable feature is that additional mechanisms to handle some datatypes can be registered from anywhere in the system, in any module.

Stacked Decorators

When 2 or more decorators are used on after another they are said to be stacked decorators.

@d1
@d2
def func()

#becomes -> 
func = d1(d2(func))

Parameterized Decorators

passing arguments to decorators. Lets look at a registeration module and build it from there.

registry = []

def register(func):
    print('running register(%s)' % func)
    registry.append(func)
    return func

@register
def f1():
    print('running f1()')
    
print('running main()')
print('registry ->', registry)
f1()
running register(<function f1 at 0x7fa0b82a5b90>)
running main()
registry -> [<function f1 at 0x7fa0b82a5b90>]
running f1()

Now let make it easy to enable or diable the function registration by adding a paramter active. Conceptually, the new register is not a decorator but a decorator factory. When called, it returns the actual decorator that will be applied to the target function.

registry = set()

def register(active=True):    # this is now a decorator function
    def decorate(func):      
        print('running register(active=%s)->decorate(%s)'
              % (active, func))
        if active:
            registry.add(func)
        else:
            registry.discard(func)
        return func
    
    return decorate

@register(active=False)
def f1():
    print('running f1()')
    
@register()
def f2():
    print('running f2()')
    
def f3():
    print('running f3()')
running register(active=False)->decorate(<function f1 at 0x7f8b00b300d0>)
running register(active=True)->decorate(<function f2 at 0x7f8b00b30310>)
registry
{<function __main__.f2()>}
# it will actually look
register()(f3)
running register(active=True)->decorate(<function f3 at 0x7f8b00b30040>)
<function __main__.f3()>
registry
{<function __main__.f2()>, <function __main__.f3()>}

Now lets revisit the clock decorator and add an extra feature. Users may pass a format string to control the output of the decorated function.

import time

DEFAULT_FMT = '[{elapsed:0.8f}s] {name}({args}) -> {result}'

def clock(fmt=DEFAULT_FMT):
    def decorate(func):
        def clocked(*_args):
            t0 = time.time()
            _result = func(*_args)
            elapsed = time.time() - t0
            name = func.__name__
            args = ', '.join(repr(arg) for arg in _args)
            result = repr(_result)
            print(fmt.format(**locals()))
            return _result
        return clocked
    return decorate
@clock()
def snooze(seconds):
    time.sleep(seconds)
    
for i in range(3):
    snooze(1)
[1.00125384s] snooze(1) -> None
[1.00108457s] snooze(1) -> None
[1.00108957s] snooze(1) -> None
@clock('{name}: {elapsed}s')
def snooze(seconds):
    time.sleep(seconds)
    
for i in range(3):
    snooze(.123)
snooze: 0.12353754043579102s
snooze: 0.12320995330810547s
snooze: 0.12340807914733887s

Note: It's advisable to implement decorators as classes with __call__ function than the way it is shown in this notebook. This is best for non-trivial examples.