Take advantage of caching and the lru_cache decorator to relieve your Python functions from repetitive heavy lifting.
Senior Writer, InfoWorld|
Python trades runtime speed for programmer convenience, and most of the time it’s a good tradeoff. One doesn’t typically need the raw speed of C for most workaday applications. And when you need to squeeze more performance out of Python, you don’t always have to turn to C; there’s plenty within Python itself to aid with that.
One performance-enhancement technique common to many languages, and one Python can use too, is memoization—caching the results of a function call so that future calls with the same inputs don’t have to be recomputed from scratch. Python provides a standard library utility, lru_cache
, to do this.
Related video: Using memoization in Python
Memoization basics
Here’s a simple example of a function that’s a good use case for memoization:
from math import sin def sin_half[x]: return sin[x]/2
A function like this has two qualities that make it worth memoizing:
- The output of the function is deterministic. Whenever you provide a certain input, you get the same output every time. Functions that rely on something outside the function itself [e.g., a network call, or a read from disk] are harder to memoize, though it can still be done. But any function that is entirely deterministic is a good candidate.
- The function is computationally expensive. Meaning, when we run it, it generally takes a long time to return an answer. Any function involving math operations, especially in Python, tends to be expensive. Caching and reusing the results is often orders of magnitude faster than recomputing the results each time.
lru_cache basics
To memoize a function in Python, we can use a utility supplied in Python’s standard library—the functools.lru_cache
decorator.
lru_cache
isn’t hard to use. The above example would be memoized with lru_cache
like this:
from functools import lru_cache from math import sin @lru_cache def sin_half[x]: return sin[x]/2
Now, every time you run the decorated function, lru_cache
will check for a cached result for the inputs provided. If the result is in the cache, lru_cache
will return it. If the result is not in the cache, lru_cache
will run the function, cache the result, and return the result.
One tremendous avantage of using lru_cache
is that it integrates with Python’s interpreter at a low level. Any calls that return a result from the cache don’t even generate a new stack frame, so there’s not only less computational overhead but less interpreter overhead as well.
lru_cache cache size
By default, lru_cache
caches every call made to the function it wraps, so the cache can grow endlessly during a program’s runtime. If your function gets a restricted range of arguments [say, only the integers 1 through 100], you probably don’t have to worry about the cache growing too large. But in some cases you may want to restrict the cache to the top X number of possibilities to avoid exhausting the memory.
This is where the “lru” in lru_cache
comes from. The “lru” stands for least recently used, and it describes how items in the cache are automatically cleared. Everything but the last X cached items is discarded.
To set the size of the cache for your function, just supply a number with the decorator, like so:
@lru_cache[360] def sin_half[x]: return sin[x]/2
This caches a maximum of 360 possible values for
from functools import lru_cache from math import sin @lru_cache def sin_half[x]: return sin[x]/23, and their corresponding responses.
Advanced use of lru_cache
lru_cache
also lets you control the behaviors of individual function caches programmatically, at runtime.
If you want to examine the statistics for a given function cache, use the
from functools import lru_cache from math import sin @lru_cache def sin_half[x]: return sin[x]/25 method on the decorated function [e.g.,
from functools import lru_cache from math import sin @lru_cache def sin_half[x]: return sin[x]/26]. This reports the total number of cache hits and misses, the maximum cache size, and the current cache size, in that order.
[ Tune into Serdar Yegulalp’s Smart Python video tutorials to learn smart Python tricks in 5 minutes or less ]
You can also manually clear the cache on a decorated function with the
from functools import lru_cache from math import sin @lru_cache def sin_half[x]: return sin[x]/27 method. This is useful if you want to cache results for some function, but forcibly invalidate them if some condition changes. One possible use for this would be a function that provides a fragment of a page for a website, like a contextual user menu that makes many back-end data requests. The cached version would avoid having to re-query the back end every time the menu is generated. Then, if any user’s data changed [which may not happen very often], you could invalidate the cache to keep users from being served stale data.
Finally, if you want to access the original, undecorated version of the function, you can use the
from functools import lru_cache from math import sin @lru_cache def sin_half[x]: return sin[x]/28 method on the decorated function. This comes in handy if, for instance, you want to compare the cached output of a function against the uncached version as part of a test.
Related:
- Python
- Software Development
Serdar Yegulalp is a senior writer at InfoWorld, focused on machine learning, containerization, devops, the Python ecosystem, and periodic reviews.