Three Techniques for Catching Memory Leaks in Python and How to Write Smart Code
Eye catching chart showing memory utilization before and after the fix.
Sometime ago I wrote a worker that periodically polls third party service for data. We started noticing that the worker process gets killed by the kernel for reaching memory limits. The container for the worker was given 512MB and that should be more than enough for the job it was doing. The amount of data it fetches can go anywhere from 25MB to a 100MB and it uses this data to sync some internal state of our systems with the data provided by the third party. I was able to find weird memory consumption patterns and refactor the code to take memory usage from ~50% to ~13% and stop getting worker process OOM killed. This post is about the tools I used to find memory problems in a Python application.
One of the first things to do is start tracking memory allocation from within the process. Python comes with resource module which lets you ask how much memory a running process is consuming.
import resource print resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
This system call returns various information about resource usage, not only memory. You can read more about it and available parameters in the linux man pages. This code block be useful in adding around critical section to identify when and how much memory get allocated. It's also a useful metrics to report to your metrics aggregator to track memory usage over time.
The next tool under the belt is the objgraph. You can install it with pip install objgraph. Objgraph lets you explore Python object graphs. It is very useful in finding dead objects and who still references them.
import objgraph objgraph.show_most_common_types() # List most common object types objgraph.show_growth() # Shows object change
show_growth can be used before/after a critical section to see what objects were allocated.
Heapy from Guppy
Guppy is a toolchain for memory analysis and profiling. Heapy seems to be the most commonly referenced submodule when it comes to digging into memory issues. You can install it with pip install guppy. Heapy is fairly complicated tool. There's a great tutorial on heapy that you should check out. To take a diff of your heap you can do this:
import guppy h = guppy.hpy() heap_snap1 = h.heap() # Critical section heap_snap2 = h.heap() heap_diff = heap_snap2 - heap_snap1
Fixing Memory Usage
After digging around with the tools above, I noticed that the process allocates about 12MB of data after every fetch from the third party service. Each iteration allocates tons of unicode and string objects after parsing JSON response. This all makes sense since the strings are fairly long and each fetch contains mostly unique data. Python's string interning won't help much. The worker is a long running process that periodically receives a job, fetches big batch of data and syncs internal systems. It seem's that python's garbage collector should kick in and clean up obosolete data. Invoking garbage collector manually was of little help. It's unclear as to why new chunk of heap was getting allocated instead of reused.
I'm not a fan of refactoring code for the sake of refactoring, however this was a good case to do so.
Here's the pseodocode of doing the data sync:
for job in get_job(): for internal_item in items_to_update(): third_party_data = get_big_batch(internal_item.date) filter_and_process(third_party_data, iternal_item)
The implementation above is really inefficient. The
get_big_batch large chunk
of data for some day. The response includes a bunch of unrelated information to
internal_item. As mentioned before,
third_party_data can be quite
large, 12MB to 100MB. The optimization was to change the code to only retrieve
data related to the
for job in get_job(): for internal_item in items_to_update(): third_party_data = get_thirdparty_data_for_item(internal_item) process(third_party_data, internal_item) gc.collect()
This change is both more efficient in terms of resource consumption and processing times. Instead of getting large batches of data and filtering things out in the process, we are asking for specific data from the thirdparty. This code adds more calls to the external service. On the other hand, it's about 10K of thirdparty data per item. The best part there's no more OOM issues.
The problem was not exactly a memory leak, at least to my knowledge, however this example serves as a great reminder that you should always think about the data you're fetching and only fetching what you need. Don't make Gilfoyle hate you.
Hut for macOS
Design and prototype web APIs and services.