Description
tl;dr: When I repeatedly create a large figure, save it, and close it, memory usage keeps growing.
Over at this discussion about when MPL should trigger garbage collection, @efiring had some lingering doubts about the chosen solution:
It would certainly be good to have a clearer understanding of when, if ever in practice, it would lead to troublesome increases in memory consumption
I ran into such a case today, when my batch job filled up 60G of RAM over night.
I repeatedly create a large figure, save it, then close it. If I don't manually call gc.collect()
after closing each figure, memory consumption saturates at around 10x of what an individual figure needs. In my case, with several fairly complex figures, this was enough to fill a big machine.
Since this is not obvious from the docs, I think there should be an official way to go back to more aggressive GC for cases like this where the trade-off discussed at #3045 fails. Maybe close(force_gc=True)
?
Code for reproduction
from memory_profiler import profile # https://pypi.python.org/pypi/memory_profiler
from memory_profiler import memory_usage
import matplotlib.pyplot as plt
import numpy as np
import gc
N = 80
@profile
def do_plots():
fig = plt.figure()
plt.plot(np.random.rand(50000))
plt.savefig('/tmp/bla.png')
plt.close(fig)
def default():
for k in range(N):
print(k)
do_plots()
def manual_gc():
for k in range(N):
print(k)
do_plots()
gc.collect()
mem_manual_gc = memory_usage((manual_gc, [], {}))
mem_default = memory_usage((default, [], {}))
plt.plot(mem_manual_gc, label='gc.collect() after close')
plt.plot(mem_default, label='default behaviour')
plt.ylabel('MB')
plt.xlabel('time (in s * 0.1)') # `memory_usage` logs every 100ms
plt.legend()
plt.title('memory usage')
plt.show()
Matplotlib version
- Operating System: Ubuntu 16.10
- Matplotlib Version: 2.0.0
- Python Version: 3.5.2