Skip to content

rerun cell with datafame make memory leak #1391

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
kelarMai opened this issue Apr 13, 2025 · 0 comments
Open

rerun cell with datafame make memory leak #1391

kelarMai opened this issue Apr 13, 2025 · 0 comments

Comments

@kelarMai
Copy link

env

use vscode remote-ssh connect to debian server and use the jupyter

debian12
vscode=Version: 1.99.0
python=3.10
ipykernal=6.29.5
pandas=2.2.3

replay

create a big dataframe

import pandas as pd
import numpy as np

np.random.seed(0)
num_rows = 10000000  
num_cols = 10  
# create the random number
data = np.random.randint(0, 100, size=(num_rows, num_cols))
df = pd.DataFrame(data, columns=[f'col_{i}' for i in range(num_cols)])

monitor the memory use

%load_ext ipython_memory_usage
%imu_start

when rerun the cell below ,momery will used increasing

df_temp = df.copy(deep=True)
df_temp.head()

in my computer,the result like

[Out] In [4] used 763.6 MiB RAM in 0.55s (system mean cpu 40%, single max cpu 100%), peaked 0.0 MiB above final usage, current RAM usage now 1632.0 MiB

[Out] In [5] used 763.0 MiB RAM in 0.55s (system mean cpu 19%, single max cpu 100%), peaked 0.0 MiB above final usage, current RAM usage now 2394.9 MiB

[Out] In [6] used 763.1 MiB RAM in 0.55s (system mean cpu 25%, single max cpu 100%), peaked 0.0 MiB above final usage, current RAM usage now 3158.0 MiB

try to solve

After many tries , I find the main reason is df_temp.head() ;
If rerun df_temp = df.copy(deep=True) ,the used memory doesn't increase;

If change the code to

df_snapshot_test = df_snapshot.copy(deep=True)
df_snapshot_test.head().copy(deep=True)

rerun will not increase memory too;

I have try

import gc
gc.collect()  

or

from IPython.display 
import clear_output

both can't free the memory

another similar issuse

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant