Skip to content

Python: variables pane updates for data frames with many columns is slow #2174

@wesm

Description

@wesm

Positron Version: git main

Steps to reproduce the issue:

  • Load data frame with over 100,000 rows
  • Transpose data frame into new variable (df2 = df.T)

For example

import numpy as np
import pandas as pd
df = pd.DataFrame({'a': np.arange(1000000)})
df
df2 = df.T

(without Positron in the loop, df.T is virtually instantaneous)

In [4]: %timeit df.T
20.6 µs ± 76.6 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

The variables pane takes over 10 seconds for the update to come through.

We should look at the variables logic around data frames to make sure that we don't do excess computations for data frames with > 10,000 columns. For some users, this case is less unusual than one might think, and so we don't want the UI to get blocked in this scenario.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions