-
-
Notifications
You must be signed in to change notification settings - Fork 18.7k
Open
Labels
PerformanceMemory or execution speed performanceMemory or execution speed performance
Description
It would be convenient to have a canonical set of dataframes for use in testing and/or benchmarking. Ideally this would be a set of named dataframes that represented common forms of data like the following:
- Random floating point data
- Random integer data
- Strings with low entropy
- Strings with high entropy
- Mostly sorted datetimes
- ...
These could then be used either within Pandas or in other libraries for benchmarks. Having a consistent set of dataframes would probably aid consistent benchmarking.
Additionally if this was then separately arranged into pytest fixture we could imagine setting things up and tearing things down in a way that made benchmarking more consistent (such as controlling garbage collection), though this may be a separate endeavor. It would be nice to have access to the dataframes outside of the context of PyTest as well
micahjsmith
Metadata
Metadata
Assignees
Labels
PerformanceMemory or execution speed performanceMemory or execution speed performance