-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Description
Is your feature request related to a problem or challenge?
For measuring the performance improvement of #11827 , some extended queries with more complex udaf(like median, approx_median) + high cardinality group by are needed #12438 .
But I found, such queries can't run successfully to get the result in my local. After debugging, I found it is due to their large intermdiate results which will full memory rapidly, leading to swap or oom...
However, when I run it in a subset with only 15% of the whole clickbench dataset, they can finish successfully and reflect the improvement #11827 (comment)
I think maybe we need a clickbench with the smaller dataset (like tpch 1, tpch 10...) in some situations.
Describe the solution you'd like
Support to generate a samller dataset of the whole clickbench dataset, and we can run queries on it.
Describe alternatives you've considered
No response
Additional context
No response