-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Description
Is your feature request related to a problem or challenge?
Someone asked in discord:
I'm looking at Datafusion and Polars as potential solutions for calculating averages over a sliding window of events, where the window is bound by event time. I've just come across Datafusion, would anyone be able to clarify if it's suitable for this use case? In essence, I have events streaming in via RPC that I want to feed into a a system that gives the above outcome.
I am pretty sure this is exactly the case for using UNBOUNDED
tables with explicitly defined ORDER BY
from Synnada and Arroyo others. However, when I went to look for the documentation, I could't find any mention of this usecase or documentation of unbounded tables
Describe the solution you'd like
I would like to help make it easier for people to use DataFusion for streaming usecases by:
- Documenting the
UNBOUNDED
keyword in theCREATE EXTERNAL TABLE
documentation - Add an example in https://github.com/apache/arrow-datafusion/tree/main/datafusion-examples/examples with a simple streaming example (perhaps implementing some simple version of the use case described in the description)
- Add a section to the library guide giving some basic overview
Describe alternatives you've considered
No response
Additional context
No response