-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Description
Is your feature request related to a problem or challenge?
If we want to have DataFusion used as the core of many new systems, we need it to be as easy as possible for someone to get their idea working on top of DataFusion.
The current user guide I think helps setup the basics of the project and get a "hello world" style program going but then kind of leave the reader in a "now what" type situation: https://arrow.apache.org/datafusion/user-guide/example-usage.html
Describe the solution you'd like
I would like a document, perhaps similar in style to the polars user guide: https://pola-rs.github.io/polars-book/user-guide/
This User Guide is an introduction to the Polars DataFrame library. Its goal is to introduce you to Polars by going through examples and comparing it to other solutions. Some design choices are introduced here. The guide will also introduce you to optimal usage of Polars.
Basically I am thinking of something that would have helped @bubbajoe get up to speed
The examples directory holds a bunch of examples: https://github.com/apache/arrow-datafusion/tree/main/datafusion-examples
Potential outline:
- Library Guide: Add SQL level user guide: #7302
- Library Guide: Add Working with Exprs #7304
- Library Guide: Add Using the DataFrame API #7305
- Library Guide: Building LogicalPlans #7306
- Catalogs: tables / schemas / tables / catalogs (in Add library guide for table provider and catalog providers #7287)
- Library Guide: Adding User Defined Functions: Scalar/Window/Aggregate/ #7307
- Adding custom
TableProviders
(in Add library guide for table provider and catalog providers #7287) - Library Guide: Extending DataFusion's operators: custom LogicalPlan and
ExecutionPlans
#7308
Describe alternatives you've considered
No response
Additional context
This idea was suggested by @MrPowers