Skip to content

Add a "Gentle Introduction to Arrow / Record Batches" #11336

@alamb

Description

@alamb

Part of #7013

Is your feature request related to a problem or challenge?

As @efredine notes on #11290 / #11290 (comment):

The in-memory examples are concise and its easy to get the gist of what's going on. But it also throws people in to the deep end of the Arrow format which lacks a gentle introduction IMO. The Arrow-rs documentation gets immediately into the weeds!

Describe the solution you'd like

It's likely that many users might never even need to know or access the arrow format directly. They will just read and write to csv or parquet.

I don't think this needs to change, but perhaps what's missing is a section on how and when to use the Arrow format? A gentler introduction to Record Batches

Describe alternatives you've considered

Add a section to the user guide on "a gentle introduction to arrow"

Additional context

here is a ticke tracking such a thing upstream: apache/arrow-rs#4071

I actually think the basic content / structure could be copied from https://jorgecarleitao.github.io/arrow2/main/guide/ with the examples being updated to reflect arrow-rs

Metadata

Metadata

Assignees

Labels

documentationImprovements or additions to documentationenhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions