Skip to content

Keynote presentation for SiMoD workshop at SIGMOD 2024 #10481

@alamb

Description

@alamb

I am giving an invited keynote talk at a workshop colocated with SIGMOD 2024 on Friday Jun 14, 2024 (after the main conference).

I need to prepare slides for this and figured people in the DataFusion community might be interested

DataFusion: The Case for Building Data Systems using Open Standards:

Abstract: Andrew will discuss engineering tradeoffs made when building Apache DataFusion, an open source and extensible query engine used as the basis of many commercial and open source projects. These decisions (mostly) favored simplicity and worked better than initially expected. He will cover the rationale for which parts of DataFusion use pre-existing standards such as Arrow and Parquet, and which parts are built “from scratch” such as vectorized hashing and normalized sort keys. He will also discuss DataFusion’s design philosophy of extensible APIs paired with simple default implementations. Finally, he will offer lessons learned and enumerate some things that worked well and what could have been improved.

Metadata

Metadata

Assignees

Labels

documentationImprovements or additions to documentation

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions