Skip to content

Implement DynamicTableProvider in DataFusion Core #10986

@goldmedal

Description

@goldmedal

Is your feature request related to a problem or challenge?

I had some discussions with @alamb about supporting a dynamic file data source (select ... from 'select .. from 'data.parquet' like #4805) in the core, as mentioned in #4850 (comment). However, we found that it's not a good idea to move so many dependencies (e.g., S3-related) to the core crate after #10745.

Describe the solution you'd like

As @alamb proposed in #10745 (comment), we can focus first on the logic that interprets table names as potential object store locations. Implement a struct DynamicTableProvider and a trait called UrlLookup to get ObjectStore at runtime.

struct DynamicTableProvider {
  // ...
  /// A callback function that is 
  url_lookup: Arc<dyn UrlLookup>
}

/// Trait for looking up the correct object store instance based on URL
pub trait UrlLookup {
  fn lookup(&self, url: &Url) -> Result<Arc<dyn ObjectStore>>;
}

By default, DynamicTableProvider only supports querying local file paths like file:///.... The implementation of dynamic file queries in datafusion-cli might also be based on DynamicTableProvider but will load the common object storage dependency by default.

Describe alternatives you've considered

No response

Additional context

No response

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions