-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Description
Is your feature request related to a problem or challenge?
Quoting from @westonpace on #13582
Many catalogs are remote (and/or disk based) and offer only asynchronous APIs. For example, Polaris, Unity, and Hive. Integrating with this catalogs is impossible since something like ctx.sql("SELECT * FROM db.schm.tbl") first enters an async context (sql) then a synchronous context (calling the catalog provider to resolve db) and then we need to go into an asynchronous context to interact with the catalog and this async -> sync -> async path is generally forbidden.
This also came up in
I believe it is possible to interact with remote catalogs with DataFusion's non async CatalogAPIs but it is not obvious how to do so
Describe the solution you'd like
I would like a clear well documented example of a DataFusion catalog that interacts with a remote catalog
Describe alternatives you've considered
Another approach that is taken by the SessionContext::sql Is:
Does an initial pass through the parse tree to find all references (non async)
Then fetch all references (can be async)
Then does the planning (non async) with all the relevant references
I don't think this is particularly well documented
Additional context
No response