-
Notifications
You must be signed in to change notification settings - Fork 196
Description
This enhancement request stems from 1) the conversation in #293 and 2) emerging needs such as #296. The intent is to clean up the PerTableConfig.
OneTable's sync flow is based on PerTableConfig, which is the input configuration provided by the user. The sync process translates metadata of a single source table into one or more target tables. The user must provide this config for the translation process to be successful.
This image below illustrates the current structure of PerTableConfig.

However, the use cases have changed over time and now require more flexibility and compatibility with different table formats. A different location may be required for generating the metadata of the target table. In that case, the path to that location should also be provided. Additionally, the target table may have a connection to another catalog instance. Which means that the target table requires not just a format identifier, but also some of the configurations that are currently provided for a source table only.
The current configuration object includes some configurations that are specific to Iceberg and Hudi formats. These configurations should be wrapped by input configuration instances that are specific to each format.
The following image shows the proposed PerTableConfig.

- A better way to name a
PerTableConfigis aTableSyncConfig, because it is a configuration for synchronizing a table. - Separate the configurations for the sync task, common table configs, and configs specific to formats.
- Instead of using only a format identifier, represent target table formats as a table. Create a separate entity called
ExternalTablethat can be either a source table or a target table. TheExternalTableclearly differentiates between internal representation and external table.
A possible class structure for representing the table config is the topic I want to discuss.