Skip to content

A new Infinity Table Engine that seamlessly extends MergeTree Tables onto cheap, shared Iceberg Tables #990

@hodgesrm

Description

@hodgesrm

Is your feature request related to a problem? Please describe.
MergeTree tables offer instant query after ingest, high performance, and efficient merging. What they don't do well is handle very large amounts of data. Replicated block storage is costly compared to object storage, ClickHouse compute does not scale independently from storage, and ClickHouse servers with large volumes of attached storage tend to be come unstable under heavy load. Users need a simple way to combine the strengths of MergeTree storage for hot data with the scalability and low cost of object storage.

Describe the solution you'd like
We propose a new Infinity table engine that implements tiered storage across MergeTree and Iceberg tables and makes them appear to users as a single table. Conceptually it appears as follows:

Image

The Infinity table has many similarities to a Distributed table. It has the following responsibilities.

  • Define the schema presented to users. (similar to Distributed table)
  • Define the location of data segments, which are MergeTree and/or Iceberg tables (similar to volumes in storage policies)
  • Track the ownership of partitions in each segment using a watermark, e.g., a time-based column (similar to TTL MOVE)
  • Provide a way to "move" the watermark.
  • Issue subqueries to each segment with appropriate filters so that data are neither dropped nor duplicated. (similar to Distributed table)
  • Correctly map data from segments to the schema of the Infinity table (similar to Distributed table)
  • Preserve the semantics of the MergeTree table on which it is based so that users can enable Infinity tables simply by changing the table name without rewriting the application. (Similar to Distributed table).

Here are some non-responsibilities in the initial implementation.

  • Handling INSERTS. Those go to the MergeTree default segment.
  • Moving partitions between segments.
  • Moving the water mark.
  • Coordination of transactions to move partitions (which need to have something close to ACID semantics). A service outside of Infinity tables will take care of this.

Describe alternatives you've considered

  • Extend MergeTree tiered storage. Tiered storage is baked into MergeTree and cannot easily be extended to allow segments to cross engine types. It already works well.
  • Build it into Distributed engine. Would be more complex to implement and would require constant remerging of changes to Distributed table engine.
    For now keeping things separate seems simplest.

Additional context
(To be added)

Metadata

Metadata

Assignees

Labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions