Skip to content

Milestones

List view

  • Miscellaneous backlog tickets for Rivulet workstream (multimodal dataset format)

    No due date
    1/2 issues closed
  • Create DeltaCAT V2 APIs, including (1) a native DeltaCAT Catalog implementation, (2) a native DeltaCAT CLI and corresponding Linux-FS-like APIs, (3) Ray/Daft Data source/sink adapters (to enable local/distributed reads/writes of DeltaCAT catalogs). The DeltaCAT Catalog implementation should also include all capabilities in`deltacat/storage/rivulet/dataset.py` (mostly on the table version level), including: 1. Manage (multiple) schemas on dataset 2. Import data (e.g. from_csv) 3. Export data (e.g. to webdataset) 4. Read and write methods (currently, deltacat catalog has somewhat different read/write methods from rivulet)

    No due date
    2/9 issues closed
  • The DeltaCAT V2 Metastore is defined by a working implementation of the DeltaCAT Storage Interface, which controls all metadata I/O. This milestone tracks the development of the DeltaCAT V2 Native Storage Implementation, which forms an abstraction layer over all lower-level code in `metafile.py` and `transaction.py` code that operates directly on metafiles.

    No due date
    3/3 issues closed
  • Use the DeltaCAT Metastore format in rivulet. Be able to express rivulet concepts (e.g. multiple schemas) in deltacat metastore. Clean up internal classes in rivulet that will no longer be needed.

    No due date
    1/7 issues closed
  • Implement all required DeltaCAT storage APIs and make any changes required to integrate LSM-based CDC on Ray with Iceberg! Proposal Doc: https://docs.google.com/document/d/1kyyJp4masbd1FrIKUHF1ED_z1hTARL8bNoKCgb7fhSQ/edit.

    No due date
    8/13 issues closed
  • Milestone to track enhancements to the existing DeltaCAT compactor by creating and better leveraging enhanced primary key indices.

    No due date
    2/5 issues closed