-
Notifications
You must be signed in to change notification settings - Fork 1.2k
support millisecond timestamps in iceberg compatibility mode #6352
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support millisecond timestamps in iceberg compatibility mode #6352
Conversation
Signed-off-by: Max Falk <[email protected]>
Signed-off-by: Max Falk <[email protected]>
|
Please fix |
Signed-off-by: Max Falk <[email protected]>
Signed-off-by: Max Falk <[email protected]>
Done, i've also updated the docs |
|
@JingsongLi good to merge? |
| "Paimon Iceberg compatibility only support timestamp type with precision from 4 to 6."); | ||
| timestampPrecision >= 3 && timestampPrecision <= 6, | ||
| "Paimon Iceberg compatibility only support timestamp type with precision from 3 to 6."); | ||
| return Timestamp.fromMicros(timestampLong); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If precision is 3, this long should be a millis.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JingsongLi great catch, added
* master: (162 commits) [Python] Rename to BATCH_COMMIT_IDENTIFIER in snapshot.py [Python] Suppport multi prepare commit in the same TableWrite (apache#6526) [spark] Fix drop temporary view (apache#6529) [core] skip validate main branch before orphan files cleaning (apache#6524) [core][spark] Introduce upper transform (apache#6521) [Python] Keep the variable names of Identifier consistent with Java (apache#6520) [core] Remove hash lookup to simplify interface (apache#6519) [core][format] Format Table plan partitions should ignore hidden & illegal dirs (apache#6522) [hotfix] Print partition spec and type when error in InternalRowPartitionComputer [hotfix] Add more informat to check partition spec in InternalRowPartitionComputer [hotfix] Use deleteDirectoryQuietly in TempFileCommitter.clean [core] format table: support write file in _temporary at first (apache#6510) [core] Support non null column with write type (apache#6513) [core][fix] Blob with rolling file failed (apache#6518) [core][rest] Support schema validation and infer for external paimon table (apache#6501) [hotfix] Correct visitors for TransformPredicate [hotfix] Rename to copy from withNewInputs in TransformPredicate [core][spark] Support push down transform predicate (apache#6506) [spark] Implement SupportsReportStatistics for PaimonFormatTableBaseScan (apache#6515) [docs] add docs for auto-clustering of historical partitions (apache#6516) ...
Signed-off-by: Max Falk <[email protected]>
Signed-off-by: Max Falk <[email protected]>
|
+1 |
* upstream/master: (51 commits) [test] Fix unstable test: handle MiniCluster shutdown gracefully in collect method (apache#6913) [python] fix ray dataset not lazy loading issue when parallelism = 1 (apache#6916) [core] Refactor ExternalPathProviders abstraction [spark] fix Merge Into unstable tests (apache#6912) [core] Enable Entropy Inject for data file path to prevent being throttled by object storage (apache#6832) [iceberg] support millisecond timestamps in iceberg compatibility mode (apache#6352) [spark] Handle NPE for pushdown aggregate when a datasplit has a null max/min value (apache#6611) [test] Fix unstable case testLimitPushDown [core] Refactor row id pushdown to DataEvolutionFileStoreScan [spark] paimon-spark supports row id push down (apache#6697) [spark] Support compact_database procedure (apache#6328) (apache#6910) [lucene] Fix row count in IndexManifestEntry [test] Remove unstable test: AppendTableITCase.testFlinkMemoryPool [core] Refactor Global index writer and reader for Btree [core] Minor refactor to magic number into footer [core] Support btree global index in paimon-common (apache#6869) [spark] Optimize compact for data-evolution table, commit multiple times to avoid out of memory (apache#6907) [rest] Add fromSnapshot to rollback (apache#6905) [test] Fix unstable RowTrackingTestBase test [core] Simplify FileStoreCommitImpl to extract some classes (apache#6904) ...
Purpose
This adds to #4318 by allowing Paimon millisecond type to be converted to the canonical microseconds in Iceberg.
We ingest data from Kafka and other sources into Paimon and our Iceberg-based Data Lake. Kafka timestamps and some business data has timestamps with millisecond precision. This allows supporting that type of data without any type changes required in the ingestion pipeline.
Tests
IcebergConversionsTimestampTest.java