Skip to content

Conversation

@gmdfalk
Copy link
Contributor

@gmdfalk gmdfalk commented Sep 29, 2025

Purpose

This adds to #4318 by allowing Paimon millisecond type to be converted to the canonical microseconds in Iceberg.

We ingest data from Kafka and other sources into Paimon and our Iceberg-based Data Lake. Kafka timestamps and some business data has timestamps with millisecond precision. This allows supporting that type of data without any type changes required in the ingestion pipeline.

Tests

IcebergConversionsTimestampTest.java

@JingsongLi
Copy link
Contributor

Please fix IcebergDataFieldTest.testTimestampPrecisionValidation.

@gmdfalk
Copy link
Contributor Author

gmdfalk commented Sep 30, 2025

Please fix IcebergDataFieldTest.testTimestampPrecisionValidation.

Done, i've also updated the docs

@gmdfalk
Copy link
Contributor Author

gmdfalk commented Sep 30, 2025

@JingsongLi good to merge?

"Paimon Iceberg compatibility only support timestamp type with precision from 4 to 6.");
timestampPrecision >= 3 && timestampPrecision <= 6,
"Paimon Iceberg compatibility only support timestamp type with precision from 3 to 6.");
return Timestamp.fromMicros(timestampLong);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If precision is 3, this long should be a millis.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JingsongLi great catch, added

gmdfalk and others added 4 commits November 5, 2025 09:45
* master: (162 commits)
  [Python] Rename to BATCH_COMMIT_IDENTIFIER in snapshot.py
  [Python] Suppport multi prepare commit in the same TableWrite  (apache#6526)
  [spark] Fix drop temporary view (apache#6529)
  [core] skip validate main branch before orphan files cleaning (apache#6524)
  [core][spark] Introduce upper transform (apache#6521)
  [Python] Keep the variable names of Identifier consistent with Java (apache#6520)
  [core] Remove hash lookup to simplify interface (apache#6519)
  [core][format] Format Table plan partitions should ignore hidden & illegal dirs (apache#6522)
  [hotfix] Print partition spec and type when error in InternalRowPartitionComputer
  [hotfix] Add more informat to check partition spec in InternalRowPartitionComputer
  [hotfix] Use deleteDirectoryQuietly in TempFileCommitter.clean
  [core] format table: support write file in _temporary at first (apache#6510)
  [core] Support non null column with write type (apache#6513)
  [core][fix] Blob with rolling file failed (apache#6518)
  [core][rest] Support schema validation and infer for external paimon table (apache#6501)
  [hotfix] Correct visitors for TransformPredicate
  [hotfix] Rename to copy from withNewInputs in TransformPredicate
  [core][spark] Support push down transform predicate (apache#6506)
  [spark] Implement SupportsReportStatistics for PaimonFormatTableBaseScan (apache#6515)
  [docs] add docs for auto-clustering of historical partitions (apache#6516)
  ...
Signed-off-by: Max Falk <[email protected]>
Signed-off-by: Max Falk <[email protected]>
@JingsongLi
Copy link
Contributor

+1

@JingsongLi JingsongLi merged commit 44b14ee into apache:master Dec 28, 2025
23 of 24 checks passed
jerry-024 added a commit to jerry-024/paimon that referenced this pull request Dec 29, 2025
* upstream/master: (51 commits)
  [test] Fix unstable test: handle MiniCluster shutdown gracefully in collect method (apache#6913)
  [python] fix ray dataset not lazy loading issue when parallelism = 1 (apache#6916)
  [core] Refactor ExternalPathProviders abstraction
  [spark] fix Merge Into unstable tests (apache#6912)
  [core] Enable Entropy Inject for data file path to prevent being throttled by object storage (apache#6832)
  [iceberg] support millisecond timestamps in iceberg compatibility mode (apache#6352)
  [spark] Handle NPE for pushdown aggregate when a datasplit has a null max/min value (apache#6611)
  [test] Fix unstable case testLimitPushDown
  [core] Refactor row id pushdown to DataEvolutionFileStoreScan
  [spark] paimon-spark supports row id push down (apache#6697)
  [spark] Support compact_database procedure (apache#6328) (apache#6910)
  [lucene] Fix row count in IndexManifestEntry
  [test] Remove unstable test: AppendTableITCase.testFlinkMemoryPool
  [core] Refactor Global index writer and reader for Btree
  [core] Minor refactor to magic number into footer
  [core] Support btree global index in paimon-common (apache#6869)
  [spark] Optimize compact for data-evolution table, commit multiple times to avoid out of memory (apache#6907)
  [rest] Add fromSnapshot to rollback (apache#6905)
  [test] Fix unstable RowTrackingTestBase test
  [core] Simplify FileStoreCommitImpl to extract some classes (apache#6904)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants