Skip to content

Conversation

@velo
Copy link
Collaborator

@velo velo commented Oct 14, 2025

Summary

Adds AWS SDK v2 dependencies required for Iceberg S3FileIO to work properly. This fixes the NoClassDefFoundError: software/amazon/awssdk/core/exception/SdkException error when using Iceberg catalog with S3.

Problem

When creating an Iceberg catalog with S3 storage using io-impl = org.apache.iceberg.aws.s3.S3FileIO, the application fails with:

java.lang.NoClassDefFoundError: software/amazon/awssdk/core/exception/SdkException

This is because Iceberg's S3FileIO requires AWS SDK v2, but only the runtime JAR was included without its dependencies.

Solution

Added specific AWS SDK v2 dependencies to parent and flink-sql-runner POMs:

  • software.amazon.awssdk:s3 - S3 client
  • software.amazon.awssdk:sts - Security Token Service (for credential management)
  • software.amazon.awssdk:url-connection-client - HTTP client for AWS SDK
  • Plus transitive dependencies (auth, regions, sdk-core, protocol implementations, etc.)

Total size: 7.2MB (instead of 482MB if using the full AWS SDK bundle)

The specific modules and their transitive dependencies are automatically copied by Maven dependency plugin and included in the Docker image.

Related

Test Plan

  • Built project successfully with mvn clean install -Pfast
  • Verified all required AWS SDK JARs are copied to target directory and included in Docker image
  • Confirmed total AWS SDK dependency size is only 7.2MB

🤖 Generated with Claude Code

@velo
Copy link
Collaborator Author

velo commented Oct 14, 2025

image

@velo velo requested a review from ferenc-csaky October 14, 2025 17:27
@velo
Copy link
Collaborator Author

velo commented Oct 14, 2025

I left it running for a good few minutes, seems stable

2025-10-14 17:39:52.868 [flink-pekko.actor.default-dispatcher-19] INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph - IcebergFilesCommitter -> IcebergSink mycatalog.default_database.Aggregation: Writer (1/1) (7f0ccfdf72cd41e31fd13479e09f4da8_8af46b81202ff240b39635087f68aa4e_0_0) switched from DEPLOYING to INITIALIZING.
2025-10-14 17:39:52.871 [flink-pekko.actor.default-dispatcher-19] INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph - Source: _InputData[1] -> Calc[2] -> WatermarkAssigner[3] -> Calc[4] -> LocalWindowAggregate[5] (1/1) (7f0ccfdf72cd41e31fd13479e09f4da8_cbc357ccb763df2852fee8c4fc7d55f2_0_0) switched from DEPLOYING to INITIALIZING.
2025-10-14 17:39:54.938 [flink-pekko.actor.default-dispatcher-15] INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph - Source: _InputData[1] -> Calc[2] -> WatermarkAssigner[3] -> Calc[4] -> LocalWindowAggregate[5] (1/1) (7f0ccfdf72cd41e31fd13479e09f4da8_cbc357ccb763df2852fee8c4fc7d55f2_0_0) switched from INITIALIZING to RUNNING.
2025-10-14 17:39:55.850 [flink-pekko.actor.default-dispatcher-15] INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph - GlobalWindowAggregate[7] -> (Calc[8] -> ConstraintEnforcer[9] -> StreamRecordTimestampInserter[9], Calc[10] -> ConstraintEnforcer[11] -> Sink: high_2[11]) (1/1) (7f0ccfdf72cd41e31fd13479e09f4da8_1f47e37d81ff856fd0d075813ec805ce_0_0) switched from INITIALIZING to RUNNING.
2025-10-14 17:39:57.047 [flink-pekko.actor.default-dispatcher-15] INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph - IcebergFilesCommitter -> IcebergSink mycatalog.default_database.Aggregation: Writer (1/1) (7f0ccfdf72cd41e31fd13479e09f4da8_8af46b81202ff240b39635087f68aa4e_0_0) switched from INITIALIZING to RUNNING.
2025-10-14 17:39:57.060 [flink-pekko.actor.default-dispatcher-15] INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph - IcebergStreamWriter (1/1) (7f0ccfdf72cd41e31fd13479e09f4da8_9772eb054cc729f24bef895461c2ebff_0_0) switched from INITIALIZING to RUNNING.
2025-10-14 17:40:48.907 [Checkpoint Timer] INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering checkpoint 5 (type=CheckpointType{name='Checkpoint', sharingFilesStrategy=FORWARD_BACKWARD}) @ 1760463648883 for job d147883d3ef59d1ace9cb7a9ba92163c.
2025-10-14 17:40:52.004 [jobmanager-io-thread-3] INFO  org.apache.flink.fs.s3.common.writer.S3Committer - Committing checkpoints/24fdf152-40c2-4cc1-b44a-55ce122cb8a9/d147883d3ef59d1ace9cb7a9ba92163c/chk-5/_metadata with MPU ID 1XSdnQBI3eBHSSEUzmFDEtclGEsVmTf0.zUjFlOXXfRjc7HsH6DrlZLHCApErFVkA8ya3v0vhHtQhY0wyfZjnFV..GR7ZEWb0D5JxeuAY9TmJ_ZxuN.LJAdkdNZsZ1kBbqFM4IBx_Pznh.XyIBbb05.NX8.11ESk3I.na0B1M8E-
2025-10-14 17:40:52.553 [jobmanager-io-thread-3] INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Completed checkpoint 5 for job d147883d3ef59d1ace9cb7a9ba92163c (35256 bytes, checkpointDuration=3293 ms, finalizationTime=377 ms).
 2025-10-14 17:44:48.899 [Checkpoint Timer] INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering checkpoint 6 (type=CheckpointType{name='Checkpoint', sharingFilesStrategy=FORWARD_BACKWARD}) @ 1760463888881 for job d147883d3ef59d1ace9cb7a9ba92163c.
2025-10-14 17:44:49.689 [jobmanager-io-thread-2] INFO  org.apache.flink.fs.s3.common.writer.S3Committer - Committing checkpoints/24fdf152-40c2-4cc1-b44a-55ce122cb8a9/d147883d3ef59d1ace9cb7a9ba92163c/chk-6/_metadata with MPU ID VM4iC3LHCElZGh9ZSCbKn0zAD06mlglsy9iBj2hDEr_M.EP84Yqiz5K4H_.r2SKDSsABdNnkea5DRmklYhJqZ73FnrfScdcLAoZ3Mg1z04RrvD.C7buki40UeQS8BfC_AMpuxszPeSAL_QP.mHohbFz4N2GKuMNjnpb31h4kDgg-
2025-10-14 17:44:50.241 [jobmanager-io-thread-2] INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Completed checkpoint 6 for job d147883d3ef59d1ace9cb7a9ba92163c (48217 bytes, checkpointDuration=1048 ms, finalizationTime=312 ms).
2025-10-14 17:48:48.896 [Checkpoint Timer] INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering checkpoint 7 (type=CheckpointType{name='Checkpoint', sharingFilesStrategy=FORWARD_BACKWARD}) @ 1760464128882 for job d147883d3ef59d1ace9cb7a9ba92163c.
2025-10-14 17:48:49.726 [jobmanager-io-thread-4] INFO  org.apache.flink.fs.s3.common.writer.S3Committer - Committing checkpoints/24fdf152-40c2-4cc1-b44a-55ce122cb8a9/d147883d3ef59d1ace9cb7a9ba92163c/chk-7/_metadata with MPU ID mwDBBBUEGw__yTcqmMtwjNZAi.uUW7UfpLCRo8VFsNn9ATL.AXvLEI3xvbdzp3Ugzi0G_r4i0au2kKEFle3RWZV3olpur7vPyna2ZiXf7F7UJvotPvwBKt2rs0E159wjDlQELAgrhN_SGN7lqLkvrZquoY2GJCsOJhFSJ6RsHTI-
2025-10-14 17:48:50.198 [jobmanager-io-thread-4] INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Completed checkpoint 7 for job d147883d3ef59d1ace9cb7a9ba92163c (61166 bytes, checkpointDuration=1019 ms, finalizationTime=297 ms).


@velo
Copy link
Collaborator Author

velo commented Oct 14, 2025

Feel free to merge @ferenc-csaky if you happy with the change

@ferenc-csaky ferenc-csaky added this to the 0.8.2 milestone Oct 15, 2025
@ferenc-csaky ferenc-csaky merged commit 5ac167b into main Oct 15, 2025
2 checks passed
@ferenc-csaky ferenc-csaky deleted the fix/iceberg-aws-sdk-dependencies branch October 15, 2025 13:25
ferenc-csaky pushed a commit that referenced this pull request Oct 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants