Skip to content

Conversation

@ianton-ru
Copy link

@ianton-ru ianton-ru commented Apr 23, 2025

Changelog category (leave one):

  • Bug Fix (user-visible misbehavior in an official stable release)

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Fixed format detection for table function iceberg (fixes #732)

Documentation entry for user-facing changes

With empty Iceberg table this query works:

select * from icebergS3('http://minio:9000/warehouse/data/', 'minio', 'minio123')

and this not

select * from iceberg('http://minio:9000/warehouse/data/', 'minio', 'minio123')

Code: 715. DB::Exception: Received from localhost:9000. DB::Exception: The data format cannot be detected by the contents of the files, because there are no files with provided path in S3ObjectStorage or all files are empty. You can specify the format manually: The data format cannot be detected by the contents of the files. You can specify the format manually. (CANNOT_DETECT_FORMAT)

because StorageIcebergConfiguration returns own default-initialized fields instead of fields of specific config implementation (S3, Azure or HDFS), ClickHouse tries to resolve it and it fails - for datalake format is always 'Parquet' now, but when it not filled properly, code tries to detect it from source files and fails for empty table case.

Technical changes - fields in StorageObjectStorage::Configuration now private access via getters and setters.
Logical changes - these getters and setters are overridden in StorageIcebergConfiguration to use proper implementation.

@ianton-ru ianton-ru force-pushed the feature/fix_configuration_format branch from eba1242 to 173d6a5 Compare April 23, 2025 09:48
@ianton-ru ianton-ru changed the title Make fields in object storage configuration private Fix format, structure and compression method detection for DataLake Apr 23, 2025
@Enmk Enmk merged commit fde55e4 into antalya May 5, 2025
245 of 309 checks passed
@svb-alt svb-alt added the antalya-25.2.2 Planned for 25.2.2 release label May 6, 2025
ianton-ru pushed a commit that referenced this pull request May 23, 2025
Fix format, structure and compression method detection for DataLake
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fail to read empty Iceberg table with iceberg table function

4 participants