Performance issue with range queries over a partitioned table.

I ran into a performance issue querying an Iceberg table in S3 via the datafusion provider.  The table was created using pyiceberg with the following schema:

```python
schema = Schema(
    NestedField(1, "id", LongType(), required=True),
    NestedField(2, "name", StringType(), required=False),
    NestedField(3, "b", BooleanType(), required=True),
    NestedField(4, "ts", TimestampType(), required=True),
    NestedField(5, "dt", DateType(), required=True),
)
```

The table is partitioned by date extracted from the `ts` column:

```python
partition_spec = PartitionSpec(
    PartitionField(
        source_id=4, field_id=1000, transform=DayTransform(), name="date"
    )
)
```

There are 10,000,000 records in the table spread evenly across ~200 partitions for dates between 2023-01-01 and 2023-08-02.

I query the table using `iceberg-rust` via the datafusion table provider using range queries of the form:

```sql
select * from my_table where ts >= timestamp '2023-01-05T00:00:00' and ts < timestamp '2023-01-06T00:00:00'
```

I expect this query to be very efficient, as it only needs to read one partition, however in reality it takes about as long as scanning the entire table with `select * from my_table` (approximately 10 seconds). It looks like predicate pushdown doesn't work here for some reason.

Questions:
* Is this a performance issue in `iceberg-rust` or am I doing something wrong?
* Is there a better way to perform this query efficiently?

I am using the latest `main` branch of this repo.

Thanks in advance!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Performance issue with range queries over a partitioned table. #811

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Performance issue with range queries over a partitioned table. #811

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions