Skip to content

Storage: Bucket.list_blobs(max_results=n) does not behave as documented #19

@bz2

Description

@bz2

The max_results parameter of list_blobs() is documented as controlling the maximum number of blobs returned in each page of results, but actually limits the total number of results as the name implies.

Compare the Bucket.list_blobs() documentation:
https://googleapis.dev/python/storage/latest/buckets.html#google.cloud.storage.bucket.Bucket.list_blobs

max_results (int) – The maximum number of blobs in each page of results from this request. Non-positive values are ignored. Defaults to a sensible value set by the API.

With the Iterator documentation:
https://googleapis.dev/python/google-api-core/latest/page_iterator.html#google.api_core.page_iterator.Iterator

max_results (int) – The maximum number of results to fetch.

Also the implementation of HTTPIterator which is used by list_blobs() internally does treat max_results as a hard limit for total num_results:
https://github.com/googleapis/google-cloud-python/blob/master/api_core/google/api_core/page_iterator.py#L378

Code example

iterator = some_big_bucket.list_blobs(max_results=100)
assert len(list(iterator)) > 100  # throws
assert sum(len(list(page)) for page in iterator.pages) > 100  # throws

Suggested resolution

Change the documentation to match what the parameter actually does. If supplying a paging size is required, a new argument to HTTPIterator could be added and exposed up through the list_blobs() interface.

Metadata

Metadata

Assignees

No one assigned

    Labels

    api: storageIssues related to the googleapis/python-storage API.priority: p2Moderately-important priority. Fix may not be included in next release.type: bugError or flaw in code with unintended results or allowing sub-optimal usage patterns.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions