Skip to content

Storage: upload_from_string() with ifGenerationMatch=0 #16

@1fish2

Description

@1fish2

The GCS HTTP protocol -- but not the Python API -- has the ability to set ifGenerationMatch when creating a storage object:

Makes the operation conditional on whether the object's current generation matches the given value. Setting to 0 makes the operation succeed only if there is a live version of the object.

Why it's useful: With this feature, the client could create a directory placeholder entry (a 0-byte object with a name ending in '/') very efficiently like this:

blob = bucket.blob('path/to/my/subdirectory/')
blob.upload_from_string(b'', if_generation_match=0)

That one round trip creates the directory placeholder entry if it doesn't already exist. The alternatives are to first make a round trip to check if the entry exists or else to let the bucket accumulate identical placeholder entries (esp. for top level directories) by blindly creating them. [Or does GCS check if an uploaded object matches the current generation and optimize that case? -- Nope.]

Why that matters: Directory placeholders speed up gcsfuse by an order of magnitude. Without the placeholders, you have to use gcsfuse in --implicit-dirs mode, and such a mount is frustratingly slow for interactive work. E.g. it takes several seconds just to list a tiny directory containing 2 files. With the placeholders, you can run gcsfuse without --implicit-dirs, and that mount lists directories in a tenth of a second or two.

Proposal: I could create a Pull Request adding this feature if you like, with either the specific if_generation_match query parameter or a way to pass in additional query parameters.

Another alternative is recommend that callers do something like subclass Blob to override _add_query_parameters() to add the if_generation_match=0 name-value pair. That's ugly and fragile.

Is there a way to do this that I'm missing? Are there better alternatives?

Metadata

Metadata

Assignees

Labels

api: storageIssues related to the googleapis/python-storage API.type: feature request‘Nice-to-have’ improvement, new feature or different behavior or design.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions