Skip to content

Upload async file data #28307

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
aersam opened this issue Jan 12, 2023 · 8 comments · Fixed by #28472
Closed

Upload async file data #28307

aersam opened this issue Jan 12, 2023 · 8 comments · Fixed by #28472
Assignees
Labels
Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Attention Workflow: This issue is responsible by Azure service team. Storage Storage Service (Queues, Blobs, Files)

Comments

@aersam
Copy link

aersam commented Jan 12, 2023

Is your feature request related to a problem? Please describe.
Hi there! I'm trying to upload a block blob using an async source:

async with aiofiles.open(file=path, mode="rb") as data:
    await blob_client.upload_blob(data, metadata=tags, overwrite=True)

This doens't work, the data is supposed to be bytes.

Describe the solution you'd like
I'd like to be able to use an async stream as source for the async Blob Client

Describe alternatives you've considered
My current workaround is to load the whole file into RAM:

async with aiofiles.open(file=path, mode="rb") as data:
    dt = await data.read()
    await blob_client.upload_blob(dt, metadata=tags, overwrite=True)

As long as my files don't get too large, this will work

Additional context
Add any other context or screenshots about the feature request here.

@ghost ghost added customer-reported Issues that are reported by GitHub users external to the Azure organization. question The issue doesn't require a change to the product in order to be resolved. Most issues start as that labels Jan 12, 2023
@github-actions github-actions bot added the needs-triage Workflow: This is a new issue that needs to be triaged to the appropriate team. label Jan 12, 2023
@xiangyan99 xiangyan99 added Storage Storage Service (Queues, Blobs, Files) Client This issue points to a problem in the data-plane of the library. CXP Attention and removed needs-triage Workflow: This is a new issue that needs to be triaged to the appropriate team. labels Jan 12, 2023
@ghost ghost added the needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team label Jan 12, 2023
@ghost
Copy link

ghost commented Jan 12, 2023

Thank you for your feedback. This has been routed to the support team for assistance.

@navba-MSFT navba-MSFT added Service Attention Workflow: This issue is responsible by Azure service team. and removed CXP Attention labels Jan 16, 2023
@ghost
Copy link

ghost commented Jan 16, 2023

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @xgithubtriage.

Issue Details

Is your feature request related to a problem? Please describe.
Hi there! I'm trying to upload a block blob using an async source:

async with aiofiles.open(file=path, mode="rb") as data:
    await blob_client.upload_blob(data, metadata=tags, overwrite=True)

This doens't work, the data is supposed to be bytes.

Describe the solution you'd like
I'd like to be able to use an async stream as source for the async Blob Client

Describe alternatives you've considered
My current workaround is to load the whole file into RAM:

async with aiofiles.open(file=path, mode="rb") as data:
    dt = await data.read()
    await blob_client.upload_blob(dt, metadata=tags, overwrite=True)

As long as my files don't get too large, this will work

Additional context
Add any other context or screenshots about the feature request here.

Author: aersam
Assignees: jalauzon-msft, vincenttran-msft
Labels:

Storage, question, Service Attention, Client, customer-reported, needs-team-attention

Milestone: -

@navba-MSFT
Copy link
Contributor

@jalauzon-msft @vincenttran-msft Could you please look into this once you get a chance ? Thanks in advance.

@jalauzon-msft
Copy link
Member

Hi @aersam Adrian, thanks for reaching out. Could you please share which version of the Blob SDK you are using? We recently released support for streams with an async read method in version 12.14.0 which I believe should work with aiofiles. Another thing to check is to make sure you are using an async BlobClient which would be imported from the aio module. Async streams are only supported on the async clients.

from azure.storage.blob.aio import BlobClient

@navba-MSFT navba-MSFT added needs-author-feedback Workflow: More information is needed from author to address the issue. and removed needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team labels Jan 18, 2023
@aersam
Copy link
Author

aersam commented Jan 19, 2023

Hi there
I'm using aiofiles 22.1 and azure-storage-blob 12.14.1

I am using the aio variants. who does not ? :)

pyright already complains when doing so:

    async with aiofiles.open(file=fake_path or data.path, mode="rb") as fdata:
        await blob_client.upload_blob(fdata, metadata=data.tags, overwrite=True)

the error at runtime:

  File "c:\Projects\DataHub\BlobTagging\blobtagging\main.py", line 96, in do
    await upload_blob(client, data=data, target_root=target_root, fake_path=fake_path)
  File "c:\Projects\DataHub\BlobTagging\blobtagging\main.py", line 90, in upload_blob 
    await blob_client.upload_blob(fdata, metadata=data.tags, overwrite=True)
  File "c:\Projects\DataHub\BlobTagging\.venv\lib\site-packages\azure\core\tracing\decorator_async.py", line 79, in wrapper_use_tracer
    return await func(*args, **kwargs)
  File "c:\Projects\DataHub\BlobTagging\.venv\lib\site-packages\azure\storage\blob\aio\_blob_client_async.py", line 405, in upload_blob
    return await upload_block_blob(**options)
  File "c:\Projects\DataHub\BlobTagging\.venv\lib\site-packages\azure\storage\blob\aio\_upload_helpers.py", line 79, in upload_block_blob        
    raise TypeError('Blob data should be of type bytes.')
TypeError: Blob data should be of type bytes.

@ghost ghost added needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team and removed needs-author-feedback Workflow: More information is needed from author to address the issue. labels Jan 19, 2023
@aersam
Copy link
Author

aersam commented Jan 19, 2023

it's a bit strange: I looked at your helper code and it basically does a check with asyncio.iscoroutinefunction(data.read) which I'd expect to return True, but it returns False. I'm not too deep into this python async stuff; I don't understnd why. My Python version is 3.9
Saw also python/cpython#81371

I could fix it, see my PR

aersam pushed a commit to aersam/azure-sdk-for-python that referenced this issue Jan 19, 2023
@jalauzon-msft
Copy link
Member

jalauzon-msft commented Jan 20, 2023

Hi @aersam Adrian, thanks for the PR! Strange that aiofiles async read method does not work with iscoroutinefunction but your proposed change does seem like a reasonable workaround. Let me check with some other folks on my team, that are more familiar with this async Python stuff, next week to make sure the change will work is all other scenarios.

If they approve, there are one or two other places we'd need to apply this change at the same time that I may ask you to do or just submit a PR myself if I have the time.

@jalauzon-msft
Copy link
Member

Hi again @aersam Adrian, I discussed with the team, and we think your change looks good and should help to increase our compatibility with different stream types. There were a few more places we needed to make this change in addition to your PR, so I went ahead and created #28472 instead of making you track them down. I will close #28410 in favor of this one.

@github-actions github-actions bot locked and limited conversation to collaborators Apr 25, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Attention Workflow: This issue is responsible by Azure service team. Storage Storage Service (Queues, Blobs, Files)
Projects
None yet
5 participants