Skip to content

TransferManager downloading to a stream delete's the stream before I finish reading it #1029

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
webbnh opened this issue Nov 30, 2018 · 3 comments
Labels
guidance Question that needs advice or information.

Comments

@webbnh
Copy link

webbnh commented Nov 30, 2018

I'm trying to use the TransferManager to download a file to a stream. I've rolled my own derivations of std::iostream and std::stringbuf which allows the download to be read while it's happening, synchronizing the reader's access to the downloaded data (the parts don't necessarily arrive in order, or, at least, my reader frequently tries to read stuff before it has arrived and has to be forced to wait). For the most part, I think it's working (although I had to do some hairy stuff).

The problem is, I've just hit the wrinkle described here: when the transfer completes, the ResponseStream implementation delete's my stream, even though I'm not done reading it, yet!

Have I've wandered down the wrong path?... If so, how was I supposed to have set things up using the TransferManager::DownloadFile() function so that the stream (or, the data, at least...) stays around after the download is complete? (Otherwise, what's the point of downloading to a stream if I can't then read the from the stream??)

Thanks!

@marcomagdy
Copy link
Contributor

typedef std::function<Aws::IOStream*(void)> CreateDownloadStreamCallback; is a bit misleading, because the stream in this case is used as an Aws::OStream (output stream) and is owned by TransferManager. Once the transfer is done, the stream is deleted, but the data has been written (and flushed) to the underlying device.

Your best bet to achieve what you're trying to do is to have the underlying device in this case be another stream that you can read from. This is not a hack, streams are meant to be compose-able.

The OStream will write to your true iostream, but it won't own it. So when the OStream goes away, you can still read from your iostream.

I know how painful iostreams are.
It's the plan to get rid of them in v2.0

@webbnh
Copy link
Author

webbnh commented Dec 6, 2018

Thanks for the confirmation, @marcomagdy .

In the interim, I had done what you suggested: I separated my implementation of the stream from the underlying streambuf, and created separate output and input streams for the TransferManger and my reader, respectively. What I've got seems to be working pretty well (my code calculates a checksum of the downloaded file, and it can get at least half of it done by the time the download completes).

For me, the painful part was not the I/O streams (OK, that's a lie, but most of the problems there were of my own making...), the painful part was synchronizing with the TransferManager. For instance, I need to know how big the resulting file is supposed to be, which I can get from TransferHandle::GetBytesTotalSize(), but not until after the transfer has started. So, I have to wait for the first call to TransferManagerConfiguration::transferStatusUpdatedCallback to grab and squirrel-away the value. Fortunately, this happens before the call to CreateDownloadStreamCallback (and, fortunately, the creation of the download stream is deferred to after the start of the transfer...I'm now thinking that that was REALLY clever!). Nevertheless, It would have been very helpful if the callback included the size or the TransferHandle as an argument (like the status and download callbacks, do) -- that would have removed the need for the gymnastics.

Anyway, until V2.0 appears, it would be helpful to those coming after me if the documentation mentioned things like the fact that the object returned by CreateDownloadStreamCallback will be delete'd when the transfer completes, and that the developer should make appropriate provisions to access the result by a separate mechanism. :-)

Just out of curiosity...in V2.0, how will you enable concurrent read of the downloaded data without using streams?

Thanks!

@marcomagdy
Copy link
Contributor

Glad that worked out for you.

how will you enable concurrent read of the downloaded data without using streams?

Callbacks on receive and on send. Basically, will follow the pattern used in C programs but I'll use safe C++ parameters like vector<unsigned char> rather than void * and size_t.

@justnance justnance added guidance Question that needs advice or information. and removed question labels Apr 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
guidance Question that needs advice or information.
Projects
None yet
Development

No branches or pull requests

3 participants