-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Multiple buffer writes to a single s3 object #1351
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Can anyone provide an idea as to how to move forward? Appreciate it. Thanks. |
In my code, I use the Transfer Manager ( If you implement your own stream ( |
Thanks @webbnh. |
@kkbachu , |
@KaibaLopez , |
I tried MyIOStream class a derivative of Aws::IOStream, but it doesn't go far. How can I substitute Aws::IOStream with MyIOStream? [ERROR] 2020-03-31 22:24:47.016 TransferManager [140637072394112] Failed to read from input stream to upload file to bucket: xxx with key: Looks like I need to have my own version of TransferManager where I can read/write proprietary filesystem instead of regular fstream calls. |
In my code, I created a class derived from You should be able to write your own version of the TransferManager -- I don't think it does anything that you couldn't do -- but I decided not to re-invent that particular wheel...it was more effective for me to use it as is and hook in at the bottom of the stream implementation. If you follow my path, I have two caveats for you. First, be aware that the TransferManager may use multiple threads, so your implementation has to take the appropriate precautions to handle or guard against concurrent reentry. Second, when you get to the download side, be aware that the downloaded blocks do not necessarily arrive in order (because of concurrent I/O requests and vagaries of network flow), so, if your filesystem requires that the file be written linearly, then your implementation should be prepared to buffer the blocks as they are received. (In my case, I needed to process the file linearly, so the callbacks weren't sufficient for me; also, IIRC, the callbacks didn't give me access to the data -- just to status -- so, I needed hooks into the stream in order to be able to process the data before the download was complete.) Good luck! |
Thanks @webbnh for sharing your insights. This is what I am doing as an experiment with put_object_async.cpp example code -
` |
Does this grant your wish? (The second |
It doesn't look like it, I did look at the second UploadFile() method that takes Aws::IOStream as input but it is basically regular cpp iostream under the hood(its typedef'ed). |
Right, so if you pass it an instance of a class derived from |
Thanks @webbnh for staying with me to help out. I tried a class derived from Aws::IOStream but it doesn't get to my methods. May be I am doing something wrong.
};` |
I'm guessing that the problem is that I'm guessing that that is why I hooked into the When you define your member functions, you should use the |
Compiler complains that |
I tried a single put object with in-memory buffer as below. But when its uploaded to S3 and I manually download the file from S3 console to cross check validity of the content(simple text file), it does not contain new line feeds '\n' or '\r'. Pretty much implemented some parts of TransferManager.cpp for SinglePartUpload(), except that source file read is proprietary filesystem.
Hexdump of the buffer that I passed to streambuf. '0a' is a line feed. What am I missing? |
Is there a parameter which specifies whether the stream is binary or text? (Text streams are typically read line by line and have their line terminators removed.) |
Since I am using streambuf and passing it to iostream, can't find anything specific to this issue. Need help. |
Sorry, false alarm. When I was trying to open the file directly from the browser, default editor strips new lines but if I open the text file with text editor like wordpad, it has new lines. Different question - curious if SetContentType() is important or not. |
Although, its working for me to upload/download using application buffers instead of read/write to filestream, but I have to do pretty much what TransferManager does. TransferManager has a lot of goodness that I would love to leverage instead of reinventing the wheel. May be we should add a feature request to have transfermanager support filestream like callbacks? |
This is passing out of my areas of expertise, but I think the answer is summarized as, "its only important if something looks at it". That is, I believe that if you access the contents directly (e.g., using the SDK, here) you won't know the difference unless you ask. However, if you use a RESTful interface (e.g., a web browser) to fetch it, it will try to format and present the contents, and it will make choices based on what you set the type to.
That would have been great a one point...but that's all water over the dam for me now. :-) |
Uh oh!
There was an error while loading. Please reload this page.
Confirm by changing [ ] to [x] below:
Platform/OS/Hardware/Device
Ubuntu 18.04
Describe the question
My application reads the file 1 block of size 1mb at a time from a proprietary filesystem and I want to the push one buffer at a time to a single s3 object. I couldn't find any documentation around this. Looked at #64 that is close to what I am looking for but it still refers to 1 custom buffer to put object request.
Is there a way to do this with aws sdk cpp for s3?
putobject request and tansfermanager refers to passing a file or a single buffer. But I couldn't find an ability to write buffer by buffer to a single object in a loop or something.
Any help is appreciated.
Logs/output
If applicable, add logs or error output.
To enable logging, set the following system properties:
REMEMBER TO SANITIZE YOUR PERSONAL INFO
The text was updated successfully, but these errors were encountered: