Skip to content

feat(lambda-http): accept http_body::Body in responses #466

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

nmoutschen
Copy link
Contributor

@nmoutschen nmoutschen commented Apr 14, 2022

Issue #, if available:

Description of changes:

Allows using any http_body::Body implementation in the response for lambda_http.

⚠️ At the moment, this PR removes the ability to return a Response<String> or any into value that implements Into<aws_lambda_events::encodings::Body>.

By submitting this pull request

  • I confirm that my contribution is made under the terms of the Apache 2.0 license.
  • I confirm that I've made a best effort attempt to update all relevant documentation.

@nmoutschen nmoutschen changed the title feat!(lambda-http): accept http_body::Body in responses feat(lambda-http): accept http_body::Body in responses Apr 19, 2022
let (parts, body) = self.into_parts();
Response::from_parts(parts, body.into())

let fut = async { Response::from_parts(parts, body.into_body().await) };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why does this have to be a future?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, this is due to the .await on into_body(), which is needed to support http_body::Body implementations that stream the body back, as we need to collect the entire payload before we can return a Lambda response.

Copy link

@brainstorm brainstorm Apr 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to collect partial payloads? Thinking HTTP Ranges or terminate early if a piece of data has been seen in the response as it streams instead of waiting for the whole payload? Not sure if this use case should be available through this API or just limited to a custom hack, but it would serve my interests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@brainstorm Not immediately for this purpose, since here we just use this to transform it into a aws_lambda_events::encoding::Body enum, which we needs to generate the right response for API Gateway/ALB.

That said, if you're calling an API that returns an http_body::Body implementation, or some form of asynchronous stream, you could always implement that yourself in your function handler.

@nmoutschen
Copy link
Contributor Author

Marking this one as ready to review as there's nothing outstanding here and would help on smithy-lang/smithy-rs#1338

Copy link
Contributor

@calavera calavera left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@calavera
Copy link
Contributor

calavera commented May 5, 2022

@nmoutschen will this fix the problem described in #438 ?

Comment on lines +95 to +96
fut_req: Pin<Box<dyn Future<Output = Result<R, E>> + 'a>>,
fut_res: Option<ResponseFuture>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: since only one of these will be polled at a time, it might be better to model it as an enum with two variants.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need some help here @calavera, @eduardomourar or @david-perez

The PR as it is now simply doesn't work. The lambdas using lambda_http get called (I can see my info logs with the event) but never return the response and time out.

I believe this poll function is at the core of the issue: https://github.com/awslabs/aws-lambda-rust-runtime/pull/466/files#diff-c62692e6b817476306b4d6966f541eec77a1768674e4dad5cf7f10c4faababa4R106-R119

I observed that the value of self.fut_res gets set then nothing else happens, the poll function never gets executed again with the value of self.fut_res set, it actually looks like it gets called once then that's it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok bear with me I think I know why this isn't working as-is: https://tokio.rs/tokio/tutorial/async#wakers

When a future returns Poll::Pending, it must ensure that the waker is signaled at some point. Forgetting to do this results in the task hanging indefinitely.

Forgetting to wake a task after returning Poll::Pending is a common source of bugs.

Looks like by returning Poll::Pending from Poll::Ready and doing nothing else we are breaking the contract we also need to take an additional step

Before returning Poll::Pending, we called cx.waker().wake_by_ref(). This is to satisfy the future contract. By returning Poll::Pending, we are responsible for signaling the waker.

The code should be:

Poll::Ready(Ok(resp)) => {
    self.fut_res = Some(resp.into_response());
    cx.waker().wake_by_ref()
    Poll::Pending
}

And I've confirmed now that this works correctly. Continuing on with testing.

Copy link
Contributor

@david-perez david-perez Jun 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hugobast Good catch! This is indeed a sharp edge of async Rust.

Typically only leaf futures are the ones that interact with wakers. Most futures uphold the poll contract by returning Poll::Pending only if some other future also returned Poll::Pending; in this way, it trivially fulfills the contract of poll since the inner future must follow that same contract.

In this particular case, we can do the same by polling the future that we just put in self.fut_res. That is, do here what is done in the first arm of the if statement too.

I think it's clearer if we model this as an enum, since we're only polling one of the futures at a time, and simply do self.poll() when the request future completes and we now want to poll the response future.

(Note though that your code is perfectly fine in terms of correctness)

Copy link
Contributor

@hugobast hugobast Jun 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@david-perez something like this? hugobast@d58a1bc

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that looks good to me

fn into_response(self) -> ResponseFuture {
Box::pin(async move {
Response::builder()
.status(self.0.try_into().expect("unable to transform status code"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will panic if the conversion fails. I'm not sure if this convenience blanket impl is worth panicking. At the very least it should be documented.

// Can safely unwrap here as we do type validation in the 'if' statement
Box::pin(ready(*any_self.downcast::<Body>().unwrap()))
} else {
Box::pin(async move { Body::from(to_bytes(self).await.expect("unable to read bytes from body").to_vec()) })
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will always use the impl From<Vec<u8>> for Body, which always creates the Body::Binary variant. Is this correct? The body might just contain UTF-8-encoded text.

See https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-payload-encodings-workflow.html, which makes my head hurt.

Copy link
Contributor

@bnusunny bnusunny Jun 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ALB integration use content-encoding and content-type headers to determine if the body should be base64 encoded. See it here: https://docs.aws.amazon.com/elasticloadbalancing/latest/application/lambda-functions.html#receive-event-from-load-balancer

Perhaps we could add utility methods to convert hyper::Response or reqwest::Reponse to http::Response<lambda_http::Body>. These methods can convert the Body to correct lambda_http::Body variant based on the same rules used by ALB.

This means we should probably work on Response not Body.

@nmoutschen
Copy link
Contributor Author

@calavera Yep, I've also included a new example to fix that.

Fair points @david-perez ! I'm a bit busy with the Stockholm summit this week, but will update this PR either by the end of this week or the next with those changes.

Regarding panicking, this is to mimic the current behaviour, but while I'm making those changes, I might as well return a Result instead.

For the Body::Binary variant, yep, I was thinking of trying to transform into a String first, and only use Vec as a default. In the case of Lambda, it will always return a string (either proper string, or base64 encoded payload). Using Body::Binary basically does base64 encoding and set isBase64Encoded to true (see https://docs.aws.amazon.com/apigateway/latest/developerguide/set-up-lambda-proxy-integrations.html#api-gateway-simple-proxy-for-lambda-output-format). My assumption was that Body::Binary would be handled correctly, but I might have to do a few more tests there to validate this just to ensure we don't break anything.

@david-perez
Copy link
Contributor

@nmoutschen Are you sure that always constructing the Body::Binary variant is correct? If I understand this table correctly, it effectively means that we're always in the last 6 rows of the second table.

My assumption was that Body::Binary would be handled correctly

My understanding is that if the body contains regular UTF-8 encoded text, and we construct the Body::Binary variant for it, we will base64-encode it before handing it off to the Lambda service. If the Lambda is behind an API Gateway and the client has e.g. application/json in its Accept header, then API Gateway will base64-decode it before responding. So things work nicely in this case. This case corresponds to row 10 of the second table. What about the other cases?

@eduardomourar
Copy link

Could this be merged and released, please?

@calavera
Copy link
Contributor

Could this be merged and released, please?

@eduardomourar unfortunately this has not been tested. @david-perez's comment still stands:

My understanding is that if the body contains regular UTF-8 encoded text, and we construct the Body::Binary variant for it, we will base64-encode it before handing it off to the Lambda service. If the Lambda is behind an API Gateway and the client has e.g. application/json in its Accept header, then API Gateway will base64-decode it before responding. So things work nicely in this case. This case corresponds to row 10 of the second table. What about the other cases?

Moreover, Nicolas has left AWS and he's no longer interested in completing this PR, which is understandable.

If you'd like to see these changes in a release, you're very welcome to help us verify that the changes work for all API GW responses. For now, we don't have much time to verify it ourselves.

@hugobast
Copy link
Contributor

Could this be merged and released, please?

@eduardomourar unfortunately this has not been tested. @david-perez's comment still stands:

My understanding is that if the body contains regular UTF-8 encoded text, and we construct the Body::Binary variant for it, we will base64-encode it before handing it off to the Lambda service. If the Lambda is behind an API Gateway and the client has e.g. application/json in its Accept header, then API Gateway will base64-decode it before responding. So things work nicely in this case. This case corresponds to row 10 of the second table. What about the other cases?

Moreover, Nicolas has left AWS and he's no longer interested in completing this PR, which is understandable.

If you'd like to see these changes in a release, you're very welcome to help us verify that the changes work for all API GW responses. For now, we don't have much time to verify it ourselves.

I'm more than happy to carry this forward, I'm having difficulties running the integration tests (probably due to my M1 Mac), I digress. What I am curious to know @calavera is what kind of test you are looking for, I thought I would add something to the existing integration test and run it as many times as there are API GW permutations to validate that we cover all responses from https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-payload-encodings-workflow.html. It does seem toilsome to do though, am I on the right track? is there something easier I can do to test?

@calavera
Copy link
Contributor

@hugobast thanks a lot for looking into this! I don't really have any better ideas to be honest.

@hugobast
Copy link
Contributor

@calavera here's the PR, happy to discuss #491 please also check this commit that I have decided to drop since the API Gateway configuration to perform the tests per the table lead me into a dead-end (I could be wrong and I'm happy to pick it back up) hugobast@83fa77d

@calavera
Copy link
Contributor

Thanks @hugobast ! I'm going to close this PR in favor of yours, we can continue the conversation there.

@calavera calavera closed this Jun 21, 2022
@calavera calavera deleted the response-body branch June 21, 2022 23:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants