Skip to content

Added error handling for #241, interim, broken #244

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 21 commits into from
Closed

Added error handling for #241, interim, broken #244

wants to merge 21 commits into from

Conversation

rimutaka
Copy link
Contributor

Issue #, if available: #241, relates to PR #242

Description of changes:

Added my changes from #242 on top of David's changes in this branch.

  1. The build is broken - lambda-http examples fail to compile.
  2. The lambda examples (basic, hello, etc) compile, run, but return an error as a result and DO NOT LOG ANYTHING at all.

For example, hello-without-macro-tracing.rs has these 2 lines:

simple_logger::init().unwrap();
info!("Hello!");

so it should at least log Hello. It doesn't.

This line returns an error for a test: Err(simple_error::SimpleError::new("simple error").into()).

Output:

{
  "errorType": "&alloc::boxed::Box<dyn std::error::Error+core::marker::Sync+core::marker::Send>",
  "errorMessage": "simple error"
}

Trace in CloudWatch:

START RequestId: 01e6ec13-33c2-40a7-b467-a37b1cbc17f0 Version: $LATEST
 TRACE hyper::proto::h1::conn: Conn::read_head
 DEBUG hyper::proto::h1::io: read 444 bytes
 TRACE hyper::proto::h1::role: Response.parse([Header; 100], [u8; 444])
 TRACE hyper::proto::h1::role: Response.parse Complete(418)
 DEBUG hyper::proto::h1::io: parsed 7 headers
 DEBUG hyper::proto::h1::conn: incoming body is content-length (26 bytes)
 TRACE hyper::proto::h1::decode: decode; state=Length(26)
 DEBUG hyper::proto::h1::conn: incoming body completed
 TRACE hyper::proto::h1::conn: maybe_notify; read_from_io blocked
 TRACE want: signal: Want    
 TRACE hyper::proto::h1::conn: flushed({role=client}): State { reading: Init, writing: Init, keep_alive: Idle }
 TRACE want: signal: Want    
 TRACE hyper::proto::h1::conn: flushed({role=client}): State { reading: Init, writing: Init, keep_alive: Idle }
 TRACE hyper::client::pool: put; add idle connection for ("http", 127.0.0.1:9001)
 DEBUG hyper::client::pool: pooling idle connection for ("http", 127.0.0.1:9001)
 TRACE want: signal: Want    
 TRACE hyper::proto::h1::conn: flushed({role=client}): State { reading: Init, writing: Init, keep_alive: Idle }
 TRACE hyper::client::pool: take? ("http", 127.0.0.1:9001): expiration = Some(90s)
 DEBUG hyper::client::pool: reuse idle connection for ("http", 127.0.0.1:9001)
 TRACE hyper::proto::h1::role: Client::encode method=POST, body=Some(Known(125))
 DEBUG hyper::proto::h1::io: flushed 306 bytes
 TRACE hyper::proto::h1::conn: flushed({role=client}): State { reading: Init, writing: KeepAlive, keep_alive: Busy }
 TRACE hyper::proto::h1::conn: Conn::read_head
 DEBUG hyper::proto::h1::io: read 130 bytes
 TRACE hyper::proto::h1::role: Response.parse([Header; 100], [u8; 130])
 TRACE hyper::proto::h1::role: Response.parse Complete(114)
 DEBUG hyper::proto::h1::io: parsed 3 headers
 DEBUG hyper::proto::h1::conn: incoming body is content-length (16 bytes)
 TRACE hyper::proto::h1::decode: decode; state=Length(16)
 DEBUG hyper::proto::h1::conn: incoming body completed
 TRACE hyper::proto::h1::conn: maybe_notify; read_from_io blocked
 TRACE want: signal: Want    
 TRACE hyper::proto::h1::conn: flushed({role=client}): State { reading: Init, writing: Init, keep_alive: Idle }
 TRACE want: signal: Want    
 TRACE hyper::proto::h1::conn: flushed({role=client}): State { reading: Init, writing: Init, keep_alive: Idle }
 TRACE hyper::client::pool: put; add idle connection for ("http", 127.0.0.1:9001)
 DEBUG hyper::client::pool: pooling idle connection for ("http", 127.0.0.1:9001)
 TRACE want: signal: Want    
 TRACE hyper::proto::h1::conn: flushed({role=client}): State { reading: Init, writing: Init, keep_alive: Idle }
 TRACE hyper::proto::h1::dispatch: client tx closed
 TRACE hyper::proto::h1::conn: State::close_read()
 TRACE hyper::proto::h1::conn: State::close_write()
 TRACE hyper::proto::h1::conn: flushed({role=client}): State { reading: Closed, writing: Closed, keep_alive: Disabled }
 TRACE hyper::proto::h1::conn: shut down IO complete
 TRACE mio::poll: deregistering handle with poller    
 TRACE want: signal: Closed    
 TRACE hyper::client::pool: pool closed, canceling idle interval
END RequestId: 01e6ec13-33c2-40a7-b467-a37b1cbc17f0
REPORT RequestId: 01e6ec13-33c2-40a7-b467-a37b1cbc17f0	Duration: 39.60 ms	Billed Duration: 100 ms	Memory Size: 128 MB	Max Memory Used: 34 MB	Init Duration: 39.10 ms	
Unknown application error occurred
unhandled

As you can see, neither info!("Hello!") nor simple error messages are there.

I am not sure what other changes are intended for lib.rs and what state it is in, so it's better if @davidbarsky takes a look first.

TODO

  1. Make it compile
  2. Retest all examples
  3. Update examples/README.md with the new input/output
  4. Explain the relationship between LOG and TRACING in the readme.

By submitting this pull request

  • I confirm that my contribution is made under the terms of the Apache 2.0 license.
  • I confirm that I've made a best effort attempt to update all relevant documentation.

Comment on lines 5 to 6
use simple_error;
use simple_logger;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
use simple_error;
use simple_logger;
use simple_error;
use simple_logger;

This will generate a clippy warning: https://rust-lang.github.io/rust-clippy/master/#single_component_path_imports

@@ -31,3 +31,5 @@ async-stream = "0.2"
tracing-subscriber = "0.2"
once_cell = "1.4.0"
simple_logger = "1.6.0"
log = "0.4"
simple-error = "0.2"
Copy link
Contributor

@Veetaha Veetaha Jul 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest using anyhow instead of simple-error. simple-error has only 17 stars and was last updated a year ago...

@@ -131,7 +131,7 @@ impl<F, A, B, Error, Fut> Handler<A, B> for HandlerFn<F>
where
F: Fn(A, Context) -> Fut,
Fut: Future<Output = Result<B, Error>> + Send + Sync,
Error: Into<Box<dyn std::error::Error + Send + Sync + 'static>> + fmt::Debug,
Error: Into<Box<dyn std::error::Error + Send + Sync + 'static>> + fmt::Display,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure why we have + fmt::Display/Debug here at all. std::error::Error requires Debug + Display as supertraits.
cc @davidbarsky

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if fmt:Display is removed as you suggested, rustc complains over error!("{}", e); and format!("{}", e) with message

`<F as Handler<A, B>>::Error` doesn't implement `std::fmt::Display`
the trait `std::fmt::Display` is not implemented for `<F as Handler<A, B>>::Error`
in format strings you may be able to use `{:?}` (or {:#?} for pretty-print) instead
required by `std::fmt::Display::fmt`rustc(E0277)

I read something about tracing macro allowing something like error!("{}", %e); to force Display over Debug, but can't find it any more. Even if that worked It may be too risky to rely on that feature.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason it seems is the Error type is Into an Error . You may need error!("{}", e.into());

@softprops
Copy link
Contributor

You most definitely get Display with Error. You also got Debug with the fmt string. Again I think the display message should be Display. Debug is not for display

@davidbarsky
Copy link
Contributor

The build is broken - lambda-http examples fail to compile.

That's my bad—I see the test failures. I'll work on fixing that and let you know when I've pushed the fix.

The lambda examples (basic, hello, etc) compile, run, but return an error as a result and DO NOT LOG ANYTHING at all.

lambda/examples/hello.rs doesn't set up a logger, so this expected (but perhaps surprising!) behavior. You need to setup a logger in order to see any logging happen. This requirement holds if you're using log or tracing.

For example, hello-without-macro-tracing.rs has these 2 lines:

simple_logger::init().unwrap();
info!("Hello!");

so it should at least log Hello. It doesn't.

Two questions:

  • are you referring to hello-with-macro-log.rs? hello-without-macro-tracing.rs doesn't import simple_logger.
  • Assuming you're referring to hello-with-macro-log.rs, I'm unable to reproduce this issue. Can you point me at a minimal reproduction?

@rimutaka
Copy link
Contributor Author

@davidbarsky David, looks like it logs from fn func, but not from fn main. Here is the exact code (modified hello-without-macro-log.rs) I've just re-tested on AWS Lambda:

use lambda::{handler_fn, Context};
use serde_json::Value;
use tracing::{error, info, debug};

type Error = Box<dyn std::error::Error + Send + Sync + 'static>;

#[tokio::main]
async fn main() -> Result<(), Error> {
    let func = handler_fn(func);
    simple_logger::init_with_level(log::Level::Info); // ::init().unwrap();

    error!("Error from main!");
    info!("Info from main!");
    debug!("Debug from main!");
    lambda::run(func).await?;
    Ok(())
}

async fn func(event: Value, _: Context) -> Result<Value, Error> {
    error!("Error from func!");
    info!("Info from func!");
    debug!("Debug from func!");
    //Ok(event)
    Err(simple_error::SimpleError::new("simple_logger::init_with_level(log::Level::Info);").into())
}

Expected action

CloudWatch log entries should contain output from:
error!("Error from main!"); + info!("Info from main!"); and error!("Error from func!"); + info!("Info from func!");.

Test input:

{
  "command": "do something"
}

Output:

The output is as expected.

{
  "errorType": "&alloc::boxed::Box<dyn std::error::Error+core::marker::Sync+core::marker::Send>",
  "errorMessage": "simple_logger::init_with_level(log::Level::Info);"
}

Cloudwatch log output:

Only error!("Error from func!"); + info!("Info from func!"); produced log entries:

START RequestId: b874a1ff-6be0-423f-b05c-5de08073b164 Version: $LATEST
2020-07-25 00:08:50,706 ERROR [hello_without_macro_log] Error from func! 
2020-07-25 00:08:50,706 INFO  [hello_without_macro_log] Info from func! 
END RequestId: b874a1ff-6be0-423f-b05c-5de08073b164
REPORT RequestId: b874a1ff-6be0-423f-b05c-5de08073b164	Duration: 31.54 ms	Billed Duration: 100 ms	Memory Size: 128 MB	Max Memory Used: 27 MB	Init Duration: 31.28 ms	
Unknown application error occurred
unhandled

The other problem here is ...

The error message produced by the handler was not logged by the runtime's lib.rs in error_message: format!("{}", err),

Using tracing instead of log makes no difference.

This modified version of hello-without-macro-tracing.rs

use lambda::{handler_fn, Context};
use serde_json::Value;
use tracing::{info, error};

type Error = Box<dyn std::error::Error + Send + Sync + 'static>;

#[tokio::main]
async fn main() -> Result<(), Error> {
    let func = handler_fn(func);
    tracing_subscriber::fmt()
        .with_max_level(tracing::Level::INFO)
        // this needs to be set to false, otherwise ANSI color codes will
        // show up in a confusing manner in CloudWatch logs.
        .with_ansi(false)
        // disabling time is handy because CloudWatch will add the ingestion time.
        .without_time()
        .init();

    info!("Info from main!");
    error!("Error from main!");
    lambda::run(func).await?;
    Ok(())
}

async fn func(event: Value, _: Context) -> Result<Value, Error> {
    //Ok(event)
    info!("Info from func!");
    error!("Error from func!");
    Err(simple_error::SimpleError::new("tracing::Level::INFO").into())
}

logs this in CloudWatch

START RequestId: d5051b76-e60b-4d89-97de-56e4c0ba047b Version: $LATEST
  INFO hello_without_macro_tracing: Info from func!
 ERROR hello_without_macro_tracing: Error from func!
END RequestId: d5051b76-e60b-4d89-97de-56e4c0ba047b
REPORT RequestId: d5051b76-e60b-4d89-97de-56e4c0ba047b	Duration: 56.18 ms	Billed Duration: 100 ms	Memory Size: 128 MB	Max Memory Used: 33 MB	Init Duration: 41.11 ms	
Unknown application error occurred
unhandled

As you can see, nothing was logged from fn main or lib.rs.

I did quick test with lib.rs by making this change (added tracing::error!("{}", err);):

            let req = match task.await {
                Ok(response) => match response.await {
                    Ok(response) => EventCompletionRequest {
                        request_id,
                        body: response,
                    }
                    .into_req(),
                    Err(err) => {
                        tracing::error!("{}", err); // <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

                        EventErrorRequest {
                            request_id,
                            diagnostic: Diagnostic {
                                error_type: type_name_of_val(&err).to_owned(),
                                error_message: format!("{}", err),
                            },
                        }
                        .into_req()
                    }
                }

and got the error logged in CloudWatch

START RequestId: a41a414c-7fc6-4ade-a74a-10101eb3442a Version: $LATEST
  INFO hello_without_macro_tracing: Info from func!
 ERROR hello_without_macro_tracing: Error from func!
 ERROR lambda: tracing::Level::INFO4  <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< that's from tracing::error!("{}", err);
END RequestId: a41a414c-7fc6-4ade-a74a-10101eb3442a
REPORT RequestId: a41a414c-7fc6-4ade-a74a-10101eb3442a	Duration: 55.08 ms	Billed Duration: 200 ms	Memory Size: 128 MB	Max Memory Used: 33 MB	Init Duration: 47.82 ms	
Unknown application error occurred
unhandled

Is it the case of EventErrorRequest being overwritten somewhere downstream?
Now it stopped making any sense to me. :) I'll come back to it in a few hours.

@rimutaka
Copy link
Contributor Author

OK, I think I understand where my confusion comes from now. I expected error_message from

EventErrorRequest {
                            request_id,
                            diagnostic: Diagnostic {
                                error_type: type_name_of_val(&err).to_owned(),
                                error_message: format!("{}", err),
                            },
                        }

to be logged in CloudWatch and returned to the caller. It is returned to the caller, but is not logged in CloudWatch. Inserting tracing::error!("{}", err); next to where EventErrorRequest is constructed fixes that.

Sorry for wasting your time. I'll make a commit with the fix in a few hours.

@rimutaka
Copy link
Contributor Author

Oustanding:

  1. lambda-http examples do not compile
  2. Lambda complains that "Process exited before completing request" and marks a successful execution as "fail".

request_id,
diagnostic: Diagnostic {
error_type: type_name_of_val(&err).to_owned(),
error_message: format!("{:?}", err),
error_message: "Lambda panicked!".to_owned(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather not lose the why here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good point! I wouldn't want my stack trace and all to go to the front end, though. How about a compromise: Display -> Response, Debug -> CloudWatch? Open to suggestions, but would wait and see how it pans out in production with different error types.

                Err(err) if err.is_panic() => {
                    error!("{:?}", err); // inconsistent with other log record formats - to be reviewed
                    EventErrorRequest {
                        request_id,
                        diagnostic: Diagnostic {
                            error_type: type_name_of_val(&err).to_owned(),
                            error_message: format!("Lambda panicked: {}", err),
                        },
                    }
                    .into_req()
                }

Response

{
  "errorType": "Runtime.ExitError",
  "errorMessage": "RequestId: 2d579019-07f7-409a-a6e6-af7725253307 Error: Runtime exited with error: exit status 101"
}

CloudWatch:

START RequestId: 2d579019-07f7-409a-a6e6-af7725253307 Version: $LATEST
thread 'main' panicked at 'explicit panic', lambda/examples/error-handling.rs:87:13
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
END RequestId: 2d579019-07f7-409a-a6e6-af7725253307
REPORT RequestId: 2d579019-07f7-409a-a6e6-af7725253307	Duration: 43.40 ms	Billed Duration: 100 ms	Memory Size: 128 MB	Max Memory Used: 27 MB	Init Duration: 23.15 ms	
RequestId: 2d579019-07f7-409a-a6e6-af7725253307 Error: Runtime exited with error: exit status 101
Runtime.ExitError

@@ -131,7 +131,7 @@ impl<F, A, B, Error, Fut> Handler<A, B> for HandlerFn<F>
where
F: Fn(A, Context) -> Fut,
Fut: Future<Output = Result<B, Error>> + Send + Sync,
Error: Into<Box<dyn std::error::Error + Send + Sync + 'static>> + fmt::Debug,
Error: Into<Box<dyn std::error::Error + Send + Sync + 'static>> + fmt::Display,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason it seems is the Error type is Into an Error . You may need error!("{}", e.into());

/// to be serialized into json. The runtime pays no attention
/// to the contents of the response payload.
#[derive(Serialize)]
struct Response {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For readability can we keep the type defs at the top of the file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

error!("{}", e.into()); kept throwing errors no matter what I tried. I do not understand it enough to make a meaningful fix. Left as is for now.

/// The actual handler of the Lambda request.
pub(crate) async fn func(event: Value, ctx: lambda::Context) -> Result<Value, Error> {
// convert the JSON request to a struct
let req = serde_json::from_value::<Request>(event);
Copy link
Contributor

@softprops softprops Jul 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Protip. You can use serde_json::from_value::<Request>(event)?; above and remove one level of unpacking here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

match serde_json::from_value::(event)?.event_type {
EventType::SimpleError => {

@rimutaka
Copy link
Contributor Author

I'm no expert here, but it looks like exiting fn main causes this Process exited before completing request error. Should we wrap the handler invocation in a loop, somewhere in lib.rs after the runtime has been initialized so that it never exits?

I looped it inside main for a test and the Process exited... errors stopped.

async fn main() -> Result<(), Error> {
    loop {
        let func = handler_fn(func);
        lambda::run(func).await?;
    }
}

It actually worked OK on master branch, somehow ...

@softprops
Copy link
Contributor

lambda::run is the loop. The runtime implements a polling protocol that will invoke your handler when an event is triggered.

@rimutaka
Copy link
Contributor Author

Sure. What confused me was that the polling protocol doesn't seem to work because async_stream::stream! in fn incoming returns None to the 2nd and subsequent calls to incoming.next().await.

I see it does not even try to execute the loop inside async_stream::stream! on the 2nd call and just returns None. I'll keep pondering, but I suspect it's outside my pay grade for now :)

@rimutaka
Copy link
Contributor Author

Outstanding:

  • lambda-http examples fail to compile

@rimutaka
Copy link
Contributor Author

@davidbarsky, @softprops , will you get a chance to look into the failed compilation of lambda-http this week? If not, I might give it a crack on Fri. Not sure I'm skilled enough to fix it, though.

@softprops
Copy link
Contributor

I might be able to this weekend. My ask here is really make sure the breaking changes are worth it.

@rimutaka
Copy link
Contributor Author

@softprops Doug, I spent all afternoon going in circles, but I cannot find the source of Sync/Send problem with the HTTP crate. Can you give me any pointers?

@softprops
Copy link
Contributor

Sorry about that. I've been busy with some family stuff recently. Is there any way to break up this pr to make it smaller to understand the changes?

@rimutaka
Copy link
Contributor Author

@softprops, I think I fixed it. Yay! :) Will tidy up the code and make a commit shortly.

@rimutaka
Copy link
Contributor Author

@softprops , @davidbarsky , @Veetaha, can you review my latest changes?
I could not find the exact description of ApGateway Context fields other than https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-mapping-template-reference.html, which doesn't tell me which part is V1, which is V2.

# Conflicts resolved in favour of davidbarsky/update-readme:
#	Cargo.lock
#	lambda/Cargo.toml
#	lambda/examples/README.md
#	lambda/examples/basic.rs
#	lambda/examples/error-handling.rs
#	lambda/src/client.rs
#	lambda/src/lib.rs
@rimutaka
Copy link
Contributor Author

rimutaka commented Aug 19, 2020

I've just run into a new issue - we cannot use any async fns inside the handler unless they implement Sync trait. E.g. Rusoto SQS SendMessage is not allowed:

future cannot be shared between threads safely
the trait `std::marker::Sync` is not implemented for `dyn std::future::Future<Output = std::result::Result<rusoto_sqs::generated::SendMessageResult, rusoto_core::error::RusotoError<rusoto_sqs::generated::SendMessageError>>> + std::marker::Send`

(fixed in rimutaka@19e6881#diff-d9a13770c705340766ae33d16aac07afR133)

Err(err) => EventErrorRequest {
},
Err(err) if err.is_panic() => {
error!("{:?}", err); // inconsistent with other log record formats - to be reviewed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a lot of changes in this diff, but it's wasn't this actually the original intended changeset: to make this a Display-able message like other runtimes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it was like 3 lines of code on top of master branch. Most other changes were fixes to the unfinished work of @davidbarsky.

@@ -331,7 +343,7 @@ where
.expect("Unable create runtime");

let client = &runtime.client;
let incoming = incoming(client).take(1);
let incoming = incoming(client);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's going on with this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

take(1) was breaking the loop that resulted in an error message that the function failed.

@softprops
Copy link
Contributor

I've just run into a new issue - we cannot use any async fns inside the handler unless they implement Sync trait. E.g. Rusoto SQS SendMessage is not allowed:

Why was Sync necessary again ? I have an example of using lambda with rusoto here which works well.

Thank you for your work on this pull request but it would be awesome if we could break this pull up into chunks that solve specific problems.

I see rustdocs added for existing to types, readme additions, breaking api changes, and new examples all in the same pull. perhaps if we revisited this pull into pulls that addressed each of those we can make some movement on this.

@rimutaka
Copy link
Contributor Author

Someone added + Sync somewhere and then it proliferated itself. I'll try to remove it. Had to add docs to keep the formatter happy. Agree the pull is too big. What I can do is just move the error handling changes back to master, which is like 3 lines of code and let @davidbarsky to finish the work he started.

@softprops
Copy link
Contributor

What I can do is just move the error handling changes back to master, which is like 3 lines of code

👍

@rimutaka
Copy link
Contributor Author

I think this PR was superseded by PR #284. Shall we close it?

@coltonweaver
Copy link
Contributor

@rimutaka Good call, closing it now. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants