-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
feat(replay): Use new afterSend hook to improve error linking #7390
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
||
if (hasHooks) { | ||
// eslint-disable-next-line @typescript-eslint/no-explicit-any | ||
(client as BaseClient<any>).on('afterSendErrorEvent', handleAfterSendError(replay)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be a more general hook that is called afterSendEvent
? Then in handleAfterSendError
we can check for isErrorEvent
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I wasn't entirely sure here 🤔 I can make it more generic. My main reasoning was to avoid the beforeSend
/beforeSendTransaction
naming mess we have, down the line 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for generic -- we'll want to do something similar with traceIds
, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will make it generic, but please note that this has downsides to - as Event
can also be a replay event or profile event, we'll have to guard this everywhere etc., even though actually as of now these events go through a different code path and can't even get in there.
size-limit report 📦
|
packages/integration-tests/suites/replay/errors/errorNotSent/test.ts
Outdated
Show resolved
Hide resolved
|
||
if (hasHooks) { | ||
// eslint-disable-next-line @typescript-eslint/no-explicit-any | ||
(client as BaseClient<any>).on('afterSendErrorEvent', handleAfterSendError(replay)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for generic -- we'll want to do something similar with traceIds
, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good from my point of view!
/* | ||
* This scenario currently shows somewhat unexpected behavior from the PoV of a user: | ||
* The error is dropped, but the recording is started and continued anyway. | ||
* If folks only sample error replays, this will lead to a lot of confusion as the resulting replay | ||
* won't contain the error that started it (possibly none or only additional errors that occurred later on). | ||
* | ||
* This is because in error-mode, we start recording as soon as replay's eventProcessor is called with an error. | ||
* If later event processors or beforeSend drop the error, the recording is already started. | ||
* | ||
* We'll need a proper SDK lifecycle hook (WIP) to fix this properly. | ||
* TODO: Once we have lifecycle hooks, we should revisit this test and make sure it behaves as expected. | ||
* This means that the recording should not be started or stopped if the error that triggered it is not sent. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👏 👏 👏
Replay SDK metrics 🚀
develop |
Revision | LCP | CLS | CPU | JS heap avg | JS heap max | netTx | netRx | netCount | netTime |
---|---|---|---|---|---|---|---|---|---|
b278429 | -14.33 ms | +0.25 ms | +15.16 pp | +7.66 MB | +10.24 MB | +107.71 kB | -706 B | +4 | +278.27 ms |
ef6b3c7 | -7.70 ms | +0.25 ms | +16.28 pp | +7.64 MB | +10.02 MB | +107.46 kB | +57 B | +4 | +222.84 ms |
ef6b3c7 | -10.10 ms | +0.23 ms | +15.85 pp | +7.66 MB | +10.56 MB | +107.49 kB | +836 B | +4 | +209.40 ms |
b1ef00d | +5.17 ms | +0.26 ms | +15.95 pp | +7.59 MB | +10.18 MB | +107.02 kB | -220 B | +4 | +306.46 ms |
42e542e | +12.94 ms | +0.25 ms | +18.24 pp | +7.8 MB | +10.31 MB | +107.39 kB | -2.17 kB | +4 | +289.66 ms |
2f3c93c | +6.35 ms | +0.26 ms | +16.82 pp | +7.44 MB | +9.99 MB | +106.58 kB | +1.53 kB | +4 | +303.23 ms |
4bff5a9 | +0.60 ms | +0.26 ms | +15.70 pp | +7.44 MB | +10.2 MB | +106.93 kB | -2.04 kB | +4 | +163.67 ms |
ba99e7c | -0.34 ms | +0.23 ms | +16.45 pp | +7.17 MB | +10.05 MB | +106.7 kB | +2.46 kB | +4 | +119.03 ms |
a70376e | -7.64 ms | +0.25 ms | +17.86 pp | +6.5 MB | +10.21 MB | +106.82 kB | -35 B | +4 | +321.76 ms |
Previous results on branch: fn/afterSend
fn/afterSend
Revision | LCP | CLS | CPU | JS heap avg | JS heap max | netTx | netRx | netCount | netTime |
---|---|---|---|---|---|---|---|---|---|
7399b99 | -6.86 ms | +0.23 ms | +15.85 pp | +7.46 MB | +10 MB | +107.04 kB | +2.6 kB | +4 | +237.05 ms |
b278429 | -14.33 ms | +0.25 ms | +15.16 pp | +7.66 MB | +10.24 MB | +107.71 kB | -706 B | +4 | +278.27 ms |
ef6b3c7 | -7.70 ms | +0.25 ms | +16.28 pp | +7.64 MB | +10.02 MB | +107.46 kB | +57 B | +4 | +222.84 ms |
ef6b3c7 | -10.10 ms | +0.23 ms | +15.85 pp | +7.66 MB | +10.56 MB | +107.49 kB | +836 B | +4 | +209.40 ms |
b1ef00d | +5.17 ms | +0.26 ms | +15.95 pp | +7.59 MB | +10.18 MB | +107.02 kB | -220 B | +4 | +306.46 ms |
42e542e | +12.94 ms | +0.25 ms | +18.24 pp | +7.8 MB | +10.31 MB | +107.39 kB | -2.17 kB | +4 | +289.66 ms |
2f3c93c | +6.35 ms | +0.26 ms | +16.82 pp | +7.44 MB | +9.99 MB | +106.58 kB | +1.53 kB | +4 | +303.23 ms |
4bff5a9 | +0.60 ms | +0.26 ms | +15.70 pp | +7.44 MB | +10.2 MB | +106.93 kB | -2.04 kB | +4 | +163.67 ms |
ba99e7c | -0.34 ms | +0.23 ms | +16.45 pp | +7.17 MB | +10.05 MB | +106.7 kB | +2.46 kB | +4 | +119.03 ms |
a70376e | -7.64 ms | +0.25 ms | +17.86 pp | +6.5 MB | +10.21 MB | +106.82 kB | -35 B | +4 | +321.76 ms |
Last updated: Thu, 16 Mar 2023 13:04:53 GMT
5f06d61
to
467671f
Compare
Co-authored-by: Lukas Stracke <[email protected]>
Wish this was on the changelog because it took me a while to figure out why it stopped sending Replays since version 4.45 :C |
Hmm, we should actually be handling this case - we fall back to the "old" behavior in that case. What exactly is happening for you? If replays are not being sent when errors happen, that would be a bug! |
It's kinda hard to debug but What I noticed is the replay integration starts as usual but doesn't flush/send replays when the send funciton is The issue could be here Because I noticed that In this case I am only altering the Transport, I am not touching the default client/hub |
This replaces our current way of connection errors & replays with a new hook
afterSendErrorEvent
.This hook is sent after
transport.send()
has completed, and receives the event & response as arguments.When we get a successful (e.g. statusCode >200<300) response, we add the event id to
errorIds
and eventually sample the error session.This has one main wrinkle, which is that when users provide a custom transport, this may not respond with the new
TransportMakeRequestResponse
but can also just returnvoid
. This will change in V8, but until then, a custom transport may not be implementing this.To avoid us never connecting an error with a replay in such a case, we detect if the transport is not the base transport (which is guaranteed to return a non-void response), we circumvent this check and simply always connect the error. This may connect an error that has been dropped, but there is no way for us to know that for certain. IMHO this is an acceptable outcome.
Note that for the case where a custom client is provided that does not implement hooks (unlikely, but possible), we can quite easily make this work by falling back to using the functionality in the global event processor.
This should fix/improve the following behaviors:
beforeSend
or a later-registered global event processor will not trigger an error session anymoreerrorIds
anymore