feat(replay): Use new afterSend hook to improve error linking #7390

mydea · 2023-03-09T12:01:32Z

This replaces our current way of connection errors & replays with a new hook afterSendErrorEvent.

This hook is sent after transport.send() has completed, and receives the event & response as arguments.

When we get a successful (e.g. statusCode >200<300) response, we add the event id to errorIds and eventually sample the error session.

This has one main wrinkle, which is that when users provide a custom transport, this may not respond with the new TransportMakeRequestResponse but can also just return void. This will change in V8, but until then, a custom transport may not be implementing this.

To avoid us never connecting an error with a replay in such a case, we detect if the transport is not the base transport (which is guaranteed to return a non-void response), we circumvent this check and simply always connect the error. This may connect an error that has been dropped, but there is no way for us to know that for certain. IMHO this is an acceptable outcome.

Note that for the case where a custom client is provided that does not implement hooks (unlikely, but possible), we can quite easily make this work by falling back to using the functionality in the global event processor.

This should fix/improve the following behaviors:

Errors that are dropped via e.g. beforeSend or a later-registered global event processor will not trigger an error session anymore
Errors that have not been sent successfully to Sentry (e.g. rate limited, ...) will not trigger an error session anymore
Errors that have been not been sent successfully to sentry (e.g. API error, ...) will not be added to errorIds anymore
We do not depend on event processor order anymore, e.g. the linking always runs after any regular event processor (think deduping, ...)
Overall, this should avoid us sending error-sessions to Sentry that have no corresponding error in Sentry.

AbhiPrasad · 2023-03-09T12:19:53Z

packages/replay/src/util/addGlobalListeners.ts

+
+  if (hasHooks) {
+    // eslint-disable-next-line @typescript-eslint/no-explicit-any
+    (client as BaseClient<any>).on('afterSendErrorEvent', handleAfterSendError(replay));


should this be a more general hook that is called afterSendEvent? Then in handleAfterSendError we can check for isErrorEvent.

Yeah, I wasn't entirely sure here 🤔 I can make it more generic. My main reasoning was to avoid the beforeSend/beforeSendTransaction naming mess we have, down the line 😅

+1 for generic -- we'll want to do something similar with traceIds, no?

I will make it generic, but please note that this has downsides to - as Event can also be a replay event or profile event, we'll have to guard this everywhere etc., even though actually as of now these events go through a different code path and can't even get in there.

github-actions · 2023-03-09T12:55:09Z

size-limit report 📦

Path	Size
@sentry/browser - ES5 CDN Bundle (gzipped + minified)	20.46 KB (+0.28% 🔺)
@sentry/browser - ES5 CDN Bundle (minified)	63.42 KB (+0.2% 🔺)
@sentry/browser - ES6 CDN Bundle (gzipped + minified)	19.02 KB (+0.19% 🔺)
@sentry/browser - ES6 CDN Bundle (minified)	56.32 KB (+0.18% 🔺)
@sentry/browser - Webpack (gzipped + minified)	20.69 KB (+0.18% 🔺)
@sentry/browser - Webpack (minified)	67.54 KB (+0.16% 🔺)
@sentry/react - Webpack (gzipped + minified)	20.71 KB (+0.18% 🔺)
@sentry/nextjs Client - Webpack (gzipped + minified)	52.16 KB (+0.08% 🔺)
@sentry/browser + @sentry/tracing - ES5 CDN Bundle (gzipped + minified)	33.72 KB (+0.18% 🔺)
@sentry/browser + @sentry/tracing - ES6 CDN Bundle (gzipped + minified)	26.05 KB (+0.16% 🔺)
@sentry/replay ES6 CDN Bundle (gzipped + minified)	43.5 KB (+0.13% 🔺)
@sentry/replay - Webpack (gzipped + minified)	37.57 KB (+0.25% 🔺)
@sentry/browser + @sentry/tracing + @sentry/replay - ES6 CDN Bundle (gzipped + minified)	61.69 KB (+0.17% 🔺)
@sentry/browser + @sentry/replay - ES6 CDN Bundle (gzipped + minified)	54.75 KB (+0.17% 🔺)

packages/integration-tests/suites/replay/errors/errorNotSent/test.ts

billyvg · 2023-03-09T15:46:20Z

packages/replay/src/util/addGlobalListeners.ts

+
+  if (hasHooks) {
+    // eslint-disable-next-line @typescript-eslint/no-explicit-any
+    (client as BaseClient<any>).on('afterSendErrorEvent', handleAfterSendError(replay));


+1 for generic -- we'll want to do something similar with traceIds, no?

packages/replay/src/coreHandlers/handleAfterSendError.ts

Lms24

This looks good from my point of view!

packages/core/src/transports/base.ts

packages/replay/src/coreHandlers/handleAfterSendEvent.ts

billyvg · 2023-03-14T14:40:05Z

packages/integration-tests/suites/replay/errors/droppedError/test.ts

-/*
- * This scenario currently shows somewhat unexpected behavior from the PoV of a user:
- * The error is dropped, but the recording is started and continued anyway.
- * If folks only sample error replays, this will lead to a lot of confusion as the resulting replay
- * won't contain the error that started it (possibly none or only additional errors that occurred later on).
- *
- * This is because in error-mode, we start recording as soon as replay's eventProcessor is called with an error.
- * If later event processors or beforeSend drop the error, the recording is already started.
- *
- * We'll need a proper SDK lifecycle hook (WIP) to fix this properly.
- * TODO: Once we have lifecycle hooks, we should revisit this test and make sure it behaves as expected.
- *       This means that the recording should not be started or stopped if the error that triggered it is not sent.
- */


packages/replay/src/coreHandlers/handleGlobalEvent.ts

billyvg · 2023-03-14T14:59:31Z

packages/replay/src/util/monkeyPatchRecordDroppedEvent.ts

👏 👏 👏

github-actions · 2023-03-14T15:09:13Z

Replay SDK metrics 🚀

		Plain	+Sentry			+Replay
	Revision	Value	Value	Diff	Ratio	Value	Diff	Ratio
LCP	This PR `603d188`	59.66 ms	67.60 ms	+7.94 ms	+13.32 %	68.90 ms	+9.24 ms	+15.50 %
LCP	Baseline `7399b99`	70.49 ms	65.18 ms	-5.31 ms	-7.53 %	63.63 ms	-6.86 ms	-9.73 %
CLS	This PR `603d188`	0.00 ms	0.00 ms	0.00 ms	0.00 %	0.24 ms	+0.23 ms	+5026.54 %
CLS	Baseline `7399b99`	0.00 ms	0.00 ms	0.00 ms	0.00 %	0.24 ms	+0.23 ms	+5077.55 %
CPU	This PR `603d188`	17.65 %	20.62 %	+2.98 pp	+16.87 %	36.60 %	+18.95 pp	+107.41 %
CPU	Baseline `7399b99`	21.63 %	21.31 %	-0.32 pp	-1.50 %	37.48 %	+15.85 pp	+73.24 %
JS heap avg	This PR `603d188`	3.54 MB	6.9 MB	+3.37 MB	+95.11 %	11.02 MB	+7.48 MB	+211.51 %
JS heap avg	Baseline `7399b99`	3.52 MB	6.89 MB	+3.37 MB	+95.87 %	10.98 MB	+7.46 MB	+212.18 %
JS heap max	This PR `603d188`	3.89 MB	8.33 MB	+4.44 MB	+114.08 %	14.18 MB	+10.29 MB	+264.51 %
JS heap max	Baseline `7399b99`	3.88 MB	8.29 MB	+4.41 MB	+113.73 %	13.88 MB	+10 MB	+257.86 %
netTx	This PR `603d188`	0 B	360.46 kB	+360.46 kB	n/a	107.67 kB	+107.67 kB	n/a
netTx	Baseline `7399b99`	0 B	360.47 kB	+360.47 kB	n/a	107.04 kB	+107.04 kB	n/a
netRx	This PR `603d188`	17.79 kB	19.06 kB	+1.28 kB	+7.18 %	16.61 kB	-1.18 kB	-6.64 %
netRx	Baseline `7399b99`	16.4 kB	20.34 kB	+3.94 kB	+24.01 %	19 kB	+2.6 kB	+15.83 %
netCount	This PR `603d188`	1	2	+1	+100.00 %	5	+4	+400.00 %
netCount	Baseline `7399b99`	1	2	+1	+100.00 %	5	+4	+400.00 %
netTime	This PR `603d188`	333.60 ms	357.82 ms	+24.22 ms	+7.26 %	500.75 ms	+167.15 ms	+50.11 %
netTime	Baseline `7399b99`	372.87 ms	432.39 ms	+59.52 ms	+15.96 %	609.92 ms	+237.05 ms	+63.57 %

Baseline results on branch: `develop`

Revision	LCP	CLS	CPU	JS heap avg	JS heap max	netTx	netRx	netCount	netTime
`b278429`	-14.33 ms	+0.25 ms	+15.16 pp	+7.66 MB	+10.24 MB	+107.71 kB	-706 B	+4	+278.27 ms
`ef6b3c7`	-7.70 ms	+0.25 ms	+16.28 pp	+7.64 MB	+10.02 MB	+107.46 kB	+57 B	+4	+222.84 ms
`ef6b3c7`	-10.10 ms	+0.23 ms	+15.85 pp	+7.66 MB	+10.56 MB	+107.49 kB	+836 B	+4	+209.40 ms
`b1ef00d`	+5.17 ms	+0.26 ms	+15.95 pp	+7.59 MB	+10.18 MB	+107.02 kB	-220 B	+4	+306.46 ms
`42e542e`	+12.94 ms	+0.25 ms	+18.24 pp	+7.8 MB	+10.31 MB	+107.39 kB	-2.17 kB	+4	+289.66 ms
`2f3c93c`	+6.35 ms	+0.26 ms	+16.82 pp	+7.44 MB	+9.99 MB	+106.58 kB	+1.53 kB	+4	+303.23 ms
`4bff5a9`	+0.60 ms	+0.26 ms	+15.70 pp	+7.44 MB	+10.2 MB	+106.93 kB	-2.04 kB	+4	+163.67 ms
`ba99e7c`	-0.34 ms	+0.23 ms	+16.45 pp	+7.17 MB	+10.05 MB	+106.7 kB	+2.46 kB	+4	+119.03 ms
`a70376e`	-7.64 ms	+0.25 ms	+17.86 pp	+6.5 MB	+10.21 MB	+106.82 kB	-35 B	+4	+321.76 ms

Previous results on branch: `fn/afterSend`

Revision	LCP	CLS	CPU	JS heap avg	JS heap max	netTx	netRx	netCount	netTime
`7399b99`	-6.86 ms	+0.23 ms	+15.85 pp	+7.46 MB	+10 MB	+107.04 kB	+2.6 kB	+4	+237.05 ms
`b278429`	-14.33 ms	+0.25 ms	+15.16 pp	+7.66 MB	+10.24 MB	+107.71 kB	-706 B	+4	+278.27 ms
`ef6b3c7`	-7.70 ms	+0.25 ms	+16.28 pp	+7.64 MB	+10.02 MB	+107.46 kB	+57 B	+4	+222.84 ms
`ef6b3c7`	-10.10 ms	+0.23 ms	+15.85 pp	+7.66 MB	+10.56 MB	+107.49 kB	+836 B	+4	+209.40 ms
`b1ef00d`	+5.17 ms	+0.26 ms	+15.95 pp	+7.59 MB	+10.18 MB	+107.02 kB	-220 B	+4	+306.46 ms
`42e542e`	+12.94 ms	+0.25 ms	+18.24 pp	+7.8 MB	+10.31 MB	+107.39 kB	-2.17 kB	+4	+289.66 ms
`2f3c93c`	+6.35 ms	+0.26 ms	+16.82 pp	+7.44 MB	+9.99 MB	+106.58 kB	+1.53 kB	+4	+303.23 ms
`4bff5a9`	+0.60 ms	+0.26 ms	+15.70 pp	+7.44 MB	+10.2 MB	+106.93 kB	-2.04 kB	+4	+163.67 ms
`ba99e7c`	-0.34 ms	+0.23 ms	+16.45 pp	+7.17 MB	+10.05 MB	+106.7 kB	+2.46 kB	+4	+119.03 ms
`a70376e`	-7.64 ms	+0.25 ms	+17.86 pp	+6.5 MB	+10.21 MB	+106.82 kB	-35 B	+4	+321.76 ms

*) pp - percentage points - an absolute difference between two percentages.
Last updated: Thu, 16 Mar 2023 13:04:53 GMT

Co-authored-by: Lukas Stracke <[email protected]>

lucas-zimerman · 2023-04-27T14:13:06Z

This has one main wrinkle, which is that when users provide a custom transport, this may not respond with the new TransportMakeRequestResponse but can also just return void. This will change in V8, but until then, a custom transport may not be implementing this.

Wish this was on the changelog because it took me a while to figure out why it stopped sending Replays since version 4.45 :C

mydea · 2023-04-27T14:22:21Z

This has one main wrinkle, which is that when users provide a custom transport, this may not respond with the new TransportMakeRequestResponse but can also just return void. This will change in V8, but until then, a custom transport may not be implementing this.

Wish this was on the changelog because it took me a while to figure out why it stopped sending Replays since version 4.45 :C

Hmm, we should actually be handling this case - we fall back to the "old" behavior in that case. What exactly is happening for you? If replays are not being sent when errors happen, that would be a bug!

lucas-zimerman · 2023-04-27T15:38:26Z

This has one main wrinkle, which is that when users provide a custom transport, this may not respond with the new TransportMakeRequestResponse but can also just return void. This will change in V8, but until then, a custom transport may not be implementing this.

Wish this was on the changelog because it took me a while to figure out why it stopped sending Replays since version 4.45 :C

Hmm, we should actually be handling this case - we fall back to the "old" behavior in that case. What exactly is happening for you? If replays are not being sent when errors happen, that would be a bug!

It's kinda hard to debug but What I noticed is the replay integration starts as usual but doesn't flush/send replays when the send funciton is void

The issue could be here
https://github.com/getsentry/sentry-javascript/pull/7390/files#diff-a9838d39c7470a876de4e928887c7f8fcab924f9aecf69acbe20d6fc9a8fbcdeR53-R55

Because I noticed that afterSendHandler was undefined on my case

In this case I am only altering the Transport, I am not touching the default client/hub

mydea added the Package: replay Issues related to the Sentry Replay SDK label Mar 9, 2023

mydea requested review from billyvg, Lms24 and AbhiPrasad March 9, 2023 12:01

mydea self-assigned this Mar 9, 2023

AbhiPrasad reviewed Mar 9, 2023

View reviewed changes

mydea force-pushed the fn/afterSend branch from eb32159 to ee0b731 Compare March 9, 2023 14:12

billyvg reviewed Mar 9, 2023

View reviewed changes

mydea force-pushed the fn/afterSend branch from ee0b731 to 6b1bef1 Compare March 10, 2023 09:00

Lms24 approved these changes Mar 10, 2023

View reviewed changes

packages/core/src/transports/base.ts Show resolved Hide resolved

packages/replay/src/coreHandlers/handleAfterSendEvent.ts Outdated Show resolved Hide resolved

mydea force-pushed the fn/afterSend branch from e101ad0 to 8c3c8ec Compare March 13, 2023 15:15

AbhiPrasad approved these changes Mar 13, 2023

View reviewed changes

mydea added the CI-Overhead-Measurements label Mar 14, 2023

mydea force-pushed the fn/afterSend branch from 8c3c8ec to 4672b8e Compare March 14, 2023 14:52

billyvg approved these changes Mar 14, 2023

View reviewed changes

mydea force-pushed the fn/afterSend branch 2 times, most recently from 5f06d61 to 467671f Compare March 16, 2023 12:20

mydea and others added 10 commits March 16, 2023 13:47

feat(replay): Use new afterSend hook to improve error linking

5f709f7

ref: Make hook afterSendEvent

3e6bd4f

use force flush test util

5030d40

extract callback type to top

27f00ed

fix lint

1c45239

make less brittle

dc576da

Apply suggestions from code review

1219f4d

Co-authored-by: Lukas Stracke <[email protected]>

add comment for afterSendHandler

21bd94c

move tests to new dir

569b645

fix test

1d97cde

mydea force-pushed the fn/afterSend branch from 467671f to 1d97cde Compare March 16, 2023 12:47

mydea merged commit 7aa20d0 into develop Mar 16, 2023

mydea deleted the fn/afterSend branch March 16, 2023 13:19

Uh oh!

feat(replay): Use new afterSend hook to improve error linking #7390

feat(replay): Use new afterSend hook to improve error linking #7390

Uh oh!

Conversation

mydea commented Mar 9, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AbhiPrasad Mar 9, 2023

Choose a reason for hiding this comment

Uh oh!

mydea Mar 9, 2023

Choose a reason for hiding this comment

Uh oh!

billyvg Mar 9, 2023

Choose a reason for hiding this comment

Uh oh!

mydea Mar 10, 2023

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Mar 9, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

size-limit report 📦

Uh oh!

Uh oh!

billyvg Mar 9, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Lms24 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

billyvg Mar 14, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

billyvg Mar 14, 2023

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Mar 14, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Replay SDK metrics 🚀

Baseline results on branch: develop

Previous results on branch: fn/afterSend

Uh oh!

lucas-zimerman commented Apr 27, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mydea commented Apr 27, 2023

Uh oh!

lucas-zimerman commented Apr 27, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

mydea commented Mar 9, 2023 •

edited

Loading

github-actions bot commented Mar 9, 2023 •

edited

Loading

github-actions bot commented Mar 14, 2023 •

edited

Loading

Baseline results on branch: `develop`

Previous results on branch: `fn/afterSend`

lucas-zimerman commented Apr 27, 2023 •

edited

Loading

lucas-zimerman commented Apr 27, 2023 •

edited

Loading