Skip to content

fix: associate batch with boundary #16621

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

dummdidumm
Copy link
Member

@dummdidumm dummdidumm commented Aug 14, 2025

This associates the current batch with the boundary when entering pending mode. That way other async work associated to that boundary also can associate itself with that batch, even if e.g. due to flushing it is no longer the current batch.

This solves a null pointer exception that can occur when the batch is flushed before the next top level await or async derived gets a hold of the current batch, which is null then.

While doing so I discovered that we have a memory leak when branches are obsolete/outdated/superseeded. In that case it may happen that their pending count never reaches zero so they're never removed from the batches set. Related to this I discovered that our "commit these DOM changes" logic was flawed: It is possible that an obsolete branch which was neutered would have pending callbacks which we still need to commit. This can happen if e.g. an if block is destroyed by branch A, then branch B superseeds branch A which is then neutered, but the if block tells branch B "nothing changed, I'm already destroyed", never adding a "remove me"-callback to branch B. That only exists in branch A. The fix for both is to compute which branches are now obsolete (they are if all their sources are also covered by the current batch) and to then call all callbacks of effects that don't exist in the current one.

Fixes #16596
Fixes sveltejs/kit#14124

Before submitting the PR, please make sure you do the following

  • It's really useful if your PR references an issue where it is discussed ahead of time. In many cases, features are absent for a reason. For large changes, please create an RFC: https://github.com/sveltejs/rfcs
  • Prefix your PR title with feat:, fix:, chore:, or docs:.
  • This message body should clearly illustrate what problems it solves.
  • Ideally, include a test that fails without this PR but passes with it.
  • If this PR changes code within packages/svelte/src, add a changeset (npx changeset).

Tests and linting

  • Run the tests with pnpm test and lint the project with pnpm lint

This associates the current batch with the boundary when entering pending mode. That way other async work associated to that boundary also can associate itself with that batch, even if e.g. due to flushing it is no longer the current batch.

This solves a null pointer exception that can occur when the batch is flushed before the next top level await or async derived gets a hold of the current batch, which is null then.

Fixes #16596
Fixes sveltejs/kit#14124
Copy link

changeset-bot bot commented Aug 14, 2025

🦋 Changeset detected

Latest commit: bde51cd

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
svelte Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

Copy link
Contributor

Playground

pnpm add https://pkg.pr.new/svelte@16621

@dummdidumm
Copy link
Member Author

dummdidumm commented Aug 14, 2025

Mhm there's a regression with this because e.g. in one case where the batch is switched mid-pending it would be wrong to associate a new-outdated batch the boundary. Mhhhm.

Update: Discovered a memory leak through this: batches that are superseeded are never cleaned up. They're put into the batches set but never cleared from it because their pending count never reaches 0.

@dummdidumm dummdidumm marked this pull request as draft August 14, 2025 15:03
@dummdidumm
Copy link
Member Author

Fixed all but one test and resolved the memory leak. The one test that is failing now is actually also failing on main if you run it in the playground, not sure why not in the test.

In the test a router is simulated:

<script lang=ts>
	let route = $state('a');
	let ok = $state(false);

	function goto(r) {
		return Promise.resolve().then(() => {
			route = r;
			throw new Error('nope');
		});
	}
</script>

<h1>{route}</h1>
<button onclick={() => route = 'a'}>a</button>
<button onclick={() => route = 'b'}>b</button>
<button onclick={() => route = 'c'}>c</button>
<button onclick={() => ok = true}>ok</button>

<svelte:boundary>
	{#if route === 'a'}
		<p>a</p>
	{/if}

	{#if route === 'b'}
		{#if ok}
			<p>b</p>
		{:else}
			{await goto('c')}
		{/if}
	{/if}

	{#if route === 'c'}
		<p>c</p>
	{/if}

	{#snippet pending()}
		<p>pending...</p>
	{/snippet}

	{#snippet failed(error, reset)}
		<button onclick={reset}>retry</button>
	{/snippet}
</svelte:boundary>

When you click on b you will go into the else branch, which "redirects" you to route c and throws an error. The redirect spawns a new batch, so the old one is obsolete now. It is also marked as neutered, which means its DOM operation callbacks will no longer be committed. That is a problem though since the new batch is only stating "destroy if block with route b and create route c", not "destroy if block with route a" - because for that batch this has already happened. As a result, <p>a</p> stays around when it shouldn't.

I'm unsure how to fix this. We somehow need to know which of the operations of the obsolete batch should be committed, or somehow make the if block dirty again and somehow bypass pass the "if value hasn't changed do early return" logic.

@dummdidumm
Copy link
Member Author

Found a way to commit the correct callbacks of obsolete branches

@dummdidumm dummdidumm marked this pull request as ready for review August 15, 2025 21:03
if (current_batch) {
this.#batch = current_batch;
}
return /** @type {Batch} */ (this.#batch);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the only thing in this PR that gives me a bit pause, basically that we have no definite way to know which batch is the latest one for which boundary, this "only" feels like a good enough approximation.

Comment on lines +408 to +413
this.#callbacks.clear();
this.#maybe_dirty_effects =
this.#dirty_effects =
this.#boundary_async_effects =
this.#async_effects =
[];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this stuff get GC'd along with the rest of the batch?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as said in the meeting, my hunch is that it's possible for obsolete batches to still be called at some point and then we don't want to rerun anything for them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TypeError: Can't access property "increment", batch is null null exception if template markup and script contain remote functions
2 participants