Support gradient accumulation #1238

janEbert · 2025-05-29T11:44:56Z

First, the batched backward calculation is refactored into its own function. Then, gradient accumulation is implemented by moving the data iterator inside the train_step method and consuming data from it as necessary. I added some extra handling for non-infinite data iterators, but if you dislike that additional complexity, I can remove it to simplify the code.

The feature is enabled by giving an additional --training.global_batch_size, which has a sensible default of 1 gradient accumulation step (i.e., no actual accumulation).

@tianyu-l thanks for the ping.

janEbert · 2025-05-29T11:46:13Z

Maybe it would also make sense to rename --training.batch_size to --training.local_batch_size accordingly to differentiate it further from the --training.global_batch_size config.

fegin

Thanks for the PR. I suggest that we don't let train_step() be aware of data_iterator. Please see the detail comments.

Also, this PR doesn't change the parallelization, which is not correct. We will have to call set_requires_gradient_sync if FSDP is applied. We can raise an exception if DDP is used and accumulation_steps > 1 for now.

torchtitan/train.py

fegin · 2025-05-30T05:59:16Z

torchtitan/train.py

+        unwrapped_loss_fn = self.loss_fn
+
+        @functools.wraps(unwrapped_loss_fn)
+        def accumulated_loss_fn(*args, **kwargs):


We should just modify build_loss_fn to take accmulation_steps to let the loss function decide the usage.

I'm OK either way.
I think being more explicit about grad accumulation handling doesn't look bad.
Also if we go with explicit global_batch_size and implicit grad_accu_steps, then we'll need to do another check & computation in the loss function.

I moved the wrapping functionality to torchtitan.components.loss, called it rescale_accumulated_loss. Not quite like what you wanted, but that way we can re-use the Trainer.gradient_accumulation_step value more easily.

torchtitan/train.py

tianyu-l

Thank you for adding this feature!
I left several comments. Please see if they make sense.

tianyu-l · 2025-05-30T05:40:52Z

torchtitan/config_manager.py

    loaded from this path instead of downloaded.
    """

    batch_size: int = 8


yeah let's call it local_batch_size

Did a rename across the codebase whereever JobConfig.training.batch_size or --training.batch_size was used. Not sure how you'd like me to handle the compatibility breakage that this introduces.

tianyu-l · 2025-05-30T05:50:09Z

torchtitan/train.py

+        if job_config.training.global_batch_size < 0:
+            job_config.training.global_batch_size = (
+                job_config.training.batch_size * dp_degree
+            )
+        assert job_config.training.global_batch_size > 0
+        assert (
+            job_config.training.global_batch_size
+            % (job_config.training.batch_size * dp_degree)
+            == 0
+        ), (
+            f"global batch size must be multiple of local batch size times "
+            f"data-parallel degree ({job_config.training.global_batch_size} "
+            f"% ({job_config.training.batch_size} * {dp_degree}) != 0)"
+        )
+
+        self.gradient_accumulation_steps = job_config.training.global_batch_size // (
+            job_config.training.batch_size * dp_degree
+        )
+        assert self.gradient_accumulation_steps > 0


nit comment

Suggested change

if job_config.training.global_batch_size < 0:

job_config.training.global_batch_size = (

job_config.training.batch_size * dp_degree

)

assert job_config.training.global_batch_size > 0

assert (

job_config.training.global_batch_size

% (job_config.training.batch_size * dp_degree)

== 0

), (

f"global batch size must be multiple of local batch size times "

f"data-parallel degree ({job_config.training.global_batch_size} "

f"% ({job_config.training.batch_size} * {dp_degree}) != 0)"

)

self.gradient_accumulation_steps = job_config.training.global_batch_size // (

job_config.training.batch_size * dp_degree

)

assert self.gradient_accumulation_steps > 0

global_batch_size = job_config.training.global_batch_size

if global_batch_size < 0:

global_batch_size = job_config.training.batch_size * dp_degree

self.gradient_accumulation_steps = 1

else:

assert global_batch_size > (job_config.training.batch_size * dp_degree)

assert (

job_config.training.global_batch_size

% (job_config.training.batch_size * dp_degree)

== 0

), (

f"global batch size must be multiple of local batch size times "

f"data-parallel degree ({global_batch_size} "

f"% ({job_config.training.batch_size} * {dp_degree}) != 0)"

)

self.gradient_accumulation_steps = global_batch_size // (

job_config.training.batch_size * dp_degree

)

Don't really agree with not re-using the code that would become else case here, but can still change it to your recommendation. For now, I put the addition of the global_batch_size variable into its own commit, which probably already has the readability improvements that you'd like. Also added a comment in the if case that this global batch size results in 1 gradient accumulation step.

tianyu-l · 2025-05-30T05:53:42Z

torchtitan/train.py


        self.loss_fn = self.train_spec.build_loss_fn(job_config)

+        unwrapped_loss_fn = self.loss_fn


Let's put the self.gradient_accumulation_steps derivation code right before here, to group gradient accum logic together as much as possible.

I understand that it is desirable to fail early on infeasible global batch size, even before parallelism and other heavy things are applied. But I'd suggest we prioritize readability. What do you think?

Sounds fair! :)

Moved this.

torchtitan/train.py

tianyu-l · 2025-05-30T06:27:47Z

torchtitan/train.py

-
-        # Keep these variables local to shorten the code as these are
-        # the major variables that are used in the training loop.
+    def batch_backward(self, input_dict: dict[str, torch.Tensor], labels: torch.Tensor):


can we call it forward_backward_step?

Done. By the way, if you'd prefer me to squash these changes into the previous commits, I'd be happy to clean up the commit chain.

tianyu-l · 2025-05-30T06:29:45Z

torchtitan/train.py

+        model_parts = self.model_parts
+        world_mesh = self.world_mesh


similarly, maybe not worth keeping these two

torchtitan/train.py

tianyu-l · 2025-05-30T06:38:08Z

torchtitan/components/metrics.py

        )
        self.ntokens_since_last_log = 0
        self.data_loading_times = []
+        self.accumulated_losses = []


Since it represents a core training concept, rather than directly used for metrics logging, let's put this in Trainer, instead of MetricsProcessor.

Done. Also added the gradient_accumulation_steps attribute to the Trainer's dataclass attributes.

tianyu-l · 2025-05-30T06:57:33Z

torchtitan/train.py

+            except StopIteration:
+                # If data runs out during gradient accumulation, that
+                # entire step will not be executed.
+                return True


Instead of explicit return True, can we just call next and let the StopIteration exception propagate to train_step and catch over there?

I initially had it implemented this way, but thought the try block would encapsulate too much code. If anything else raises a StopIteration, it would make debugging much more difficult. Therefore the minimization of the try scope.

I would prefer directly raise StopIteration and let the outer loop to catch. As mentioned in the above discussion, the original design is to keep train_step() simple without data dependency. So there is no other StopIteration() afaik. If there are other places actually raise the StopIteration, we should figure it out.

If we really want to avoid ambiguity , we can have a customized next(), like next_batch() which will raise a customized DataDepleteException().

That's considerate. I think it's quite unlikely other places would also raise StopIteration? Maybe microbatching in pipeline parallel? But over there the number of microbatches should be fixed ahead of time.

Anyways, if you think we need to deal with this explicitly, we should catch the StopIteration exception, and raise a customized DataloaderStopIteration exception to be caught by caller, instead of return True.

Went with a combination of these suggestions; a Trainer.next_batch method basically just calls next(data_iterator), but catches and re-raises its StopIteration as a new DataloaderStopIteration.

tianyu-l · 2025-05-30T06:59:31Z

torchtitan/train.py

                self.step += 1
                self.gc_handler.run(self.step)
-                self.train_step(inputs, labels)
+                data_ran_out = self.train_step(data_iterator)


we can catch the StopIteration here and do different treatment on self.checkpointer.save in try vs. catch.

Has been changed, but we now simply break in case of the DataloaderStopIteration to prevent the change to the checkpointing logic.

This does change the general logic (e.g., torch_profiler and memory_profiler won't be stepped anymore) compared to the previous code, but is a bit nicer to read instead of adding an extra variable check in the while-query, IMO.

fegin · 2025-05-30T07:18:22Z

@tianyu-l Let me know what you think about the proposal above. Don't want @janEbert to be stuck in two different reviews.

tianyu-l

Also, this PR doesn't change the parallelization, which is not correct. We will have to call set_requires_gradient_sync if FSDP is applied.

@fegin For background please see #292 (comment)

I think for us we don't want the potential memory overhead and code complexity, although it can save some communications which could've been hidden anyway.

torchtitan/train.py

tianyu-l · 2025-05-30T07:34:12Z

torchtitan/train.py

+        unwrapped_loss_fn = self.loss_fn
+
+        @functools.wraps(unwrapped_loss_fn)
+        def accumulated_loss_fn(*args, **kwargs):


I'm OK either way.
I think being more explicit about grad accumulation handling doesn't look bad.
Also if we go with explicit global_batch_size and implicit grad_accu_steps, then we'll need to do another check & computation in the loss function.

torchtitan/train.py

fegin · 2025-05-30T15:40:30Z

torchtitan/train.py

+    def train_step(
+        self,
+        data_iterator: Iterable[tuple[dict[str, torch.Tensor], torch.Tensor]],
+    ) -> bool | None:


We should just return bool and change all other returns to return False to keep the semantic consistent. This should be changed if we still keep the returning value as the design option. But I prefer try/catch. See the below response.

Reverted this/refactored to try-catch solution as per other discussions. Return type is back to implicit None.

fegin · 2025-05-30T15:47:13Z

torchtitan/train.py

+            except StopIteration:
+                # If data runs out during gradient accumulation, that
+                # entire step will not be executed.
+                return True


I would prefer directly raise StopIteration and let the outer loop to catch. As mentioned in the above discussion, the original design is to keep train_step() simple without data dependency. So there is no other StopIteration() afaik. If there are other places actually raise the StopIteration, we should figure it out.

If we really want to avoid ambiguity , we can have a customized next(), like next_batch() which will raise a customized DataDepleteException().

fegin · 2025-05-30T17:03:16Z

The review order looks pretty confusing, lol. The summary of some big discussions:

Keep the design with a new forward_backward_step and global_batch to align with RL use case.
Avoid returning a value from train_step(), using a customized Exception for data depletion.

cc., @tianyu-l

tianyu-l · 2025-06-03T08:56:53Z

hey @janEbert how about let's work a bit more on the PR.

Sorry for the confusion in the reviews. I think we have agreed on the direction:

Keep the design with a new forward_backward_step and global_batch to align with RL use case.
Avoid returning a value from train_step(), using a customized Exception for data depletion.

Please also add a test case in https://github.com/pytorch/torchtitan/blob/main/tests/integration_tests.py

@fegin

@fegin said: > TorchTitan currently doesn't perform force checkpoint if data is > depleted. We can fix this but I suggest that we don't do this in this > PR. (See pytorch#1238 (comment).)

janEbert · 2025-06-03T13:46:18Z

I believe I have incorporated all the feedback. Let me know how you like the changes. FYI I'm currently on a conference and on vacation from Friday, so it would be great to get this done before Friday, even if I may only sporadically find time. :)

@fegin

@fegin said: > TorchTitan currently doesn't perform force checkpoint if data is > depleted. We can fix this but I suggest that we don't do this in this > PR. (See pytorch#1238 (comment).)

janEbert · 2025-06-03T14:15:13Z

Rebased because of local_batch_size changes.

tianyu-l

Looks almost good! Please address final comments.

Also the addition of forward_backward_step breaks the FLUX model training.
Could you help refactor the train_step to forward_backward_step over there? Probably just

remove the optimizer.zero_grad
remove https://github.com/pytorch/torchtitan/blob/main/torchtitan/experiments/flux/train.py#L152-L180
return loss

For the eval step https://github.com/pytorch/torchtitan/blob/main/torchtitan/experiments/flux/train.py#L182
It should be done in Trainer.train(), but since we are not using grad accumulation in FLUX training, it is OK to leave it in forward_backward_step to accelerate landing of this PR, as long as CI tests pass. @wwwjn and I will work together on fixing it later.

tianyu-l · 2025-06-04T06:58:04Z

tests/integration_tests.py

+        OverrideDefinitions(
+            [
+                [
+                    # Default local batch size = 8, and `ngpu=2`, so


Let's explicitly specify local batch size as well, in case some future PR change the default without changing the test here.

tianyu-l · 2025-06-04T06:59:20Z

torchtitan/config_manager.py

    """
    The size of each pipeline parallel microbatch (default 1).
-    This value is used to compute the total number of microbatches by dividing batch_size with
+    This value is used to compute the total number of microbatches by dividing local batch_size with


Suggested change

This value is used to compute the total number of microbatches by dividing local batch_size with

This value is used to compute the total number of microbatches by dividing local_batch_size with

Great catch! I didn't see the underscore on my dirty screen lol

tianyu-l · 2025-06-04T07:01:39Z

torchtitan/train.py

+class DataloaderStopIteration(StopIteration):
+    """An exception that indicates dataloader exhaustion."""
+
+    pass


let's put this in https://github.com/pytorch/torchtitan/blob/main/torchtitan/components/dataloader.py

tianyu-l · 2025-06-04T07:08:07Z

torchtitan/train.py

+                try:
+                    self.train_step(data_iterator)
+                except DataloaderStopIteration:
+                    logger.info("Ran out of data; last step was canceled.")


Suggested change

logger.info("Ran out of data; last step was canceled.")

logger.warning("Ran out of data; last step was canceled.")

tianyu-l · 2025-06-04T07:21:18Z

torchtitan/train.py

-
-        # Keep these variables local to shorten the code as these are
-        # the major variables that are used in the training loop.
+    def next_batch(


This function sounds less necessary, especially when we already have dataloader and batch_generator. Given how short it is, it seems not too bad just running the try-catch in train_step?

To me, it makes the train_step look cleaner and it was nice to have it re-usable for the FLUX refactor. Does that change your mind? :)

I was thinking to patch the data iterator's __next__ method on-the-fly, to ensure the DataloaderStopIteration is raised, but didn't want to put too much black magic. It would require modifying the ParallelAwareDataloader.__iter__ method to apply the patch to the returned iterator. What do you think of that option?

I would suggest to keep the current implementation. Monkey patching is usually not a good idea. Also agree this function makes train_step cleaner.

Some future benefit, we may want to do data loader pipelining, which overlaps the to("cuda") with the computation. This function gives us a good place to implement it.

tianyu-l · 2025-06-04T07:30:08Z

torchtitan/train.py

+            job_config.training.local_batch_size * dp_degree
+        )
+        assert self.gradient_accumulation_steps > 0
+        self.loss_fn = rescale_accumulated_loss(


This is a comment not a suggestion:

The code sounds to me assuming the loss function we use must perform a "mean" reduction, instead of "sum" also available in e.g. cross entropy loss.
But I believe this assumption is also made in pytorch DDP, FSDP, PP, and universally accepted as the default now. So I think it's ok.

Good point. I added a docstring to the function to explicitly mention this.

Yes, CP also assumes mean. A docstring will be nice, thanks!

janEbert · 2025-06-04T12:12:38Z

PTAL.

wwwjn · 2025-06-04T16:54:45Z

Looks almost good! Please address final comments.

Also the addition of forward_backward_step breaks the FLUX model training. Could you help refactor the train_step to forward_backward_step over there? Probably just

remove the optimizer.zero_grad

remove https://github.com/pytorch/torchtitan/blob/main/torchtitan/experiments/flux/train.py#L152-L180

return loss

For the eval step https://github.com/pytorch/torchtitan/blob/main/torchtitan/experiments/flux/train.py#L182 It should be done in Trainer.train(), but since we are not using grad accumulation in FLUX training, it is OK to leave it in forward_backward_step to accelerate landing of this PR, as long as CI tests pass. @wwwjn and I will work together on fixing it later.

Agree! The current change on FLUX side looks good to me. In the future I will also test grad accumulation w/ FLUX. Ideally in the future I will move eval_step() in to the main trainer's train loop, and reuse main trainer's train_step() in FLUX

fegin

LGTM. There are some typing nits, but overall the implementation is clean.

fegin · 2025-06-04T17:07:34Z

torchtitan/train.py

+
+    def forward_backward_step(
+        self, input_dict: dict[str, torch.Tensor], labels: torch.Tensor
+    ):


Can we type the return value?

fegin · 2025-06-04T17:10:07Z

torchtitan/train.py

+
+    def train_step(
+        self, data_iterator: Iterable[tuple[dict[str, torch.Tensor], torch.Tensor]]
+    ):


ditto, can we type the return value?

This returns implicit None, so probably this one should remain like it is (i.e., don't add the -> None)?

ye, it's minor but I think it is generally better to explicit type, even for None: https://peps.python.org/pep-0484/#using-none

tianyu-l

I can't run CI because

This branch has conflicts that must be resolved

Would you please rebase?

Had one more comment on the next_batch function. See if you agree.

tianyu-l · 2025-06-05T05:45:47Z

torchtitan/train.py

-
-        # Keep these variables local to shorten the code as these are
-        # the major variables that are used in the training loop.
+    def next_batch(


I still think it's not necessary to create next_batch.

For the purpose of transforming the StopIteration exception, can we just do it in batch_generator? E.g. not doing for loop, but while True and try-catch

it was nice to have it re-usable for the FLUX refactor.

I think FLUX can reuse all train_step. For correctness right now, we can overload FluxTrainer.train_step() by calling super.train_step() and then do eval.

Some future benefit, we may want to do data loader pipelining, which overlaps the to("cuda") with the computation. This function gives us a good place to implement it.

For future benefit, let's make it only when the future comes.

Awesome idea!

janEbert · 2025-06-05T13:43:23Z

PTAL.

janEbert · 2025-06-05T13:55:00Z

One other thing I noticed and changed just now, which was an artifact from earlier versions: we don't need to keep the accumulated_losses as an attribute of the dataclass, so I made it method-local.

Fix pytorch#292.

Previously `int | None`. Makes it possible to obtain the automatic calculation of it when it has already been set in a TOML config.

@fegin

@fegin said: > TorchTitan currently doesn't perform force checkpoint if data is > depleted. We can fix this but I suggest that we don't do this in this > PR. (See pytorch#1238 (comment).)

I.e., a new `DataloaderStopIteration` that inherits from `StopIteration`. Accordingly, no longer return an optional `bool` to indicate depletion and adapt the remainder of the code to catch the new exception instead.

This concerns only renaming - `--training.batch_size` to `--training.local_batch_size` and - `job_config.training.batch_size` to `job_config.training.local_batch_size`.

I.e., the method in `Trainer`.

Instead use a new helper variable `global_batch_size` for all logic. Improves readability.

Improve readability.

These were only used in 1 or 2 locations each.

... from `MetricsProcessor`.

... toward `forward_backward_step` design.

We now raise the `DataloaderStopIteration` from inside the `batch_generator` generator method. `next_batch` can thus be removed as its only purpose at this point for raising the custom exception upon iterator exhaustion.

Move from dataclass attributes to method-local variable.

wwwjn

Thanks for making such a great work!

tianyu-l · 2025-06-05T21:45:22Z

@janEbert Thank you very much for the elegant work!!!

janEbert · 2025-06-05T21:47:55Z

Very kind words, thank you! Thank you also for all the patience and great reviews!

After #1238 landed, we could consolidate FLUX train_step() to reuse the main trainer's `train_step` function, by removing the `eval_step()`. We will replace eval_step() to be a `Validator` in the future to perform various validation methods.

facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label May 29, 2025

janEbert mentioned this pull request May 29, 2025

[Feature] Add gradient accumulation #292

Closed

fegin requested changes May 30, 2025

View reviewed changes

tianyu-l reviewed May 30, 2025

View reviewed changes

tianyu-l linked an issue May 30, 2025 that may be closed by this pull request

[Feature] Add gradient accumulation #292

Closed

tianyu-l reviewed May 30, 2025

View reviewed changes

fegin reviewed May 30, 2025

View reviewed changes

tianyu-l mentioned this pull request Jun 3, 2025

[Experimental Feature] Huggingface model training #919

Open

janEbert force-pushed the grad-accum-pr branch from c265c65 to 14edf13 Compare June 3, 2025 14:14

tianyu-l reviewed Jun 4, 2025

View reviewed changes

fegin approved these changes Jun 4, 2025

View reviewed changes

tianyu-l approved these changes Jun 5, 2025

View reviewed changes

janEbert requested a review from wwwjn as a code owner June 5, 2025 13:43

janEbert added 5 commits June 5, 2025 16:34

Refactor batch loss and grad calculation

7c55f07

Support gradient accumulation

83de5dc

Fix pytorch#292.

Run pre-commit hooks

c9396c2

Change global_batch_size type to int

1da4106

Previously `int | None`. Makes it possible to obtain the automatic calculation of it when it has already been set in a TOML config.

Do not save checkpoint when data ran out

82b5e5d

@fegin said: > TorchTitan currently doesn't perform force checkpoint if data is > depleted. We can fix this but I suggest that we don't do this in this > PR. (See pytorch#1238 (comment).)

janEbert added 23 commits June 5, 2025 16:34

Raise custom exception upon data depletion

4a30b74

I.e., a new `DataloaderStopIteration` that inherits from `StopIteration`. Accordingly, no longer return an optional `bool` to indicate depletion and adapt the remainder of the code to catch the new exception instead.

Rename "batch size" to "local batch size"

6cd2e31

This concerns only renaming - `--training.batch_size` to `--training.local_batch_size` and - `job_config.training.batch_size` to `job_config.training.local_batch_size`.

Rename batch_backward to forward_backward_step

45fbe0a

I.e., the method in `Trainer`.

Refactor loss function gradient accumulation wrap

be23194

Do not modify job_config.global_batch_size

5ae21d7

Instead use a new helper variable `global_batch_size` for all logic. Improves readability.

Add comment on default gradient accumulation step

8cf71d6

Move gradient accumulation derivation logic

b485012

Improve readability.

Remove redundant shortcut variables

12d274d

These were only used in 1 or 2 locations each.

Improve readability

d4dd122

Add gradient_accumulation_step to dataclass

a7f4c80

Move accumulated_losses to Trainer

7003eb9

... from `MetricsProcessor`.

Apply pre-commit hooks

266cffa

Add gradient accumulation integration test

072b9b4

Fix FLUX trainer

35fdc79

Refactor FLUX train step

12898e4

... toward `forward_backward_step` design.

Use fixed local batch size

a2d8c26

Fix typo

40802ad

Move custom StopIteration exception

af0a5ed

Fix log type

a29b59b

Add docstring to rescaled loss function

6b0efca

Fix missing types

b810950

Refactor away next_batch method

39b5087

We now raise the `DataloaderStopIteration` from inside the `batch_generator` generator method. `next_batch` can thus be removed as its only purpose at this point for raising the custom exception upon iterator exhaustion.

Refactor accumulated_losses

f4af76d

Move from dataclass attributes to method-local variable.

janEbert force-pushed the grad-accum-pr branch from a22bea3 to f4af76d Compare June 5, 2025 14:35

wwwjn approved these changes Jun 5, 2025

View reviewed changes

tianyu-l merged commit 8570415 into pytorch:main Jun 5, 2025
8 of 9 checks passed

wwwjn mentioned this pull request Jun 5, 2025

Move eval to main train loop and consolidate FLUX train_step #1266

Merged


		self.loss_fn = self.train_spec.build_loss_fn(job_config)

		unwrapped_loss_fn = self.loss_fn

	This value is used to compute the total number of microbatches by dividing local batch_size with
	This value is used to compute the total number of microbatches by dividing local_batch_size with

	logger.info("Ran out of data; last step was canceled.")
	logger.warning("Ran out of data; last step was canceled.")

Support gradient accumulation #1238

Support gradient accumulation #1238

Uh oh!

Conversation

janEbert commented May 29, 2025

Uh oh!

janEbert commented May 29, 2025

Uh oh!

fegin left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tianyu-l left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fegin commented May 30, 2025

Uh oh!

tianyu-l left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fegin commented May 30, 2025 •

edited

Loading

tianyu-l commented Jun 3, 2025 •

edited

Loading