Skip to content

Retry support for exceptions in transaction commit phase without re-executing reader #4114

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
daanvdh opened this issue May 17, 2022 · 1 comment
Labels
status: waiting-for-reporter Issues for which we are waiting for feedback from the reporter type: feature

Comments

@daanvdh
Copy link

daanvdh commented May 17, 2022

Expected Behavior

By adding Throwables to a (not yet existing) configuration, the batch framework will catch and swallow these exceptions if they where fired during the commit phase of the transaction and add the ChunkContext including the attributeQueue so that this chunk can be retried without having to execute the reader.

Current Behavior

Currently there is no retry mechanism in place where ChunkContext has status completed (correctly) and we only want to retry the processor and writer.

Work around implementation details

Implementing this feature currently requires the copying and pasting from some classes, with only minor changes to the code. This will be difficult to maintain if there are framework changes.

  • copy paste ChunkOrientedTasklet and remove line "chunkContext.removeAttribute(INPUTS_KEY);"
  • set tour own version of ChunkOrientedTasklet by using TaskletStep::setTasklet in the jobConfiguration.
  • copy paste StepContextRepeatCallback and add a catch statement catching exceptions from the transaction commit. In the catch store the ChunkContext in the attributeQueue if the exception if swallowed. Also set the completed status to false.
  • Extend TaskletStep to override doExecute, only to use our own version of StepContextRepeatCallback.
  • Use our version of TaskletStep in the jobConfiguration.

Context

Our use case is that we read multiple xml files concurrently that contain data to either create or update entities in a db. The system must be able to handle multiple xml files that possibly contain updates to the same entities. The current design is that a batch job reads in a single xml file, the reader reads a chunk of xml data and creates a pojo per entity to create/update. These pojos are input to the processor which fetches (if present) the existing entity and updates/creates it. Then the writer writes them to the db.

Currently if 2 xml files that contain an update to the same entity are executed concurrently in 2 different batch jobs, an OptimisticLockingException can occur in the commit phase. That is because both processors fetched the existing entity from the db before a writer wrote its changes. Simply re-executing the processor and writer on the same pojo will fix this.

@daanvdh daanvdh added status: waiting-for-triage Issues that we did not analyse yet type: feature labels May 17, 2022
@fmbenhassine
Copy link
Contributor

Thank you for opening this issue.

Current Behavior
Currently there is no retry mechanism in place where ChunkContext has status completed (correctly) and we only want to retry the processor and writer.

I am not sure I understand the problem with this. If the ChunkContext is marked as completed correctly, why would one want or need to retry the chunk?

For this kind of non trivial issues, and in order to address this ticket in an efficient way, we are expecting you to provide a minimal example that we can inspect to understand the problem you are describing and what you are requesting as a new feature. Thank you for your collaboration.

BTW, we are going to revisit the current implementation of the chunk-oriented processing model (see #3950), which might impact the way we introduce new features related to fault-tolerance and concurrency.

@fmbenhassine fmbenhassine added status: waiting-for-reporter Issues for which we are waiting for feedback from the reporter and removed status: waiting-for-triage Issues that we did not analyse yet labels Apr 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: waiting-for-reporter Issues for which we are waiting for feedback from the reporter type: feature
Projects
None yet
Development

No branches or pull requests

2 participants