Skip to content

Limited to using long for entity ids #1317

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
spring-projects-issues opened this issue Jun 10, 2014 · 3 comments
Open

Limited to using long for entity ids #1317

spring-projects-issues opened this issue Jun 10, 2014 · 3 comments

Comments

@spring-projects-issues
Copy link
Collaborator

Rob Fletcher opened BATCH-2254 and commented

We're attempting to build a workflow orchestration system and Spring Batch – primarily using Tasklets – seemed a good fit but one of the things we need to do is persist job status to a clustered environment (cross-region on Amazon's cloud) in order that we can recover from instance outages, even region outages and have job execution continue. The fact that Spring Batch's Entity class makes it impossible to use any id type other than long is preventing us from using a clustered storage solution such as Cassandra. We'd have to introduce some kind of blocking in order to reliably generate a unique long id without danger of collision. It seems like it would make sense if Entity used Serializable as a key which wouldn't preclude the current strategy for non-clustered SQL stores but would open a path to using UUIDs.


Affects: 3.0.1

1 votes, 4 watchers

@spring-projects-issues
Copy link
Collaborator Author

Rob Fletcher commented

OK. Delving deeper into the codebase I can see that JSR-352 is built around long ids and Spring Batch would be unable to implement various types from that JSR if the id type changed to Serializable. I'm worried this may not be a realistically solvable problem.

@spring-projects-issues
Copy link
Collaborator Author

Dave Syer commented

We should maybe work in this together a bit. I think that the JobRepository was basically designed to need a global lock when creating a new JobExecution, so the ids for that entity should be centrally generated anyway (it's a common problem and I'm sure we can find a solution). The natural key for all the other entities is actually the job execution id and a local identifier, so my instinct is that there should be a repository implementation that doesn't care how global the latter are. As long as you don't need to start loads of really small jobs it should work.

@fmbenhassine fmbenhassine changed the title Limited to using long for entity ids [BATCH-2254] Limited to using long for entity ids Sep 27, 2023
@fmbenhassine
Copy link
Contributor

fmbenhassine commented Oct 24, 2023

The impediment due to the JSR is removed as of v5 (#3894). As mentioned by D.Syer, IDs should be generated centrally to prevent creating duplicate job executions. However, they do not have to be of type long. The only current usage of that type is when getting the last job execution (descending order by ID). Getting the last job execution is required in 3 places:

  • When starting new job instances: need to check the state of the last job execution if any
  • When starting next job instances in a sequence with an incrementer (CommandLineJobRunner with "-next" option and JobOperator#startNextInstance): need to get job parameters from the last job execution
  • When restarting failed jobs: need to get the execution context from the last job execution

Now since the creation of job executions should be done centrally anyway, I think the ordering could be based on creation time.

--

Related issue: #877

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants