-
Notifications
You must be signed in to change notification settings - Fork 2.4k
JSON files are randomly not well formatted when multi-thread step is used #4708
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Thank you for raising this issue. This is actually due to the fact that What you can do is wrap it in a Here is an example: https://gist.github.com/fmbenhassine/2ed050f69d5472c6857ab9690e3715ab. The example is for the xml writer, but the same idea applies to the json writer as well. Can you try this and share your feedback? TIA |
Thank you for your suggestion, normally I already use the I was also using the My test has a chunk size of 1, and only 2 threads in parallel. So now the result is [
,
{"code":10002,"ref":"B2C3D4E5F6G7","type":11,"nature":6,"etat":2,"ref2":"B2C3D4E5F6G7"},
{"code":10004,"ref":"D4E5F6G7H8I9","type":13,"nature":8,"etat":4,"ref2":"D4E5F6G7H8I9"},
{"code":10006,"ref":"F6G7H8I9J0K1","type":15,"nature":10,"etat":6,"ref2":"F6G7H8I9J0K1"},
{"code":10008,"ref":"H8I9J0K1L2M3","type":17,"nature":12,"etat":8,"ref2":"H8I9J0K1L2M3"} {"code":10001,"ref":"A1B2C3D4E5F6","type":10,"nature":5,"etat":1,"ref2":"A1B2C3D4E5F6"},
{"code":10003,"ref":"C3D4E5F6G7H8","type":12,"nature":7,"etat":3,"ref2":"C3D4E5F6G7H8"},
{"code":10005,"ref":"E5F6G7H8I9J0","type":14,"nature":9,"etat":5,"ref2":"E5F6G7H8I9J0"},
{"code":10007,"ref":"G7H8I9J0K1L2","type":16,"nature":11,"etat":7,"ref2":"G7H8I9J0K1L2"},
{"code":10009,"ref":"I9J0K1L2M3N4","type":18,"nature":13,"etat":9,"ref2":"I9J0K1L2M3N4"},
{"code":10010,"ref":"J0K1L2M3N4O5","type":19,"nature":14,"etat":10,"ref2":"J0K1L2M3N4O5"},
{"code":10011,"ref":"K1L2M3N4O5P6","type":20,"nature":15,"etat":11,"ref2":"K1L2M3N4O5P6"},
{"code":10012,"ref":"L2M3N4O5P6Q7","type":21,"nature":16,"etat":12,"ref2":"L2M3N4O5P6Q7"},
{"code":10013,"ref":"M3N4O5P6Q7R8","type":22,"nature":17,"etat":13,"ref2":"M3N4O5P6Q7R8"},
{"code":10014,"ref":"N4O5P6Q7R8S9","type":23,"nature":18,"etat":14,"ref2":"N4O5P6Q7R8S9"},
{"code":10015,"ref":"O5P6Q7R8S9T0","type":24,"nature":19,"etat":15,"ref2":"O5P6Q7R8S9T0"},
{"code":10016,"ref":"P6Q7R8S9T0U1","type":25,"nature":20,"etat":16,"ref2":"P6Q7R8S9T0U1"},
{"code":10017,"ref":"Q7R8S9T0U1V2","type":26,"nature":21,"etat":17,"ref2":"Q7R8S9T0U1V2"},
{"code":10018,"ref":"R8S9T0U1V2W3","type":27,"nature":22,"etat":18,"ref2":"R8S9T0U1V2W3"},
{"code":10019,"ref":"S9T0U1V2W3X4","type":28,"nature":23,"etat":19,"ref2":"S9T0U1V2W3X4"}
] where it should be [
{"code":10001,"ref":"A1B2C3D4E5F6","type":10,"nature":5,"etat":1,"ref2":"A1B2C3D4E5F6"},
{"code":10003,"ref":"C3D4E5F6G7H8","type":12,"nature":7,"etat":3,"ref2":"C3D4E5F6G7H8"},
{"code":10005,"ref":"E5F6G7H8I9J0","type":14,"nature":9,"etat":5,"ref2":"E5F6G7H8I9J0"},
{"code":10007,"ref":"G7H8I9J0K1L2","type":16,"nature":11,"etat":7,"ref2":"G7H8I9J0K1L2"},
{"code":10002,"ref":"B2C3D4E5F6G7","type":11,"nature":6,"etat":2,"ref2":"B2C3D4E5F6G7"},
{"code":10004,"ref":"D4E5F6G7H8I9","type":13,"nature":8,"etat":4,"ref2":"D4E5F6G7H8I9"},
{"code":10006,"ref":"F6G7H8I9J0K1","type":15,"nature":10,"etat":6,"ref2":"F6G7H8I9J0K1"},
{"code":10008,"ref":"H8I9J0K1L2M3","type":17,"nature":12,"etat":8,"ref2":"H8I9J0K1L2M3"},
{"code":10009,"ref":"I9J0K1L2M3N4","type":18,"nature":13,"etat":9,"ref2":"I9J0K1L2M3N4"},
{"code":10010,"ref":"J0K1L2M3N4O5","type":19,"nature":14,"etat":10,"ref2":"J0K1L2M3N4O5"},
{"code":10011,"ref":"K1L2M3N4O5P6","type":20,"nature":15,"etat":11,"ref2":"K1L2M3N4O5P6"},
{"code":10012,"ref":"L2M3N4O5P6Q7","type":21,"nature":16,"etat":12,"ref2":"L2M3N4O5P6Q7"},
{"code":10013,"ref":"M3N4O5P6Q7R8","type":22,"nature":17,"etat":13,"ref2":"M3N4O5P6Q7R8"},
{"code":10014,"ref":"N4O5P6Q7R8S9","type":23,"nature":18,"etat":14,"ref2":"N4O5P6Q7R8S9"},
{"code":10015,"ref":"O5P6Q7R8S9T0","type":24,"nature":19,"etat":15,"ref2":"O5P6Q7R8S9T0"},
{"code":10016,"ref":"P6Q7R8S9T0U1","type":25,"nature":20,"etat":16,"ref2":"P6Q7R8S9T0U1"},
{"code":10017,"ref":"Q7R8S9T0U1V2","type":26,"nature":21,"etat":17,"ref2":"Q7R8S9T0U1V2"},
{"code":10018,"ref":"R8S9T0U1V2W3","type":27,"nature":22,"etat":18,"ref2":"R8S9T0U1V2W3"},
{"code":10019,"ref":"S9T0U1V2W3X4","type":28,"nature":23,"etat":19,"ref2":"S9T0U1V2W3X4"}
] So my opinion is that the first thread must race to be the first to get the semaphore in I start to think on a way to identify the thread that writes the first chunk, it should be possible based on the |
oh sorry i did not check the code you shared, my bad. In that case, I need to dig deeper. I will add a comment here when done. |
…unk to be written by first thread
To "dig deeper", you're sooooo right ! An improvement could be to make this latch optional (by default set to 0 to deactivate it) as this is not necessary for output formats without record separator (XML, CSV). |
🤣 Welcome to the deepest rabbit hole of Spring Batch 😅
I really appreciate your time and effort to open a workaround PR, but I just wanted to let you know that we are working on a complete rewrite of the implementation, see #3950. We have patched things around for years and this has to stop at some point. For this reason, we are not going to invest time on working around issues anymore. Rather, we are focusing on testing the initial prototype of the new implementation. If you want to help, I would appreciate if you give that a try and share your feedback. |
oh no, I was hoping at least a 5.1.3 or 5.2 😞 it's so hard to follow the execution line as you unfolded in #3950 , that I only can understand and approve the need of rework, sometimes it's better to restart from blank page I started to read the new implementation, will try to test it when free time on the project and will come back to you for feedbacks, and probably questions notably related on differences between |
While I agree with much of the assessments on the current implementation of multi-threaded steps, I don't think it's actually at fault here. 😄 @glelarge I think your analysis of the problem is correct: The root cause is that the But I believe the solution is to just disable the
The writer will then flush within the scope of the synchronization of Disabling the |
@hpoettker Thank you ! With the transactional to |
Thank you for the feedback @glelarge ! and thank you for the follow up on this @hpoettker ! Based on the last comments, I am closing this issue and the corresponding PR. |
Bug description
Spring Batch offers the possibility to configure a chunk-oriented step as multi-threaded by configuring a
TaskExecutor
in the Step configuration.When configured with multi-thread, the produced JSON is sometimes not correctly formatted.
Detected with Spring Batch 5.1.2 and Spring Framework 6.1.13.
Wrong result
Sometime the produced JSON is misformatted :
Expected result
When the file is not well formatted, the error is always the same :
Minimal Complete Reproducible example
https://github.com/glelarge/spring-batch-json-multithread
git clone https://github.com/glelarge/spring-batch-json-multithread cd spring-batch-json-multithread mvn package ./run.sh
If the issue is related to the configuration, would it be possible to update the repo to fix the configuration ?
The text was updated successfully, but these errors were encountered: