-
Notifications
You must be signed in to change notification settings - Fork 328
Fix for memory leak in JVMObjectTracker #801
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix for memory leak in JVMObjectTracker #801
Conversation
… to release JVM objects during shutdown.
...scala/microsoft-spark-2-3/src/test/scala/org/apache/spark/api/dotnet/DotnetBackendTest.scala
Show resolved
Hide resolved
@imback82 I think the fix this PR addresses and the fix for issue #792 / PR #793 are similar. We can either leave This PR takes the first approach, and #793 does the second, but I think we should try to be consistent with both fixes. I'm partial to the 2nd approach and |
Just to let you know. The fix I've created solves memory leak issue, but causes job isolation issues, when all tracking objects for one job may be cleaned up during shutdown of another job. The only sensible way to fix that is to make |
Sounds like a good plan. Looking forward to the updated PR. |
…aru doesn't contain such method.
Hi @suhsteve PR was updated, so currently Looking forward for any feedback. |
FYI I've done a little regression testing on my QA environment using latest fixes from current PR. All batch jobs, which previously suffered by memory leak, currently are running correctly without any memory issues. It would be perfect to start reviewing changes and move further towards integrating these changes into master. |
Thanks @spzSource for the update. I will get to this PR this week, sorry for the delay. |
looking forword the PR go to master. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few minor/nit comments, but generally looking great to me. Thanks @spzSource for working on this!
src/scala/microsoft-spark-2-3/src/main/scala/org/apache/spark/api/dotnet/SerDe.scala
Outdated
Show resolved
Hide resolved
src/scala/microsoft-spark-2-3/src/main/scala/org/apache/spark/api/dotnet/SerDe.scala
Outdated
Show resolved
Hide resolved
src/scala/microsoft-spark-2-3/src/test/scala/org/apache/spark/api/dotnet/SerDeTest.scala
Outdated
Show resolved
Hide resolved
src/scala/microsoft-spark-2-3/src/test/scala/org/apache/spark/api/dotnet/SerDeTest.scala
Show resolved
Hide resolved
src/scala/microsoft-spark-2-3/src/test/scala/org/apache/spark/api/dotnet/Extensions.scala
Outdated
Show resolved
Hide resolved
...scala/microsoft-spark-2-3/src/test/scala/org/apache/spark/api/dotnet/DotnetBackendTest.scala
Outdated
Show resolved
Hide resolved
src/scala/microsoft-spark-2-3/src/main/scala/org/apache/spark/api/dotnet/DotnetBackend.scala
Outdated
Show resolved
Hide resolved
src/scala/microsoft-spark-2-3/src/main/scala/org/apache/spark/api/dotnet/DotnetBackend.scala
Outdated
Show resolved
Hide resolved
…to jvmObjectTracker-memory-leak
Seems that latest build failed due to nuget feed issue:
Do anybody have idea what has happened with this feed? |
Looks like the feed got deprecated recently. I will create a PR to fix this. Thanks! |
The build issue is fixed in #807. I pushed the changes to your branch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @spzSource!
Fixes #799
As per explanation in the linked issue there is memory leak in
JVMObjectTracker
. Because ofJVMObjectTracker
is singleton by it's nature, it retains references to JVM objects throughout the entire life cycle of spark driver node.All these in turn causes massive memory leak when running
DotnetRunner
multiple time on the same driver node. For instance this happens when using SetJar approach in Databricks.To fix the memory leak
JVMObjectTracker
is cleaned up right beforeDotnetBackend
shutdown, so it releases references to tracked JVM objects, which results for successful garbage collection against these objects.Heap statistics using
jmap
tool after the fix applied:Usage of Old gen space is 12%, comparing with 99,9% before the fix.
Point to discuss:
My first attempt was to make
JVMObjectTracker
non-static, which potentially may provide better protection from leaks in the future. But making it non-static causes multiple file being modified, moreover it looks like it will be pretty hard to makeDotnetForeachBatchHelper
work with non-static version ofJVMObjectTracker
.Assuming all mentioned above I've decided to implement an easiest possible solution, but I absolutely do not mind discussing other approaches.