Skip to content

Try harder to free memory for TString.testNoLeaks() #281

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 8, 2021

Conversation

saudet
Copy link
Contributor

@saudet saudet commented Apr 8, 2021

This makes the test run much more slowly, but it still takes less than a minute, and it's not flaky for me anymore.

@Craigacp
Copy link
Collaborator

Craigacp commented Apr 8, 2021

Does this test work on Java 11? Or on G1GC, ZGC or Shenandoah?

@saudet
Copy link
Contributor Author

saudet commented Apr 8, 2021

I don't know, I don't care. If you care that much, please try to get Google to cooperate by getting them to give us their CppMemoryChecker that I mentioned at #266 (comment).

Go Oracle! Do something useful :)

@Craigacp
Copy link
Collaborator

Craigacp commented Apr 8, 2021

Given our CI failed because Github Actions silently upgraded their default JVM from Java 8 to Java 11, I don't want more tests that flake out or are sensitive to JVM configuration settings that people frequently set. Plus the default GC algorithm does change from time to time, it changed to G1GC in 9, and G1GC has seen significant improvements in newer releases.

@cowwoc
Copy link

cowwoc commented Apr 8, 2021

A counter-proposal, if I may: https://stackoverflow.com/a/868921/14731

Instead of Thread.sleep() or System.gc() which are slow and/or non-deterministic, you could:

  • Run a for-loop that allocates instances without closing them.
  • Count the number of iterations needed to trigger OutOfMemoryError.
  • Run a for-loop with double the number of iterations but this time allocate and close instances on every iteration. Check that OutOfMemoryError is not thrown.

If JUnit does not allow tests to throw OutOfMemoryErrors without blowing up the JVM then you could skip the first two steps and instead hard-code a "large enough" iteration constant. You can probably also specify a -Xmx parameter to be used when running unit tests.

@karllessard
Copy link
Collaborator

I think @cowwoc suggestion is worth trying, I would also prefer a more deterministic approach to run that test. My suspicions is that an OOM will blow up the whole Maven surefire process and we might have trouble to fine the right -Xmx to run all tests including this one but we'll see, I'll make some tests and let you know.

Copy link
Collaborator

@karllessard karllessard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I'm merging this now, at least for the sake of restoring a successful CI build, but we might replace it with a deterministic approach if that works out (and while being there, extend the scope of memory leak detection beyond the string tensors)

@karllessard karllessard merged commit d570edb into tensorflow:master Apr 8, 2021
@saudet
Copy link
Contributor Author

saudet commented Apr 9, 2021

@cowwoc What we're trying to detect is not memory leaks in the Java heap, but from off-heap allocations in TensorFlow. The JVM is not going to go OOM when the latter happens.

@cowwoc
Copy link

cowwoc commented Apr 9, 2021

@saudet I see, but wouldn't you get the native equivalent of OOM? At best, you should get some sort of exception. At worst, you should get a core dump. Either way, the test should fail if memory gets leaked. Am I wrong?

@saudet
Copy link
Contributor Author

saudet commented Apr 9, 2021

We'd have to exhaust the native heap, and that depends on the settings of the operating system, so we'd need to set those somehow for each platform, and then figure out how to hack that with surefire, for each platform, but then that doesn't work too well when you start having different loads, like large models for integration tests, but sure, anything's possible. Anyway, the point is Google has already done all the work for that internally. If we had an actual budget for this, we could of course reinvent the wheel and everything, but personally, I have more interesting things to do. If you're interested to work on that in your free time though, please go ahead!

@cowwoc
Copy link

cowwoc commented Apr 9, 2021

@saudet I was only trying to help. Okay, I'll leave it up to you to track down CppMemoryChecker.

To your point, I'd rather not dig into my free time unless I have to. I'd much rather spend that time with my family :)

Be well.

@saudet
Copy link
Contributor Author

saudet commented Apr 9, 2021

Thanks @cowwoc. BTW, I do appreciate the concerns expressed by @Craigacp, I just think it's something that should be thought about more carefully if we want to do it right. If this unit test is giving us problems though, I would just @Ignore it by default.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants