-
Notifications
You must be signed in to change notification settings - Fork 18.1k
cmd/compile: automatically stack-allocate small non-escaping slices of dynamic size #27625
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I think it's almost always a win to start on the stack, if we can. This example is tricky, and probably very common:
How do we preallocate some space for
But that isn't right if |
I wonder if always applying this transformation, even if it never hurt performance, would make binaries noticeably bigger. Would this be done for all allocations of small non-escaping slices? Or only for those where the capacity is known at compile time to be small? |
We already allocate small non-escaping slices on the stack if their capacity is known at compile time. |
Is this #20533? |
It's definitely similar. #20533 is going more down the road of really allocating |
Transforming var a []byte
if n < 64 {
a = make([]byte, n, 64) // stack allocation
} else {
a = make([]byte, n) // heap allocation
} can surely have some code size impact. In some cases, maybe prove is able to remove one of the two branches but I'm not holding my breath on that. I wonder if it's still worth, performance wise. We should also explore doing this for slices of different types (while keeping total stack allocation within a certain limit). |
This commit:
95a11c7
shows a real-world performance gain triggered by moving a small non-escaping slice to the stack. It is my understanding that the Go compiler always allocated the slice in the heap because the length was not known at compile time.
Would it make sense to attempt a similar code transformation for many/all non escaping slices? What would be the cons? Any suggestion on how to identify which slices could benefit from this transformation and which would possibly just create overhead?
The text was updated successfully, but these errors were encountered: