Skip to content

runtime: Simplify slice growing/appending code #4287

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jul 31, 2024

Conversation

lpereira
Copy link
Contributor

@lpereira lpereira commented Jun 7, 2024

Refactor the slice appending function to rely on the slice growing function, and remove branches/loops to use a branchfree variant.

CC @dgryski

@lpereira lpereira force-pushed the slice-mild-over-allocation branch from 90f998f to ec37a27 Compare June 11, 2024 21:55
@dgryski
Copy link
Member

dgryski commented Jun 26, 2024

Some quick testing shows this makes the multiplication factor 1.125. However the expression doesn't seem to work for newCap <= 8 so I think we need to make sure the edge cases are handled properly.

@dgryski
Copy link
Member

dgryski commented Jul 22, 2024

I'm not sure this actually saves us anything. I threw together a quick simulation and it seems that the new algorithm requires more calls to "grow" (since we're not growing as much, we need to do it more frequently, requiring more CPU) and ends up generating more garbage, because the additional calls to grow means we have to throw more stuff away.

You can play with my sample code here: https://go.dev/play/p/QcHiMqnGlym

@lpereira
Copy link
Contributor Author

lpereira commented Jul 22, 2024 via email

@lpereira lpereira force-pushed the slice-mild-over-allocation branch from ec37a27 to fca77ff Compare July 22, 2024 20:04
@lpereira lpereira changed the title runtime: Mildly over-allocate when growing slices runtime: Simplify slice growing/appending code Jul 22, 2024
@lpereira lpereira force-pushed the slice-mild-over-allocation branch from fca77ff to a4b6cfe Compare July 22, 2024 20:07
@lpereira
Copy link
Contributor Author

Alright, @dgryski, pushed a new version, rebased on the current dev branch, that goes back to the "align to the next power of two" code, but using a branchfree version instead.

@lpereira lpereira force-pushed the slice-mild-over-allocation branch 3 times, most recently from ec47ad5 to 06eb710 Compare July 23, 2024 19:43
@lpereira
Copy link
Contributor Author

The failure inside the reflect test is very weird.

@lpereira lpereira force-pushed the slice-mild-over-allocation branch from 06eb710 to 4e2bdb8 Compare July 23, 2024 22:28
@cee-dub
Copy link

cee-dub commented Jul 25, 2024

The Index field of "want" for the reflect field F looks incorrect, it is field 1, but want is 0,0.

Copy link
Contributor

@ldemailly ldemailly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. nice to share the code in grow. just some (non merge blocking) questions


buf := alloc(oldCap*elemSize, nil)
buf := alloc(newCap*elemSize, nil)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this have some overflow check?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it should. Even calculating the new newCap can overflow. However, handling overflow conditions doesn't seem to be something that's done pervasively in TinyGo, so I didn't do it here so this is closer to the rest of the code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is very unlikely to overflow in practice, but yes it would be a good idea to have some sort of check for this.
For 32-bit and 64-bit systems it would be possible to use math/bits for this: https://pkg.go.dev/math/bits#Add32
But maybe there are universal expressions for detecting such an overflow condition efficiently.

@lpereira
Copy link
Contributor Author

The Index field of "want" for the reflect field F looks incorrect, it is field 1, but want is 0,0.

Right, and I have no idea why! I've spent some time in the reflect code, and @dgryski too, but we don't yet know what's happening here. Our best guesses is that there's either memory corruption somewhere, or that, although the code seems correct, its behavior is different from before, and the reflect code is probably depending on that somehow.

@cee-dub
Copy link

cee-dub commented Jul 25, 2024

Hmm, I was suggesting/agreeing with @dgryski that the test may be faulty and the old implementation matched. I think the expectation looks incorrect, though I have very little experience or deep knowledge of this code.

@lpereira
Copy link
Contributor Author

Hmm, I was suggesting/agreeing with @dgryski that the test may be faulty and the old implementation matched. I think the expectation looks incorrect, though I have very little experience or deep knowledge of this code.

If I understood the test correctly, it seems correct. The index is a slice of indices, starting with the current type, and following any nested type. So an index of []int{0} means the field is the first in that struct; and an index of []int{0, 1} means it's the second of the first of that struct.

The test defines two structs: Rec1 and Rec2; Rec1 has a pointer to Rec2, and Rec2 has a F string and a pointer to Rec1. To find F from Rec1, you need to go through the first element, and then find the first element ([]int{0, 0}), which seems to agree with the test.

I might be mistaken about how this works, though.

@cee-dub
Copy link

cee-dub commented Jul 25, 2024

Reading more of #tinygo-dev, feel free to ignore me. I didn't look closely enough at the Index type in the test.

@lpereira
Copy link
Contributor Author

So I copied the failing test to its own file, and reflect.VisibleFields() gives the right output, agreeing with Big Go. So the problem seems to be in the actual test driver somewhere.

@lpereira
Copy link
Contributor Author

Fixed the reflect test; let's see if CI likes it.

@lpereira
Copy link
Contributor Author

Alright, archive/zip failing is probably the reflect.DeepEqual() calls inside the tests, most likely some unbounded recursion or something of the sort.

@dgryski
Copy link
Member

dgryski commented Jul 26, 2024

@lpereira Ok, so it does seem like the bug is somewhere in reflect then. I'm going to try to hunt down the bug looking at that function.

@lpereira lpereira force-pushed the slice-mild-over-allocation branch from e0e9a35 to a3c5eb5 Compare July 26, 2024 18:10
@dgryski
Copy link
Member

dgryski commented Jul 26, 2024

This fixes the reflect test:

diff --git a/src/reflect/type.go b/src/reflect/type.go
index 1356f67c..32bf68d7 100644
--- a/src/reflect/type.go
+++ b/src/reflect/type.go
@@ -774,7 +774,7 @@ func (t *rawType) rawFieldByNameFunc(match func(string) bool) (rawStructField, [
                                if match(name) {
                                        found = append(found, result{
                                                rawStructFieldFromPointer(descriptor, field.fieldType, data, flagsByte, name, offset),
-                                               append(ll.index, int(i)),
+                                               append(ll.index[:len(ll.index):len(ll.index)], int(i)),
                                        })
                                }

@@ -787,7 +787,7 @@ func (t *rawType) rawFieldByNameFunc(match func(string) bool) (rawStructField, [

                                        nextlevel = append(nextlevel, fieldWalker{
                                                t:     embedded,
-                                               index: append(ll.index, int(i)),
+                                               index: append(ll.index[:len(ll.index):len(ll.index)], int(i)),
                                        })
                                }

I'll get this in as another PR and you can rebase against that with only the slice allocation changes.

@dgryski
Copy link
Member

dgryski commented Jul 26, 2024

#4367

@lpereira
Copy link
Contributor Author

I rebased this on top of #4387 (please merge it first!), and removed the commit changing reflect. CI is still failing with some error I was unable to track down.

@dgryski
Copy link
Member

dgryski commented Jul 27, 2024

I'll look at the archive/zip failure this evening.

@dgryski
Copy link
Member

dgryski commented Jul 27, 2024

The archive/zip failure seems to be related to the stepped doubling logic function.

@dgryski
Copy link
Member

dgryski commented Jul 27, 2024

Nope; failure seems intermittent.

@dgryski
Copy link
Member

dgryski commented Jul 31, 2024

This is due to memory fragmentation which (annoyingly) has different behaviour with the new growth algorithm. A smarter allocator that has free lists for small allocation sizes to reduce fragmentation would help here, but that's outside the scope of this PR.

dgryski and others added 5 commits July 31, 2024 11:10
Refactor the slice appending function to rely on the slice growing
function, and remove branches/loops to use a branchfree variant.

Signed-off-by: L. Pereira <[email protected]>
Both branches were equivalent, so guard the overall logic in
sliceAppend() with the more general condition.

Signed-off-by: L. Pereira <[email protected]>
Use `bits.Len()` rather than `32 - bits.LeadingZeros32()`.  They're
equivalent, but the Len version is a bit easier to read.

Signed-off-by: L. Pereira <[email protected]>
sliceGrow() will return the old slice if its capacity is large enough.

Signed-off-by: L. Pereira <[email protected]>
@lpereira lpereira force-pushed the slice-mild-over-allocation branch from 38095eb to 61c1425 Compare July 31, 2024 18:10
@lpereira
Copy link
Contributor Author

A smarter allocator that has free lists for small allocation sizes to reduce fragmentation would help here, but that's outside the scope of this PR.

Agreed. I've rebased on the current dev branch and changed the code to have only one overallocation strategy until we have a better allocator. This won't reduce the amount of memory used by slices, but it simplifies the code quite a bit, which is already a gain IMHO.

@ldemailly
Copy link
Contributor

fragmentation because of not taking headers/meta into account was my point in #4287 (comment)

@lpereira
Copy link
Contributor Author

sizediff failure is unrelated.

Copy link
Member

@dgryski dgryski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dgryski
Copy link
Member

dgryski commented Jul 31, 2024

Merging. Will close my PR that has the reflect fix.

@dgryski dgryski merged commit 417a26d into tinygo-org:dev Jul 31, 2024
16 of 17 checks passed
append(ll.index, int(i)),
append(ll.index[:len(ll.index):len(ll.index)], int(i)),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will force a new allocation on every append, is that intentional?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. This fixes a test failure in reflect that was triggered by the new slice allocation code which made this sometimes be a copy and sometimes not. (The old growth code made this always a copy in the test, but the new growth code expanded it more and so sometimes reused the old slice, which then had entries in the slice overwritten by later loop iterations, causing the faiure.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants