Skip to content

Conversation

@pgavlin
Copy link
Contributor

@pgavlin pgavlin commented Mar 25, 2021

Under certain circumstances, the existing rules for bit operations can
produce code that writes beyond its intended bounds. For example,
consider the following code:

func repro(b []byte, addr, bit int32) {
    _ = b[3]
    v := uint32(b[0]) | uint32(b[1])<<8 | uint32(b[2])<<16 | uint32(b[3])<<24 | 1<<(bit&31)
    b[0] = byte(v)
    b[1] = byte(v >> 8)
    b[2] = byte(v >> 16)
    b[3] = byte(v >> 24)
}

Roughly speaking:

  1. The expression 1 << (bit & 31) is rewritten into (SHLL 1 bit)
  2. The expression uint32(b[0]) | uint32(b[1])<<8 | uint32(b[2])<<16 | uint32(b[3])<<24 is rewritten into (MOVLload &b[0])
  3. The statements b[0] = byte(v) ... b[3] = byte(v >> 24) are
    rewritten into (MOVLstore &b[0], v)
  4. (ORL (SHLL 1, bit) (MOVLload &b[0])) is rewritten into
    (BTSL (MOVLload &b[0]) bit). This is a valid transformation because
    the destination is a register: in this case, the bit offset is masked
    by the number of bits in the destination register. This is identical
    to the masking performed by SHL.
  5. (MOVLstore &b[0] (BTSL (MOVLload &b[0]) bit)) is rewritten into
    (BTSLmodify &b[0] bit). This is an invalid transformation because
    the destination is memory: in this case, the bit offset is not
    masked, and the chosen instruction may write outside its intended
    32-bit location.

These changes fix the invalid rewrite performed in step (5) by
explicitly maksing the bit offset operand to BT(S|R|C)(L|Q)modify. In
the example above, the adjusted rules produce
(BTSLmodify &b[0] (ANDLconst [31] bit)) in step (5).

These changes also add several new rules to rewrite bit sets, toggles,
and clears that are rooted at (OR|XOR|AND)(L|Q)modify operators into
appropriate BT(S|R|C)(L|Q)modify operators. These rules catch cases
where MOV(L|Q)store ((OR|XOR|AND)(L|Q) ...) is rewritten to
(OR|XOR|AND)(L|Q)modify before the (OR|XOR|AND)(L|Q) ... can be
rewritten to BT(S|R|C)(L|Q) ....

Overall, compilecmp reports small improvements in code size on
darwin/amd64 when the changes to the compiler itself are exlcuded:

file before after Δ %
runtime.s 536464 536412 -52 -0.010%
bytes.s 32629 32593 -36 -0.110%
strings.s 44565 44529 -36 -0.081%
os/signal.s 7967 7959 -8 -0.100%
cmd/vendor/golang.org/x/sys/unix.s 81686 81678 -8 -0.010%
math/big.s 188235 188253 +18 +0.010%
cmd/link/internal/loader.s 89295 89056 -239 -0.268%
cmd/link/internal/ld.s 633551 633232 -319 -0.050%
cmd/link/internal/arm.s 18934 18928 -6 -0.032%
cmd/link/internal/arm64.s 31814 31801 -13 -0.041%
cmd/link/internal/riscv64.s 7347 7345 -2 -0.027%
cmd/compile/internal/ssa.s 4029173 4033066 +3893 +0.097%
total 21298280 21301472 +3192 +0.015%

@google-cla
Copy link

google-cla bot commented Mar 25, 2021

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

@google-cla google-cla bot added the cla: no Used by googlebot to label PRs as having an invalid CLA. The text of this label should not change. label Mar 25, 2021
@google-cla google-cla bot added cla: yes Used by googlebot to label PRs as having a valid CLA. The text of this label should not change. and removed cla: no Used by googlebot to label PRs as having an invalid CLA. The text of this label should not change. labels Mar 25, 2021
@pgavlin
Copy link
Contributor Author

pgavlin commented Mar 25, 2021

@googlebot I fixed it.

@gopherbot
Copy link
Contributor

This PR (HEAD: 0198052) has been imported to Gerrit for code review.

Please visit https://go-review.googlesource.com/c/go/+/304869 to see it.

Tip: You can toggle comments from me using the comments slash command (e.g. /comments off)
See the Wiki page for more info

@gopherbot
Copy link
Contributor

Message from Go Bot:

Patch Set 1:

Congratulations on opening your first change. Thank you for your contribution!

Next steps:
A maintainer will review your change and provide feedback. See
https://golang.org/doc/contribute.html#review for more info and tips to get your
patch through code review.

Most changes in the Go project go through a few rounds of revision. This can be
surprising to people new to the project. The careful, iterative review process
is our way of helping mentor contributors and ensuring that their contributions
have a lasting impact.

During May-July and Nov-Jan the Go project is in a code freeze, during which
little code gets reviewed or merged. If a reviewer responds with a comment like
R=go1.11 or adds a tag like "wait-release", it means that this CL will be
reviewed as part of the next development cycle. See https://golang.org/s/release
for more details.


Please don’t reply on this GitHub thread. Visit golang.org/cl/304869.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

Message from Keith Randall:

Patch Set 1:

(1 comment)


Please don’t reply on this GitHub thread. Visit golang.org/cl/304869.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

This PR (HEAD: 42e795b) has been imported to Gerrit for code review.

Please visit https://go-review.googlesource.com/c/go/+/304869 to see it.

Tip: You can toggle comments from me using the comments slash command (e.g. /comments off)
See the Wiki page for more info

@gopherbot
Copy link
Contributor

Message from Pat Gavlin:

Patch Set 1:

(2 comments)


Please don’t reply on this GitHub thread. Visit golang.org/cl/304869.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

Message from Keith Randall:

Patch Set 2: Run-TryBot+1 Code-Review+2

(1 comment)


Please don’t reply on this GitHub thread. Visit golang.org/cl/304869.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

Message from Go Bot:

Patch Set 2:

TryBots beginning. Status page: https://farmer.golang.org/try?commit=b6ed9729


Please don’t reply on this GitHub thread. Visit golang.org/cl/304869.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

Message from Go Bot:

Patch Set 2: TryBot-Result+1

TryBots are happy.


Please don’t reply on this GitHub thread. Visit golang.org/cl/304869.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

Message from Pat Gavlin:

Patch Set 2:

(2 comments)


Please don’t reply on this GitHub thread. Visit golang.org/cl/304869.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

This PR (HEAD: f12a621) has been imported to Gerrit for code review.

Please visit https://go-review.googlesource.com/c/go/+/304869 to see it.

Tip: You can toggle comments from me using the comments slash command (e.g. /comments off)
See the Wiki page for more info

@gopherbot
Copy link
Contributor

Message from Keith Randall:

Patch Set 3: Code-Review+2

(1 comment)


Please don’t reply on this GitHub thread. Visit golang.org/cl/304869.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

This PR (HEAD: 7e92885) has been imported to Gerrit for code review.

Please visit https://go-review.googlesource.com/c/go/+/304869 to see it.

Tip: You can toggle comments from me using the comments slash command (e.g. /comments off)
See the Wiki page for more info

@gopherbot
Copy link
Contributor

Message from Pat Gavlin:

Patch Set 5:

(1 comment)


Please don’t reply on this GitHub thread. Visit golang.org/cl/304869.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

Message from Josh Bleecher Snyder:

Patch Set 5: Trust+1

(1 comment)


Please don’t reply on this GitHub thread. Visit golang.org/cl/304869.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

Message from Keith Randall:

Patch Set 5: Code-Review+2

(1 comment)


Please don’t reply on this GitHub thread. Visit golang.org/cl/304869.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

Message from Pat Gavlin:

Patch Set 5:

(1 comment)


Please don’t reply on this GitHub thread. Visit golang.org/cl/304869.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

This PR (HEAD: ab94ede) has been imported to Gerrit for code review.

Please visit https://go-review.googlesource.com/c/go/+/304869 to see it.

Tip: You can toggle comments from me using the comments slash command (e.g. /comments off)
See the Wiki page for more info

@randall77
Copy link
Contributor

@gopherbot please open backport issues.

@gopherbot
Copy link
Contributor

Message from Keith Randall:

Patch Set 6: Code-Review+2

(1 comment)


Please don’t reply on this GitHub thread. Visit golang.org/cl/304869.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

Message from Pat Gavlin:

Patch Set 6:

(1 comment)


Please don’t reply on this GitHub thread. Visit golang.org/cl/304869.
After addressing review feedback, remember to publish your drafts!

Under certain circumstances, the existing rules for bit operations can
produce code that writes beyond its intended bounds. For example,
consider the following code:

    func repro(b []byte, addr, bit int32) {
	    _ = b[3]
	    v := uint32(b[0]) | uint32(b[1])<<8 | uint32(b[2])<<16 | uint32(b[3])<<24 | 1<<(bit&31)
	    b[0] = byte(v)
	    b[1] = byte(v >> 8)
	    b[2] = byte(v >> 16)
	    b[3] = byte(v >> 24)
    }

Roughly speaking:

1. The expression `1 << (bit & 31)` is rewritten into `(SHLL 1 bit)`
2. The expression `uint32(b[0]) | uint32(b[1])<<8 | uint32(b[2])<<16 |
   uint32(b[3])<<24` is rewritten into `(MOVLload &b[0])`
3. The statements `b[0] = byte(v) ... b[3] = byte(v >> 24)` are
   rewritten into `(MOVLstore &b[0], v)`
4. `(ORL (SHLL 1, bit) (MOVLload &b[0]))` is rewritten into
   `(BTSL (MOVLload &b[0]) bit)`. This is a valid transformation because
   the destination is a register: in this case, the bit offset is masked
   by the number of bits in the destination register. This is identical
   to the masking performed by `SHL`.
5. `(MOVLstore &b[0] (BTSL (MOVLload &b[0]) bit))` is rewritten into
   `(BTSLmodify &b[0] bit)`. This is an invalid transformation because
   the destination is memory: in this case, the bit offset is not
   masked, and the chosen instruction may write outside its intended
   32-bit location.

These changes fix the invalid rewrite performed in step (5) by
explicitly maksing the bit offset operand to `BT(S|R|C)(L|Q)modify`. In
the example above, the adjusted rules produce
`(BTSLmodify &b[0] (ANDLconst [31] bit))` in step (5).

These changes also add several new rules to rewrite bit sets, toggles,
and clears that are rooted at `(OR|XOR|AND)(L|Q)modify` operators into
appropriate `BT(S|R|C)(L|Q)modify` operators. These rules catch cases
where `MOV(L|Q)store ((OR|XOR|AND)(L|Q) ...)` is rewritten to
`(OR|XOR|AND)(L|Q)modify` before the `(OR|XOR|AND)(L|Q) ...` can be
rewritten to `BT(S|R|C)(L|Q) ...`.

Overall, compilecmp reports small improvements in code size on
darwin/amd64 when the changes to the compiler itself are exlcuded:

file                               before   after    Δ       %
runtime.s                          536464   536412   -52     -0.010%
bytes.s                            32629    32593    -36     -0.110%
strings.s                          44565    44529    -36     -0.081%
os/signal.s                        7967     7959     -8      -0.100%
cmd/vendor/golang.org/x/sys/unix.s 81686    81678    -8      -0.010%
math/big.s                         188235   188253   +18     +0.010%
cmd/link/internal/loader.s         89295    89056    -239    -0.268%
cmd/link/internal/ld.s             633551   633232   -319    -0.050%
cmd/link/internal/arm.s            18934    18928    -6      -0.032%
cmd/link/internal/arm64.s          31814    31801    -13     -0.041%
cmd/link/internal/riscv64.s        7347     7345     -2      -0.027%
cmd/compile/internal/ssa.s         4029173  4033066  +3893   +0.097%
total                              21298280 21301472 +3192   +0.015%
@gopherbot
Copy link
Contributor

This PR (HEAD: 9a42bd2) has been imported to Gerrit for code review.

Please visit https://go-review.googlesource.com/c/go/+/304869 to see it.

Tip: You can toggle comments from me using the comments slash command (e.g. /comments off)
See the Wiki page for more info

@gopherbot
Copy link
Contributor

Message from Pat Gavlin:

Patch Set 7:

(1 comment)


Please don’t reply on this GitHub thread. Visit golang.org/cl/304869.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

Message from Keith Randall:

Patch Set 8: Code-Review+2

(1 comment)


Please don’t reply on this GitHub thread. Visit golang.org/cl/304869.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

Message from Pat Gavlin:

Patch Set 8:

(1 comment)


Please don’t reply on this GitHub thread. Visit golang.org/cl/304869.
After addressing review feedback, remember to publish your drafts!

gopherbot pushed a commit that referenced this pull request Mar 26, 2021
Under certain circumstances, the existing rules for bit operations can
produce code that writes beyond its intended bounds. For example,
consider the following code:

    func repro(b []byte, addr, bit int32) {
	    _ = b[3]
	    v := uint32(b[0]) | uint32(b[1])<<8 | uint32(b[2])<<16 | uint32(b[3])<<24 | 1<<(bit&31)
	    b[0] = byte(v)
	    b[1] = byte(v >> 8)
	    b[2] = byte(v >> 16)
	    b[3] = byte(v >> 24)
    }

Roughly speaking:

1. The expression `1 << (bit & 31)` is rewritten into `(SHLL 1 bit)`
2. The expression `uint32(b[0]) | uint32(b[1])<<8 | uint32(b[2])<<16 |
   uint32(b[3])<<24` is rewritten into `(MOVLload &b[0])`
3. The statements `b[0] = byte(v) ... b[3] = byte(v >> 24)` are
   rewritten into `(MOVLstore &b[0], v)`
4. `(ORL (SHLL 1, bit) (MOVLload &b[0]))` is rewritten into
   `(BTSL (MOVLload &b[0]) bit)`. This is a valid transformation because
   the destination is a register: in this case, the bit offset is masked
   by the number of bits in the destination register. This is identical
   to the masking performed by `SHL`.
5. `(MOVLstore &b[0] (BTSL (MOVLload &b[0]) bit))` is rewritten into
   `(BTSLmodify &b[0] bit)`. This is an invalid transformation because
   the destination is memory: in this case, the bit offset is not
   masked, and the chosen instruction may write outside its intended
   32-bit location.

These changes fix the invalid rewrite performed in step (5) by
explicitly maksing the bit offset operand to `BT(S|R|C)(L|Q)modify`. In
the example above, the adjusted rules produce
`(BTSLmodify &b[0] (ANDLconst [31] bit))` in step (5).

These changes also add several new rules to rewrite bit sets, toggles,
and clears that are rooted at `(OR|XOR|AND)(L|Q)modify` operators into
appropriate `BT(S|R|C)(L|Q)modify` operators. These rules catch cases
where `MOV(L|Q)store ((OR|XOR|AND)(L|Q) ...)` is rewritten to
`(OR|XOR|AND)(L|Q)modify` before the `(OR|XOR|AND)(L|Q) ...` can be
rewritten to `BT(S|R|C)(L|Q) ...`.

Overall, compilecmp reports small improvements in code size on
darwin/amd64 when the changes to the compiler itself are exlcuded:

file                               before   after    Δ       %
runtime.s                          536464   536412   -52     -0.010%
bytes.s                            32629    32593    -36     -0.110%
strings.s                          44565    44529    -36     -0.081%
os/signal.s                        7967     7959     -8      -0.100%
cmd/vendor/golang.org/x/sys/unix.s 81686    81678    -8      -0.010%
math/big.s                         188235   188253   +18     +0.010%
cmd/link/internal/loader.s         89295    89056    -239    -0.268%
cmd/link/internal/ld.s             633551   633232   -319    -0.050%
cmd/link/internal/arm.s            18934    18928    -6      -0.032%
cmd/link/internal/arm64.s          31814    31801    -13     -0.041%
cmd/link/internal/riscv64.s        7347     7345     -2      -0.027%
cmd/compile/internal/ssa.s         4029173  4033066  +3893   +0.097%
total                              21298280 21301472 +3192   +0.015%

Change-Id: I2e560548b515865129e1724e150e30540e9d29ce
GitHub-Last-Rev: 9a42bd2
GitHub-Pull-Request: #45242
Reviewed-on: https://go-review.googlesource.com/c/go/+/304869
Reviewed-by: Keith Randall <[email protected]>
Trust: Josh Bleecher Snyder <[email protected]>
@gopherbot
Copy link
Contributor

This PR is being closed because golang.org/cl/304869 has been merged.

@gopherbot gopherbot closed this Mar 26, 2021
gopherbot pushed a commit that referenced this pull request Mar 31, 2021
…MD64

Under certain circumstances, the existing rules for bit operations can
produce code that writes beyond its intended bounds. For example,
consider the following code:

    func repro(b []byte, addr, bit int32) {
	    _ = b[3]
	    v := uint32(b[0]) | uint32(b[1])<<8 | uint32(b[2])<<16 | uint32(b[3])<<24 | 1<<(bit&31)
	    b[0] = byte(v)
	    b[1] = byte(v >> 8)
	    b[2] = byte(v >> 16)
	    b[3] = byte(v >> 24)
    }

Roughly speaking:

1. The expression `1 << (bit & 31)` is rewritten into `(SHLL 1 bit)`
2. The expression `uint32(b[0]) | uint32(b[1])<<8 | uint32(b[2])<<16 |
   uint32(b[3])<<24` is rewritten into `(MOVLload &b[0])`
3. The statements `b[0] = byte(v) ... b[3] = byte(v >> 24)` are
   rewritten into `(MOVLstore &b[0], v)`
4. `(ORL (SHLL 1, bit) (MOVLload &b[0]))` is rewritten into
   `(BTSL (MOVLload &b[0]) bit)`. This is a valid transformation because
   the destination is a register: in this case, the bit offset is masked
   by the number of bits in the destination register. This is identical
   to the masking performed by `SHL`.
5. `(MOVLstore &b[0] (BTSL (MOVLload &b[0]) bit))` is rewritten into
   `(BTSLmodify &b[0] bit)`. This is an invalid transformation because
   the destination is memory: in this case, the bit offset is not
   masked, and the chosen instruction may write outside its intended
   32-bit location.

These changes fix the invalid rewrite performed in step (5) by
explicitly maksing the bit offset operand to `BT(S|R|C)(L|Q)modify`. In
the example above, the adjusted rules produce
`(BTSLmodify &b[0] (ANDLconst [31] bit))` in step (5).

These changes also add several new rules to rewrite bit sets, toggles,
and clears that are rooted at `(OR|XOR|AND)(L|Q)modify` operators into
appropriate `BT(S|R|C)(L|Q)modify` operators. These rules catch cases
where `MOV(L|Q)store ((OR|XOR|AND)(L|Q) ...)` is rewritten to
`(OR|XOR|AND)(L|Q)modify` before the `(OR|XOR|AND)(L|Q) ...` can be
rewritten to `BT(S|R|C)(L|Q) ...`.

Overall, compilecmp reports small improvements in code size on
darwin/amd64 when the changes to the compiler itself are exlcuded:

file                               before   after    Δ       %
runtime.s                          536464   536412   -52     -0.010%
bytes.s                            32629    32593    -36     -0.110%
strings.s                          44565    44529    -36     -0.081%
os/signal.s                        7967     7959     -8      -0.100%
cmd/vendor/golang.org/x/sys/unix.s 81686    81678    -8      -0.010%
math/big.s                         188235   188253   +18     +0.010%
cmd/link/internal/loader.s         89295    89056    -239    -0.268%
cmd/link/internal/ld.s             633551   633232   -319    -0.050%
cmd/link/internal/arm.s            18934    18928    -6      -0.032%
cmd/link/internal/arm64.s          31814    31801    -13     -0.041%
cmd/link/internal/riscv64.s        7347     7345     -2      -0.027%
cmd/compile/internal/ssa.s         4029173  4033066  +3893   +0.097%
total                              21298280 21301472 +3192   +0.015%

Fixes #45253

Change-Id: I2e560548b515865129e1724e150e30540e9d29ce
GitHub-Last-Rev: ab94ede
GitHub-Pull-Request: #45242
Reviewed-on: https://go-review.googlesource.com/c/go/+/305069
Trust: Emmanuel Odeke <[email protected]>
Run-TryBot: Emmanuel Odeke <[email protected]>
TryBot-Result: Go Bot <[email protected]>
Reviewed-by: Keith Randall <[email protected]>
gopherbot pushed a commit that referenced this pull request May 8, 2021
These operations (BT{S,R,C}{Q,L}modify) are quite a bit slower than
other ways of doing the same thing.

Without the BTxmodify operations, there are two fallback ways the compiler
performs these operations: AND/OR/XOR operations directly on memory, or
load-BTx-write sequences. The compiler kinda chooses one arbitrarily
depending on rewrite rule application order. Currently, it uses
load-BTx-write for the Const benchmarks and AND/OR/XOR directly to memory
for the non-Const benchmarks. TBD, someone might investigate which of
the two fallback strategies is really better. For now, they are both
better than BTx ops.

name              old time/op  new time/op  delta
BitSet-8          1.09µs ± 2%  0.64µs ± 5%  -41.60%  (p=0.000 n=9+10)
BitClear-8        1.15µs ± 3%  0.68µs ± 6%  -41.00%  (p=0.000 n=10+10)
BitToggle-8       1.18µs ± 4%  0.73µs ± 2%  -38.36%  (p=0.000 n=10+8)
BitSetConst-8     37.0ns ± 7%  25.8ns ± 2%  -30.24%  (p=0.000 n=10+10)
BitClearConst-8   30.7ns ± 2%  25.0ns ±12%  -18.46%  (p=0.000 n=10+10)
BitToggleConst-8  36.9ns ± 1%  23.8ns ± 3%  -35.46%  (p=0.000 n=9+10)

Fixes #45790
Update #45242

Change-Id: Ie33a72dc139f261af82db15d446cd0855afb4e59
Reviewed-on: https://go-review.googlesource.com/c/go/+/318149
Trust: Keith Randall <[email protected]>
Run-TryBot: Keith Randall <[email protected]>
TryBot-Result: Go Bot <[email protected]>
Reviewed-by: Ben Shi <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla: yes Used by googlebot to label PRs as having a valid CLA. The text of this label should not change.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants