Closed
Description
What version of Go are you using (go version
)?
1.16.6
Does this issue reproduce with the latest release?
I have not tried 1.17.x
What operating system and processor architecture are you using (go env
)?
linux on amd64
What did you do?
I've been doing my own rune code folding (due to bugs in x/text/cases). I have this function:
func MyCaseFold(name []byte) []byte {
var b bytes.Buffer
b.Grow(len(name))
for i := 0; i < len(name); {
r, w := utf8.DecodeRune(name[i:])
if r == utf8.RuneError && w < 2 {
return name
}
replacements := foldMap[r]
if len(replacements) > 0 {
for j := range replacements {
b.WriteRune(replacements[j])
}
} else {
b.WriteRune(r)
}
i += w
}
return b.Bytes()
}
What did you expect to see?
I had expected that I could do the same with r := range string(name)
. However, I happened to have this (in hex) unicode string: 43efbfbd. The latter three bytes of that are valid utf-8 (and utf8.Valid
agrees). It so happens that said string decodes to the same value as utf8.RuneError
. I had to add that w < 2
check, which I determined after reading the DecodeRune
source, where I saw that RuneError was only returned with 0 or 1 for the byte width.
utf8.RuneError
should be a value that cannot be created via one of the non-error code-paths in DecodeRune
.