Description
Character range matching is conceptually (range_start..range_end).any(|c| c == input_char)
, but as an optimization is implemented as range_start <= input_char && input_char <= range_end
. This is fine.
Case-insensitive matching is implemented as uppercase(c) == uppercase(input_char)
. This is fine (modulo #55).
So case-insensitive range matching is conceptually (range_start..range_end).any(|c| uppercase(c) == uppercase(input_char))
. It is currently implemented as uppercase(range_start) <= uppercase(input_char) && uppercase(input_char) <= uppercase(range_end)
which is not equivalent.
One of the tests currently passing is that (?i)\p{Lu}+
matches ΛΘΓΔα
entirely. That is, greek letters (both upper case and lower case) all match the category of upper case letters when matched case-insensitively. But the same test with \p{Ll}
(category of lower case letters) instead of \p{Lu}
currently fails because of this issue. (\p{Lu}
and \p{Ll}
expand to large unions of character ranges.)