Skip to content

Remove [whitespace character] and use [unicode whitespace character] instead #343

@zudov

Description

@zudov

At the moment we have

  • whitespace character -- a space (U+0020), tab (U+0009), newline (U+000A), line tabulation (U+000B), form feed (U+000C), or carriage return (U+000D).
  • unicode whitespace character -- is any code point in the unicode Zs class, or a tab (U+0009), carriage return (U+000D), newline (U+000A), or form feed (U+000C).

These two are very close to each other, is there a need to have both of them? As I understand it the primary function of [whitespace character] is to restrict various kind of spaces (e.g.  ). I looked through the places where [whitespace character] is used but didn't understand how thing like   would harm there.

In case if we actually don't need this distinction, I propose to remove [whitespace character] and use [unicode whitespace character] in those places. Or better go further and remove name [unicode whitespace character] and change the definition of [whitespace character] to the one that [unicode whitespace character] has at the moment.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions