Skip to content

Correct and deduplicate docs on "printable" characters #82045

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
gnprice opened this issue Aug 15, 2019 · 2 comments
Closed

Correct and deduplicate docs on "printable" characters #82045

gnprice opened this issue Aug 15, 2019 · 2 comments
Labels
docs Documentation in the Doc dir topic-unicode

Comments

@gnprice
Copy link
Contributor

gnprice commented Aug 15, 2019

BPO 37864
Nosy @vstinner, @ezio-melotti, @gnprice
PRs
  • bpo-37864: Correct and deduplicate "isprintable" docs; add test. #15300
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2019-08-15.04:40:26.339>
    labels = ['3.8', 'expert-unicode']
    title = 'Correct and deduplicate docs on "printable" characters'
    updated_at = <Date 2019-08-15.04:43:18.597>
    user = 'https://github.com/gnprice'

    bugs.python.org fields:

    activity = <Date 2019-08-15.04:43:18.597>
    actor = 'Greg Price'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Unicode']
    creation = <Date 2019-08-15.04:40:26.339>
    creator = 'Greg Price'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 37864
    keywords = ['patch']
    message_count = 1.0
    messages = ['349792']
    nosy_count = 3.0
    nosy_names = ['vstinner', 'ezio.melotti', 'Greg Price']
    pr_nums = ['15300']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue37864'
    versions = ['Python 3.8']

    Linked PRs

    @gnprice
    Copy link
    Contributor Author

    gnprice commented Aug 15, 2019

    While working on bpo-36502 and then bpo-18236 about the definition and docs of str.isspace(), I looked closely also at its neighbor str.isprintable().

    It turned out that we have the definition of what makes a character "printable" documented in three places, giving two different definitions.

    The definition in the comment on _PyUnicode_IsPrintable is inverted, so that's an easy small fix.

    With that correction, the two definitions turn out to be equivalent -- but to confirm that, you have to go look up, or happen to know, that those are the only five "Other" categories and only three "Separator" categories in the Unicode character database. That makes it hard for the reader to tell whether they really are the same, or if there's some subtle difference in the intended semantics.

    I've taken a crack at writing some improved docs text for a single definition, borrowing ideas from the C comment as well as the existing docs text; and then pointing there from the other places we'd had definitions. PR coming shortly.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @iritkatriel iritkatriel added docs Documentation in the Doc dir 3.11 only security fixes 3.10 only security fixes 3.12 only security fixes and removed 3.8 (EOL) end of life labels Apr 4, 2023
    encukou pushed a commit that referenced this issue Feb 14, 2025
    …30118)
    
    We had the definition of what makes a character "printable" documented in three places, giving two different definitions.
    The definition in the comment on `_PyUnicode_IsPrintable` was inverted; correct that.
    
    With that correction, the two definitions turn out to be equivalent -- but to confirm that, you have to go look up, or happen to know, that those are the only five "Other" categories and only three "Separator" categories in the Unicode character database.  That makes it hard for the reader to tell whether they really are the same, or if there's some subtle difference in the intended semantics.
    
    Fix that by cutting the C API docs' and the C comment's copies of the subtle details, in favor of referring to the Python-level docs. That ensures it's explicit that these are all meant to agree, and also lets us concentrate improvements to the wording in one place.
    
    Speaking of which, borrow some ideas from the C comment, along with other tweaks, to hopefully add a bit more clarity to that one newly-centralized copy in the docs.
    
    Also add a thorough test that the implementation agrees with this definition.
    
    Author:    Greg Price <[email protected]>
    
    Co-authored-by: Greg Price <[email protected]>
    StanFromIreland added a commit to StanFromIreland/cpython that referenced this issue Feb 14, 2025
    …pythonGH-130118)
    
    We had the definition of what makes a character "printable" documented in three places, giving two different definitions.
    The definition in the comment on `_PyUnicode_IsPrintable` was inverted; correct that.
    
    With that correction, the two definitions turn out to be equivalent -- but to confirm that, you have to go look up, or happen to know, that those are the only five "Other" categories and only three "Separator" categories in the Unicode character database.  That makes it hard for the reader to tell whether they really are the same, or if there's some subtle difference in the intended semantics.
    
    Fix that by cutting the C API docs' and the C comment's copies of the subtle details, in favor of referring to the Python-level docs. That ensures it's explicit that these are all meant to agree, and also lets us concentrate improvements to the wording in one place.
    
    Speaking of which, borrow some ideas from the C comment, along with other tweaks, to hopefully add a bit more clarity to that one newly-centralized copy in the docs.
    
    Also add a thorough test that the implementation agrees with this definition.
    
    Author: Stan Ulbrych <[email protected]>
    
    Co-authored-by: Greg Price <[email protected]>
    (cherry picked from commit 3402e13)
    StanFromIreland added a commit to StanFromIreland/cpython that referenced this issue Feb 14, 2025
    …pythonGH-130118)
    
    We had the definition of what makes a character "printable" documented in three places, giving two different definitions.
    The definition in the comment on `_PyUnicode_IsPrintable` was inverted; correct that.
    
    With that correction, the two definitions turn out to be equivalent -- but to confirm that, you have to go look up, or happen to know, that those are the only five "Other" categories and only three "Separator" categories in the Unicode character database.  That makes it hard for the reader to tell whether they really are the same, or if there's some subtle difference in the intended semantics.
    
    Fix that by cutting the C API docs' and the C comment's copies of the subtle details, in favor of referring to the Python-level docs. That ensures it's explicit that these are all meant to agree, and also lets us concentrate improvements to the wording in one place.
    
    Speaking of which, borrow some ideas from the C comment, along with other tweaks, to hopefully add a bit more clarity to that one newly-centralized copy in the docs.
    
    Also add a thorough test that the implementation agrees with this definition.
    
    Author:    Greg Price <[email protected]>
    
    Co-authored-by: Greg Price <[email protected]>
    (cherry picked from commit 3402e13)
    @picnixz picnixz removed 3.11 only security fixes 3.10 only security fixes 3.12 only security fixes labels Feb 14, 2025
    encukou pushed a commit that referenced this issue Feb 17, 2025
    GH-130125)
    
    We had the definition of what makes a character "printable" documented in three places, giving two different definitions.
    The definition in the comment on `_PyUnicode_IsPrintable` was inverted; correct that.
    
    With that correction, the two definitions turn out to be equivalent -- but to confirm that, you have to go look up, or happen to know, that those are the only five "Other" categories and only three "Separator" categories in the Unicode character database.  That makes it hard for the reader to tell whether they really are the same, or if there's some subtle difference in the intended semantics.
    
    Fix that by cutting the C API docs' and the C comment's copies of the subtle details, in favor of referring to the Python-level docs. That ensures it's explicit that these are all meant to agree, and also lets us concentrate improvements to the wording in one place.
    
    Speaking of which, borrow some ideas from the C comment, along with other tweaks, to hopefully add a bit more clarity to that one newly-centralized copy in the docs.
    
    Also add a thorough test that the implementation agrees with this definition.
    
    Co-authored-by: Greg Price <[email protected]>
    (cherry picked from commit 3402e13)
    encukou pushed a commit that referenced this issue Feb 17, 2025
    GH-130127)
    
    We had the definition of what makes a character "printable" documented in three places, giving two different definitions.
    The definition in the comment on `_PyUnicode_IsPrintable` was inverted; correct that.
    
    With that correction, the two definitions turn out to be equivalent -- but to confirm that, you have to go look up, or happen to know, that those are the only five "Other" categories and only three "Separator" categories in the Unicode character database.  That makes it hard for the reader to tell whether they really are the same, or if there's some subtle difference in the intended semantics.
    
    Fix that by cutting the C API docs' and the C comment's copies of the subtle details, in favor of referring to the Python-level docs. That ensures it's explicit that these are all meant to agree, and also lets us concentrate improvements to the wording in one place.
    
    Speaking of which, borrow some ideas from the C comment, along with other tweaks, to hopefully add a bit more clarity to that one newly-centralized copy in the docs.
    
    Also add a thorough test that the implementation agrees with this definition.
    
    Author:    Greg Price <[email protected]>
    
    Co-authored-by: Greg Price <[email protected]>
    (cherry picked from commit 3402e13)
    @encukou
    Copy link
    Member

    encukou commented Feb 17, 2025

    Thank you for the patch, and sorry for the wait!

    @encukou encukou closed this as completed Feb 17, 2025
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    docs Documentation in the Doc dir topic-unicode
    Projects
    Status: Todo
    Development

    No branches or pull requests

    4 participants