Skip to content

Critical deficiencies in Tools/c-globals/check-c-globals.py. #47

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
17 of 29 tasks
ericsnowcurrently opened this issue Jun 21, 2019 · 0 comments
Open
17 of 29 tasks
Assignees
Labels
complexity: medium moderately hard to figure out P0 high priority size: medium requires a moderate amount of changes type: code health URGENT must be dealt with ASAP X [isolation] improved interpreter isolation X [runtime] CPython runtime (lifecycle, state)

Comments

@ericsnowcurrently
Copy link
Owner

ericsnowcurrently commented Jun 21, 2019

(see #33)

The existing Tools/c-globals/check-c-globals.py script is an important tool for achieving a per-interpreter GIL (and generally for the health of the CPython code base). However, it has a number of problems that keep it from being helpful in reaching that objective.

First we'll review the characteristics of the tool. Then we'll look at the deficiencies. Finally, we'll cover the specific tasks that will address those problems.

Purpose

The script has the following 2 core purposes:

  • make sure there are no "unsupported" static variables in CPython
  • help address failures when such static variables are found
    • including making it easy to see the current related state of the CPython code base

(More or less, "unsupported" means the variable holds per-interpreter runtime state.)

Constraints

The script also has the following core constraints:

  • produce correct results (identify exactly all unsupported static variables)
  • be up-to-date (relative to ignored variables and to what "unsupported" means)
  • have low maintenance cost (few defects, easy to understand, easy to update/extend)

Deficiencies

In context of those purposes and constraints, the script has some critical deficiencies (in rough order of priority):

  • not actually used by core developers (so statics keep getting added)
  • static locals are not detected
  • detection relies on a platform-specific tool (nm on linux)
  • the code is a mess (impacting maintainability)
  • no tests (impacting maintainability)
  • relies on naive hard-coded rules to ignore many statics (impacting correctness & maintainability)
  • relies on a simple large list of names (from a basic flat text file) to ignore the remaining statics (impacting correctness & maintainability)

(Note that non-critical deficiencies (and nice-to-haves) are covered in #49.)

Solution

(Reminder: we're only aiming to addresss critical deficiencies here.)

High-level Plan

  • rename Tools/c-globals to Tools/c-analyzer
  • add c-statics.py
    • the "full-featured" replacement for check-c-statics.py
    • only provide minimal necessary functionality to meet core purposes (and be fast enough)
    • correct the critical deficiencies
    • (detailed tasks below)
  • rewrite check-c-globals.py as a single-purpose script named check-c-statics.py
    • no args
    • share code with c-statics.py (i.e. call Tools/c-analyzer/c_statics/__main__.py:main())
    • refer to the c-statics.py script in output
  • support ignoring some "unsupported" statics
    • add Tools/c-analyzer/c_statics/ignored.tsv
    • add a --ignore option to the c-statics.py script
  • add Lib/test/test_check_c_statics.py
    • 1 test that effectively runs the check-c-statics.py script
    • ignore all statics in ignored.tsv for now (will still catch any new statics)
    • ensures check-c-statics.py is run at some point in CI
  • make sure it's clear how to resolve "unsupported" statics

At that point we can use the tool sans nearly all deficiencies. The remaining tasks would then be:

Directory Structure

Tools/c-analyzer/
    c_statics/
        README (including "Resolving Failures" section)
        __main__.py
        find.py
        show.py
        known.py
        supported.py
        ignored.tsv (temporary)
    c_parser/
        files.py
        info.py
        statics.py
        declarations.py
    c-statics.py
    check-c-statics.py
Lib/test/
    test_tools/test_tool_c_analyzer/
        testdata/...
        test_c_statics/...
        test_c_parser/...
    test_check_c_statics.py

c-statics.py CLI

The c-statics.py script will initially have the following basic CLI:

  • show [--ignored FILE] [--known FILE] [DIR,...]
    • print table of static variables
  • check [--ignored FILE] [--known FILE] [DIR,...]
    • fail if unsupported static vars found (and print simple list)

Specific Tasks for c-globals.py

(Each task includes adding sufficient tests.)

  • Tools/c-globals/_cg/: add __init__.py
  • Lib/test/test_toolcglobals.py: frame out the tests
  • _cg/__main__.py: add a basic parse_args(), main(), and if __name__ == '__main__:`
  • __main__.py: stub out a "show" command w/ file args
  • __main__.py: stub out a "check" command w/ file args
  • Tools/c-globals/c-globals.py: copy if __name__ == '__main__:from_cg/main.py`
  • ...
  • info.py: add StaticVar (filename, funcname, name, vartype)
  • find.py: add a fake "statics()" func (w/ hard-coded [StaticVar, supported] list)
  • show.py: add show_basic() (group by supported / unsupported)
  • __main__.py: implement the "show" command
  • ...
  • __main__.py: implement the "check" command
  • ...
  • scan.py: add "iter_statics(file)" with a fake list of StaticVar
  • supported.py: add "is_supported(var)" with a simple testable heuristic
  • find.py: implement "statics()"
  • ...
  • scan.py: add "normalize_vartype(text)"
  • ...
  • scan.py: implement "iter_statics()"
  • ...
  • known.py: add "self_check()"
    • make sure "known" data is up-to-date with search dirs
    • check "ignored" variables
  • ...
  • Lib/test/test_toolcglobals.py: add black-box system tests for c-globals.py (see below)
  • Lib/test/test_toolcglobals.py: add a "self-check" system test

System Tests

  1. "self-check":
    • run normal check
    • ensure known macros & vartypes aren't hiding unknown local types
    • compare results with nm output
  2. run c-globals.py show against dummy files
    • check exact output
    • drive results via "known" and "ignored" files
    • static vars are a mix of POTS, stdlib, 3rd-party, and local types
    • no known macros or vartypes unless otherwise indicated
    • tests (none ignored):
      • all known -> supported
      • only local known -> supported
      • some local unknown -> unsupported
      • some vartypes known but not their nested local type -> supported (fail self-check?)
      • some macros known but not their nested local type -> supported (fail self-check?)
  3. run c-globals.py check against dummy files:
    • make sure a passing "check" really means everything is okay
    • ...

dummy files:

  • static vars for all POTS
  • static vars for stdlib (as much as possible)
  • many static vars and many non-static
  • a good mix of global and local
  • at least 1 macro that produces a static
Lib/test/toolcglobalsdata/
    test*/
        Src1/
            spam.c
            eggs.c
        Src2/
            ham.c
        Src3/
            beans.c
            beans_extra.h
            tartar.c
        Include/
            spam.h
            beans.h
           tartar.h
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
complexity: medium moderately hard to figure out P0 high priority size: medium requires a moderate amount of changes type: code health URGENT must be dealt with ASAP X [isolation] improved interpreter isolation X [runtime] CPython runtime (lifecycle, state)
Projects
None yet
Development

No branches or pull requests

1 participant