Skip to content

Consider applying flags for warnings about potential security issues #112301

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
mdboom opened this issue Nov 21, 2023 · 27 comments
Open

Consider applying flags for warnings about potential security issues #112301

mdboom opened this issue Nov 21, 2023 · 27 comments
Labels
build The build process and cross-build performance Performance or resource usage type-feature A feature request or enhancement type-security A security issue

Comments

@mdboom
Copy link
Contributor

mdboom commented Nov 21, 2023

Feature or enhancement

Proposal:

At a recent meeting of OpenSSF's Memory Safety SIG, I became aware of the C/C++ hardening guide they are putting together.

At a high-level, they recommend compiling with the following flags:

-O2 -Wall -Wformat=2 -Wconversion -Wtrampolines -Wimplicit-fallthrough \
-U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=3 \
-D_GLIBCXX_ASSERTIONS \
-fstrict-flex-arrays=3 \
-fstack-clash-protection -fstack-protector-strong \
-Wl,-z,nodlopen -Wl,-z,noexecstack \
-Wl,-z,relro -Wl,-z,now \
-fPIE -pie -fPIC -shared

(-shared doesn't really make sense as a global CFLAG, so I removed it.)

When compiling on most x86 architectures (amd64, i386 and x32), add:

-fcf-protection=full

At @sethmlarson's urging, I compiled CPython on Linux/x86_64/gcc with these flags. From the complete build log, there are 3,084 warnings, but otherwise the result builds and passes all unit tests.

The warnings are of these types: (EDIT: Table updated to not double count the same line)

warning type count
sign-conversion 2,341
conversion 595
array-bounds= 131
format-nonliteral 11
stringop-overflow= 2
float-conversion 2
stringop-overread 1
maybe-uninitialized 1
total 3,084
**Top warnings per file.**
filename count
./Modules/binascii.c 208
Objects/unicodeobject.c 142
./Include/internal/pycore_runtime_init.h 128
Parser/parser.c 114
./Modules/_decimal/libmpdec/mpdecimal.c 94
./Modules/posixmodule.c 85
./Modules/socketmodule.c 76
./Modules/_pickle.c 75
Objects/longobject.c 65
./Modules/arraymodule.c 49
total 3,084

I am not a security expert, so I don't know a good way to assess how many of these are potentially exploitable, and how many are harmless false positives. Some are probably un-resolvable (format-literal is pretty hard to avoid when wrapping sprintf, for example).

At a high level, I think the process to address these and make incremental progress maybe looks something like:

  • Pick one of the warning types, and assess how many false positives it gives and how onerous it is to fix them. From this, build concensus about whether it's worth addressing.
  • Fix all of the existing instances.
  • Turn that specific warning into an error so it doesn't creep back in.

But this is just to start the discussion about how to move forward.

Has this already been discussed elsewhere?

No response given

Links to previous discussion of this feature:

No response

Linked PRs

@mdboom mdboom added type-feature A feature request or enhancement type-security A security issue build The build process and cross-build labels Nov 21, 2023
@colesbury
Copy link
Contributor

I don't think we want -fstrict-flex-arrays=3. We need flexible array members and we need C++ support, so we're forced to rely on the (widely supported) compiler extension of using field[0] or field[1] as a flexible array member.

@sobolevn
Copy link
Member

sobolevn commented Nov 22, 2023

Some are probably un-resolvable (format-literal is pretty hard to avoid when wrapping sprintf, for example)

These warnings do no make much sense in current use-cases:

Objects/unicodeobject.c:2592:21: warning: format not a string literal, argument types not checked [-Wformat-nonliteral]
 2592 |                     sprintf(buffer, fmt, va_arg(*vargs, long)) :
      |                     ^~~~~~~
Objects/unicodeobject.c:2593:21: warning: format not a string literal, argument types not checked [-Wformat-nonliteral]
 2593 |                     sprintf(buffer, fmt, va_arg(*vargs, unsigned long));
      |                     ^~~~~~~
Objects/unicodeobject.c:2597:21: warning: format not a string literal, argument types not checked [-Wformat-nonliteral]
 2597 |                     sprintf(buffer, fmt, va_arg(*vargs, long long)) :
      |                     ^~~~~~~
Objects/unicodeobject.c:2598:21: warning: format not a string literal, argument types not checked [-Wformat-nonliteral]
 2598 |                     sprintf(buffer, fmt, va_arg(*vargs, unsigned long long));
      |                     ^~~~~~~
Objects/unicodeobject.c:2602:21: warning: format not a string literal, argument types not checked [-Wformat-nonliteral]
 2602 |                     sprintf(buffer, fmt, va_arg(*vargs, Py_ssize_t)) :
      |                     ^~~~~~~
Objects/unicodeobject.c:2603:21: warning: format not a string literal, argument types not checked [-Wformat-nonliteral]
 2603 |                     sprintf(buffer, fmt, va_arg(*vargs, size_t));
      |                     ^~~~~~~
Objects/unicodeobject.c:2606:17: warning: format not a string literal, argument types not checked [-Wformat-nonliteral]
 2606 |                 len = sprintf(buffer, fmt, va_arg(*vargs, ptrdiff_t));
      |                 ^~~
Objects/unicodeobject.c:2610:21: warning: format not a string literal, argument types not checked [-Wformat-nonliteral]
 2610 |                     sprintf(buffer, fmt, va_arg(*vargs, intmax_t)) :
      |                     ^~~~~~~
Objects/unicodeobject.c:2611:21: warning: format not a string literal, argument types not checked [-Wformat-nonliteral]
 2611 |                     sprintf(buffer, fmt, va_arg(*vargs, uintmax_t));
      |                     ^~~~~~~
Objects/unicodeobject.c:2615:21: warning: format not a string literal, argument types not checked [-Wformat-nonliteral]
 2615 |                     sprintf(buffer, fmt, va_arg(*vargs, int)) :
      |                     ^~~~~~~
Objects/unicodeobject.c:2616:21: warning: format not a string literal, argument types not checked [-Wformat-nonliteral]
 2616 |                     sprintf(buffer, fmt, va_arg(*vargs, unsigned int));
      |                     ^~~~~~~

I think that they should be silenced / ignored.

@sethmlarson
Copy link
Contributor

@mdboom Are you okay with me editing your topic to create a checklist style table with links to either why we're not implementing or the actual implementation? My guess is we'll be adopting these one by one :)

@mdboom
Copy link
Contributor Author

mdboom commented Nov 22, 2023

@mdboom Are you okay with me editing your topic to create a checklist style table with links to either why we're not implementing or the actual implementation? My guess is we'll be adopting these one by one :)

@sethmlarson: Good idea.

@hugovk
Copy link
Member

hugovk commented Nov 23, 2023

At a high level, I think the process to address these and make incremental progress maybe looks something like:

  • Pick one of the warning types, and assess how many false positives it gives and how onerous it is to fix them. From this, build concensus about whether it's worth addressing.
  • Fix all of the existing instances.
  • Turn that specific warning into an error so it doesn't creep back in.

Sounds a good approach.

To share another method that could additionally help: as part of #101100, we're working through a lot of docs "nit-picky" warnings.

When building the docs, we only allow warnings to occur in files that already have warnings and are listed in a .nitignore file. Once a file has been cleaned, we remove it from the list to prevent regressions.

We also fail the docs build if we "accidentally" clean a file: if warnings do not occur in a file where we previously expected warnings, so the file must also be removed from the list, again to prevent regressions.

This does need some custom tooling, but it's helped us make gradual progress, and we've fixed 40% so far.

@carsonRadtke
Copy link
Contributor

RE: @hugovk's .nitignore

I am in favor of a solution like this. It would not require any custom tooling as we could change the build arguments to whatever we find consensus in and then silence compiler warnings for offending lines until somebody comes along and fixes them.

This also allows us to silence errors locally, but enforce them globally. That way we could still have -Wformat-nonliteral, but allow non-compliance during the compilation of 'unicodeobject.c'. (I am not advocating for this flag, just using it as an example)

@nohlson
Copy link
Contributor

nohlson commented Jun 4, 2024

Hello all I have been selected by GSoC to work on this!

@mdboom I am curious how you were able to get the unit tests to pass with the linker option -Wl,-z,nodlopen. I am testing options out on my own machine Linux/x86_64/gcc.

configure determines that because dlopen() is available that dynload_shlib.o should be linked which uses dlopen(), so later on when importing modules I understandably get an error that shared objects cannot be dlopen()ed. I'm not even aware of any alternative so I was curious how you were able to avoid dlopen() and pass tests that load modules

@mdboom
Copy link
Contributor Author

mdboom commented Jun 5, 2024

Welcome, @nohlson! I was really excited to hear about this GSoC project at PyCon.

This whole investigation for me was a quick afternoon hack. I only ever got as far as getting the build to complete -- I never even ran Python, let alone its test suite. I just got as far as thinking "someone with more time should work on this", and here you are ;)

Also looking at this again, I see I didn't set the linker flags on LDFLAGS, only CFLAGS, and therefore it seems they had no effect. So there's nothing magical about it working for me and not you.

IMHO, this seems like a hard flag to support. "Python without dlopen" would be a different beast -- maybe some very security conscious people would want that, but it would require tooling to statically link in all expected extension modules (some of that tooling already exists elsewhere). So, personally I'd defer solving that one for now (but I'm not the GSoC mentor, that's just my opinion).

@nohlson
Copy link
Contributor

nohlson commented Jun 14, 2024

I would like to get some discussion going about performance impacts of enabling options and how much we would be willing to conceed in performance for safety.

Here is an example of a cypthon baseline pyperformance benchmark vs. a build with multiple performance-impacting options enabled broke down by benchmark category:

Benchmark Tag Geometric Mean
apps 1.03x slower
asyncio 1.04x slower
math 1.03x slower
regex 1.00x faster
serialize 1.08x slower
startup 1.03x slower
template 1.03x slower
template 1.04x slower

To see all the benchmarks run and the comparison between builds here is more detail:

Benchamrk Comparison (baseline vs. hardened build)

Benchmarks with tag 'apps':

2to3: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 306 ms +- 1 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 314 ms +- 1 ms: 1.03x slower
docutils: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 2.72 sec +- 0.01 sec -> [two_baselines_and_tldr/config_3/pyperf_output.json] 2.83 sec +- 0.02 sec: 1.04x slower
html5lib: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 73.9 ms +- 0.7 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 76.8 ms +- 0.5 ms: 1.04x slower
tornado_http: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 117 ms +- 1 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 119 ms +- 2 ms: 1.02x slower

Geometric mean: 1.03x slower

Benchmarks with tag 'asyncio':

async_tree_none: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 443 ms +- 22 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 460 ms +- 22 ms: 1.04x slower
async_tree_cpu_io_mixed: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 707 ms +- 36 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 736 ms +- 37 ms: 1.04x slower
async_tree_cpu_io_mixed_tg: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 666 ms +- 60 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 693 ms +- 61 ms: 1.04x slower
async_tree_eager: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 137 ms +- 1 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 146 ms +- 1 ms: 1.07x slower
async_tree_eager_cpu_io_mixed: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 448 ms +- 9 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 468 ms +- 8 ms: 1.04x slower
async_tree_eager_cpu_io_mixed_tg: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 394 ms +- 9 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 411 ms +- 8 ms: 1.04x slower
async_tree_eager_io_tg: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 1.38 sec +- 0.07 sec -> [two_baselines_and_tldr/config_3/pyperf_output.json] 1.42 sec +- 0.07 sec: 1.03x slower
async_tree_eager_memoization: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 271 ms +- 23 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 284 ms +- 23 ms: 1.05x slower
async_tree_eager_memoization_tg: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 214 ms +- 8 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 222 ms +- 8 ms: 1.04x slower
async_tree_eager_tg: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 96.1 ms +- 0.8 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 102 ms +- 1 ms: 1.06x slower
async_tree_io: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 1.08 sec +- 0.08 sec -> [two_baselines_and_tldr/config_3/pyperf_output.json] 1.12 sec +- 0.08 sec: 1.04x slower
async_tree_io_tg: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 1.12 sec +- 0.04 sec -> [two_baselines_and_tldr/config_3/pyperf_output.json] 1.16 sec +- 0.04 sec: 1.03x slower
async_tree_memoization: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 567 ms +- 54 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 587 ms +- 53 ms: 1.04x slower

Benchmark hidden because not significant (3): async_tree_eager_io, async_tree_memoization_tg, async_tree_none_tg

Geometric mean: 1.04x slower

Benchmarks with tag 'math':

float: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 86.6 ms +- 0.7 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 92.2 ms +- 0.9 ms: 1.06x slower
nbody: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 85.9 ms +- 1.0 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 89.2 ms +- 0.6 ms: 1.04x slower
pidigits: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 171 ms +- 1 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 170 ms +- 0 ms: 1.00x faster

Geometric mean: 1.03x slower

Benchmarks with tag 'regex':

regex_compile: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 155 ms +- 1 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 158 ms +- 1 ms: 1.02x slower
regex_dna: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 163 ms +- 1 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 162 ms +- 1 ms: 1.01x faster
regex_effbot: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 3.04 ms +- 0.08 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 3.11 ms +- 0.07 ms: 1.02x slower
regex_v8: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 26.7 ms +- 0.3 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 25.4 ms +- 0.2 ms: 1.05x faster

Geometric mean: 1.00x faster

Benchmarks with tag 'serialize':

json_dumps: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 13.9 ms +- 0.2 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 15.6 ms +- 0.2 ms: 1.12x slower
json_loads: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 31.5 us +- 0.3 us -> [two_baselines_and_tldr/config_3/pyperf_output.json] 35.6 us +- 0.6 us: 1.13x slower
pickle: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 13.7 us +- 0.1 us -> [two_baselines_and_tldr/config_3/pyperf_output.json] 15.3 us +- 0.1 us: 1.11x slower
pickle_dict: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 28.7 us +- 0.5 us -> [two_baselines_and_tldr/config_3/pyperf_output.json] 39.4 us +- 0.1 us: 1.37x slower
pickle_list: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 4.42 us +- 0.08 us -> [two_baselines_and_tldr/config_3/pyperf_output.json] 4.23 us +- 0.13 us: 1.04x faster
pickle_pure_python: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 357 us +- 3 us -> [two_baselines_and_tldr/config_3/pyperf_output.json] 370 us +- 2 us: 1.04x slower
tomli_loads: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 2.48 sec +- 0.02 sec -> [two_baselines_and_tldr/config_3/pyperf_output.json] 2.63 sec +- 0.02 sec: 1.06x slower
unpickle: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 18.6 us +- 0.2 us -> [two_baselines_and_tldr/config_3/pyperf_output.json] 20.3 us +- 0.3 us: 1.09x slower
unpickle_list: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 5.32 us +- 0.15 us -> [two_baselines_and_tldr/config_3/pyperf_output.json] 5.93 us +- 0.15 us: 1.11x slower
unpickle_pure_python: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 266 us +- 2 us -> [two_baselines_and_tldr/config_3/pyperf_output.json] 258 us +- 2 us: 1.03x faster
xml_etree_parse: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 157 ms +- 2 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 164 ms +- 2 ms: 1.05x slower
xml_etree_iterparse: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 106 ms +- 1 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 113 ms +- 1 ms: 1.07x slower
xml_etree_generate: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 116 ms +- 1 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 123 ms +- 2 ms: 1.06x slower
xml_etree_process: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 78.4 ms +- 0.5 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 83.2 ms +- 0.8 ms: 1.06x slower

Geometric mean: 1.08x slower

Benchmarks with tag 'startup':

python_startup: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 10.3 ms +- 0.0 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 10.6 ms +- 0.0 ms: 1.03x slower
python_startup_no_site: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 7.07 ms +- 0.02 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 7.31 ms +- 0.04 ms: 1.03x slower

Geometric mean: 1.03x slower

Benchmarks with tag 'template':

genshi_text: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 26.9 ms +- 0.2 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 27.6 ms +- 0.2 ms: 1.02x slower
genshi_xml: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 60.8 ms +- 0.5 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 62.3 ms +- 0.4 ms: 1.02x slower
mako: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 12.8 ms +- 0.1 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 13.3 ms +- 0.1 ms: 1.04x slower

Geometric mean: 1.03x slower

All benchmarks:

2to3: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 306 ms +- 1 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 314 ms +- 1 ms: 1.03x slower
async_generators: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 465 ms +- 3 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 536 ms +- 4 ms: 1.15x slower
async_tree_none: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 443 ms +- 22 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 460 ms +- 22 ms: 1.04x slower
async_tree_cpu_io_mixed: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 707 ms +- 36 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 736 ms +- 37 ms: 1.04x slower
async_tree_cpu_io_mixed_tg: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 666 ms +- 60 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 693 ms +- 61 ms: 1.04x slower
async_tree_eager: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 137 ms +- 1 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 146 ms +- 1 ms: 1.07x slower
async_tree_eager_cpu_io_mixed: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 448 ms +- 9 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 468 ms +- 8 ms: 1.04x slower
async_tree_eager_cpu_io_mixed_tg: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 394 ms +- 9 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 411 ms +- 8 ms: 1.04x slower
async_tree_eager_io_tg: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 1.38 sec +- 0.07 sec -> [two_baselines_and_tldr/config_3/pyperf_output.json] 1.42 sec +- 0.07 sec: 1.03x slower
async_tree_eager_memoization: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 271 ms +- 23 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 284 ms +- 23 ms: 1.05x slower
async_tree_eager_memoization_tg: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 214 ms +- 8 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 222 ms +- 8 ms: 1.04x slower
async_tree_eager_tg: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 96.1 ms +- 0.8 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 102 ms +- 1 ms: 1.06x slower
async_tree_io: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 1.08 sec +- 0.08 sec -> [two_baselines_and_tldr/config_3/pyperf_output.json] 1.12 sec +- 0.08 sec: 1.04x slower
async_tree_io_tg: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 1.12 sec +- 0.04 sec -> [two_baselines_and_tldr/config_3/pyperf_output.json] 1.16 sec +- 0.04 sec: 1.03x slower
async_tree_memoization: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 567 ms +- 54 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 587 ms +- 53 ms: 1.04x slower
asyncio_tcp_ssl: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 1.52 sec +- 0.01 sec -> [two_baselines_and_tldr/config_3/pyperf_output.json] 1.54 sec +- 0.00 sec: 1.01x slower
chaos: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 71.4 ms +- 0.6 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 75.7 ms +- 1.0 ms: 1.06x slower
comprehensions: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 18.5 us +- 0.1 us -> [two_baselines_and_tldr/config_3/pyperf_output.json] 19.3 us +- 0.1 us: 1.04x slower
bench_mp_pool: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 20.8 ms +- 7.3 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 17.8 ms +- 5.2 ms: 1.17x faster
bench_thread_pool: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 1.10 ms +- 0.01 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 1.11 ms +- 0.01 ms: 1.02x slower
coroutines: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 24.9 ms +- 0.3 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 25.8 ms +- 0.3 ms: 1.04x slower
coverage: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 120 ms +- 4 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 122 ms +- 3 ms: 1.01x slower
crypto_pyaes: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 79.2 ms +- 1.0 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 88.2 ms +- 1.1 ms: 1.11x slower
dask: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 387 ms +- 14 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 399 ms +- 12 ms: 1.03x slower
deepcopy: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 432 us +- 4 us -> [two_baselines_and_tldr/config_3/pyperf_output.json] 451 us +- 3 us: 1.04x slower
deepcopy_reduce: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 4.19 us +- 0.06 us -> [two_baselines_and_tldr/config_3/pyperf_output.json] 4.50 us +- 0.04 us: 1.08x slower
deepcopy_memo: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 42.4 us +- 0.6 us -> [two_baselines_and_tldr/config_3/pyperf_output.json] 45.0 us +- 0.4 us: 1.06x slower
deltablue: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 3.68 ms +- 0.03 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 3.98 ms +- 0.03 ms: 1.08x slower
docutils: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 2.72 sec +- 0.01 sec -> [two_baselines_and_tldr/config_3/pyperf_output.json] 2.83 sec +- 0.02 sec: 1.04x slower
fannkuch: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 428 ms +- 4 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 460 ms +- 4 ms: 1.08x slower
float: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 86.6 ms +- 0.7 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 92.2 ms +- 0.9 ms: 1.06x slower
create_gc_cycles: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 1.18 ms +- 0.00 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 1.20 ms +- 0.01 ms: 1.02x slower
gc_traversal: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 3.16 ms +- 0.03 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 3.38 ms +- 0.15 ms: 1.07x slower
generators: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 29.1 ms +- 0.3 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 29.7 ms +- 0.2 ms: 1.02x slower
genshi_text: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 26.9 ms +- 0.2 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 27.6 ms +- 0.2 ms: 1.02x slower
genshi_xml: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 60.8 ms +- 0.5 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 62.3 ms +- 0.4 ms: 1.02x slower
go: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 149 ms +- 1 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 154 ms +- 1 ms: 1.04x slower
hexiom: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 6.79 ms +- 0.03 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 7.04 ms +- 0.05 ms: 1.04x slower
html5lib: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 73.9 ms +- 0.7 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 76.8 ms +- 0.5 ms: 1.04x slower
json_dumps: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 13.9 ms +- 0.2 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 15.6 ms +- 0.2 ms: 1.12x slower
json_loads: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 31.5 us +- 0.3 us -> [two_baselines_and_tldr/config_3/pyperf_output.json] 35.6 us +- 0.6 us: 1.13x slower
logging_format: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 7.57 us +- 0.07 us -> [two_baselines_and_tldr/config_3/pyperf_output.json] 8.25 us +- 0.14 us: 1.09x slower
logging_silent: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 125 ns +- 1 ns -> [two_baselines_and_tldr/config_3/pyperf_output.json] 117 ns +- 2 ns: 1.07x faster
logging_simple: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 6.89 us +- 0.05 us -> [two_baselines_and_tldr/config_3/pyperf_output.json] 7.22 us +- 0.09 us: 1.05x slower
mako: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 12.8 ms +- 0.1 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 13.3 ms +- 0.1 ms: 1.04x slower
meteor_contest: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 102 ms +- 1 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 109 ms +- 1 ms: 1.07x slower
nbody: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 85.9 ms +- 1.0 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 89.2 ms +- 0.6 ms: 1.04x slower
nqueens: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 105 ms +- 1 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 113 ms +- 1 ms: 1.07x slower
pathlib: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 22.2 ms +- 0.1 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 22.9 ms +- 0.1 ms: 1.03x slower
pickle: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 13.7 us +- 0.1 us -> [two_baselines_and_tldr/config_3/pyperf_output.json] 15.3 us +- 0.1 us: 1.11x slower
pickle_dict: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 28.7 us +- 0.5 us -> [two_baselines_and_tldr/config_3/pyperf_output.json] 39.4 us +- 0.1 us: 1.37x slower
pickle_list: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 4.42 us +- 0.08 us -> [two_baselines_and_tldr/config_3/pyperf_output.json] 4.23 us +- 0.13 us: 1.04x faster
pickle_pure_python: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 357 us +- 3 us -> [two_baselines_and_tldr/config_3/pyperf_output.json] 370 us +- 2 us: 1.04x slower
pidigits: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 171 ms +- 1 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 170 ms +- 0 ms: 1.00x faster
pprint_safe_repr: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 983 ms +- 10 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 1.01 sec +- 0.01 sec: 1.03x slower
pprint_pformat: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 2.00 sec +- 0.02 sec -> [two_baselines_and_tldr/config_3/pyperf_output.json] 2.06 sec +- 0.02 sec: 1.03x slower
pyflate: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 468 ms +- 5 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 481 ms +- 4 ms: 1.03x slower
python_startup: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 10.3 ms +- 0.0 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 10.6 ms +- 0.0 ms: 1.03x slower
python_startup_no_site: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 7.07 ms +- 0.02 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 7.31 ms +- 0.04 ms: 1.03x slower
raytrace: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 304 ms +- 2 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 323 ms +- 2 ms: 1.07x slower
regex_compile: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 155 ms +- 1 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 158 ms +- 1 ms: 1.02x slower
regex_dna: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 163 ms +- 1 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 162 ms +- 1 ms: 1.01x faster
regex_effbot: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 3.04 ms +- 0.08 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 3.11 ms +- 0.07 ms: 1.02x slower
regex_v8: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 26.7 ms +- 0.3 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 25.4 ms +- 0.2 ms: 1.05x faster
richards: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 55.7 ms +- 0.5 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 59.9 ms +- 0.9 ms: 1.07x slower
richards_super: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 64.2 ms +- 0.7 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 68.0 ms +- 0.8 ms: 1.06x slower
scimark_fft: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 395 ms +- 4 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 413 ms +- 3 ms: 1.05x slower
scimark_lu: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 137 ms +- 2 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 142 ms +- 2 ms: 1.04x slower
scimark_monte_carlo: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 76.0 ms +- 0.7 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 79.5 ms +- 1.1 ms: 1.05x slower
scimark_sor: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 145 ms +- 1 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 153 ms +- 1 ms: 1.05x slower
scimark_sparse_mat_mult: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 5.76 ms +- 0.07 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 6.17 ms +- 0.11 ms: 1.07x slower
spectral_norm: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 120 ms +- 1 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 139 ms +- 1 ms: 1.16x slower
sqlglot_normalize: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 145 ms +- 1 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 151 ms +- 1 ms: 1.04x slower
sqlglot_optimize: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 69.0 ms +- 0.4 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 71.0 ms +- 0.3 ms: 1.03x slower
sqlglot_parse: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 1.44 ms +- 0.01 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 1.50 ms +- 0.01 ms: 1.04x slower
sqlglot_transpile: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 1.76 ms +- 0.01 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 1.82 ms +- 0.01 ms: 1.04x slower
sqlite_synth: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 3.45 us +- 0.04 us -> [two_baselines_and_tldr/config_3/pyperf_output.json] 3.57 us +- 0.05 us: 1.04x slower
telco: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 10.9 ms +- 0.1 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 11.6 ms +- 0.1 ms: 1.06x slower
tomli_loads: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 2.48 sec +- 0.02 sec -> [two_baselines_and_tldr/config_3/pyperf_output.json] 2.63 sec +- 0.02 sec: 1.06x slower
tornado_http: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 117 ms +- 1 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 119 ms +- 2 ms: 1.02x slower
typing_runtime_protocols: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 212 us +- 3 us -> [two_baselines_and_tldr/config_3/pyperf_output.json] 224 us +- 4 us: 1.05x slower
unpack_sequence: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 38.4 ns +- 2.1 ns -> [two_baselines_and_tldr/config_3/pyperf_output.json] 41.2 ns +- 0.4 ns: 1.07x slower
unpickle: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 18.6 us +- 0.2 us -> [two_baselines_and_tldr/config_3/pyperf_output.json] 20.3 us +- 0.3 us: 1.09x slower
unpickle_list: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 5.32 us +- 0.15 us -> [two_baselines_and_tldr/config_3/pyperf_output.json] 5.93 us +- 0.15 us: 1.11x slower
unpickle_pure_python: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 266 us +- 2 us -> [two_baselines_and_tldr/config_3/pyperf_output.json] 258 us +- 2 us: 1.03x faster
xml_etree_parse: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 157 ms +- 2 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 164 ms +- 2 ms: 1.05x slower
xml_etree_iterparse: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 106 ms +- 1 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 113 ms +- 1 ms: 1.07x slower
xml_etree_generate: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 116 ms +- 1 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 123 ms +- 2 ms: 1.06x slower
xml_etree_process: Mean +- std dev: [two_baselines_and_tldr/config_1/pyperf_output.json] 78.4 ms +- 0.5 ms -> [two_baselines_and_tldr/config_3/pyperf_output.json] 83.2 ms +- 0.8 ms: 1.06x slower

Benchmark hidden because not significant (6): async_tree_eager_io, async_tree_memoization_tg, async_tree_none_tg, asyncio_tcp, asyncio_websockets, mdp

Geometric mean: 1.04x slower

I am putting together some analysis of how individual options affect performance that I will share but would like to get some optionions concerning which benchmarks can't afford to take a performance hit. For example I would be less concerned about startup and docutils benchmarks as regex and math benchmarks which could be used at high frequency in applications.

@mdboom mdboom added the performance Performance or resource usage label Jun 14, 2024
@mdboom
Copy link
Contributor Author

mdboom commented Jun 14, 2024

I think, unfortunately, the answer to that is "it depends". Startup really matters for some applications, and not others, for example. Likewise, security really matters in some contexts, but not others. It's hard to speculate at the beginning of this project, but maybe the end result will be to make it easy to make a security-hardened build at the expense of performance when the end user wants to make that tradeoff.

I like the idea of breaking this out by individual flags, so we can see which have the most impact. It might also be possible that we can reduce the impact of some of the options by changing how some code is written in CPython, i.e. if an option makes some unsafe C feature slower, maybe we try to stop using that unsafe C feature if we can ;)

We have a whole set of standard benchmarking machines at Microsoft that are set up to get the results as-reproducible-as-possible. If you create a branch on your fork of CPython with some proposed changes, you can ping me and I can kick off a run, and the results will show up automatically on iscpythonfastyet.com. Unfortunately, we can't automate triggering those runs for security reasons, but it's really easy for me so don't hesitate to ask me and I can get to it pretty quickly during my working hours.

@corona10
Copy link
Member

corona10 commented Jun 26, 2024

@nohlson By the way, fail-through emits a lot of warnings from the CPython build (we pursue remove compiler warnings as possible), and some of them are intended fail-throughs. Do you have any plans for this?

example

./Modules/_testcapi/exceptions.c:38:9: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
        case 2:
        ^
./Modules/_testcapi/exceptions.c:38:9: note: insert '__attribute__((fallthrough));' to silence this warning
        case 2:
        ^
        __attribute__((fallthrough)); 
./Modules/_testcapi/exceptions.c:38:9: note: insert 'break;' to avoid fall-through
        case 2:
        ^
        break; 
./Modules/_testcapi/exceptions.c:42:9: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
        case 1:
        ^
./Modules/_testcapi/exceptions.c:42:9: note: insert '__attribute__((fallthrough));' to silence this warning
        case 1:
        ^
        __attribute__((fallthrough)); 
./Modules/_testcapi/exceptions.c:42:9: note: insert 'break;' to avoid fall-through
        case 1:
        ^
        break; 

And for the expat module, you should send patches to the upstream.
https://github.com/libexpat/libexpat cc @hartwork

@nohlson
Copy link
Contributor

nohlson commented Jun 26, 2024

@corona10 Yes I am going to be implementing some tooling to keep track of new warnings that are generated by enabling these new flags, which is the next step of this process.

I had actually overlooked those warnings until I saw them in some of the buildbot compile logs. I had intended for the first round to be warning-free.

We can start by deciding if we should ignore the intended fall-through warnings in the tooling or add the attributes.

@corona10
Copy link
Member

corona10 commented Jun 26, 2024

We can start by deciding if we should ignore the intended fall-through warnings in the tooling or add the attributes.

Then, how about we revert the fall-through warning changes and then just test when your new tool is implemented?
(I like your idea, but we need to separate tasks as possible)
If not until then, emitted warnings(and as I said, some codes are vendored modules, not ours; we should submit the patch to the upstream first, which the fix can be delayed until they release new version) will be stressful for some core devs.

@nohlson
Copy link
Contributor

nohlson commented Jun 26, 2024

@corona10 I agree let's remove fall-through warnings

and I will look into why I am not seeing those warnings when I build locally

@nohlson
Copy link
Contributor

nohlson commented Jun 26, 2024

@corona10 PR to remove fallthrough warning option: #121041

Just from browsing the builds from #121030 it seems that clang is only emitting warnings for fallthrough.

Here are builds with warnings:
https://buildbot.python.org/all/#/builders/721/builds/1465 (macos)
https://buildbot.python.org/all/#/builders/111/builds/1387 (Fedora with clang)

And others don't (Fedora/RHEL w/ gcc):
https://buildbot.python.org/all/#/builders/816/builds/977
https://buildbot.python.org/all/#/builders/745/builds/987
https://buildbot.python.org/all/#/builders/115/builds/1398

I will pay extra close attention to these compiler nuances when working on the tooling.

@nohlson
Copy link
Contributor

nohlson commented Jul 8, 2024

@hugovk For the warning tooling I had initially considered if it would be feasible to add to the pipeline for each of the buildbots maybe even as a unit test the warning checks, but keeping track of the warnings for each platform/compiler might be too complicated. I was considering just making a couple github actions that run the warning check tooling for macos, ubuntu, and windows and that would be representative, just as was done for the docs warning tracker.

Are there any thoughts on the latter approach?

@hugovk
Copy link
Member

hugovk commented Jul 9, 2024

Sounds like a good idea to just test on a subset. Using GitHub Actions means we can run as part of all PRs, and also people can test on their forks without too much bother.

Things to consider: do we want warnings to fail the build, so that people can't merge if they introduce new warnings? If not now, we can consider this for later.

The next level is allow warnings to report as a failure, but not let that failure block a merge. A downside of this is that it can still shows as red, even if they can merge, which people find annoying.

Then the next level is just to output the warnings in the log, and the job always reports as passed. The downside is you need to dig into the logs to see what happened, and most people won't do that.

If we're still at the investigation phase, we probably don't want to fail PRs just yet, but I imagine at some point we will.

Another thing to consider: will this be new jobs, or something added to the existing build?

For the docs warnings, we added it to an existing build: output all warnings to log, but otherwise build as usual, then in a later step of the same job, run a script to analyse the logs and decide when to fail.

@nohlson
Copy link
Contributor

nohlson commented Jul 10, 2024

@hugovk Awesome thank you for the input! I am moving forward with a GitHub Actions solution.

Things to consider: do we want warnings to fail the build, so that people can't merge if they introduce new warnings? If not now, we can consider this for later.

The first iteration I will introduce will have options at the script level for failing on regression or improvement just as the docs version has but will have both disabled in the github action configuration. The results of the checks will just be printed out as a "warning" about warnings

The next level is allow warnings to report as a failure, but not let that failure block a merge. A downside of this is that it can still shows as red, even if they can merge, which people find annoying.

Once the tooling has been introduced we can create a new PR that enables an option that does produce a small number of new warnings. At this point we could enable fail on regression. Then we can decide if we are going to add to the ignore list or fix the manageable number of warnings.

Another thing to consider: will this be new jobs, or something added to the existing build?

Currently I am going to try to fit this into the existing ubuntu build job. Then I will take a look at the macos and windows jobs after. The design is very similar to the docs warnings checks.

@nohlson
Copy link
Contributor

nohlson commented Jul 23, 2024

Consider adding extra flags to CFLAGS_NODIST, so they only affect CPython, not third-party extensions built for it. Those might have conflicting requirements.

I agree with limiting the scope of these options to just CPython. After testing with buildbots we can make this change as suggested by @corona10 here

hugovk added a commit that referenced this issue Jul 27, 2024
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
Co-authored-by: Hugo van Kemenade <[email protected]>
zware pushed a commit that referenced this issue Jul 30, 2024
…nings (GH-122465)

Also remove superfluous shebang from the warning check script
@encukou
Copy link
Member

encukou commented Aug 1, 2024

I was redirected here: when adding configure options, I think it would be appropriate to have a wider discussion on Discourse rather than just on GitHub issues.

As I'm not following this effort very closely, I feel lost in the discussion and decisions :(

Practically, I think

  • the new configure options should be added to What's New
  • if some the options can conflict with other options set by users, there should be docs “somewhere” to guide someone from seeing a compiler/linker error to solving it.
  • the docs should link to the specific document that is being implemented, rather than the OpenSSF home page.

@sethmlarson
Copy link
Contributor

Apologies @encukou, agreed that these should get discussed to make sure they're the best method of handling this. The new options got added as a way to opt-in to performance-affecting compilation options or opt-out of options that aren't supported on your platform. See: #121996

I'll work with @nohlson to create this topic and issues for documenting them properly.

@nohlson
Copy link
Contributor

nohlson commented Aug 5, 2024

I started a thread to have more discussion related to this topic: https://discuss.python.org/t/new-default-compiler-options-for-safety/60057/2

Attn: @encukou @sethmlarson

hugovk pushed a commit that referenced this issue Aug 6, 2024
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
brandtbucher pushed a commit to brandtbucher/cpython that referenced this issue Aug 7, 2024
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
AA-Turner added a commit that referenced this issue Aug 8, 2024
…fety`` and ``--enable-slower-safety``) (#122758)

Co-authored-by: Adam Turner <[email protected]>
hugovk added a commit that referenced this issue Aug 14, 2024
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
Co-authored-by: Hugo van Kemenade <[email protected]>
blhsing pushed a commit to blhsing/cpython that referenced this issue Aug 22, 2024
…ck warnings (pythonGH-122465)

Also remove superfluous shebang from the warning check script
blhsing pushed a commit to blhsing/cpython that referenced this issue Aug 22, 2024
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
blhsing pushed a commit to blhsing/cpython that referenced this issue Aug 22, 2024
…ble-safety`` and ``--enable-slower-safety``) (python#122758)

Co-authored-by: Adam Turner <[email protected]>
blhsing pushed a commit to blhsing/cpython that referenced this issue Aug 22, 2024
…22711)

Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
Co-authored-by: Hugo van Kemenade <[email protected]>
hugovk added a commit that referenced this issue Sep 13, 2024
…123020)

Co-authored-by: Hugo van Kemenade <[email protected]>
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
hugovk added a commit to hugovk/cpython that referenced this issue Sep 13, 2024
@colesbury
Copy link
Contributor

-Wconversion and -Wsign-conversion add a huge number of warnings. Is the security benefit commensurate with the amount of hassle and work that enabling those warnings will create? What security bugs would they have prevented?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build The build process and cross-build performance Performance or resource usage type-feature A feature request or enhancement type-security A security issue
Projects
None yet
Development

No branches or pull requests

10 participants