gh-128641: Fix ConfigParser.read Perfomance Regression #129596

2trvl · 2025-02-02T20:34:49Z

Issue: Significant Configparser Performance Regression #128641

ghost · 2025-02-02T20:34:51Z

All commit authors signed the Contributor License Agreement.

bedevere-app · 2025-02-02T20:34:52Z

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

Lib/configparser.py

Misc/NEWS.d/next/Library/2025-02-02-23-47-35.gh-issue-128641.GFs673.rst

…into configparser_perf

jaraco

Thanks for this. The changes look sound and certainly help elucidate the reported performance degradation.

This biggest problem with this change is that it doesn't capture the performance-sensitive concerns, so a future contributor (including myself having forgotten about these details) might refactor the code back to what it was (for example to reduce the number of logical branches or attribute accesses).

When I've encountered performance-sensitive parts of the code, I've done my best to (a) encapsulate the performance-sensitive aspects so they're identifiable and (b) include comments or docstrings to articulate their purpose. For example, in zipfile.Path, I added FastLookup:

cpython/Lib/zipfile/_path/__init__.py

Lines 177 to 193 in c537301

    
           class FastLookup(CompleteDirs): 
        
               """ 
        
               ZipFile subclass to ensure implicit 
        
               dirs exist and are resolved rapidly. 
        
               """ 
        
               def namelist(self): 
        
                   with contextlib.suppress(AttributeError): 
        
                       return self.__names 
        
                   self.__names = super().namelist() 
        
                   return self.__names 
        
               def _name_set(self): 
        
                   with contextlib.suppress(AttributeError): 
        
                       return self.__lookup 
        
                   self.__lookup = super()._name_set() 
        
                   return self.__lookup

This class encapsulated the performance optimizations, but also allows the concerns from the non-optimized version to be exposed separately (CompleteDirs). By disentangling these two concerns, it becomes more obvious why they're there and implemented that way. I'm not necessarily suggesting this pattern is appropriate here, though it could be.

The other thing I've done in these projects where possible is to include a regression test or benchmark that captures the performance expectations (example). Unfortunately, often these sorts of tests require sophistication that isn't available in the stdlib. Sometimes, however, it's possible to put in a test that will capture serious performance issues, such as hanging on an O(n^2) operation.

I've left some comments to specific lines of the code indicating areas where I'm uneasy with the change (or might have liked to have done things differently). Because of this uneasiness, I'd like to explore other options. I'm not expecting you to simply address the comments.

I'm happy to help explore other options, but currently, it seems we don't have a good reproducer, so I'll ask for that in the ticket.

Lib/configparser.py

2trvl · 2025-02-14T10:49:39Z

@jaraco View my comments on review.

The creation of an inline_comment from 3 blocks is combined into 1. Comprehension is broken down using lambda.

News entry now reflects a return to previous performance rather than being misleading.

Rely on re.sub to perform the substitutions in a unified way across full and inline prefixes.

…he variable name.

Use 'slots' to avoid '__dict__'.

Lib/configparser.py

Co-authored-by: Bénédikt Tran <[email protected]>

jaraco

With these adjustments, I'm much happier with the approach and it regains the majority of the performance regression with a smaller diff and still mostly functional (paradigm) logic.

2trvl · 2025-02-17T00:42:48Z

@jaraco

190 files

Main: 94 - 100 ms
Mine: 55 - 59 ms
Yours: 61 - 63 ms

You indeed applied the main optimizations and made the structure more logical. Now _LineParser does not pretend to be a line. Also, you combined full and inline strip.

It might be better to rename the load method to something like:

class	object
`CommentSpec.wrap(text)`	`comments.wrap(text)`
CommentSpec.pack(text)	comments.pack(text)
CommentSpec.mark(text)	comments.mark(text)
`CommentSpec.enclose(text)`	`comments.enclose(text)`

jaraco · 2025-02-23T16:41:32Z

It might be better to rename the load method to something like:

As I read those options, I'm struggling to choose a best one. Pick the one you like best and let's go with that and get this merged.

2trvl · 2025-02-23T23:57:52Z

Ready to merge.

…129596) --------- Co-authored-by: Jason R. Coombs <[email protected]> Co-authored-by: Bénédikt Tran <[email protected]>

perf: replace _Line with _LineParser

a3a696b

2trvl requested a review from jaraco as a code owner February 2, 2025 20:34

bedevere-app bot added the awaiting review label Feb 2, 2025

2trvl changed the title ~~Fix ConfigParser Perfomance Regression~~ gh-128641: Fix ConfigParser.read Perfomance Regression Feb 2, 2025

bedevere-app bot mentioned this pull request Feb 2, 2025

Significant Configparser Performance Regression #128641

Closed

docs: add news entry

7100857

2trvl force-pushed the configparser_perf branch from 1c339ab to 7100857 Compare February 2, 2025 21:42

eendebakpt reviewed Feb 2, 2025

View reviewed changes

Lib/configparser.py Outdated Show resolved Hide resolved

Misc/NEWS.d/next/Library/2025-02-02-23-47-35.gh-issue-128641.GFs673.rst Outdated Show resolved Hide resolved

2trvl added 4 commits February 3, 2025 02:20

refactor: apply requested changes

cff9f44

Merge branch 'python:main' into configparser_perf

4afca4f

docs: use ConfigParser instead of RawConfigParser

45c7a0e

Merge branch 'configparser_perf' of https://github.com/2trvl/cpython …

fa11e55

…into configparser_perf

jaraco self-assigned this Feb 9, 2025

jaraco reviewed Feb 9, 2025

View reviewed changes

Lib/configparser.py Outdated Show resolved Hide resolved

Lib/configparser.py Show resolved Hide resolved

Lib/configparser.py Outdated Show resolved Hide resolved

Lib/configparser.py Outdated Show resolved Hide resolved

Lib/configparser.py Outdated Show resolved Hide resolved

refactor: apply requested changes

dc751b8

2trvl requested a review from jaraco February 13, 2025 18:43

Merge branch 'main' into configparser_perf

b005e50

jaraco added 4 commits February 15, 2025 12:43

Move comment handling into a _CommentSpec class.

a01e0a9

Rely on re.sub to perform the substitutions in a unified way across full and inline prefixes.

Remove the optimization, as it has no effect.

f74543c

Rename variable from string to text. Use 'trimmed' to avoid masking t…

a098e69

…he variable name.

Restored Line as str subclass.

218db85

Use 'slots' to avoid '__dict__'.

jaraco force-pushed the configparser_perf branch from 40b4f1c to 218db85 Compare February 15, 2025 17:43

picnixz reviewed Feb 15, 2025

View reviewed changes

Lib/configparser.py Show resolved Hide resolved

Lib/configparser.py Outdated Show resolved Hide resolved

jaraco and others added 2 commits February 15, 2025 13:17

Normalize whitespace in diff.

8610c71

Co-authored-by: Bénédikt Tran <[email protected]>

Merge branch 'main' into configparser_perf

f44d663

jaraco approved these changes Feb 15, 2025

View reviewed changes

bedevere-app bot removed the awaiting review label Feb 15, 2025

bedevere-app bot added the awaiting merge label Feb 15, 2025

refactor: comments wrap

1f70178

jaraco enabled auto-merge (squash) February 23, 2025 23:59

jaraco merged commit cd6abe2 into python:main Feb 24, 2025
39 checks passed

bedevere-app bot removed the awaiting merge label Feb 24, 2025

2trvl deleted the configparser_perf branch February 24, 2025 01:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

gh-128641: Fix ConfigParser.read Perfomance Regression #129596

gh-128641: Fix ConfigParser.read Perfomance Regression #129596

Uh oh!

2trvl commented Feb 2, 2025 •

edited

Loading

Uh oh!

ghost commented Feb 2, 2025 •

edited by ghost

Loading

Uh oh!

bedevere-app bot commented Feb 2, 2025

Uh oh!

Uh oh!

Uh oh!

jaraco left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

2trvl commented Feb 14, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

jaraco left a comment

Uh oh!

2trvl commented Feb 17, 2025 •

edited

Loading

Uh oh!

jaraco commented Feb 23, 2025

Uh oh!

2trvl commented Feb 23, 2025

Uh oh!

Uh oh!

Uh oh!

	class FastLookup(CompleteDirs):
	"""
	ZipFile subclass to ensure implicit
	dirs exist and are resolved rapidly.
	"""

	def namelist(self):
	with contextlib.suppress(AttributeError):
	return self.__names
	self.__names = super().namelist()
	return self.__names

	def _name_set(self):
	with contextlib.suppress(AttributeError):
	return self.__lookup
	self.__lookup = super()._name_set()
	return self.__lookup

Uh oh!

gh-128641: Fix ConfigParser.read Perfomance Regression #129596

gh-128641: Fix ConfigParser.read Perfomance Regression #129596

Uh oh!

Conversation

2trvl commented Feb 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ghost commented Feb 2, 2025 • edited by ghost Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bedevere-app bot commented Feb 2, 2025

Uh oh!

Uh oh!

Uh oh!

jaraco left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

2trvl commented Feb 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jaraco left a comment

Choose a reason for hiding this comment

Uh oh!

2trvl commented Feb 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jaraco commented Feb 23, 2025

Uh oh!

2trvl commented Feb 23, 2025

Uh oh!

Uh oh!

Uh oh!

2trvl commented Feb 2, 2025 •

edited

Loading

ghost commented Feb 2, 2025 •

edited by ghost

Loading

2trvl commented Feb 14, 2025 •

edited

Loading

2trvl commented Feb 17, 2025 •

edited

Loading