Skip to content

Commit 44b48bd

Browse files
committed
gh-118761: Reduce import time of gettext.py by delaying re import
gettext is often imported in programs that may not end up translating anything. In fact, the `struct` module already has a delayed import when parsing GNUTranslations to speed up the no .mo files case. The re module is also used in the same situation, but behind a function chain only called by GNUTranslations. cache the compiled regex globally the first time it is used. The finditer function can be converted to a method call on the compiled object (it always could) which is slightly more efficient and necessary for the conditional re import.
1 parent d05140f commit 44b48bd

File tree

2 files changed

+19
-14
lines changed

2 files changed

+19
-14
lines changed

Lib/gettext.py

+18-14
Original file line numberDiff line numberDiff line change
@@ -70,22 +70,26 @@
7070
# https://www.gnu.org/software/gettext/manual/gettext.html#Plural-forms
7171
# http://git.savannah.gnu.org/cgit/gettext.git/tree/gettext-runtime/intl/plural.y
7272

73-
_token_pattern = re.compile(r"""
74-
(?P<WHITESPACES>[ \t]+) | # spaces and horizontal tabs
75-
(?P<NUMBER>[0-9]+\b) | # decimal integer
76-
(?P<NAME>n\b) | # only n is allowed
77-
(?P<PARENTHESIS>[()]) |
78-
(?P<OPERATOR>[-*/%+?:]|[><!]=?|==|&&|\|\|) | # !, *, /, %, +, -, <, >,
79-
# <=, >=, ==, !=, &&, ||,
80-
# ? :
81-
# unary and bitwise ops
82-
# not allowed
83-
(?P<INVALID>\w+|.) # invalid token
84-
""", re.VERBOSE|re.DOTALL)
85-
73+
_token_pattern = None
8674

8775
def _tokenize(plural):
88-
for mo in re.finditer(_token_pattern, plural):
76+
global _token_pattern
77+
if _token_pattern is None:
78+
import re
79+
_token_pattern = re.compile(r"""
80+
(?P<WHITESPACES>[ \t]+) | # spaces and horizontal tabs
81+
(?P<NUMBER>[0-9]+\b) | # decimal integer
82+
(?P<NAME>n\b) | # only n is allowed
83+
(?P<PARENTHESIS>[()]) |
84+
(?P<OPERATOR>[-*/%+?:]|[><!]=?|==|&&|\|\|) | # !, *, /, %, +, -, <, >,
85+
# <=, >=, ==, !=, &&, ||,
86+
# ? :
87+
# unary and bitwise ops
88+
# not allowed
89+
(?P<INVALID>\w+|.) # invalid token
90+
""", re.VERBOSE|re.DOTALL)
91+
92+
for mo in _token_pattern.finditer(plural):
8993
kind = mo.lastgroup
9094
if kind == 'WHITESPACES':
9195
continue
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Reduce import time of mod:`gettext`. Patch by Eli Schwartz.

0 commit comments

Comments
 (0)