@@ -106,6 +106,56 @@ stream by opening a file in binary mode with buffering disabled::
106
106
The raw stream API is described in detail in the docs of :class: `RawIOBase `.
107
107
108
108
109
+ .. _io-text-encoding :
110
+
111
+ Text Encoding
112
+ -------------
113
+
114
+ The default encoding of :class: `TextIOWrapper ` and :func: `open ` is
115
+ locale-specific (:func: `locale.getpreferredencoding(False) <locale.getpreferredencoding> `).
116
+
117
+ However, many developers forget to specify the encoding when opening text files
118
+ encoded in UTF-8 (e.g. JSON, TOML, Markdown, etc...) since most Unix
119
+ platforms use UTF-8 locale by default. This causes bugs because the locale
120
+ encoding is not UTF-8 for most Windows users. For example::
121
+
122
+ # May not work on Windows when non-ASCII characters in the file.
123
+ with open("README.md") as f:
124
+ long_description = f.read()
125
+
126
+ Additionally, while there is no concrete plan as of yet, Python may change
127
+ the default text file encoding to UTF-8 in the future.
128
+
129
+ Accordingly, it is highly recommended that you specify the encoding
130
+ explicitly when opening text files. If you want to use UTF-8, pass
131
+ ``encoding="utf-8" ``. To use the current locale encoding,
132
+ ``encoding="locale" `` is supported in Python 3.10.
133
+
134
+ When you need to run existing code on Windows that attempts to opens
135
+ UTF-8 files using the default locale encoding, you can enable the UTF-8
136
+ mode. See :ref: `UTF-8 mode on Windows <win-utf8-mode >`.
137
+
138
+ .. _io-encoding-warning :
139
+
140
+ Opt-in EncodingWarning
141
+ ^^^^^^^^^^^^^^^^^^^^^^
142
+
143
+ .. versionadded :: 3.10
144
+ See :pep: `597 ` for more details.
145
+
146
+ To find where the default locale encoding is used, you can enable
147
+ the ``-X warn_default_encoding `` command line option or set the
148
+ :envvar: `PYTHONWARNDEFAULTENCODING ` environment variable, which will
149
+ emit an :exc: `EncodingWarning ` when the default encoding is used.
150
+
151
+ If you are providing an API that uses :func: `open ` or
152
+ :class: `TextIOWrapper ` and passes ``encoding=None `` as a parameter, you
153
+ can use :func: `text_encoding ` so that callers of the API will emit an
154
+ :exc: `EncodingWarning ` if they don't pass an ``encoding ``. However,
155
+ please consider using UTF-8 by default (i.e. ``encoding="utf-8" ``) for
156
+ new APIs.
157
+
158
+
109
159
High-level Module Interface
110
160
---------------------------
111
161
@@ -143,6 +193,32 @@ High-level Module Interface
143
193
.. versionadded :: 3.8
144
194
145
195
196
+ .. function :: text_encoding(encoding, stacklevel=2)
197
+
198
+ This is a helper function for callables that use :func: `open ` or
199
+ :class: `TextIOWrapper ` and have an ``encoding=None `` parameter.
200
+
201
+ This function returns *encoding * if it is not ``None `` and ``"locale" `` if
202
+ *encoding * is ``None ``.
203
+
204
+ This function emits an :class: `EncodingWarning ` if
205
+ :data: `sys.flags.warn_default_encoding <sys.flags> ` is true and *encoding *
206
+ is None. *stacklevel * specifies where the warning is emitted.
207
+ For example::
208
+
209
+ def read_text(path, encoding=None):
210
+ encoding = io.text_encoding(encoding) # stacklevel=2
211
+ with open(path, encoding) as f:
212
+ return f.read()
213
+
214
+ In this example, an :class: `EncodingWarning ` is emitted for the caller of
215
+ ``read_text() ``.
216
+
217
+ See :ref: `io-text-encoding ` for more information.
218
+
219
+ .. versionadded :: 3.10
220
+
221
+
146
222
.. exception :: BlockingIOError
147
223
148
224
This is a compatibility alias for the builtin :exc: `BlockingIOError `
@@ -869,6 +945,8 @@ Text I/O
869
945
*encoding * gives the name of the encoding that the stream will be decoded or
870
946
encoded with. It defaults to
871
947
:func: `locale.getpreferredencoding(False) <locale.getpreferredencoding> `.
948
+ ``encoding="locale" `` can be used to specify the current locale's encoding
949
+ explicitly. See :ref: `io-text-encoding ` for more information.
872
950
873
951
*errors * is an optional string that specifies how encoding and decoding
874
952
errors are to be handled. Pass ``'strict' `` to raise a :exc: `ValueError `
@@ -920,6 +998,9 @@ Text I/O
920
998
locale encoding using :func: `locale.setlocale `, use the current locale
921
999
encoding instead of the user preferred encoding.
922
1000
1001
+ .. versionchanged :: 3.10
1002
+ The *encoding * argument now supports the ``"locale" `` dummy encoding name.
1003
+
923
1004
:class: `TextIOWrapper ` provides these data attributes and methods in
924
1005
addition to those from :class: `TextIOBase ` and :class: `IOBase `:
925
1006
0 commit comments