Skip to content

Commit 9871983

Browse files
committed
gh-99726: Add 'fast' argument to os.[l]stat for faster calculation
When passed as True, only st_mode's type bits, st_size and st_mtime[_nsec] are guaranteed to be set. Other fields may also be set for a given Python version on a given platform version, but may change without warning (in the case of OS changes - Python will try to keep them stable). This first implementation uses a new Windows API that is significantly faster, provided the volume identifier is not required. Other optimizations may be added later.
1 parent 55bad19 commit 9871983

11 files changed

+375
-55
lines changed

Doc/library/os.rst

+50-3
Original file line numberDiff line numberDiff line change
@@ -2175,7 +2175,7 @@ features:
21752175
Accepts a :term:`path-like object`.
21762176

21772177

2178-
.. function:: lstat(path, *, dir_fd=None)
2178+
.. function:: lstat(path, *, dir_fd=None, fast=False)
21792179

21802180
Perform the equivalent of an :c:func:`lstat` system call on the given path.
21812181
Similar to :func:`~os.stat`, but does not follow symbolic links. Return a
@@ -2184,8 +2184,15 @@ features:
21842184
On platforms that do not support symbolic links, this is an alias for
21852185
:func:`~os.stat`.
21862186

2187+
Passing *fast* as ``True`` may omit some information on some platforms
2188+
for the sake of performance. These omissions are not guaranteed (that is,
2189+
the information may be returned anyway), and may change between Python
2190+
releases without a deprecation period or due to operating system updates
2191+
without warning. See :class:`stat_result` documentation for the fields
2192+
that are guaranteed to be present under this option.
2193+
21872194
As of Python 3.3, this is equivalent to ``os.stat(path, dir_fd=dir_fd,
2188-
follow_symlinks=False)``.
2195+
follow_symlinks=False, fast=fast)``.
21892196

21902197
This function can also support :ref:`paths relative to directory descriptors
21912198
<dir_fd>`.
@@ -2209,6 +2216,9 @@ features:
22092216
Other kinds of reparse points are resolved by the operating system as
22102217
for :func:`~os.stat`.
22112218

2219+
.. versionchanged:: 3.12
2220+
Added the *fast* parameter.
2221+
22122222

22132223
.. function:: mkdir(path, mode=0o777, *, dir_fd=None)
22142224

@@ -2781,7 +2791,7 @@ features:
27812791
for :class:`bytes` paths on Windows.
27822792

27832793

2784-
.. function:: stat(path, *, dir_fd=None, follow_symlinks=True)
2794+
.. function:: stat(path, *, dir_fd=None, follow_symlinks=True, fast=False)
27852795

27862796
Get the status of a file or a file descriptor. Perform the equivalent of a
27872797
:c:func:`stat` system call on the given path. *path* may be specified as
@@ -2806,6 +2816,13 @@ features:
28062816
possible and call :func:`lstat` on the result. This does not apply to
28072817
dangling symlinks or junction points, which will raise the usual exceptions.
28082818

2819+
Passing *fast* as ``True`` may omit some information on some platforms
2820+
for the sake of performance. These omissions are not guaranteed (that is,
2821+
the information may be returned anyway), and may change between Python
2822+
releases without a deprecation period or due to operating system updates
2823+
without warning. See :class:`stat_result` documentation for the fields
2824+
that are guaranteed to be present under this option.
2825+
28092826
.. index:: module: stat
28102827

28112828
Example::
@@ -2838,19 +2855,32 @@ features:
28382855
returns the information for the original path as if
28392856
``follow_symlinks=False`` had been specified instead of raising an error.
28402857

2858+
.. versionchanged:: 3.12
2859+
Added the *fast* parameter.
2860+
28412861

28422862
.. class:: stat_result
28432863

28442864
Object whose attributes correspond roughly to the members of the
28452865
:c:type:`stat` structure. It is used for the result of :func:`os.stat`,
28462866
:func:`os.fstat` and :func:`os.lstat`.
28472867

2868+
When the *fast* argument to these functions is passed ``True``, some
2869+
information may be reduced or omitted. Those attributes that are
2870+
guaranteed to be valid, and those currently known to be omitted, are
2871+
marked in the documentation below. If not specified and you depend on
2872+
that field, explicitly pass *fast* as ``False`` to ensure it is
2873+
calculated.
2874+
28482875
Attributes:
28492876

28502877
.. attribute:: st_mode
28512878

28522879
File mode: file type and file mode bits (permissions).
28532880

2881+
When *fast* is ``True``, only the file type bits are guaranteed
2882+
to be valid (the mode bits may be zero).
2883+
28542884
.. attribute:: st_ino
28552885

28562886
Platform dependent, but if non-zero, uniquely identifies the
@@ -2865,6 +2895,8 @@ features:
28652895

28662896
Identifier of the device on which this file resides.
28672897

2898+
On Windows, when *fast* is ``True``, this may be zero.
2899+
28682900
.. attribute:: st_nlink
28692901

28702902
Number of hard links.
@@ -2883,6 +2915,8 @@ features:
28832915
The size of a symbolic link is the length of the pathname it contains,
28842916
without a terminating null byte.
28852917

2918+
This field is guaranteed to be filled when specifying *fast*.
2919+
28862920
Timestamps:
28872921

28882922
.. attribute:: st_atime
@@ -2893,6 +2927,8 @@ features:
28932927

28942928
Time of most recent content modification expressed in seconds.
28952929

2930+
This field is guaranteed to be filled when specifying *fast*.
2931+
28962932
.. attribute:: st_ctime
28972933

28982934
Platform dependent:
@@ -2909,6 +2945,9 @@ features:
29092945
Time of most recent content modification expressed in nanoseconds as an
29102946
integer.
29112947

2948+
This field is guaranteed to be filled when specifying *fast*, subject
2949+
to the note below.
2950+
29122951
.. attribute:: st_ctime_ns
29132952

29142953
Platform dependent:
@@ -2998,12 +3037,16 @@ features:
29983037
:c:func:`GetFileInformationByHandle`. See the ``FILE_ATTRIBUTE_*``
29993038
constants in the :mod:`stat` module.
30003039

3040+
This field is guaranteed to be filled when specifying *fast*.
3041+
30013042
.. attribute:: st_reparse_tag
30023043

30033044
When :attr:`st_file_attributes` has the ``FILE_ATTRIBUTE_REPARSE_POINT``
30043045
set, this field contains the tag identifying the type of reparse point.
30053046
See the ``IO_REPARSE_TAG_*`` constants in the :mod:`stat` module.
30063047

3048+
This field is guaranteed to be filled when specifying *fast*.
3049+
30073050
The standard module :mod:`stat` defines functions and constants that are
30083051
useful for extracting information from a :c:type:`stat` structure. (On
30093052
Windows, some items are filled with dummy values.)
@@ -3039,6 +3082,10 @@ features:
30393082
files as :const:`S_IFCHR`, :const:`S_IFIFO` or :const:`S_IFBLK`
30403083
as appropriate.
30413084

3085+
.. versionchanged:: 3.12
3086+
Added the *fast* argument and defined the minimum set of returned
3087+
fields.
3088+
30423089
.. function:: statvfs(path)
30433090

30443091
Perform a :c:func:`statvfs` system call on the given path. The return value is
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,77 @@
1+
#ifndef Py_INTERNAL_FILEUTILS_WINDOWS_H
2+
#define Py_INTERNAL_FILEUTILS_WINDOWS_H
3+
#ifdef __cplusplus
4+
extern "C" {
5+
#endif
6+
7+
#ifndef Py_BUILD_CORE
8+
# error "Py_BUILD_CORE must be defined to include this header"
9+
#endif
10+
11+
#ifdef MS_WINDOWS
12+
13+
#if !defined(NTDDI_WIN10_NI) || !(NTDDI_VERSION >= NTDDI_WIN10_NI)
14+
typedef struct _FILE_STAT_BASIC_INFORMATION {
15+
LARGE_INTEGER FileId;
16+
LARGE_INTEGER CreationTime;
17+
LARGE_INTEGER LastAccessTime;
18+
LARGE_INTEGER LastWriteTime;
19+
LARGE_INTEGER ChangeTime;
20+
LARGE_INTEGER AllocationSize;
21+
LARGE_INTEGER EndOfFile;
22+
ULONG FileAttributes;
23+
ULONG ReparseTag;
24+
ULONG NumberOfLinks;
25+
ULONG DeviceType;
26+
ULONG DeviceCharacteristics;
27+
} FILE_STAT_BASIC_INFORMATION;
28+
29+
typedef enum _FILE_INFO_BY_NAME_CLASS {
30+
FileStatByNameInfo,
31+
FileStatLxByNameInfo,
32+
FileCaseSensitiveByNameInfo,
33+
FileStatBasicByNameInfo,
34+
MaximumFileInfoByNameClass
35+
} FILE_INFO_BY_NAME_CLASS;
36+
#endif
37+
38+
typedef BOOL (WINAPI *PGetFileInformationByName)(
39+
PCWSTR FileName,
40+
FILE_INFO_BY_NAME_CLASS FileInformationClass,
41+
PVOID FileInfoBuffer,
42+
ULONG FileInfoBufferSize
43+
);
44+
45+
static inline BOOL GetFileInformationByName(
46+
PCWSTR FileName,
47+
FILE_INFO_BY_NAME_CLASS FileInformationClass,
48+
PVOID FileInfoBuffer,
49+
ULONG FileInfoBufferSize
50+
) {
51+
static PGetFileInformationByName GetFileInformationByName = NULL;
52+
static int GetFileInformationByName_init = -1;
53+
54+
if (GetFileInformationByName_init < 0) {
55+
HMODULE hMod = LoadLibraryW(L"api-ms-win-core-file-l2-1-4");
56+
GetFileInformationByName_init = 0;
57+
if (hMod) {
58+
GetFileInformationByName = (PGetFileInformationByName)GetProcAddress(
59+
hMod, "GetFileInformationByName");
60+
if (GetFileInformationByName) {
61+
GetFileInformationByName_init = 1;
62+
} else {
63+
FreeLibrary(hMod);
64+
}
65+
}
66+
}
67+
68+
if (GetFileInformationByName_init <= 0) {
69+
SetLastError(ERROR_NOT_SUPPORTED);
70+
return FALSE;
71+
}
72+
return GetFileInformationByName(FileName, FileInformationClass, FileInfoBuffer, FileInfoBufferSize);
73+
}
74+
75+
#endif
76+
77+
#endif

Include/internal/pycore_global_objects_fini_generated.h

+1
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Include/internal/pycore_global_strings.h

+1
Original file line numberDiff line numberDiff line change
@@ -389,6 +389,7 @@ struct _Py_global_strings {
389389
STRUCT_FOR_ID(false)
390390
STRUCT_FOR_ID(family)
391391
STRUCT_FOR_ID(fanout)
392+
STRUCT_FOR_ID(fast)
392393
STRUCT_FOR_ID(fd)
393394
STRUCT_FOR_ID(fd2)
394395
STRUCT_FOR_ID(fdel)

Include/internal/pycore_runtime_init_generated.h

+1
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Include/internal/pycore_unicodeobject_generated.h

+2
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Lib/test/test_os.py

+12
Original file line numberDiff line numberDiff line change
@@ -613,6 +613,18 @@ def test_stat_result_pickle(self):
613613
unpickled = pickle.loads(p)
614614
self.assertEqual(result, unpickled)
615615

616+
def test_stat_result_fast(self):
617+
# Minimum guaranteed fields when requesting incomplete info
618+
result_1 = os.stat(self.fname, fast=True)
619+
result_2 = os.stat(self.fname, fast=False)
620+
result_3 = os.stat(self.fname)
621+
self.assertEqual(stat.S_IFMT(result_1.st_mode),
622+
stat.S_IFMT(result_2.st_mode))
623+
self.assertEqual(result_1.st_size, result_2.st_size)
624+
self.assertEqual(result_1.st_mtime, result_2.st_mtime)
625+
# Ensure the default matches fast=False
626+
self.assertEqual(result_2, result_3)
627+
616628
@unittest.skipUnless(hasattr(os, 'statvfs'), 'test needs os.statvfs()')
617629
def test_statvfs_attributes(self):
618630
result = os.statvfs(self.fname)
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
Adds `fast` argument to :func:`os.stat` and :func:`os.lstat` to enable
2+
performance optimizations by skipping some fields in the result.

0 commit comments

Comments
 (0)