-
Notifications
You must be signed in to change notification settings - Fork 17
Support Python 3 #263
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Python 3 #263
Conversation
57afd00
to
6cd8586
Compare
ea04777
to
41d40a7
Compare
My thoughts regarding the shebang: packpack/packpack-docker-images#68 (comment). |
d7adb6e
to
8cfb910
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice to see that the Python 3 support is closer to the reality!
I'll look through commits one by one and will place my comments for each commit in it's own GitHub's review. I guess it'll help us to navigate over the discussion (and also will help to deliver first comments earlier).
Don't get me wrong. I will not insist on fixing every line and every comment according to my preferences. No. However I want to share preferences and thoughts during review to learn something new from each other.
01/23 Explicit imports of needed objects
So called 'star import' (the first link) is irrelevant here. import foo
is not a star import. 'Import everything from a module' is not the same as 'import a module'.
The second link refers the question, where the accepted answer does not state that this approach 'allows to keep code clean for readers'. The answer author prefers the opposite (at least in the reduced context of their discussion -- regarding sys
) and explicitly said that it is the code style question. I would treat it as that there is no general rule of thumb (and brief looking around relevant SO discussions confirms this).
I guess you personally prefer this style and it is okay. Don't hesitate to declare your vision as the reason of the change.
What regarding me... I don't remember why exactly I did this change in the draft. I don't have a personal preference, but it seems that from foo import bar
is the preferred way within the test-run's code. Improving consistency looks as enough motivation for me in the scope of this project.
One more note: I tends to mark all commits with Part of #xxx
even when they're just side effect of looking around code to fix / implement #xxx
(and it is easier to keep it within the patchset). There are two reasons:
- Git (w/o GitHub) have no any mechanism to mark a consequent set of commits as part of one patch series (I don't count merge commits, because of my preference to keep a history linear).
- I like to read stories and don't like changes, which appears from nothing. When there is a linked issue, we can guess that an author found a bad code during looking into the issue and decided to fix it. That's the nice storyline.
Too much text, sorry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
02/23 Remove unused variable
I tested it. flake8 complains regarding this line when called from Python 3, but does not complain when called from Python 2. So it is certainly part of #20.
My flake8 versions:
$ flake8 --version # Python 2, flake8 from pip
3.8.4 (mccabe: 0.6.1, pycodestyle: 2.6.0, pyflakes: 2.2.0) CPython 2.7.18 on
Linux
$ /usr/bin/flake8 --version # Python 3, flake8 from portage
3.8.4 (mccabe: 0.6.1, pycodestyle: 2.6.0, pyflakes: 2.2.0) CPython 3.9.1 on Linux
LGTM otherwise.
03/23 python3: make print calls compatible with Python 3.x
LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
04/23 python3: fix path import lib/box_connection.py
This change is unclear for me: see comments below for details.
I see that I'm co-author here, so maybe the reason is the content of my experimental patch. But this fact alone does not advocate the change anyhow. Sorry if I unintentionally directed you to a wrong way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
05/23 python3: fix import of StringIO module
I don't get why do you moved one code block (see below).
I left some details regarding all those buffer implementations, but primarily for myself -- to better understand what would be an ideal solution in a common case like our one.
LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
06/23 python3: fix mode format in chmod() call
The change itself is okay, but the explanation is misleading. See below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
07/24 python3: fix import of SimpleQueue
This change looks useless. See details below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
08/24 python3: fix configparser module import
There are several comments regarding this change, see details below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
09/24 python3: replace xrange() with xrange()
The change is okay (the size
is small: it is always 3
), but there is a typo in the commit title: the last xrange()
should be range()
.
If you want to keep xrange()
usage on Python 2, that's easy. I would bother in code that tends to be universal, but here it is optional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
10/24 python3: using six to iteritems()
Some code that work on Python 2 and Python 3 was changed. I would not change it if we'll not going to add six-like iteritems()
and iterkeys()
functions to lib/utils.py
(see below).
New six
usages were added. I would avoid this: please, consider this as my little wish. Since we can easily eliminate the dependency I hope we'll have less problems with RPM / Deb dependencies in tarantool and a few connectors and modules that also use test-run.
There are few six
usages now and everything that we need is just the following code in lib/utils.py
:
if sys.version_info[0] == 3:
string_types = str,
integer_types = int,
from shlex import quote as shlex_quote
else:
string_types = basestring,
integer_types = (int, long)
from pipes import quote as shlex_quote
Aside of this we need to change .iteritems()
and .iterkeys()
to just .items()
and .keys()
. Or we can copy the corresponding functions from six
and use them (if someone really bother about invisible performance aberrations when running test-run on Python 2). In this case the lib/utils.py
file will gain 16 SLOC instead of 8 SLOC, that's negligible.
I commented four cases with more details. The rest is the same.
6f00b29
to
e20eb77
Compare
4dce9be
to
fdf2764
Compare
Unlike Python 2, Python 3 does not treat `import bar` from a module in a package `foo` as importing of `foo.bar`. Absolute import that is relative to a test-run root (like `import lib.utils`) still work, because `sys.path` contains a script directory (after symlinks resolution) on both Python versions. https://www.python.org/dev/peps/pep-0328/ Part of #20 Co-authored-by: Alexander Turenko <[email protected]>
What's New In Python 3.0 [1]: The function attributes named func_X have been renamed to use the __X__ form, freeing up these names in the function attribute namespace for user-defined attributes. To wit, func_closure, func_code, func_defaults, func_dict, func_doc, func_globals, func_name were renamed to __closure__, __code__, __defaults__, __dict__, __doc__, __globals__, __name__, respectively. 1. https://docs.python.org/3/whatsnew/3.0.html#operators-and-special-methods Part of #20
Fix Tarantool connection liveness and box-py/call.test.py test. Part of #20 Co-authored-by: Alexander Turenko <[email protected]>
gevent 1.2 and below does not support Python 3.7 and above, see [1]. 1. gevent/gevent#1297 Part of #20
eecbd70
to
bcba5f7
Compare
I see, several result files contains characters that are not in ASCII, but in UTF-8:
Don't know how I missed it. Thanks! |
Hey, not I know how I missed this :) The fail is due to non UTF-8 locale in our images. The commit message (PR #39) states the following:
So I have several questions here:
@avtikhon Can you clarify? (Anyway, I don't mind the change in test-run.) |
In Python 3 start method 'spawn' in multiprocessing module becomes default on Mac OS [1]. The 'spawn' method causes re-execution of some code, which is already executed in the main process. At least it is seen on the lib/__init__.py code, which removes the 'var' directory. Some other code may have side effects too, it requires investigation. The method also requires object serialization that doesn't work when objects use lambdas, whose for example used in class TestSuite (lib/test_suite.py). The latter problem is easy to fix, but the former looks more fundamental. So we stick to the 'fork' method now. The new start method is available on Python 3 only: Traceback (most recent call last): File "../../test/test-run.py", line 227, in <module> multiprocessing.set_start_method('fork') AttributeError: 'module' object has no attribute 'set_start_method' 1. https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods Fixes #265 Part of #20 Co-authored-by: Alexander Turenko <[email protected]>
So I should change the commit message of the 'python3: decouple bytes and strings' commit, because it states the following now:
Since we don't lean on this property anymore, we should reflect it here. |
In Python 2 default string type (`<str>`) is a binary string, non-unicode. We receive data from a socket, from a Popen stream, from a file as a string and operate on those strings without any conversions. Python 3 draws a line here. We usually operate on unicode strings in the code (because this is the default string type, `<str>`), but receive bytes from a socket and a Popen stream. We can use unicode or binary streams for files (unicode by default[^1]). This commit decouples bytes and strings. In most cases it means that we convert data from bytes to a string after receiving from a socket / Popen stream and convert it back from a string to bytes before writting to a socket. Those operations are no-op on Python 2. So, the general rule for our APIs is to accept and return `<str>` disregarding Python version. Not `<bytes>`, not `<unicode>`. The only non-trivial change is around `FilteredStream` and writes into `sys.stdout`. The `FilteredStream` instance replaces `sys.stdout` during execution of a test, so it should follow the usual convention and accept `<str>` in the `write()` method. This is both intuitive and necessary, because `*.py` tests rely on `print('bla bla')` to write into a result file. However the stream should also accept `<bytes>`, because we have a unit test (`unit/json.test`), which produces a binary output, which does not conform UTF-8 encoding. The separate `write_bytes()` method was introduced for this sake. UnittestServer and AppServer write tests output as bytes directly, TarantoolServer rely on the usual string output. We also use bytes directly, when write from one stream to another one: in `app_server.py` for stderr (writting to a log file), in `tarantool_server.py` for log destination property (because it is the destination for Popen). [^1]: Technically it depends on a Python version and a system locale. There are situations, when default text file streams encoding is not 'utf-8'. They will be handled in the next commit. Part of #20 Co-authored-by: Sergey Bronnikov <[email protected]>
The problem: test files, result files contain UTF-8 symbols, which are out of ASCII range. Just `open(file, 'r').read()` without the encoding='utf-8' argument fails to decode them, when the default encoding for file text streams is not 'utf-8'. We meet this situation on Python 3.6.8 (provided by CentOS 7 and CentOS 8), when the POSIX locale is set (`LC_ALL=C`). The solution is described in the code comment: replace `open()` built-in function and always set `encoding='utf-8'`. That's hacky way, but it looks better than change every `open()` call across the code and don't forget to do that in all future code (and keep Python 2 compatibility in the mind). But maybe we'll revisit the approach later. There is another way to hack the `open()` behaviour that works for me on Python 3.6.8: | import _bootlocale | _bootlocale.getpreferredencoding = (lambda *args: 'utf8') However it leans on Python internals and looks less reliable than the implemented one. Part of #20
When Tarantool unit tests runs using test-run error with output like below may happen: [027] small/small_class.test [027] Test.run() received the following error: [027] Traceback (most recent call last): [027] File "/__w/tarantool/tarantool/test-run/lib/test.py", line 192, in run [027] self.execute(server) [027] File "/__w/tarantool/tarantool/test-run/lib/unittest_server.py", line 20, in execute [027] proc = Popen(execs, cwd=server.vardir, stdout=PIPE, stderr=STDOUT) [027] File "/usr/lib/python3.5/subprocess.py", line 676, in __init__ [027] restore_signals, start_new_session) [027] File "/usr/lib/python3.5/subprocess.py", line 1282, in _execute_child [027] raise child_exception_type(errno_num, err_msg) [027] FileNotFoundError: [Errno 2] No such file or directory: '../test/small/small_class.test' [027] [ fail ] The root cause of error is changed behaviour of Popen in Python 3 in comparison to Python 2. One should explicitly set path to a current work dir: Python 2 [1]: "If cwd is not None, the child’s current directory will be changed to cwd before it is executed. Note that this directory is not considered when searching the executable, so you can’t specify the program’s path relative to cwd." Python 3 [2]: "If cwd is not None, the function changes the working directory to cwd before executing the child. <...> In particular, the function looks for executable (or for the first item in args) relative to cwd if the executable path is a relative path." 1. https://docs.python.org/2/library/subprocess.html#subprocess.Popen 2. https://docs.python.org/3/library/subprocess.html#subprocess.Popen Part of #20
To make test box-py/call.test.py compatible with Python 2 and Python 3 it must have the same output when running with both versions of interpeter. With running on Python 2.7 output of called commands concluded to round brackets while on Python 3 it is not. [001] @@ -20,7 +20,7 @@ [001] - true [001] - null [001] ... [001] -call f1 () [001] +('call ', 'f1', ((),)) [001] - 'testing' [001] - 1 [001] - False To fix it print should use formatted string in print() call. Part of #20
Related issues and changes: - Support Python 3 in Tarantool test suite tarantool/tarantool#5538 - Python 3 in tarantool-python tarantool/tarantool-python#181 tarantool/tarantool-python#186 - Use Python 3 in a test infrastructure tarantool/tarantool#5652 Closes #20
It appears that the problem is not only here. We read test files in the same way. We read result files when a hung worker is detected, which is not quite usual situation, so it'll not be covered by our usual CI run in tarantool. I decided to work the problem around for all text streams uniformly. So, I reverted the change from the 'python3: decouple bytes and strings' commit and slightly changed its wording: I added one more commit to handle the problem. It is named 'python3: set text file streams encoding to utf-8', I'll push it here soon. Footnotes
|
bcba5f7
to
e93d067
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code looks okay for me now.
I did some testing on Python 2 and Python 3, but not on full set of our platforms. I see that you tested the patchset thoroughly, so I don't worry.
I'll think about better order of pushing relevant test-run and tarantool patches and will proceed.
Updated the test-run submodule in tarantool in 2.8.0-127-gf203bf33a (part of PR #5751). Backports to 2.7, 2.6 and 1.10 are in progress. |
2.7 in 2.7.1-116-ga71248988. |
1.10 in 1.10.9-75-g949e86b7e. |
Less dependencies we have, less effort we need to keep RPM / Deb package dependencies and building intructions for various OSes up to date. It was mostly removed in [1]: this commit removed the last occurence and drops the dependency. [1]: #263
Less dependencies we have, less effort we need to keep RPM / Deb package dependencies and building intructions for various OSes up to date. It was mostly removed in [1]: this commit removed the last occurence and drops the dependency. [1]: #263
Less dependencies we have, less effort we need to keep RPM / Deb package dependencies and building intructions for various OSes up to date. It was mostly removed in [1]: this commit removed the last occurence and drops the dependency. [1]: #263
Uncommitted changes:
bump tarantool-python version in test-run to include patches (Update tarantool-python to 0.7.1-1-g8847b8c #264)python3: make json.dumps compatible with Python 2 tarantool-python#186python3: make conversion to string deterministic tarantool-python#187Testing status
vshard, Python 2
PR to bump test-run version in vshard - tarantool/vshard#268
Tarantool, Python 3
CI (enabled Python 3 in
test-run.py
, branchligurio/python3-support
in tarantool): everything is passed except job with FreeBSD, it requires manual update of golden image tarantool/tarantool#5751Tarantool, Python 2
We want to keep compatibility with Python 2, so I have prepared a special branches in tarantool (
ligurio/python3-support-py2
) and test-run (ligurio/gh-15-support-python-3-wip-py2
) repositories to check that tests passed with Python 2. See CI status in branchligurio/python3-support-py2
in tarantool).ligurio/python3-support-py2
contains all commits fromligurio/python3-support
except those that force using python3:ligurio/gh-15-support-python-3-wip-py2
contains all commits fromligurio/gh-15-support-python-3-wip-py2
with removed commit that changes shebang to Python 3 intest-run.py
.