select.select() consuming excessive process time on Ubuntu & MacOS #234

adde1 · 2020-06-02T09:56:06Z

Hi,

I have been using stackless for some time, but now I am stuck and need to ask for help. In short, the call to select.select() consumes excessive processing time (equal to the wall clock) in some scenarios. It seems it happens when the system get busy, but I have not been able to boil it down better than that.

The behaviour is not consistent across platforms and versions of Stackless. When I started using Stackless back around 2.7.2 I did not have this performance problem. I first got problems on MacOS around 2.7.9 but since I was anyway about to finish up my then big project I just switched to working on Ubuntu. But now I get similar symptoms on Ubuntu as well.

The core loop of my project has not changed significantly since the start. I also don't know what I could have done wrong on Python side to have select.select behave almost like if it was implemented with a loop (but only in some cases).

I would like to move onto Conda because for my new project I need numpy, scipy, and pygame at the same time (as well as FORTRAN compiler) but with the current issues I am kind of stuck.

The behaviour I get is as follows:

Ubuntu 12, 14, 16 - Stackless built locally - Intel 2500K

Unfortunately the machine died some time ago, and I don't remember the exact Stackless version (probably 2.7.2).
Running project test suite (multi-threaded): good performance, moderate CPU load
Running "empty loop" (framework only): low CPU load
Running "zita" (pygame application + framework): good performance, moderate CPU load that disappeared when idle

Ubuntu 18 - Conda environment - Ryzen 3700

Python 2.7.16 Stackless 3.1b3 060516 |Anaconda, Inc.| (default, Mar 23 2019, 22:01:13)
[GCC 7.3.0] on linux2
Running project test suite (multi-threaded): bad performance, high CPU load
Running "empty loop" (framework only): low CPU load
Running "zita" (pygame application + framework): decent performance, high % CPU load that persists when idle

Ubuntu 18 - Stackless built locally - Ryzen 3700

Python 2.7.16 Stackless 3.1b3 060516 (default, Aug 17 2019, 14:48:39)
[GCC 7.4.0] on linux2
Running project test suite (multi-threaded): bad performance, high CPU load
Running "empty loop" (framework only): low CPU load
Running "zita" (pygame application + framework): good performance, moderate CPU load that disappears when idle

MacOS - Conda environment - Intel Core i5 (c:a 2013)

Python 2.7.15 Stackless 3.1b3 060516 |Anaconda, Inc.| (default, Oct 5 2018, 08:25:48)
[GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)] on darwin
Running project test suite (multi-threaded): bad performance, high CPU load
Running "empty loop" (framework only): low CPU load
Running "zita" (pygame application + framework): poor graphics performance, high CPU load that disappears when idle

MacOS - Downloaded installer - Intel Core i5 (c:a 2013)

Python 2.7.9 Stackless 3.1b3 060516 (default, Oct 22 2016, 20:25:12)
[GCC 4.2.1 Compatible Apple LLVM 7.3.0 (clang-703.0.31)] on darwin
Running project test suite (multi-threaded): bad performance, high CPU load
Running "empty loop" (framework only): low CPU load
Running "zita" (pygame application + framework): poor graphics performance, high CPU load that disappears when idle

Sorry for the vague error report, but I just don't have a lot to go on. Any help will be appreciated.

Thank you in advance and best regards,

Andreas

kristjanvalur · 2020-06-04T17:05:50Z

Hi there. So, you are using the plain old select.select(), I gather, and no special stackless features? From your description the problems seems limited to Ubuntu 18 on conda, with which I am not familiar. Why do you think this problem is peculiar to Stackless? Does regular python show the same problem?

adde1 · 2020-06-05T09:50:13Z

Hi Kristjan,

I am using select.select() to switch between sockets (for interprocess/intermachine communication) and Stackless channels/tasklets. There is also a scheduling function so I rely on the timeout of select.select() for it to wake up. My guess is that you would find something similar at the core of any framework supporting inter process communication and concurrency.

The framework makes a fair amount of use of tasklets and cooperative scheduling, enough so that running on standard python is not an option and migrating to a thread based approach would be a fairly steep investment.

Of course I cannot rule out that the problem is in Ubuntu, but given the fundamental nature of select.select() and that I see the same issues on both MacOS and Linux I think it is a less likely source.

Similarly with Python, I was assuming that standard Python was implementing a fairly straight call to the underlying select.select() and there should not be many sources of bugs here. But I have also not looked at the Python implementation (and to be honest it is probably beyond my skills in C anyway).

Two quick questions for trying to pin down the problem:

Does the Stackless implementation do anything special that in any way affects the select.select() statement in Python?
Is there any other more modern way to incorporate sockets with stackless for concurrency that does not include a call to select.select()?
Back in the days, I remember seeing a reference implementation of the socket module for Stackless. Is that still around, or was that incorporated into the Stackless distribution?

Thank you in advance and best regards :-)

And oh, I used to maintain a Windows dev environment too that unfortunately died some time ago. I'll see if I can resurrect that and if the problem exists on Windows or not.

kristjanvalur · 2020-06-05T10:51:01Z

select.select() is unchanged in stackless. It basically waits for file/socket IO and wakes up if these become readable/writable. From your description, it sounds like you are using select() to wait for socket IO, and then take these messages and send them into channels. if your cpu is spent in the select() call, it points to some operating system issue, possibly Ubuntu on this particular platform. A typical stackless loop would be something like (pseudocode) while true: # run tasklets, look at custom stakless timers for wakeup time if idle wakeup_time = perform_scheduling_and_find_next_wakeup_time() io = wait_for_io_until(wakeup_time) # essentially a select()/poll/() call. So, you need to see if it s the wait_for_io that is causing the cpu to remain high, or possibly that your sleep time is very low, possibly even 0, maybe because of some delta-time computations not being done correctly. In short, select.select() is not something within control of stackless. Eiher a) select() system call is very inefficient in this configuration or b) something is wrong in the scheduling code and you timeout is too low, causing unnecessary spin in the loop. Regardless of all that, you should be using poll() rather than select if possible. fös., 5. jún. 2020 kl. 09:50 skrifaði adde1 <[email protected]>:

…

Hi Kristjan, I am using select.select() to switch between sockets (for interprocess/intermachine communication) and Stackless channels/tasklets. There is also a scheduling function so I rely on the timeout of select.select() for it to wake up. The framework makes a fair amount of use of tasklets and cooperative scheduling, enough so that running on standard python is not an option and migrating to a thread based approach would be a fairly steep investment. Of course I cannot rule out that the problem is in Ubuntu, but given the fundamental nature of select.select() and that I see the same issues on both MacOS and Linux I think it is a less likely source. Similarly with Python, I was assuming that standard Python was implementing a fairly straight call to the underlying select.select() and there should not be many sources of bugs here. But I have also not looked at the Python implementation (and to be honest it is probably beyond my skills in C anyway). Two quick questions for trying to pin down the problem: 1. Does the Stackless implementation do anything special that in anyway affects the select.select() statement in Python? 2. Is there any other more modern way to incorporate sockets with stackless for concurrency that does not include a call to select.select()? 3. Back in the days, I remember seeing a reference implementation of the socket module for Stackless. Is that still around, or was that incorporated into the Stackless distribution? Thank you in advance and best regards :-) — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#234 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABN3FR4MU4TOJMYH2AB52TTRVC5WJANCNFSM4NQSINXQ> .

kristjanvalur · 2020-06-05T10:52:23Z

sorry, this should have been: while true: # run tasklets, look at custom stakless timers for wakeup time if idle wakeup_time = perform_scheduling_and_find_next_wakeup_time() io = wait_for_io_until(wakeup_time) # essentially a select()/poll/() call. This is the "idle" point in your program. act_on_io(io) # send messages to tasklets, etc. fös., 5. jún. 2020 kl. 10:50 skrifaði Kristján Valur Jónsson < [email protected]>:

…

select.select() is unchanged in stackless. It basically waits for file/socket IO and wakes up if these become readable/writable. From your description, it sounds like you are using select() to wait for socket IO, and then take these messages and send them into channels. if your cpu is spent in the select() call, it points to some operating system issue, possibly Ubuntu on this particular platform. A typical stackless loop would be something like (pseudocode) while true: # run tasklets, look at custom stakless timers for wakeup time if idle wakeup_time = perform_scheduling_and_find_next_wakeup_time() io = wait_for_io_until(wakeup_time) # essentially a select()/poll/() call. So, you need to see if it s the wait_for_io that is causing the cpu to remain high, or possibly that your sleep time is very low, possibly even 0, maybe because of some delta-time computations not being done correctly. In short, select.select() is not something within control of stackless. Eiher a) select() system call is very inefficient in this configuration or b) something is wrong in the scheduling code and you timeout is too low, causing unnecessary spin in the loop. Regardless of all that, you should be using poll() rather than select if possible. fös., 5. jún. 2020 kl. 09:50 skrifaði adde1 ***@***.***>: > Hi Kristjan, > > I am using select.select() to switch between sockets (for > interprocess/intermachine communication) and Stackless channels/tasklets. > There is also a scheduling function so I rely on the timeout of > select.select() for it to wake up. > > The framework makes a fair amount of use of tasklets and cooperative > scheduling, enough so that running on standard python is not an option and > migrating to a thread based approach would be a fairly steep investment. > > Of course I cannot rule out that the problem is in Ubuntu, but given the > fundamental nature of select.select() and that I see the same issues on > both MacOS and Linux I think it is a less likely source. > > Similarly with Python, I was assuming that standard Python was > implementing a fairly straight call to the underlying select.select() and > there should not be many sources of bugs here. But I have also not looked > at the Python implementation (and to be honest it is probably beyond my > skills in C anyway). > > Two quick questions for trying to pin down the problem: > > 1. Does the Stackless implementation do anything special that in > anyway affects the select.select() statement in Python? > 2. Is there any other more modern way to incorporate sockets with > stackless for concurrency that does not include a call to select.select()? > 3. Back in the days, I remember seeing a reference implementation of > the socket module for Stackless. Is that still around, or was that > incorporated into the Stackless distribution? > > Thank you in advance and best regards :-) > > — > You are receiving this because you commented. > Reply to this email directly, view it on GitHub > <#234 (comment)>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/ABN3FR4MU4TOJMYH2AB52TTRVC5WJANCNFSM4NQSINXQ> > . >

adde1 · 2020-06-05T11:26:59Z

Thank you Kristjan,

Thank you for the confirmation that Stackless does not modify the select.select() call!

I'll try switching to poll, and dig around a bit more.

I'll keep this issue open for a little while more, I'll report back my findings.

Again, thank you :-)

adde1 · 2020-06-07T17:57:15Z

Hi,

I have now:

Tested the old code on Windows 10, with Stackless from conda. It works perfectly (like it used to on the other platforms as well)
Instrumented the code to see that there was no bug in the delta-time calculation. I even get the same high CPU load when I lock the timeout to 0.5 seconds (resulting in 2 iterations per second when there is no communication).
Tested to replace select.select() with select.poll(). It did not make any noticable difference - the problem still persists.

The core loop looks pretty much exaclty like Kristjan describes. And it has been working for years (until recently).

The only lead I have is that with the same version of Stackless (Python 2.7.16 Stackless 3.1b3 060516) I get different result depending on if I use the build provided by conda, or if I use the build I built locally. They were built with different compilers (GCC 7.3.0 vs. GCC 7.4.0) and perhaps some differences in the dependencies that got linked in. But I don't know what to make of that.

If anyone has any thought on what to try, please let me know.

Cheers,

Andreas

kristjanvalur · 2020-06-07T18:58:02Z

What happens if you create an artificial program that just does select, with a long timeout. Will it consume CPU? You can then compare different pythons with and without stackless.

…

On Sun, 7 Jun 2020, 17:57 adde1, ***@***.***> wrote: Hi, I have now: 1. Tested the old code on Windows 10, with Stackless from conda. It works perfectly (like it used to on the other platforms as well) 2. Instrumented the code to see that there was no bug in the delta-time calculation. I even get the same high CPU load when I lock the timeout to 0.5 seconds (resulting in 2 iterations per second when there is no communication). 3. Tested to replace select.select() with select.poll(). It did not make any noticable difference - the problem still persists. The core loop looks pretty much exaclty like Kristjan describes. And it has been working for years (until recently). The only lead I have is that with the same version of Stackless (Python 2.7.16 Stackless 3.1b3 060516) I get different result depending on if I use the build provided by conda, or if I use the build I built locally. They were build with different compilers (GCC 7.3.0 vs. GCC 7.4.0) and perhaps some differences in the dependencies that get linked in. But I don't know what to make of that. If anyone has any thought on what to try, please let me know. Cheers, Andreas — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#234 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABN3FRZ2PSRDGRQAV4YRXM3RVPIIPANCNFSM4NQSINXQ> .

adde1 · 2021-10-30T01:35:04Z

Hi,

After doing a bit of other work in C, I mustered up the courage to dig into the implementation of selectmodule.c

At least on Debian, the problem with the excessive load was solved when I commented out Py_BEGIN_ALLOW_THREADS and Py_END_ALLOW_THREADS. So I believe the problem is within Python and not the operating system.

I have not (yet) tried to track down why Python seem to go into some infinite loop when the other threads are allowed. I am worried this may be over my head. But we will see...

Cheers

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

select.select() consuming excessive process time on Ubuntu & MacOS #234

select.select() consuming excessive process time on Ubuntu & MacOS #234

adde1 commented Jun 2, 2020 •

edited

Loading

kristjanvalur commented Jun 4, 2020

Uh oh!

adde1 commented Jun 5, 2020 •

edited

Loading

Uh oh!

kristjanvalur commented Jun 5, 2020 via email

Uh oh!

kristjanvalur commented Jun 5, 2020 via email

Uh oh!

adde1 commented Jun 5, 2020

Uh oh!

adde1 commented Jun 7, 2020 •

edited

Loading

Uh oh!

kristjanvalur commented Jun 7, 2020 via email

Uh oh!

adde1 commented Oct 30, 2021

Uh oh!

select.select() consuming excessive process time on Ubuntu & MacOS #234

select.select() consuming excessive process time on Ubuntu & MacOS #234

Comments

adde1 commented Jun 2, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

kristjanvalur commented Jun 4, 2020

Uh oh!

adde1 commented Jun 5, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kristjanvalur commented Jun 5, 2020 via email

Uh oh!

kristjanvalur commented Jun 5, 2020 via email

Uh oh!

adde1 commented Jun 5, 2020

Uh oh!

adde1 commented Jun 7, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kristjanvalur commented Jun 7, 2020 via email

Uh oh!

adde1 commented Oct 30, 2021

Uh oh!

adde1 commented Jun 2, 2020 •

edited

Loading

adde1 commented Jun 5, 2020 •

edited

Loading

adde1 commented Jun 7, 2020 •

edited

Loading