-
Notifications
You must be signed in to change notification settings - Fork 61
select.select() consuming excessive process time on Ubuntu & MacOS #234
Comments
Hi there. So, you are using the plain old select.select(), I gather, and no special stackless features? From your description the problems seems limited to Ubuntu 18 on conda, with which I am not familiar. Why do you think this problem is peculiar to Stackless? Does regular python show the same problem? |
Hi Kristjan, I am using select.select() to switch between sockets (for interprocess/intermachine communication) and Stackless channels/tasklets. There is also a scheduling function so I rely on the timeout of select.select() for it to wake up. My guess is that you would find something similar at the core of any framework supporting inter process communication and concurrency. The framework makes a fair amount of use of tasklets and cooperative scheduling, enough so that running on standard python is not an option and migrating to a thread based approach would be a fairly steep investment. Of course I cannot rule out that the problem is in Ubuntu, but given the fundamental nature of select.select() and that I see the same issues on both MacOS and Linux I think it is a less likely source. Similarly with Python, I was assuming that standard Python was implementing a fairly straight call to the underlying select.select() and there should not be many sources of bugs here. But I have also not looked at the Python implementation (and to be honest it is probably beyond my skills in C anyway). Two quick questions for trying to pin down the problem:
Thank you in advance and best regards :-) And oh, I used to maintain a Windows dev environment too that unfortunately died some time ago. I'll see if I can resurrect that and if the problem exists on Windows or not. |
select.select() is unchanged in stackless. It basically waits for
file/socket IO and wakes up if these become readable/writable.
From your description, it sounds like you are using select() to wait for
socket IO, and then take these messages and send them into channels. if
your cpu is spent in the select() call, it points to some operating system
issue, possibly Ubuntu on this particular platform.
A typical stackless loop would be something like (pseudocode)
while true:
# run tasklets, look at custom stakless timers for wakeup time if idle
wakeup_time = perform_scheduling_and_find_next_wakeup_time()
io = wait_for_io_until(wakeup_time) # essentially a select()/poll/()
call.
So, you need to see if it s the wait_for_io that is causing the cpu to
remain high, or possibly that your sleep time is very low, possibly even 0,
maybe because of some delta-time computations not being done correctly.
In short, select.select() is not something within control of stackless.
Eiher a) select() system call is very inefficient in this configuration or
b) something is wrong in the scheduling code and you timeout is too low,
causing unnecessary spin in the loop.
Regardless of all that, you should be using poll() rather than select if
possible.
fös., 5. jún. 2020 kl. 09:50 skrifaði adde1 <[email protected]>:
… Hi Kristjan,
I am using select.select() to switch between sockets (for
interprocess/intermachine communication) and Stackless channels/tasklets.
There is also a scheduling function so I rely on the timeout of
select.select() for it to wake up.
The framework makes a fair amount of use of tasklets and cooperative
scheduling, enough so that running on standard python is not an option and
migrating to a thread based approach would be a fairly steep investment.
Of course I cannot rule out that the problem is in Ubuntu, but given the
fundamental nature of select.select() and that I see the same issues on
both MacOS and Linux I think it is a less likely source.
Similarly with Python, I was assuming that standard Python was
implementing a fairly straight call to the underlying select.select() and
there should not be many sources of bugs here. But I have also not looked
at the Python implementation (and to be honest it is probably beyond my
skills in C anyway).
Two quick questions for trying to pin down the problem:
1. Does the Stackless implementation do anything special that in
anyway affects the select.select() statement in Python?
2. Is there any other more modern way to incorporate sockets with
stackless for concurrency that does not include a call to select.select()?
3. Back in the days, I remember seeing a reference implementation of
the socket module for Stackless. Is that still around, or was that
incorporated into the Stackless distribution?
Thank you in advance and best regards :-)
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#234 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABN3FR4MU4TOJMYH2AB52TTRVC5WJANCNFSM4NQSINXQ>
.
|
sorry, this should have been:
while true:
# run tasklets, look at custom stakless timers for wakeup time if idle
wakeup_time = perform_scheduling_and_find_next_wakeup_time()
io = wait_for_io_until(wakeup_time) # essentially a select()/poll/()
call. This is the "idle" point in your program.
act_on_io(io) # send messages to tasklets, etc.
fös., 5. jún. 2020 kl. 10:50 skrifaði Kristján Valur Jónsson <
[email protected]>:
… select.select() is unchanged in stackless. It basically waits for
file/socket IO and wakes up if these become readable/writable.
From your description, it sounds like you are using select() to wait for
socket IO, and then take these messages and send them into channels. if
your cpu is spent in the select() call, it points to some operating system
issue, possibly Ubuntu on this particular platform.
A typical stackless loop would be something like (pseudocode)
while true:
# run tasklets, look at custom stakless timers for wakeup time if idle
wakeup_time = perform_scheduling_and_find_next_wakeup_time()
io = wait_for_io_until(wakeup_time) # essentially a select()/poll/()
call.
So, you need to see if it s the wait_for_io that is causing the cpu to
remain high, or possibly that your sleep time is very low, possibly even 0,
maybe because of some delta-time computations not being done correctly.
In short, select.select() is not something within control of stackless.
Eiher a) select() system call is very inefficient in this configuration or
b) something is wrong in the scheduling code and you timeout is too low,
causing unnecessary spin in the loop.
Regardless of all that, you should be using poll() rather than select if
possible.
fös., 5. jún. 2020 kl. 09:50 skrifaði adde1 ***@***.***>:
> Hi Kristjan,
>
> I am using select.select() to switch between sockets (for
> interprocess/intermachine communication) and Stackless channels/tasklets.
> There is also a scheduling function so I rely on the timeout of
> select.select() for it to wake up.
>
> The framework makes a fair amount of use of tasklets and cooperative
> scheduling, enough so that running on standard python is not an option and
> migrating to a thread based approach would be a fairly steep investment.
>
> Of course I cannot rule out that the problem is in Ubuntu, but given the
> fundamental nature of select.select() and that I see the same issues on
> both MacOS and Linux I think it is a less likely source.
>
> Similarly with Python, I was assuming that standard Python was
> implementing a fairly straight call to the underlying select.select() and
> there should not be many sources of bugs here. But I have also not looked
> at the Python implementation (and to be honest it is probably beyond my
> skills in C anyway).
>
> Two quick questions for trying to pin down the problem:
>
> 1. Does the Stackless implementation do anything special that in
> anyway affects the select.select() statement in Python?
> 2. Is there any other more modern way to incorporate sockets with
> stackless for concurrency that does not include a call to select.select()?
> 3. Back in the days, I remember seeing a reference implementation of
> the socket module for Stackless. Is that still around, or was that
> incorporated into the Stackless distribution?
>
> Thank you in advance and best regards :-)
>
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub
> <#234 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/ABN3FR4MU4TOJMYH2AB52TTRVC5WJANCNFSM4NQSINXQ>
> .
>
|
Thank you Kristjan, Thank you for the confirmation that Stackless does not modify the select.select() call! I'll try switching to poll, and dig around a bit more. I'll keep this issue open for a little while more, I'll report back my findings. Again, thank you :-) |
Hi, I have now:
The core loop looks pretty much exaclty like Kristjan describes. And it has been working for years (until recently). The only lead I have is that with the same version of Stackless (Python 2.7.16 Stackless 3.1b3 060516) I get different result depending on if I use the build provided by conda, or if I use the build I built locally. They were built with different compilers (GCC 7.3.0 vs. GCC 7.4.0) and perhaps some differences in the dependencies that got linked in. But I don't know what to make of that. If anyone has any thought on what to try, please let me know. Cheers, Andreas |
What happens if you create an artificial program that just does select,
with a long timeout. Will it consume CPU? You can then compare different
pythons with and without stackless.
…On Sun, 7 Jun 2020, 17:57 adde1, ***@***.***> wrote:
Hi,
I have now:
1. Tested the old code on Windows 10, with Stackless from conda. It
works perfectly (like it used to on the other platforms as well)
2. Instrumented the code to see that there was no bug in the
delta-time calculation. I even get the same high CPU load when I lock the
timeout to 0.5 seconds (resulting in 2 iterations per second when there is
no communication).
3. Tested to replace select.select() with select.poll(). It did not
make any noticable difference - the problem still persists.
The core loop looks pretty much exaclty like Kristjan describes. And it
has been working for years (until recently).
The only lead I have is that with the same version of Stackless (Python
2.7.16 Stackless 3.1b3 060516) I get different result depending on if I use
the build provided by conda, or if I use the build I built locally. They
were build with different compilers (GCC 7.3.0 vs. GCC 7.4.0) and perhaps
some differences in the dependencies that get linked in. But I don't know
what to make of that.
If anyone has any thought on what to try, please let me know.
Cheers,
Andreas
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#234 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABN3FRZ2PSRDGRQAV4YRXM3RVPIIPANCNFSM4NQSINXQ>
.
|
Hi, After doing a bit of other work in C, I mustered up the courage to dig into the implementation of selectmodule.c At least on Debian, the problem with the excessive load was solved when I commented out Py_BEGIN_ALLOW_THREADS and Py_END_ALLOW_THREADS. So I believe the problem is within Python and not the operating system. I have not (yet) tried to track down why Python seem to go into some infinite loop when the other threads are allowed. I am worried this may be over my head. But we will see... Cheers |
Uh oh!
There was an error while loading. Please reload this page.
Hi,
I have been using stackless for some time, but now I am stuck and need to ask for help. In short, the call to select.select() consumes excessive processing time (equal to the wall clock) in some scenarios. It seems it happens when the system get busy, but I have not been able to boil it down better than that.
The behaviour is not consistent across platforms and versions of Stackless. When I started using Stackless back around 2.7.2 I did not have this performance problem. I first got problems on MacOS around 2.7.9 but since I was anyway about to finish up my then big project I just switched to working on Ubuntu. But now I get similar symptoms on Ubuntu as well.
The core loop of my project has not changed significantly since the start. I also don't know what I could have done wrong on Python side to have select.select behave almost like if it was implemented with a loop (but only in some cases).
I would like to move onto Conda because for my new project I need numpy, scipy, and pygame at the same time (as well as FORTRAN compiler) but with the current issues I am kind of stuck.
The behaviour I get is as follows:
Ubuntu 12, 14, 16 - Stackless built locally - Intel 2500K
Ubuntu 18 - Conda environment - Ryzen 3700
[GCC 7.3.0] on linux2
Ubuntu 18 - Stackless built locally - Ryzen 3700
[GCC 7.4.0] on linux2
MacOS - Conda environment - Intel Core i5 (c:a 2013)
[GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)] on darwin
MacOS - Downloaded installer - Intel Core i5 (c:a 2013)
[GCC 4.2.1 Compatible Apple LLVM 7.3.0 (clang-703.0.31)] on darwin
Sorry for the vague error report, but I just don't have a lot to go on. Any help will be appreciated.
Thank you in advance and best regards,
Andreas
The text was updated successfully, but these errors were encountered: