-
Notifications
You must be signed in to change notification settings - Fork 1k
Local Echo Broadcast with Async Clients #1744
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Local Echo Broadcast with Async Clients #1744
Conversation
janiversen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems you have missed out that local echo must work for both client and server, since transport is used in both.
|
I do not understand your racing condition...there are no "next transaction" before a response is received from the first transaction. "local echo" does not involve any serial/network communication but are purely at device driver level in the local computer, so there are no delay in receiving the data. Are you basing this PR on a real life problem ? if so please open an issue, with the debug logs, Your PR in general is well done, so I am sorry be a bit negative. |
janiversen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice test!
When sending a broadcast that will result in an echo, the current code sets the send_buffer to what was previously sent. The broadcast packet does not wait for any response and returns immediately after the packet is sent. The user code continues while the packet is transmitting on the line, often before the local echo is fully received. If the user code sends another packet immediately after the broadcast packet, it will overwrite send_buffer with the new packet before the echo was received. When the echo of the broadcast packet is then received it is not seen as an echo because send_buffer contains the contents of the next packet. This results in communication failures seeing there's unexpected data on the bus.
I'm working with hardware that uses an MAX13487E with auto direction control to a half-duplex RS485 bus. The auto direction control is necessary seeing the microprocessor cannot control the direction quick enough to reliably receive the other devices' responses. Auto direction control automatically controls whether the transmitter is enabled but it doesn't do anything for the receiver, and thus everything sent on the bus will be received by the hardware. In this case then, the local echo is at the hardware level, cannot be disabled and is delayed for as long as it takes to send the buffer and receive it back through the kernel. Assuming that a local echo is always instant is wrong.
Yes, this is a real life problem. I'm not able to capture the issue with debug logs enabled seeing it changes the timing: it adds enough delay that the local echo is received before the next packet is sent. Sorry, I should have included more context in the PR. |
|
Broadcast do not exist in modbus, especially not in rs485 communication, you can set slave=0 meaning address all devices on the line. The client still looks for a response and waits until timeout, at least when you use the mixin methods. If you send multiple requests on the serial line without giving the devices time to respond (go from listen mode to write mode), and that breaks the modbus protocol. Part of the PR makes sense to me, especially since the added test shows a clear bug, but the future is not acceptable nor that transport logic goes into client base. The transport class is to be seen as a closed box with a few methods and a few callbacks, and it handles everything relevant to to the transport layer. You could e.g. add to send_buffer instead of overwriting it since everything will be echoed. |
|
Just made a test with client_async_calls, change SLAVE=0x00 (broadcast as you call it) and ran it on a serial line. All requests waited for a response....could the problem be, you forgotten to await the client call: "await client.read_.....", that would allow your code to continue, but break a lot of things in pymodbus. |
Broadcasts do exist in serial modbus:
(From MODBUS over serial line specification and implementation guide V1.02 -- page 7. Broadcasts are also referenced in other specifications.) In fact pymodbus supports this behaviour -- it is just optional. You must specify broadcast_enable=True when creating the client. This works, and does not wait for the response in the current pymodbus code. You can see this in the code I'm changing in async_execute(), it simply returns the text "Broadcast write sent - no response expected" and does not wait for any response. This implementation just has this bug I'm trying to solve when it is combined with handle_local_echo=True. I suspect the test you tried did not set broadcaste_enable in the client. Then you would see what you are seeing. I can tell you that I am definitely await-ing the calls correctly. Broadcast support on the server side is different. In order for a server to support broadcasts it has to handle requests with a slave ID of zero and not send the response for such requests. There is a similar broadcast_enable flag in the server context and is implemented here. The server side does not have an equivalent of the problem I'm trying to solve with handle_local_echo seeing it doesn't actually send anything for broadcast packets. If you prefer the approach of appending to send_buffer, I can try that. However I won't get to it until next week. |
|
you are right, and I stand corrected....however your idea about the server is a bit off, remember a server can sit on a RS485 with many other devices so it is quite a lot more complicated. Anyhow, my main point is, Keep transport clean and do not spread transport issues into the upper levels. We have worked hard the last few releases to clean especially the client code ! When you have it ready, then add me as reviewer. |
|
Just passed over transport, the solution is quite simple. All you need to do is in transport_send check if send_buffer is empty and wait if not. |
Ok, that sounds reasonable. Just a quick question: As it is now, transport_send() does not have any knowledge of whether it is a broadcast request or not. If it doesn't know it's a broadcast then we are adding an additional wait on every request which is unnecessary and reduces efficiency. Would you be ok with passing a broadcast flag to this function which adds the wait in when handle_local_echo is also true? |
|
It should not know about broadcast the is message layer. It only needs to wait if send_buffer contains something, for non broadcast that will not be the case so no problem |
send_buffer will be set for all messages when handle_local_echo is true, not just broadcast messages. In 99% of the cases it does not need to wait for the local echo seeing the code will eventually be waiting for a response which includes that. The wait is only required when broadcast packets are sent and handle_local_echo is true. send_buffer will still be set for non-broadcast cases. In my opinion adding a wait for all packets is just adding inefficiencies to common use cases when it's only really needed in the niche case. So I think it's worthwhile passing a broadcast flag to transport_send() or finding another way. |
|
you forget in all other cases, there are no extra send until data is received which will clear the send_buffer so there are no extra wait. But it's your PR so feel free to do what you want, I will review what you submit. To me it is important to keep the layers separated, and broadcast relate to message handling not transport, |
|
We are getting close to release time, so if you want this in please solve the outstanding issue, and request a new review. |
The local echo code lookes for the local echo inside a received datagram. However, this can be racy, especially with serial ports. It's always the case that the local echo should start the transaction, but it's possible for a partial echo to arrive. Handle partial echos by dropping a part of the sent_buffer and print an error if the local echo is not actually received when it is expected.
Test that a local echo received over two datagrams is handled correctly.
When using broadcast packets with the local echo, the echo is not waited for. If transport_send() gets called again before the echo is received then `sent_buffer` is overwritten and the echo that might be coming from the broadcast packet will be forgotten about. To fix this, always append the expected echo to `sent_buffer`. Thus if there is an existing echo that has not been received yet it is not forgotten about and subsequent receives will do the right thing.
Broadcast packets do not wait for a response and thus any local echo will be stuck in the recieve buffer. Test to ensure a subsequent packet can be sent after the broadcast and that the local echo is handled correctly. Three tests are done: one where the broadcast echo and the subsequent message echo are returned in two calls to datagram_recieved(), one where the entire response is in one call, and one where the echo for the message packet is split over two datagram_received() calls.
70883ee to
837750d
Compare
|
I attempted solving the problem in the way that we discussed, however, it was not possible. transport_send() is not an async function so awaiting on a future is not possible in that function. transport_send() also called in places that were not async functions so it was not possible to easily convert it to one. (The classic function colouring problem.) Instead, I've pushed a fix using the other solution I alluded to: append to |
janiversen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks.
Looking forward to see more PRs from you, this one was not an easy one and we are always looking for a helping hand.
|
Thanks! |
Using broadcast messages with local echo almost works with asynchronous clients but it is racy.
If the echo for a broadcast comes in after the next transaction has been sent, then the sent_buffer gets overridden and the echo of the broadcast is forgotten. This corrupts the data stream.
This PR has a couple commits to make it more robust: for one allow the echo to be split along multiple packets, then ensure a broadcast waits for the echo before returning from async_execute.
The last two commits add tests for both these issues.