Skip to content

ssh: Use paramiko instead of pxssh #450

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Mar 6, 2020

Conversation

douglas-raillard-arm
Copy link
Collaborator

Here is an crude port of SshConnection to paramiko, so pxssh can rest in peace.

Quote from a pexpect maintainer in 2014 on DFE/MONK#108:

As a maintainer of pexpect, I highly recommend using paramiko instead of pexpect for ssh :) pexpect's ssh support is from long ago where no native-ssh option existed, it only remains for compatibility.
With paramiko, remote programs will not detect you as a terminal and will not likely write terminal sequences.

Note: it might be a better idea to just introduce a new class rather than ruthlessly modifying SshConnection, since TelnetConnection relies on pexpect and the recent addition of options parameter will not allow easily moving away from an actual ssh/scp command. (paramiko seems to have an OpenSSH config emulation, maybe that would allow that to work ...)

Note2: Some APIs that were not properly abstracted are also somewhat broken, like background() method that leaked a Popen object, rather than just giving a handle on stdin/stdout/stderr like paramiko does.

All in all, it took 2h to go from scratch to something that seem to work well enough for LISA, which is less than 10% of the time spent chasing TTY size issues (worked around but not really fixed AFAIR) and control sequences issues (#442). It also reduces by 2 the mount of code in SshConnection, mostly removing cruft related to interactive terminal management:

 2 files changed, 79 insertions(+), 159 deletions(-)

@douglas-raillard-arm
Copy link
Collaborator Author

A quick timeit benchmark on target.execute('hello world') on Juno R0 with SSH:

  • with pxssh (current devlib): 161 ms ± 456 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
  • with paramiko: 36.4 ms ± 4.22 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

That's 4.4 times faster with paramiko. This difference in speed allows a LISA test to go from 2min 30s to 1min 40s execution time.

@setrofim
Copy link
Collaborator

setrofim commented Dec 2, 2019

Looks good. Thanks for doing this!

Given the nature of the change, this will need to be thoroughly tested across a range of devices and applications before we merge.

@douglas-raillard-arm
Copy link
Collaborator Author

douglas-raillard-arm commented Dec 2, 2019

Of course, I'm currently testing the following setups:

  • debian + arm 64 + password auth
  • debian + arm + password auth
  • buildroot (OpenSSH_7.8p1) + arm 64 + password auth
  • buildroot (OpenSSH_7.8p1) + arm + password auth
  • buildroot (OpenSSH_7.8p1) + arm 64 + key auth
  • android (adb), which should not be impacted

Expected compat breakages:

  • usage of return value of target.background(): It's now a tuple stdin/stdout/stderr but currently is most of the time popen.Popen. After looking around the returned object already differed between Gem5Connection and SshConnection.
  • Users of the newly introduced options __init__ parameter.
  • Users expecting an interactive shell (unlikely but who knows ...). Paramiko has a exec_command(..., get_pty=False) parameters that might allow that. I would advise against doing that by default, since that's very unlikely to be needed and makes the user's life miserable in most cases.
  • cancel_running_command() is gone. It seemed to only be used in _execute_and_wait_for_prompt() to implement timeouts, and I cannot really think of another use case. If we want this kind of feature back for background(), it should be implemented by taking a handler on the specific thing returned by background, rather than some implicit ambiguous state of the target. Otherwise spawning 2 backgrounds command will only allow killing the last one, and spawning any non-background command in between will prevent from killing anything ...

TODO:

  • target.background() needs to allow the user to track the return code of the command, which can be done with the stdout object of paramiko but would need to be abstracted away.
  • see what can be done for options. Paramiko ships an OpenSSH config file parser, but it's up to the user to make use of the key/values. This means we need to know what people will use that for and translate it to parameters to SSHClient.connect(). If it's not too late, it might be better to remove options and replace it with specific supported options that can be implemented for both pxssh and paramiko.
  • Check that the exceptions I'm catching are the relevant ones. Paramiko has its own classes, but can also raise exceptions from the std socket module like socket.timeout.

Opened questions:

  • The current PR will raise an exception if sudo needs to be used when the user only authenticated with a keyfile, since we end up not having any password to feed to sudo. I'm not sure what's the current expected behavior in this case.
  • What was process_backspaces() designed for ? If it was only to cope with control sequences, we can safely remove it as I did, since we got a non-interactive shell. If it was there for other reasons, we might want to keep it.

@setrofim
Copy link
Collaborator

setrofim commented Dec 2, 2019

The current PR will raise an exception if sudo needs to be used when the user only authenticated with a keyfile, since we end up not having any password to feed to sudo. I'm not sure what's the current expected behavior in this case.

That is probably fine.

What was process_backspaces() designed for ? If it was only to cope with control sequences, we can safely remove it as I did, since we got a non-interactive shell. If it was there for other reasons, we might want to keep it.

Yeah, it's to deal with TTY weirdness that seemed to leave ^H characters in the stream on some devices. I'm happy to get rid of it on the assumption that paramiko won't present the same issues, and re-introduce this later if it becomes necessary.

Copy link
Collaborator

@marcbonnici marcbonnici left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed this seems like a good option.

Regarding the SSH options features @mpekatsoula would you be able to provide us with some of your use cases to see how they could be mapped onto this?

source = '{}@{}:{}'.format(self.username, self.host, source)
return self._scp(source, dest, timeout)
sftp = self._get_sftp(timeout)
sftp.get(source, dest)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The paramiko sftp implementation doesn't seem to be able to deal with pushing/pulling directories directly so we might need to add support for this manually for convenience.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the destination already exists when pushing a folder, should it be wiped or should the new content overlayed on the existing folder ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think overlaying would make sense, this would at least preserve the existing behaviour of scp.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect, that's also easier to implement :)

@douglas-raillard-arm
Copy link
Collaborator Author

Updated with:

  • fixed push/pull to work with folders (currently, pushing over an existing folder will write into it rather than wiping it first).
  • Fixed locking by using a threading.RLock rather than a threading.Lock, which allows the lock to be reentrant so we can call execute() from anywhere, including to compute properties potentially used by execute() itself.

@mpekatsoula
Copy link
Contributor

Regarding the SSH options features @mpekatsoula would you be able to provide us with some of your use cases to see how they could be mapped onto this?

@marcbonnici
We use these options when connecting to a device:
-o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null

@setrofim
Copy link
Collaborator

setrofim commented Dec 3, 2019

-o StrictHostKeyChecking=no

This can be achieved by not loading the host keys (currently done here: https://github.com/ARM-software/devlib/pull/450/files#diff-d0a4376874b01b42b478cb7119010811R206)

-o UserKnownHostsFile=/dev/null

This handled by the AutoAddPolicy set here: https://github.com/ARM-software/devlib/pull/450/files#diff-d0a4376874b01b42b478cb7119010811R207

@douglas-raillard-arm
Copy link
Collaborator Author

Updating from the offline chat with @setrofim and @marcbonnici about background():

TODO:

  1. Implement BackgroundCommand() to abstract over popen.Popen() and whatever paramiko gives to control background commands. It should:
    • give a file-like stdout object (containing merged stdout and stderr)
    • give a file-like stdin object
    • give a way to send arbitrary POSIX signal
    • give a cancel() method with the usual behavior (sigint, then sigkill after a short timeout)
    • give a way to wait for completion and return exite code

1.1) Use BackgroundCommand.cancel() to implement SshConnection.cancel_running_command()

  1. Check that the exceptions I'm catching are the relevant ones. Paramiko has its own classes, but can also raise exceptions from the std socket module like socket.timeout.

  2. Add an SshConnection parameter equivalent to:

    -o UserKnownHostsFile=/dev/null

From @setrofim : This handled by the AutoAddPolicy set here: https://github.com/ARM-software/devlib/pull/450/files#diff-d0a4376874b01b42b478cb7119010811R207

  1. (optional) see if we can remove the connection locking when executing background() if that's reasonably straightforward to do.

This leaves us with the following user-visible change:

  • options either removed and replaced with ad-hoc parameters. That can be avoided by only using UserKnownHostsFile key and raising exception when any other option is found, but this means we will have to exactly match what ssh client from OpenSSH gives, otherwise users might (rightfully) complain. If paramiko cannot reproduce the exact behavior, we might run into issues.

  • Command will now be ran without a controlling terminal. Do we want to try to support get_pty=True ?

@douglas-raillard-arm
Copy link
Collaborator Author

background() question: Paramiko does not seem to allow redirecting stdout/stderr. This means we cannot currently honour these parameters of background() unless we spawn a thread that will read the outputs and re-write them into what the user provides. Should we go the side-thread way or should we just drop stdout and stderr parameters from background() ?

That's a bit of a tricky question since that means that the user will have to read stdout and stderr, otherwise the command might get blocked until the user does so, leading to a timeout triggering.

@setrofim
Copy link
Collaborator

setrofim commented Dec 5, 2019

Getting rid of them is not an option. Can Paramiko be monkey-patched to allow redirection? If not, then go with the thread. From the user's perspective, the behaviour of background() should not change from the existing implementation.

@setrofim
Copy link
Collaborator

setrofim commented Dec 5, 2019

OK, having had a brief look through Paramiko, monkey-patching isn't going to work, and given its implementation, subprocess.PIPEs don't really make much sense. However the blocking on read is potentially a problem, so having a thread drain the underlying channel into a local buffer and then presenting that as a stream to the calling code might be the way to go.

@douglas-raillard-arm
Copy link
Collaborator Author

After experimenting a bit, it appears that with the current default stdout=subprocess.PIPE already forces the user to consume the output. If that's not done, the command will block if it outputs an implementation-defined amount of data, so all users must consume the output. Since all the APIs I've seen so far force consumption of stdout/stderr, I would say it's reasonable to replicate that in devlib.

That would mean we only need to spawn a thread if someone gives one of these:

* DEVNULL (== read and discard)
* an existing file descriptor (a positive integer)
* an existing file object
* None (== sys.stdout or sys.stderr)

(list taken from https://docs.python.org/3/library/subprocess.html#subprocess.Popen)

@douglas-raillard-arm
Copy link
Collaborator Author

douglas-raillard-arm commented Dec 5, 2019

Updated with BackgroundCommand() abstract base class, mirroring a subset of subprocess.Popen API. It's implemented on top of both subprocess.Popen and paramiko objects.

remaining TODOs:
0) Implement custom stream redirection in SshConnection.background() with a companion thread to copy data from paramiko into another file-like object.
=> Done

  1. Use BackgroundCommand.cancel() to implement SshConnection.cancel_running_command()
    => Done
  2. Check that the exceptions I'm catching are the relevant ones. Paramiko has its own classes, but can also raise exceptions from the std socket module like socket.timeout.
    => Done
  3. Add option to implement -o UserKnownHostsFile=/dev/null
    => Done (implemented as SshConnectionBase(strict_host_check=True))
  4. (optional) see if we can remove the connection locking when executing background() if that's reasonably straightforward to do.
    => Done
  5. Make an AdbBackgroundCommand() that is able to send signals to the remote command, so it can be killed properly
    => Done
  6. Make sure that the command dies if the SSH connection is broken (not sure what will happen by default since we don't have a controlling terminal).
    => Does not seem to be doable. We can cancel the current background command on connection close() though, and if the channel is closed the command will get SIGPIPE if it tries to write on stdout/stderr. It won't receive a SIGHUP though, since there is no controlling terminal.
  7. Implement SshConnection.background(as_root=True)
    => Done

@douglas-raillard-arm douglas-raillard-arm force-pushed the _home_pr1 branch 4 times, most recently from d4611ca to 7e0b08b Compare December 12, 2019 15:00
@douglas-raillard-arm
Copy link
Collaborator Author

douglas-raillard-arm commented Dec 12, 2019

@marcbonnici @setrofim The PR should now be finished feature-wise and somewhat ready for testing. It will need some polishing for review (I'll try to break down the big commit in smaller ones), but the code itself can now be expected to work.

The main changes are:

  1. BackgroundCommand() offering a subprocess.Popen-like interface, and is expected to behave as such. Differences between an actual Popen should be looked into if they are fixable. Main differences are:
  • Actions done on the BackgroundCommand() are "transparent" so that they impact the remote command, not whatever local proxy command in use (ex adb shell).
  • sending a signal sends it to the whole process group rather than just one process. That makes the code work regardless of the nesting of sh/sudo calls, non-blocking & execution etc.
  • Closing stdout on an SSH remote command will not send SIGPIPE to the remote command when it tries to read it until stderr is closed as well. (if it was redirected to subprocess.DEVNULL it's considered closed). This is because of a limitation of SSH which uses one channel multiplexing stdin/stdout/stderr, and only closing that channel ends up in closing the writer end of the pipe of the remote process.
  1. options is gone, replaced by strict_host_check=True. When True, any unknown server will be rejected, when False unknown servers will be added to known_hosts.

  2. Any number of concurrent target.background() commands can be started at the same time. target.cancel_running_command() will call BackgroundCommand.cancel() on all the ones that are still alive, to send them SIGTERM then SIGKILL.

  3. Commands ran with SSH don't have a controlling terminal, so interactive programs won't work. sudo -S is used to read the password from stdin instead. Output lines will not wrap anymore and most commands will stop sending colored output.

Performance-wise, an target.execute('echo hello world') now takes ~30ms roundtrip compared to ~160ms with pxssh.

@douglas-raillard-arm douglas-raillard-arm changed the title [RFC] ssh: Use paramiko instead of pxssh ssh: Use paramiko instead of pxssh Jan 15, 2020
@douglas-raillard-arm
Copy link
Collaborator Author

@marcbonnici @setrofim I've split the PR into more contained commits. The last one is pretty big but there isn't really any alternative since it's almost all new code.

It should now be ready for review proper and testing so I removed the [RFC] tag on the PR name.

Copy link
Collaborator

@marcbonnici marcbonnici left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some findings from initial testing with a chromebook.

@douglas-raillard-arm
Copy link
Collaborator Author

Hi @setrofim @marcbonnici, should I squash the WIP patches into the ones they are fixing up to prepare the final version of the branch ?

@setrofim
Copy link
Collaborator

setrofim commented Mar 5, 2020

Yes, please. I think we've gotten all the feedback we're going to for this; and I don't think anything more would need to be changed.

Use the timeout parameter added in Python 3.3, which removes the need for the
timer thread and avoids some weird issues in preexec_fn, as it's now documented
to sometimes not work when threads are involved.
Similar to a regular property(), with the following differences:
* Values are memoized and are threadlocal

* The value returned by the property needs to be called (like a weakref) to get
  the actual value. This level of indirection is needed to allow methods to be
  implemented in the proxy object without clashing with the value's methods.

* If the above is too annoying, a "sub property" can be created with the regular
  property() behavior (and therefore without the additional methods) using
  tls_property.basic_property .
This frees the connection to have to handle threading issues, since each thread
using the Target will have its own connection. The connection will be garbage
collected when the thread using it dies, avoiding connection leaks.
Update a command line to redirect standard streams as specified using the
parameters. This helper allows honoring streams specified in the same way as
subprocess.Popen, by doing it as much using shell redirections as possible.
* Unify the behavior of background commands in connections.BackgroundCommand().
  This implements a subset of subprocess.Popen class, with a unified behavior
  across all connection types

* Implement the SSH connection using paramiko rather than pxssh.
Check that executing the most basic command works without troubles or stderr
content. If that's not the case, raise a TargetStableError.
@douglas-raillard-arm
Copy link
Collaborator Author

@setrofim the branch is updated

@marcbonnici marcbonnici merged commit 7e682ed into ARM-software:master Mar 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants