Skip to content

Conversation

tanishqjasoria
Copy link

This aims to add support for hardware security modules, to generate signatures using keys stored in the modules.

Fixes issue #: tuf/issues/864

Description of the changes being introduced by the pull request:

  1. Add PKCS11LIB to securesystemslib/settings.py. It servers as the PATH of PKCS11 library for
    cryptographic tokens
  2. Add hsm.py, to support hardware security modules through the PKCS#11 standard. This module use
    PyKCS11, a python wrapper (SWIG) for PKCS#11 modules to communicate with the cryptographic
    tokens using CCID/PIV interface
  3. Add hsm_keys.py, to provide a high-level API for using hardware security modules for various
    cryptographic operations

Please verify and check that the pull request fulfills the following
requirements
:

  • The code follows the Code Style Guidelines
  • Tests have been added for the bug fix or new feature
  • Docs have been added for the bug fix or new feature

@lukpueh lukpueh self-requested a review June 19, 2019 15:51
@JustinCappos
Copy link

JustinCappos commented Jun 19, 2019 via email

Copy link
Contributor

@mnm678 mnm678 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a few suggestions inline. In addition, could you add some tests for this? That will make the Travis build pass as well.

Copy link
Member

@lukpueh lukpueh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, @tanishqjasoria, this is impressive work! :) Don't let the many review comments discourage you, most of them should be very easy to fix.

Please make sure to add tests all of your new functionality. I also think we should make PyKCS11 optional, similar to cryptography and pynacl (see extras_require in setup.py), since we want to provide a minimal python-only securesystemslib.

I'm hesitant whether it's a good idea to add libsofthsm2.so to the repo, since it isn't cross-platform. I'd rather have pointers to resources that explain how to install those native dependencies in our docs, i.e.:

We can also explain how to then use securesystemslib.settings.PKCS11LIB or the PYKCS11LIB envvar to point to those dependencies.

We also need to tell Travis to install swig and SoftHSMv2 in order to run the tests.

Let me know if any of my comments are unclear.

@tanishqjasoria tanishqjasoria force-pushed the hsm_support_signing branch 2 times, most recently from ce7d029 to 5e64921 Compare July 1, 2019 23:35
@lukpueh
Copy link
Member

lukpueh commented Jul 2, 2019

Thanks for the changes, @tanishqjasoria! Please ping me when you've fixed the tests and addressed all of my comments, and I'll re-review. And let me know if you need any help.

@lukpueh
Copy link
Member

lukpueh commented Aug 5, 2019

Hi @tanishqjasoria, are you planning on finalizing this PR? It would be very much appreciated. :)

@tanishqjasoria
Copy link
Author

Yeah sure, I would try to finalize this PR by this weekend!

Copy link
Member

@lukpueh lukpueh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing my comments, @tanishqjasoria! Besides my new inline comments, there are a few more things that we should tackle before merging:

  • Add PyKCS11 to extras_require in setup.py for easy optional installation
  • Remove libsofthsm2.so
  • Add documentation for local testing, i.e. how to install and configure softhsm (may be ticketized)
  • Add usage documentation (may be ticketized)

Regarding your questions (on slack):

  • It's okay to not have 100% coverage, we can add more tests in a follow-up PR.
  • It's okay if this does not work on older Python versions. However, we should
    • mention this in the usage documentation (may be ticketized),
    • add version markers to *-requirements.txt and setup.py (e.g. PyKCS11==1.5.5; python_version > '<X.X>'),
    • don't run HSM-related tests on those versions.

Let me know if you want do address these (or some of these) issues, otherwise I am happy to take over.

@lukpueh
Copy link
Member

lukpueh commented Sep 6, 2019

It would be good to rebase on top of master, because the recent changes enhance automated testing and coverage visualization.

I do suggest though to first address the inline comments in my last review, as GitHub understandably has a hard time mapping the comments to the code after a rebase. :)

@coveralls
Copy link

coveralls commented Sep 10, 2019

Coverage Status

Coverage remained the same at ?% when pulling 4ca1d2e on tanishqjasoria:hsm_support_signing into 2f2dfa9 on secure-systems-lab:master.

Copy link
Collaborator

@SantiagoTorres SantiagoTorres left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two minor nits, but this is looking good :)

@trishankatdatadog
Copy link
Contributor

Finally --- good job, guys!!

setup.py Outdated
],
install_requires = ['six>=1.11.0', 'subprocess32; python_version < "3"',
'python-dateutil==2.8.0', 'colorama>=0.3.9'],
'python-dateutil==2.8.0', 'PyKCS11==1.5.5; python_version > "3"', 'colorama>=0.3.9'],

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're pinning here a version of PyKCS11 that is not available anymore on PyPI.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, maybe I'm missing something, but in general we should be using >= not ==, no?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I forgot to change that.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, you have the same version pinning in ci-requirements.txt and test-requirements.txt.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out. The various *-requirements.txt are due for revision (see #182).

Comment on lines +329 to +334
if PyKCS11.CKK[key_type] == 'CKK_RSA':
mechanism = PyKCS11.RSA_PSS_Mechanism(RSA_PSS_MECH,
RSA_PSS_HASH_SHA256, RSA_PSS_MGF_SHA256, RSA_PSS_SALT_LENGTH)

elif PyKCS11.CKK[key_type] == 'CKK_EC' or PyKCS11.CKK[key_type] == 'CKK_ECDSA':
mechanism = PyKCS11.Mechanism(PyKCS11.CKM_ECDSA)
Copy link

@Silvanoc Silvanoc Jan 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mechanism to be used should be negotiated with the slot to ensure that a supported one (provided by PyKCS11Lib.getMechanismList) gets used. Mine, for example, doesn't support any of the PSS mechanisms.

I'd rather provide a sorted list (being the first the preferred mechanism) and automatically select the first matching one.

BTW this is exactly the same code as in lines 384-389, right? I'd rather extract that code to a function, not only for code economy, but in order to ensure that the same negotiation algorithm gets used.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To get started with the work, We kept it simple at first (worked on the issue according to the mechanisms available with the YubiKey). I can make the required changes to get it supported with various HSM available.

How would we decide the list of mechanisms to be made available?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's okay to restrict the list of supported mechanisms. Ideally we do this via some module level constants or setting. For other signing algorithms/schemes in this library we make the information available, e.g. to the verifier, via the key.

Comment on lines 39 to 46
# RSA-PSS with SHA256 hash to be used for signature generation.
RSA_PSS_MECH = PyKCS11.CKM_SHA256_RSA_PKCS_PSS
# SHA256 hash to be used to digest the data.
RSA_PSS_HASH_SHA256 = PyKCS11.CKM_SHA256
# Mask generating function for SHA256 Hash.
RSA_PSS_MGF_SHA256 = PyKCS11.CKG_MGF1_SHA256
# Length of salt to be used for hashing.
RSA_PSS_SALT_LENGTH = 32
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Due to various mechanisms available, I just declared the mechanism at the start for consistency within various operations.

@tanishqjasoria tanishqjasoria force-pushed the hsm_support_signing branch 3 times, most recently from 5bb604b to f7a3bbb Compare January 30, 2020 20:24
@lukpueh
Copy link
Member

lukpueh commented Jan 31, 2020

Hi @tanishqjasoria, you have a couple of unnecessary merge commits in your branch. The preferred way to sync a feature branch with upstream master is to regularly rebase on master (instead of merging master into the feature branch).

I cleaned up your commit history by cherry-picking your relevant commits on top of the up-to-date upstream master (resolving many conflicts). You can replace your branch with my cleaned up version using these commands (try to understand what they are doing):

# Fetch my version of your branch
git fetch [email protected]:lukpueh/securesystemslib.git hsm_support_signing-clean:hsm_support_signing-clean

# Go to your branch (if you're not already on it)
git checkout hsm_support_signing

# You can diff yours with mine to see that mine does not remove any of your changes
# You should only see additions that come from syncing with upstream master
git diff hsm_support_signing hsm_support_signing-clean

# Replace yours with mine
git reset --hard hsm_support_signing-clean

# Force push to update the branch on GitHub
# (you might have to change "origin" to whatever points to [email protected]:tanishqjasoria/securesystemslib.git)
git push origin hsm_support_signing -f

In case you are interested how I cleaned up your history, here's what I did. You don't need to do this, if you used above commands, but it helps to know these commands.

# Note down the relevant commit hashes from your PR
#  `--no-merges` shows the log without merge commits
# `--first-parent` shows the log without commits that you merged in
git log --oneline --no-merges  --first-parent master..hsm_support_signing

# Create a new branch on top of master
# (make sure that master is the laster version of upstream/master !!) 
git checkout -b hsm_support_signing-clean &&  git reset --hard master

# Now use `git cherry-pick <commit ID>` for all commits that you noted above, 
# starting at the bottom and going to the top.
# This is actually the hard part, because it requires fixing quite a few conflicts, due
# to upstream master having moved on, which in turn requires knowledge of the
# changes in upstream master.

For future references, please try to avoid merge commits on your feature branch! :)

@tanishqjasoria
Copy link
Author

Thanks a lot for your help @lukpueh!
I will keep this in mind from now on.

@joshuagl
Copy link
Collaborator

joshuagl commented Feb 6, 2020

In Issue #179 we're working to ensure that any functionality with a native dependency presents meaningful user feedback when the native dependency is not available.

It would be great if this PR could implement functionality similar to #200 to ensure the hsm_keys module can be imported even if the native dependency isn't available and that any functions which rely on that native dependency throw an UnsupportedLibraryError.

@trishankatdatadog
Copy link
Contributor

Also, how do we test that this works? Docker has tests that somehow tests Yubikeys, not sure how they do it, maybe using emulators.

"""

# Refresh the list of available slots for HSM
self.refresh()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a more general design question: what is the intended lifetime behavior of HSM objects?

My intuition as a consumer of this library is that the list of available HSMs available for use in an HSM instance is fixed at initialization time, but calling refresh here makes it possible that the list has changed between object creation and use. If the intended behavior is that any operation that needs to access HSMs always operates on a refreshed list, it might make sense to turn refresh into a decorator and explicitly document that as the behavior.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intended behaviour of the HSM object is something like, get the list of available HSM -> establish a session with the required HSM -> do the intended job or jobs -> close the session.
It's advisable to refresh the list before doing a series of operations as SlotID may change on removal and reinsertion of the HSM.
In hsm_keys, load_HSM calls refresh and get the list of available HSMs to the user. The user chooses from the list of slot available and then pass it as the argument HSM_info to various functions. Adding refresh as a decorator won't give the user the functionality to choose from the available HSMs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be feasible to make initialization and session handling transparent to the user of the interface? You already do this to some extent, i.e. by each time opening the session in load_public_keys and load_private_keys. Similarly you always call login before loading private keys.




def logout(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Do logout and close_session form part of the public API, or are they just abstracted here for calling in close? If the latter, it might make sense to document them as private and prefix them with underscores (e.g. _logout and _close_session) to discourage misuse.

@lukpueh
Copy link
Member

lukpueh commented Feb 21, 2020

I don't recall. I'll let @lukpueh and @SantiagoTorres chime in...

@lukpueh @SantiagoTorres Do you guys need a YubiHSM to test this with?

It would be nice if NYU could get one to test it. I only have a YubiKey 5 nano, but haven't tested it with this PR yet.

@tanishqjasoria
Copy link
Author

I have tested this PR using the Yubikey NEO.

@trishankatdatadog
Copy link
Contributor

It would be nice if NYU could get one to test it. I only have a YubiKey 5 nano, but haven't tested it with this PR yet.

Let me test this PR with a YubiHSM we have around here. NYU should get one of their own, it's not very expensive (~$650).

Copy link
Member

@lukpueh lukpueh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool stuff, @tanishqjasoria!! I just started playing around with your interface and my YubiKey. Here are a few very superficial things I noticed. More feedback to come ...

"""
<Purpose>
To load a custom library to interact with the hardware tokens.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to have a description of the PKCS11LIB_USER argument here in the docstring.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the HSM class has more info. Maybe you can transfer/copy some of it to the public facing function, and flesh it out a bit.

@trishankatdatadog
Copy link
Contributor

Note to self: test code with YubiHSM sooner than later...

Copy link
Member

@lukpueh lukpueh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the hard work you put into this PR, @tanishqjasoria. I finally had the time to play around with your code, using my YubiKey 5 nano. And I did manage to create/verify signatures with keys on the PIV certificate slots. 🎉 💯 😄

Here is my high-level feedback (more details inline):

Usage docs
My Yubikey didn't make this too easy for me, so it might be a good idea to put together some usage docs. We can ticketize this though. Until then, for anyone who wants to try this, use these keys creation instructions and don't forget to add a certificate. Apparently the pkcs11 interface does not find keys on the Yubikey if they don't have a certificate attached. Furthermore, the latests opensc-pkcs11.so from the OpenSC project did not work for me, instead I used ykcs11, which is available via the yubico-piv-tool.
(Btw. both of these rather relevant pieces of information are hidden in Yubico's "Using PIV for SSH through PKCS #11" doc.)

Architectural decisions
I appreciate your motivation to keep the high-level user interface (hsm_keys.py) separated from the low-level code (hsm.py) that interacts with PyKCS11 . But I still think the code would be easier to understand, maintain, and even refactor (e.g. if we replaced PyKCS11 with python-pkcs11) without those two levels of abstraction. Right now I have to keep track of the global smartcard in hsm_keys.py and also of the attributes (most notably session) inside the HSM container pointed to by smartcard. Maybe we can simplify this?

Tests
Removing one level of abstraction will also make testing easier. Instead of testing two create_signature functions you only have to test one. In return you can focus on actually testing their functionality (see inline comments).

Let me know if my comments make sense.

# To carry out the tests even when the hardware token is not connected,
# we would be emulating the hardware token using softHSM 2.0.
# To carry out all the tests, SoftHSM needs to be initialized and
# RSD, ECDSA key pairs must be generated on the SoftHSM.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/RSD/RSA ?


install:
- pip install -U tox coveralls
- wget https://dist.opendnssec.org/source/softhsm-2.3.0.tar.gz
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping

@@ -1,5 +1,6 @@
# test runtime dependencies (see 'tests_require' field in setup.py)
mock; python_version < "3.3"
PyKCS11>=1.5.5; python_version > '3'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#209 shuffled the requirements files quite a bit, so there will be a few conflicts on a (necessary) rebase. I can help you with that if you want.

Copy link
Author

@tanishqjasoria tanishqjasoria Mar 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I already did a rebase and resolved the conflicts.

"""
try:
self.session = self.PKCS11.openSession(slot_info['slot_id'],
PyKCS11.CKF_SERIAL_SESSION | PyKCS11.CKF_RW_SESSION)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we generally need write access. Probably only to create test keys.

self.session.login(user_pin)
except PyKCS11.PyKCS11Error as error:
if PyKCS11.CKR[error.value] == "CKR_USER_ALREADY_LOGGED_IN":
logger.warning('Already logged in as CKU_USER.')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please change to info or debug level log message. I unexpectedly see this message when calling load_private_keys more than once.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to this.

self.SMARTCARD.login(_USER_PIN)

self.SMARTCARD.logout()
self.SMARTCARD.close_session()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above in test_close_session

"""

# Refresh the list of available slots for HSM
self.refresh()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be feasible to make initialization and session handling transparent to the user of the interface? You already do this to some extent, i.e. by each time opening the session in load_public_keys and load_private_keys. Similarly you always call login before loading private keys.

Comment on lines +329 to +334
if PyKCS11.CKK[key_type] == 'CKK_RSA':
mechanism = PyKCS11.RSA_PSS_Mechanism(RSA_PSS_MECH,
RSA_PSS_HASH_SHA256, RSA_PSS_MGF_SHA256, RSA_PSS_SALT_LENGTH)

elif PyKCS11.CKK[key_type] == 'CKK_EC' or PyKCS11.CKK[key_type] == 'CKK_ECDSA':
mechanism = PyKCS11.Mechanism(PyKCS11.CKM_ECDSA)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's okay to restrict the list of supported mechanisms. Ideally we do this via some module level constants or setting. For other signing algorithms/schemes in this library we make the information available, e.g. to the verifier, via the key.

except:
logger.debug('The public key object does not contain the DER encoded value'
'It needs to be calculated from the Modulus and Exponent.'
'But this functionality is not yet available!')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as in test_hsm.test_get_public_key_value

"""

# Use the HSM with the corresponding 'slot_info'
smartcard.get_HSM_session(HSM_info)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you call this function before load_library you will get NameError: name 'smartcard' is not defined

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment applies to other functions that call smartcard

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there is a particular flow that needs to be followed while using this.
It should give a more meaningful error to the user, I would change that.

@tanishqjasoria
Copy link
Author

tanishqjasoria commented Mar 3, 2020

@lukpueh I agree on the change of the architecture. We should remove the two levels of abstraction. Apart from all the other benefits, it would make the interaction with HSM more transparent to the user.
We can use python-pkcs11 to make it easier to understand.

@lukpueh
Copy link
Member

lukpueh commented Mar 9, 2020

Notes from an out-of-band conversation with @tanishqjasoria about the architecture:


The API should look similar to securesystemslib.functions.gpg , that is:

  • create_signature -- Create a signature using a key from the smartcard (via pykcs11), and return it in a securesystemslib-like format (see securesystemslib.formats.*SIGNATURE_SCHEMA)
  • export_pubkey -- Export a pubkey from the smartcard (via pykcs11), and return it in an securesystemslib-like format (see securesystemslib.formats.*PUBKEY_SCHEMA)
  • verify_signature -- Take the signed data, and a pubkey and signature in securesystemslib-like format, and verify it (via cryptography). This should not require a smartcard or pykcs11.

In addition we will need a public function that lists all the available keys from all the available hsms that function should give the user the necessary hsm/key identifiers required by create_signature or export_pubkey.
I suggest that each call to those functions operates in a separate hsm session, that is they should be wrapped by setup and teardown (including login+authentication/logout if necessary). This could be implemented in a decorator but it is okay to just call some private functions, i.e. _create_session, _login on top, and _logout, _destroy_session on the bottom of those functions.

We might want to check how costly (computationally, memory-wise) these operations are, but I guess it’s feasible and definitely makes the interaction a lot nicer for the user.

Similarly, loading the pkcs11.so library could be done automatically on import of securesystemslib.hsm. PyKCS11 just defaults to the PYKCS11LIB environment variable if no path is passed. I suggest we just try using that. We can assign the library object to a module level global and use it in the functions where we need it. If PYKCS11LIB is not set or the load fails for a different reason, we don’t do anything right away. But later, if the user calls a function for which the library is needed we inform them to either set the environment variable or load the library manually.

I picture something like this:

# in module securesystemslib.hsm
import PyKCS11
PKCS11 = PyKCS11.PyKCS11Lib()
# Per default we try loading from PYKCS11LIB environment variable
try:
  PKCS11.load()
except PyKCS11.PyKCS11Error as e:
  # TODO: maybe add some more debug info
  log.debug(e)

def load_pkcs11_library(path=None):
  # TODO: Wrap in try/except and add nice error message/user feedback,
  # and re-raise as suited exception
  PKCS11.load(path)

def create_signature():
  # TODO: figure out a way to tell that PKCS11 has not been loaded
  if not PKCS11.loaded:
    # TODO: Raise suited exception informing user about the PYKCS11LIB envvar
    # or the option to use `load_pkcs11_library`
    raise
  # TODO:
  # open session
  # login
  # sign
  # logout
  # close session

# TODO: add other functions

@trishankatdatadog
Copy link
Contributor

Looks great to me, thanks, Lukas and @tanishqjasoria!

# For all the available HSMs available, add relevant information
# to the slots dictionary
for slot in slot_list:
slot_dict = dict()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More of a style nit, but literal construction is a little easier to read and less repetitive:

slot_dict = {
    'slot_id': slot,
    'slot_info': self.PKCS11.getSlotInfo(slot),
    # ...
}
slot_info_list.append(slot_dict)

@lukpueh
Copy link
Member

lukpueh commented Apr 6, 2020

Superseded by #229. Thanks for the solid groundwork, @tanishqjasoria! Closing here...

@lukpueh lukpueh closed this Apr 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants