-
Notifications
You must be signed in to change notification settings - Fork 108
Further cleanup for the auth
module
#3114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Further cleanup for the auth
module
#3114
Conversation
32edeaf
to
74e6fb7
Compare
0259abc
to
f389c0d
Compare
586d2fe
to
92a82af
Compare
0a38abf
to
b1d84d6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ready for official review!
f866ecf
to
69090cf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a few changes to address additional Google Doc string format usage, or lack there-of, in the users model.
602c191
to
031cb69
Compare
This work is now dependent on PR #3211 since I did not want to spend lots of time waiting for CI jobs to complete to find out how the functional tests failed. |
ffcea7f
to
ab121de
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I acknowledge that this PR is still in Draft state, but, in the interest of keeping things moving, I went ahead and reviewed it (although, I didn't get to lib/pbench/test/unit/server/auth/test_auth.py
). I have a bunch of comments, but they are all small stuff, I think.
@@ -88,6 +88,7 @@ def run_gunicorn(server_config: PbenchServerConfig, logger: Logger) -> int: | |||
workers = str(server_config.get("pbench-server", "workers")) | |||
worker_timeout = str(server_config.get("pbench-server", "worker_timeout")) | |||
crontab_dir = server_config.get("pbench-server", "crontab-dir") | |||
server_config.get("flask-app", "secret-key") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is "interesting" -- we're making a "get" call and then ignoring what it returns. I presume that the intention here is to fail early if it was omitted from the configuration, which is good, but this use of get()
is slightly odd and perhaps deserves a comment. Also, it seems to impose the requirement that the Pbench Server be configured with a "secret-key"
even if the Server is not configured to use an OIDC broker...is that intentional?
If so, then, instead of passing the whole server_config
at line 106, consider doing the server_config.get("authentication", "server_url")
call here and passing just the URL value to wait_for_oidc_server()
there. If appropriate, we can specify a fallback value of None
here and test for that at line 105 instead of having it recover from the absence.
openid-connect: A JSON object containing the OpenID Connect parameters | ||
required for the web client to use OIDC authentication. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's probably worth pointing out that openid-connect
will be omitted if the OIDC server is not configured.
Args: | ||
method : The API HTTP method | ||
path : Path for the request. | ||
data : Json data to send with the request in case of the POST | ||
kwargs : Additional keyword args |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The description for headers
is missing.
final_headers = self.headers.copy() | ||
if headers is not None: | ||
final_headers.update(headers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternately,
final_headers = self.headers | (headers if headers else {})
That is, the |
operator will return a new dictionary consisting of self.headers
overlaid with the entries from headers
.
(I liked this better before I had to add the ternary to handle the headers is None
case, but I figured I would toss it out there anyway.)
kwargs = dict( | ||
params=kwargs, | ||
data=data, | ||
headers=final_headers, | ||
verify=self.verify, | ||
) | ||
try: | ||
response = self._connection.request(method, url, **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stylistically, are you sure that you want to be overwriting the input parameter, kwargs
? Shouldn't we be instantiating a new local variable and using that in the call, instead?
FIXME - This method appears unused in the rest of the code. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you intend to act on this FIXME
prior to merging?
AIUI, the information referenced here is currently provided by the User
database table; once that is removed, this function will provide a "bridge" for any existing code which still requires that info. The question is, will there be any such code remaining?
I concur that we should avoid carrying dead code. On the other hand, it's not hard to maintain this code during the transition, and doing so while we have all the OIDC stuff in the front of our heads is probably a good idea (that is, I think it will be cheaper to carry it than to reconstruct it). The issue is, when we turn our attention to other things, what is going prompt us to remove this code if, in fact, it winds up being unused?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we would ever going to call the userinfo
endpoint as we get pretty much all the user information when we decode the tokens. I added this when I did some experimentation with Keycloak APIs and I thought this might come handy but I realized token decode is enough to get all the user information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, on that basis, let's rip out this function, then (which will get rid of the FIXME
😉).
headers = {"Authorization": f"Bearer {token}"} if token else None | ||
return self._connection.get(self._userinfo_endpoint, headers=headers).json() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@npalaska, this function is set up with a default value of None
for token
, and it even has some special-case handling for when the token is omitted...but is there any point to trying to make this call without a token? (Don't we know what Keycloak's response is going to be in that case??)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, if the token is None
then this one is going to return UNAUTHORIZED
. However, I don't think Peter made this change to the function, and I think updating the guts of this function should not be a part of this PR.
I added this functionality long back when I was doing some experiments with OIDC APIs but I don't think we ever going to need to call userinfo
endpoint as we pretty much get all the required information about the user when we decode the token so there is no need to call this endpoint at all. I will work on it in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the history of this PR, I really cannot say who made the change, but, looking at the diffs here, someone replaced an if
with a ternary...which I would normally support (😉) except that, in this particular case, it seems like immediately returning an error would be better than issuing the request (since we know that that is what will/should be the ultimate result).
Whether we should do that in this PR or not is the next question....
# Module private constants | ||
_TOKEN_ALG_INT = "HS256" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this definition relate to the definition of OpenIDClient._TOKEN_ALG
? Should they have the same value? Should they be referencing the same definition??
If not, then it would be good to have an explanatory comment here (I presume that we're only using this value temporarily during the transition).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current mechanism to encode the auth token is HS256
and I don't think the PR is trying to change the way we currently encode the tokens.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood. It's just that the respective similarities between _TOKEN_ALG
/_TOKEN_ALG_INT
and HS256
/RS256
seem like they could easily be confused, and, if we're stuck with those names, then it would be good to have come good comments in the code to make sure that readers don't gloss over the differences.
else InternalUser.create( | ||
# FIXME - `client_id` is the value pulled from the Pbench Server | ||
# "openid-connect" "client" field in the configuration file. This | ||
# needs to be an ID from the OpenID Connect response payload (online | ||
# case) or decoded token (offline case). | ||
client_id=oidc_client.client_id, | ||
token_payload=token_payload, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realize that this is a draft PR, but, in case you've overlooked it, here's another FIXME
.
if pbench_client_roles: | ||
payload["resource_access"].update( | ||
{"pbench-client": {"roles": pbench_client_roles}} | ||
) | ||
token_str = jwt.encode(payload, jwt_secret, algorithm="HS256") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be a reference to Auth._TOKEN_ALG_INT
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CI job failed to due race condition to be solved by PR #3232. Restarted. |
@@ -253,19 +253,25 @@ def test_create_w_roles(self): | |||
|
|||
|
|||
@pytest.fixture(scope="session") | |||
def rsa_keys() -> Dict[str, str]: | |||
def rsa_keys() -> Dict[str, Union[rsa.RSAPrivateKey, str]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You didn't change this, but changing the line highlights it ... now that we're using Python 3.9 we can use dict[...]
instead of typing.Dict
. I've been trying to gradually fix type hints as I hit them.
@@ -29,7 +29,7 @@ uri = postgresql://pbenchcontainer:pbench@localhost:5432/pbenchcontainer | |||
secret-key = "pbench-in-a-can secret shhh" | |||
|
|||
[openid-connect] | |||
server_url = http://localhost:8090 | |||
#server_url = http://localhost:8090 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While your justification is valid, depending on Nikhil's #3235 this may be short-lived. 😆 Timing may not be everything, but a lot is happening all at once ...
lib/pbench/server/auth/__init__.py
Outdated
raise OpenIDClientError( | ||
HTTPStatus.BAD_GATEWAY, | ||
f"Failure to connect to the OpenID client, {self!r}: {exc}", | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OpenIDCClientError
(and others) should probably extend APIAbort
so they're handled cleanly in _dispatch
instead of going to the catch-all.
Also, where you use HTTPStatus.INTERNAL_SERVER_ERROR
it'd be better to use APIInternalError
so we generate the logger uuid right away rather than, again, going through the Exception
catch-all in _dispatch
.
The refactoring approach taken here covers two broad areas. The first being the removal of the `Auth` class replacing it with module interfaces. This change removes the need to pass the `Auth` instance around between modules and objects, where a simple import of the `Auth` module is sufficient. The second area of refactoring is simplifying the authorization module interfaces to hide knowledge of "internal tokens" vs OpenID Connect tokens. The refactoring solves a problem facing the code prior to this change where one needs a working OpenID Connect instance to function, but the fallback doesn't work to the internal token mechanism. With these changes, if it is possible to talk to an OpenID Connect instance, only OpenID Connect tokens will be accepted for authorization. If that is not possible, only "internal" tokens will be accepted. This prevents the two code paths from having to co-exist seamlessly, such that the unit test environment can explicitly operate in one or the other. For now, the unit test environment assumes internal tokens only. Unit tests for the OIDC code paths (with 100% coverage) are provided separately. Further, a number of other changes are worth noting explicitly: * The "authentication" section in the configuration file has been renamed to "openid-connect" There are four required options in that section required to use OpenID Connect for tokens: "server_url", "realm", "client", and "secret". If the section or any of those options are missing then "internal" tokens will be expected. The `pbench-server-default.cfg` file now contains that section with the "server_url" option commented out since we are defaulting to "internal" tokens for the time being. The Pbench-in-a-Can `pbench-server.cfg` file was updated to use the OpenID Connect instance (KeyCloak) by defining the "server_url" and providing a "secret". * A new "flask-app" section was added to provide the "secret-key" option for populating the `Flask` app's `secret_key` property * The `Auth` module's `set_oidc_client` was moved to the OpenIDClient `create` class factory method to improve OIDC encapsulation * The `Auth` module's `get_user_id()` was renamed to `get_current_user_id()` * The `Auth` module's `verify_auth()` method calls one of two helper methods based on whether OpenID Connect or internal tokens are to be used The presence of the "openid-connect" section with all the options configured causes OpenID Connect to be used vs internal tokens * Knowledge of OIDC tokens pushed out of the `Auth` module and into the `OpenIDClient` class We did not make a separate module for internal tokens since the intention is to remove that code entire once we can depend on an OpenID Connect instance. * The `OpenIDClient` was refactored to further its encapsulation and to facilitate easier testing via its new `Connection` sister class * Finally, the unit tests for the `Auth` module proper were rewritten to leverage the new `Connection` class to make the test behaviors
Until we can support functional tests that use a token from a user provided by OpenID Connect (i.e. a KeyCloak instance) don't configure the functional tests to use OpenID Connect. We also correct `load_keycloak.sh` to use the same values in the Pbench Server configuration.
Unless there is some major objection, I'd like to merge this code as is, and follow-up with a separate PR(s) for some of the comments made. I'd like to unblock further in this area. |
If you have particular changes that you'd like to see made, feel free to follow-up with your own PRs as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless there is some major objection, I'd like to merge this code as is, and follow-up with a separate PR(s) for some of the comments made. I'd like to unblock further in this area.
Well, this has been going on for a long time and I don't see any show-stoppers even though there's a lot of possible cleanup. So, sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
And if you leave the comments unresolved it will help with future work, thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I withdrawal my objection.
Supersedes PR #3094, while including its commit with authorship.
While PR #3094 attempted to cleanup the
auth
module a bit, this PR dives in a bit further.Suggested commit message to use: