-
Notifications
You must be signed in to change notification settings - Fork 415
MSC2918: Refresh tokens #2918
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MSC2918: Refresh tokens #2918
Changes from all commits
ab50b62
f8dad2a
0e615f7
870cded
6530ecc
b320001
d433e3b
87566c3
269fcac
4d73b7e
db8ceab
9bbb4c5
a050dc3
2c11e6f
4cd94e3
04ae1c3
488e9e1
4cf821c
c076763
a157cc3
ed54213
70b2dfc
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,163 @@ | ||
| # MSC2918: Refresh tokens | ||
turt2live marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| In Matrix, requests to the Client-Server API are currently authenticated using non-expiring, revocable access tokens. | ||
| An access token might leak for various reasons, including: | ||
|
|
||
| - leaking from the server database (and its backups) | ||
| - intercepting it with a man-in-the-middle attack | ||
| - leaking from the client storage (and its backups) | ||
|
|
||
| In the OAuth 2.0 world, this vector of attack is partly mitigated by having expiring access tokens with short lifetimes and rotating refresh tokens to renew them. | ||
| This MSC adds support for expiring access tokens and introduces refresh tokens to renew them. | ||
richvdh marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| A more [detailed rationale](#detailed-rationale) of what kind of attacks it mitigates lives at the end of this document. | ||
|
|
||
| ## Proposal | ||
|
|
||
| Homeservers can choose to have access tokens expire after a short amount of time, forcing the client to renew them with a refresh token. | ||
| A refresh token is issued on login and rotates on each usage. | ||
|
|
||
| It allows homeservers to opt for signed and non-revocable access tokens (JWTs, Macaroon, etc.) for performance reasons if their expiration is short enough (less than 5 minutes). | ||
|
|
||
| It is heavily recommended for clients to support refreshing tokens for additional security. | ||
| They can advertise their support by adding a `"refresh_token": true` field in the request body on the `/login` and `/register` APIs. | ||
|
|
||
| Handling of clients that do *not* support refreshing access tokens is up to individual homeserver deployments. | ||
| For example, server administrators may choose to support such clients for backwards-compatibility, or to expire access tokens anyway for improved security at the cost of inferior user experience in legacy clients. | ||
|
|
||
| If a client uses an access token that has expired, the server will respond with an `M_UNKNOWN_TOKEN` error, preferably with the `soft_logout` parameter set to `true` to improve the user experience in legacy clients. | ||
| Thus, if a client receives an `M_UNKNOWN_TOKEN` error, and it has a refresh token available, it should no longer assume that it has been logged out, and instead attempt to refresh the token. | ||
| If the client was in fact logged out, then the server will respond with an `M_UNKNOWN_TOKEN` error to the token refresh request, possibly with the `soft_logout` parameter set. | ||
|
|
||
| ### Login API changes | ||
richvdh marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| The login API returns two additional fields: | ||
clokep marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| - `expires_in_ms`: The lifetime in milliseconds of the access token. | ||
sandhose marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| - `refresh_token`: The refresh token, which can be used to obtain new access tokens. | ||
sandhose marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| This also applies to logins done by application services. | ||
|
|
||
| Both fields are optional. | ||
| If `expires_in_ms` is missing, the client can assume the access token won't expire. | ||
| If `refresh_token` is missing but `expires_in_ms` is present, the client can assume the access token will expire but it won't have a way to refresh the access token without re-logging in. | ||
|
|
||
| Clients advertise their support for refreshing tokens by setting the `refresh_token` field to `true` in the request body. | ||
|
|
||
| ### Account registration API changes | ||
|
|
||
| Unless `inhibit_login` is `true`, the account registration API returns two additional fields: | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Shouldn't this be if its false? I.e. we include the params when we return a valid access token? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The spec says:
So that sentence seems correct? If There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. it's not the easiest thing to grok, but @sandhose is right, and @erikjohnston is confused. |
||
|
|
||
| - `expires_in_ms`: The lifetime in milliseconds of the access token. | ||
| - `refresh_token`: The refresh token, which can be used to obtain new access tokens. | ||
richvdh marked this conversation as resolved.
Show resolved
Hide resolved
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does the refresh token expire? How does one manually expire it? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. also would be interested to see how this plays with soft logout: if the access token expires, but the refresh token is still live, should the server be using There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Refresh token don't expire, but they get invalidated on use.
I was not aware of how There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
This particular piece is a bit concerning, as it means that refresh tokens are hanging around waiting to give access back to the account. On the other hand, this somewhat fixes the scripts usecase as it can then store the refresh token and use that on the next run. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It still heavily mitigates the impact of token leakage, since they are rotating. tl;dr: if there is an attempt to use an old refresh token, there might be a token leak somewhere and the whole session should be invalidated. This could be mentioned in the MSC and/or in the spec, and implemented in Synapse if you thing it makes sense. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also clarified about There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ah ha, so the server would revoke the access token when a refresh token is used twice? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It could, although not strictly enforced by this MSC (and the current implementation in Synapse does not do that) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @turt2live is this clear enough now? |
||
|
|
||
| This also applies to registrations done by application services. | ||
sandhose marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| As in the login API, both field are optional. | ||
|
|
||
| Clients advertise their support for refreshing tokens by setting the `refresh_token` field to `true` in the request body. | ||
|
|
||
| ### Token refresh API | ||
|
|
||
| This API lets the client refresh the access token. | ||
| A new refresh token is also issued. | ||
| The existing refresh token remains valid until the new access token (or refresh token) is used, at which point it is revoked. | ||
| This allows for the request to get lost in flight. | ||
| The Matrix server can revoke the old access token right away, but does not have to since its lifetime is short enough that it will expire anyway soon after. | ||
uhoreg marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| `POST /_matrix/client/r0/refresh` | ||
turt2live marked this conversation as resolved.
Show resolved
Hide resolved
turt2live marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| ```json | ||
| { | ||
| "refresh_token": "aaaabbbbccccdddd" | ||
| } | ||
| ``` | ||
|
|
||
| response: | ||
|
|
||
| ```json | ||
| { | ||
| "access_token": "xxxxyyyyzzz", | ||
| "expires_in_ms": 60000, | ||
| "refresh_token": "eeeeffffgggghhhh" | ||
| } | ||
| ``` | ||
|
|
||
| If the `refresh_token` is missing from the response, the client can assume the refresh token has not changed and use the same token in subsequent token refresh API requests. | ||
|
|
||
| The `refresh_token` parameter can be invalid for two reasons: | ||
|
|
||
| - if it does not exist | ||
| - if it was already used once | ||
sandhose marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| In both cases, the server must reply with a `401` HTTP status code and an `M_UNKNOWN_TOKEN` error code. | ||
sandhose marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| This new use case of the `M_UNKNOWN_TOKEN` error code must be reflected in the spec. | ||
| As with other endpoints, the server can include an extra `soft_logout` parameter in the response to signify the client it should do a soft logout. | ||
|
|
||
| This new API should be rate-limited and does not require authentication since only the `refresh_token` parameter is needed. | ||
KitsuneRal marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| Identity assertion via the `user_id` query parameter as defined by the Application Service API specification is disabled on this endpoint. | ||
|
|
||
| ### Device handling | ||
|
|
||
| The current spec states that "Matrix servers should record which device each access token is assigned to". | ||
| This must be updated to reflect that devices are bound to a session, which are created during login and stays the same after refreshing the token. | ||
|
|
||
uhoreg marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| ## Potential issues | ||
|
|
||
| The refresh token being rotated on each refresh is strongly recommended in the OAuth 2.0 world for unauthenticated clients to avoid token replay attacks. | ||
| This can however make the deployment of CLI tools for Matrix a bit harder, since the credentials can't be statically defined anymore. | ||
| This is not an issue in OAuth 2.0 because usually CLI tools use the client credentials flow, also known as service accounts. | ||
| An alternative would be to make the refresh token non-rotating for now but recommend clients to support rotation of refresh tokens and enforce it later on. | ||
|
|
||
| ## Alternatives | ||
|
|
||
| This MSC defines a new endpoint for token refresh, but it could also be integrated as a new authentication mechanism. | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. to expand on this, and the "potential issues" section above, what are the concerns with introducing it as some form of opt-in (or opt-out) mechanism for things like long-lived bots or scripts which do not easily have a refresh opportunity? For example, a nightly batch job to prune rooms/events/etc could use a static access token instead of having to login, do the work, then log out again, which would put the password near the script rather than a single revocable token. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think for both use cases (bots and scripts) I'd rather make use of the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fair, that sounds reasonable. Just wanted to expand on the potential usecase, but agreed that scripts can find other ways to authenticate (or better yet: be replaced by features within the protocol/homeserver implementation) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think an authentication option for scripts needs to be in the spec. I have a lot of scripts that push notifications or upload files from CI jobs for example. Those use access tokens, because CI jobs do sometimes get compromised (happened once because of codecov) and that way the access token can be easily rotated without being a homeserver admin. If the script used username and password instead, an attacker would have been able to get past UIA and change the password and just in general do much more nasty stuff than with an access token. The jobs also can't refresh the access token, since they may be running concurrently and can't change CI variables. What would be my alternative for that use case, that works independent of the specific homeserver implementation? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There is an ongoing effort to rework the whole authentication process, with use cases like scripts running in CI in mind. This MSC is also done to prepare clients for the eventual migration to this new authentication stack without having them to logout all their existing sessions. The login API with non-expiring token will hopefully stay until this new auth stack is ready, so when you would need to migrate you will have a proper alternative. In the meantime, if you want to still adopt refresh tokens and you are admin of your homeserver, I suggest you look into the |
||
|
|
||
| ## Security considerations | ||
|
|
||
| The time to live (TTL) of access tokens isn't enforced in this MSC but is advised to be kept relatively short. | ||
| Servers might choose to have stateless, digitally signed access tokens (JWT are good examples of this), which makes them non-revocable. | ||
| The TTL of access tokens should be around 15 minutes if they are revocable and should not exceed 5 minutes if they are not. | ||
|
|
||
| ## Unstable prefix | ||
|
|
||
| While this MSC is not in a released version of the specification, clients should use the `org.matrix.msc2918.refresh_token` field in place of the `refresh_token` field in requests to the login and registration endpoints. | ||
| The refresh token endpoint should be served and used using the unstable prefix: `POST /_matrix/client/unstable/org.matrix.msc2918/refresh`. | ||
|
|
||
| ## Detailed rationale | ||
|
|
||
| This MSC does not aim to protect against a completely compromised client. | ||
| More specifically, it does not protect against an attacker that managed to distribute an alternate, compromised version of the client to users. | ||
| In contrast, it protects against a whole range of attacks where the access token and/or refresh token get leaked but the client isn't completely compromised. | ||
|
|
||
| For example, those tokens can leak from user backups (user backs up his device on a NAS, the NAS gets compromised and leaks a backup of the client's secret storage), but one can assume those backups could be at least 5 min old. | ||
| If the leak only includes the access token, it is useless to the attacker since it would have expired. | ||
| If it also includes the refresh token, it is useless *if* the token was refreshed before (which will happen if the user just opens their Matrix client in between). | ||
|
|
||
| Worst case scenario, the leaked refresh token is still valid: in this case, the attacker would consume the refresh token to get a valid access token, but when the original client tries to use the same refresh token, the homeserver can detect it, consider the session has been compromised, end the session and warn the user. | ||
|
|
||
| This kind of attack also applies to leakage from the server, which could happen from database backups, for example. | ||
|
|
||
| The important thing here is while it does not completely prevent attacks in case of a token leakage, it does make this range of attack a lot more time-sensitive and detectable. | ||
| A homeserver will notice if a refresh token is being used twice. | ||
|
|
||
| The IETF has interesting [guidelines for refresh tokens](https://datatracker.ietf.org/doc/html/draft-ietf-oauth-security-topics#section-4.13.2). | ||
| They recommend that either: | ||
|
|
||
| - the refresh tokens are sender-bound and require client authentication (making token leakage completely useless if the client credentials are not leaked at the same time) | ||
| - or make them rotate to make the attack a lot harder, as described just above. | ||
|
|
||
| Since all clients are "public" in the Matrix world, there are no client-bound credentials that could be used, hence the rotation of refresh tokens. | ||
|
|
||
| --- | ||
|
|
||
| The other kind of scenario where this change makes sense is to help further changes in the homeservers. | ||
| A good, recent example of this, is in Synapse v1.34.0 [they moved away from macaroons for access tokens](https://github.com/matrix-org/synapse/pull/5588) to random, shorter, saved in database tokens, similar to [what GitHub did recently](https://github.blog/2021-04-05-behind-githubs-new-authentication-token-formats/). | ||
|
|
||
| Because there is no refresh token mechanism in the C2S API, most Synapse instances now have a mix of the two formats of tokens, and for a long time. | ||
| It makes it impossible to enforce the new format of tokens without invalidating all existing sessions, making it impossible to roll out changes like a web-app firewall in front of Synapse that verifies the shape and checksums of tokens even before reaching Synapse. | ||
|
|
||
| --- | ||
|
|
||
| Lastly, expiring tokens already exist in Synapse (via the `session_lifetime` configuration parameter). | ||
| Before this MSC, clients had no idea when the session would end and relied on the server replying with a 401 error with `soft_logout: true` in the response on a random request to trigger a soft logout and go through the authentication process again. | ||
| A side effect of this MSC (although it could have been introduced separately) is that the login responses can now include a `expires_in_ms` to inform the clients when the token will expire. | ||
Uh oh!
There was an error while loading. Please reload this page.