|  | 
|  | 1 | +# MSC3030: Jump to date API endpoint | 
|  | 2 | + | 
|  | 3 | +Add an API that makes it easy to find the closest messages for a given | 
|  | 4 | +timestamp. | 
|  | 5 | + | 
|  | 6 | +The goal of this change is to have clients be able to implement a jump to date | 
|  | 7 | +feature in order to see messages back at a given point in time. Pick a date from | 
|  | 8 | +a calender, heatmap, or paginate next/previous between days and view all of the | 
|  | 9 | +messages that were sent on that date. | 
|  | 10 | + | 
|  | 11 | +Alongside the [roadmap of feature parity with | 
|  | 12 | +Gitter](https://github.com/vector-im/roadmap/issues/26), we're also interested | 
|  | 13 | +in using this for a new better static Matrix archive. Our idea is to server-side | 
|  | 14 | +render [Hydrogen](https://github.com/vector-im/hydrogen-web) and this new | 
|  | 15 | +endpoint would allow us to jump back on the fly without having to paginate and | 
|  | 16 | +keep track of everything in order to display the selected date. | 
|  | 17 | + | 
|  | 18 | +Also useful for archiving and backup use cases. This new endpoint can be used to | 
|  | 19 | +slice the messages by day and persist to file. | 
|  | 20 | + | 
|  | 21 | +Related issue: [*URL for an arbitrary day of history and navigation for next and | 
|  | 22 | +previous days* | 
|  | 23 | +(vector-im/element-web#7677)](https://github.com/vector-im/element-web/issues/7677) | 
|  | 24 | + | 
|  | 25 | + | 
|  | 26 | +## Problem | 
|  | 27 | + | 
|  | 28 | +These types of use cases are not supported by the current Matrix API because it | 
|  | 29 | +has no way to fetch or filter older messages besides a manual brute force | 
|  | 30 | +pagination from the most recent event in the room. Paginating is time-consuming | 
|  | 31 | +and expensive to process every event as you go (not practical for clients). | 
|  | 32 | +Imagine wanting to get a message from 3 years ago 😫 | 
|  | 33 | + | 
|  | 34 | + | 
|  | 35 | +## Proposal | 
|  | 36 | + | 
|  | 37 | +Add new client API endpoint `GET | 
|  | 38 | +/_matrix/client/v1/rooms/{roomId}/timestamp_to_event?ts=<timestamp>&dir=[f|b]` | 
|  | 39 | +which fetches the closest `event_id` to the given timestamp `ts` query parameter | 
|  | 40 | +in the direction specified by the `dir` query parameter. The direction `dir` | 
|  | 41 | +query parameter accepts `f` for forward-in-time from the timestamp and `b` for | 
|  | 42 | +backward-in-time from the timestamp. This endpoint also returns | 
|  | 43 | +`origin_server_ts` to make it easy to do a quick comparison to see if the | 
|  | 44 | +`event_id` fetched is too far out of range to be useful for your use case. | 
|  | 45 | + | 
|  | 46 | +When an event can't be found in the given direction, the endpoint throws a 404 | 
|  | 47 | +`"errcode":"M_NOT_FOUND",` (example error message `"error":"Unable to find event | 
|  | 48 | +from 1672531200000 in direction f"`). | 
|  | 49 | + | 
|  | 50 | +In order to solve the problem where a homeserver does not have all of the history in a | 
|  | 51 | +room and no suitably close event, we also add a server API endpoint `GET | 
|  | 52 | +/_matrix/federation/v1/timestamp_to_event/{roomId}?ts=<timestamp>?dir=[f|b]` which other | 
|  | 53 | +homeservers can use to ask about their closest `event_id` to the timestamp. This | 
|  | 54 | +endpoint also returns `origin_server_ts` to make it easy to do a quick comparison to see | 
|  | 55 | +if the remote `event_id` fetched is closer than the local one. After the local | 
|  | 56 | +homeserver receives a response from the federation endpoint, it probably should | 
|  | 57 | +try to backfill this event via the federation `/event/<event_id>` endpoint so that it's | 
|  | 58 | +available to query with `/context` from a client in order to get a pagination token. | 
|  | 59 | + | 
|  | 60 | +The heuristics for deciding when to ask another homeserver for a closer event if | 
|  | 61 | +your homeserver doesn't have something close, are left up to the homeserver | 
|  | 62 | +implementation, although the heuristics will probably be based on whether the | 
|  | 63 | +closest event is a forward/backward extremity indicating it's next to a gap of | 
|  | 64 | +events which are potentially closer. | 
|  | 65 | + | 
|  | 66 | +A good heuristic for which servers to try first is to sort by servers that have | 
|  | 67 | +been in the room the longest because they're most likely to have anything we ask | 
|  | 68 | +about. | 
|  | 69 | + | 
|  | 70 | +These endpoints are authenticated and should be rate-limited like similar client | 
|  | 71 | +and federation endpoints to prevent resource exhaustion abuse. | 
|  | 72 | + | 
|  | 73 | +``` | 
|  | 74 | +GET /_matrix/client/v1/rooms/<roomID>/timestamp_to_event?ts=<timestamp>&dir=<direction> | 
|  | 75 | +{ | 
|  | 76 | +    "event_id": ... | 
|  | 77 | +    "origin_server_ts": ... | 
|  | 78 | +} | 
|  | 79 | +``` | 
|  | 80 | + | 
|  | 81 | +Federation API endpoint: | 
|  | 82 | +``` | 
|  | 83 | +GET /_matrix/federation/v1/timestamp_to_event/<roomID>?ts=<timestamp>&dir=<direction> | 
|  | 84 | +{ | 
|  | 85 | +    "event_id": ... | 
|  | 86 | +    "origin_server_ts": ... | 
|  | 87 | +} | 
|  | 88 | +``` | 
|  | 89 | + | 
|  | 90 | +--- | 
|  | 91 | + | 
|  | 92 | +In order to paginate `/messages`, we need a pagination token which we can get | 
|  | 93 | +using `GET /_matrix/client/r0/rooms/{roomId}/context/{eventId}?limit=0` for the | 
|  | 94 | +`event_id` returned by `/timestamp_to_event`. | 
|  | 95 | + | 
|  | 96 | +We can always iterate on `/timestamp_to_event` later and return a pagination | 
|  | 97 | +token directly in another MSC ⏩ | 
|  | 98 | + | 
|  | 99 | + | 
|  | 100 | +## Potential issues | 
|  | 101 | + | 
|  | 102 | +### Receiving a rogue random delayed event ID | 
|  | 103 | + | 
|  | 104 | +Since `origin_server_ts` is not enforcably accurate, we can only hope that an event's | 
|  | 105 | +`origin_server_ts` is relevant enough to its `prev_events` and descendants. | 
|  | 106 | + | 
|  | 107 | +If you ask for "the message with `origin_server_ts` closest to Jan 1st 2018" you | 
|  | 108 | +might actually get a rogue random delayed one that was backfilled from a | 
|  | 109 | +federated server, but the human can figure that out by trying again with a | 
|  | 110 | +slight variation on the date or something. | 
|  | 111 | + | 
|  | 112 | +Since there isn't a good or fool-proof way to combat this, it's probably best to just go | 
|  | 113 | +with `origin_server_ts` and not let perfect be the enemy of good. | 
|  | 114 | + | 
|  | 115 | + | 
|  | 116 | +### Receiving an unrenderable event ID | 
|  | 117 | + | 
|  | 118 | +Another issue is that clients could land on an event they can't/won't render, | 
|  | 119 | +such as a reaction, then they'll be forced to desperately seek around the | 
|  | 120 | +timeline until they find an event they can do something with. | 
|  | 121 | + | 
|  | 122 | +Eg: | 
|  | 123 | + - Client wants to jump to January 1st, 2022 | 
|  | 124 | + - Server says there's an event on January 2nd, 2022 that is close enough | 
|  | 125 | + - Client finds out there's a ton of unrenderable events like memberships, poll responses, reactions, etc at that time | 
|  | 126 | + - Client starts paginating forwards, finally finding an event on January 27th it can render | 
|  | 127 | + - Client wasn't aware that the actual nearest neighbouring event was backwards on December 28th, 2021 because it didn't paginate in that direction | 
|  | 128 | + - User is confused that they are a month past the target date when the message is *right there*. | 
|  | 129 | + | 
|  | 130 | +Clients can be smarter here though. Clients can see when events were sent as | 
|  | 131 | +they paginate and if they see they're going more than a couple days out, they | 
|  | 132 | +can also try the other direction before going further and further away. | 
|  | 133 | + | 
|  | 134 | +Clients can also just explain to the user what happened with a little toast: "We | 
|  | 135 | +were unable to find an event to display on January 1st, 2022. The closest event | 
|  | 136 | +after that date is on January 27th." | 
|  | 137 | + | 
|  | 138 | + | 
|  | 139 | +### Abusing the `/timestamp_to_event` API to get the `m.room.create` event  | 
|  | 140 | + | 
|  | 141 | +Although it's possible to jump to the start of the room and get the first event in the | 
|  | 142 | +room (`m.room.create`) with `/timestamp_to_event?dir=f&ts=0`, clients should still use | 
|  | 143 | +`GET /_matrix/client/v3/rooms/{roomId}/state/m.room.create/` to get the room creation | 
|  | 144 | +event. | 
|  | 145 | + | 
|  | 146 | +In the future, with things like importing history via | 
|  | 147 | +[MSC2716](https://github.com/matrix-org/matrix-spec-proposals/pull/2716), the first | 
|  | 148 | +event you encounter with `/timestamp_to_event?dir=f&ts=0` could be an imported event before | 
|  | 149 | +the room was created. | 
|  | 150 | + | 
|  | 151 | + | 
|  | 152 | +## Alternatives | 
|  | 153 | + | 
|  | 154 | +We chose the current `/timestamp_to_event` route because it sounded like the | 
|  | 155 | +easist path forward to bring it to fruition and get some real-world experience. | 
|  | 156 | +And was on our mind during the [initial discussion](https://docs.google.com/document/d/1KCEmpnGr4J-I8EeaVQ8QJZKBDu53ViI7V62y5BzfXr0/edit#bookmark=id.qu9k9wje9pxm) because there was some prior art with a [WIP | 
|  | 157 | +implementation](https://github.com/matrix-org/synapse/pull/9445/commits/91b1b3606c9fb9eede0a6963bc42dfb70635449f) | 
|  | 158 | +from @erikjohnston. The alternatives haven't been thrown out for a particular | 
|  | 159 | +reason and we could still go down those routes depending on how people like the | 
|  | 160 | +current design. | 
|  | 161 | + | 
|  | 162 | + | 
|  | 163 | +### Paginate `/messages?around=<timestamp>` from timestamp | 
|  | 164 | + | 
|  | 165 | +Add the `?around=<timestamp>` query parameter to the `GET | 
|  | 166 | +/_matrix/client/r0/rooms/{roomId}/messages` endpoint. This will start the | 
|  | 167 | +response at the message with `origin_server_ts` closest to the provided `around` | 
|  | 168 | +timestamp. The direction is determined by the existing `?dir` query parameter. | 
|  | 169 | + | 
|  | 170 | +Use topological ordering, just as Element would use if you follow a permalink. | 
|  | 171 | + | 
|  | 172 | +This alternative could be confusing to the end-user around how this plays with | 
|  | 173 | +the existing query parameters | 
|  | 174 | +`/messages?from={paginationToken}&to={paginationToken}` which also determine | 
|  | 175 | +what part of the timeline to query. Those parameters could be extended to accept | 
|  | 176 | +timestamps in addition to pagination tokens but then could get confusing again | 
|  | 177 | +when you start mixing timestamps and pagination tokens. The homeserver also has | 
|  | 178 | +to disambiguate what a pagination token looks like vs a unix timestamp. Since | 
|  | 179 | +pagination tokens don't follow a certain convention, some homeserver | 
|  | 180 | +implementations may already be using arbitrary number tokens already which would | 
|  | 181 | +be impossible to distinguish from  a timestamp. | 
|  | 182 | + | 
|  | 183 | +A related alternative is to use `/messages` with a `from_time`/`to_time` (or | 
|  | 184 | +`from_ts`/`to_ts`) query parameters that only accept timestamps which solves the | 
|  | 185 | +confusion and disambigution problem of trying to re-use the existing `from`/`to` | 
|  | 186 | +query paramters. Re-using `/messages` would reduce the number of round-trips and | 
|  | 187 | +potentially client-side implementations for the use case where you want to fetch | 
|  | 188 | +a window of messages from a given time. But has the same round-trip problem if | 
|  | 189 | +you want to use the returned `event_id` with `/context` or another endpoint | 
|  | 190 | +instead. | 
|  | 191 | + | 
|  | 192 | + | 
|  | 193 | +### Filter by date in `RoomEventFilter` | 
|  | 194 | + | 
|  | 195 | +Extend `RoomEventFilter` to be able to specify a timestamp or a date range. The | 
|  | 196 | +`RoomEventFilter` can be passed via the `?filter` query param on the `/messages` | 
|  | 197 | +endpoint. | 
|  | 198 | + | 
|  | 199 | +This suffers from the same confusion to the end-user of how it plays with how | 
|  | 200 | +this plays with `/messages?from={paginationToken}&to={paginationToken}` which | 
|  | 201 | +also determines what part of the timeline to query. | 
|  | 202 | + | 
|  | 203 | + | 
|  | 204 | +### Return the closest event in any direction | 
|  | 205 | + | 
|  | 206 | +We considered omitting the `dir` parameter (or allowing `dir=c`) to have the server | 
|  | 207 | +return the closest event to the timestamp, regardless of direction. However, this seems | 
|  | 208 | +to offer little benefit. | 
|  | 209 | + | 
|  | 210 | +Firstly, for some usecases (such as archive viewing, where we want to show all the | 
|  | 211 | +messages that happened on a particular day), an explicit direction is important, so this | 
|  | 212 | +would have to be optional behaviour. | 
|  | 213 | + | 
|  | 214 | +For a regular messaging client, "directionless" search also offers little benefit: it is | 
|  | 215 | +easy for the client to repeat the request in the other direction if the returned event | 
|  | 216 | +is "too far away", and in any case it needs to manage an iterative search to handle | 
|  | 217 | +unrenderable events, as discussed above. | 
|  | 218 | + | 
|  | 219 | +Implementing a directionless search on the server carries a performance overhead, since | 
|  | 220 | +it must search both forwards and backwards on every request. In short, there is little | 
|  | 221 | +reason to expect that a single `dir=c` request would be any more efficient than a pair of | 
|  | 222 | +requests with `dir=b` and `dir=f`. | 
|  | 223 | + | 
|  | 224 | +### New `destination_server_ts` field | 
|  | 225 | + | 
|  | 226 | +Add a new field and index on messages called `destination_server_ts` which | 
|  | 227 | +indicates when the message was received from federation. This gives a more | 
|  | 228 | +"real" time for how someone would actually consume those messages. | 
|  | 229 | + | 
|  | 230 | +The contract of the API is "show me messages my server received at time T" | 
|  | 231 | +rather than the messy confusion of showing a delayed message which happened to | 
|  | 232 | +originally be sent at time T. | 
|  | 233 | + | 
|  | 234 | +We've decided against this approach because the backfill from federated servers | 
|  | 235 | +could be horribly late. | 
|  | 236 | + | 
|  | 237 | +--- | 
|  | 238 | + | 
|  | 239 | +Related issue around `/sync` vs `/messages`, | 
|  | 240 | +https://github.com/matrix-org/synapse/issues/7164 | 
|  | 241 | + | 
|  | 242 | +> Sync returns things in the order they arrive at the server; backfill returns | 
|  | 243 | +> them in the order determined by the event graph. | 
|  | 244 | +> | 
|  | 245 | +> *-- @richvdh, https://github.com/matrix-org/synapse/issues/7164#issuecomment-605877176* | 
|  | 246 | +
 | 
|  | 247 | +> The general idea is that, if you're following a room in real-time (ie, | 
|  | 248 | +> `/sync`), you probably want to see the messages as they arrive at your server, | 
|  | 249 | +> rather than skipping any that arrived late; whereas if you're looking at a | 
|  | 250 | +> historical section of timeline (ie, `/messages`), you want to see the best | 
|  | 251 | +> representation of the state of the room as others were seeing it at the time. | 
|  | 252 | +> | 
|  | 253 | +> *-- @richvdh , https://github.com/matrix-org/synapse/issues/7164#issuecomment-605953296* | 
|  | 254 | +
 | 
|  | 255 | + | 
|  | 256 | +## Security considerations | 
|  | 257 | + | 
|  | 258 | +We're only going to expose messages according to the existing message history | 
|  | 259 | +setting in the room (`m.room.history_visibility`). No extra data is exposed, | 
|  | 260 | +just a new way to sort through it all. | 
|  | 261 | + | 
|  | 262 | + | 
|  | 263 | + | 
|  | 264 | +## Unstable prefix | 
|  | 265 | + | 
|  | 266 | +While this MSC is not considered stable, the endpoints are available at `/unstable/org.matrix.msc3030` instead of their `/v1` description from above. | 
|  | 267 | + | 
|  | 268 | +``` | 
|  | 269 | +GET /_matrix/client/unstable/org.matrix.msc3030/rooms/<roomID>/timestamp_to_event?ts=<timestamp>&dir=<direction> | 
|  | 270 | +{ | 
|  | 271 | +    "event_id": ... | 
|  | 272 | +    "origin_server_ts": ... | 
|  | 273 | +} | 
|  | 274 | +``` | 
|  | 275 | + | 
|  | 276 | +``` | 
|  | 277 | +GET /_matrix/federation/unstable/org.matrix.msc3030/timestamp_to_event/<roomID>?ts=<timestamp>&dir=<direction> | 
|  | 278 | +{ | 
|  | 279 | +    "event_id": ... | 
|  | 280 | +    "origin_server_ts": ... | 
|  | 281 | +} | 
|  | 282 | +``` | 
|  | 283 | + | 
|  | 284 | +Servers will indicate support for the new endpoint via a non-empty value for feature flag | 
|  | 285 | +`org.matrix.msc3030` in `unstable_features` in the response to `GET | 
|  | 286 | +/_matrix/client/versions`. | 
0 commit comments