Skip to content

[WIP] Enhanced matrix bridge. #723

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 18, 2022
Merged

Conversation

ro-i
Copy link
Contributor

@ro-i ro-i commented Sep 21, 2021

Hi :)

As discussed here, I'm now creating the pull request for the new Matrix <-> Zulip bridge.
See here for further discussions about uploading an in-memory file-object to a Zulip server.

Feedback would be greatly appreciated! :)

@rht
Copy link
Contributor

rht commented Sep 21, 2021

Also reminder to update https://zulip.com/integrations/doc/matrix once this PR is merged.

@ro-i ro-i force-pushed the matrix-multiple-bridges branch from be900bb to 0c91e8e Compare September 22, 2021 07:59
@rht
Copy link
Contributor

rht commented Sep 23, 2021

There are some Mypy errors in the CI.

@ro-i
Copy link
Contributor Author

ro-i commented Sep 23, 2021

Yeah, I'm about to fix them. Weirdly, I cannot reproduce them with tools/lint or tools/run-mypy...

@ro-i ro-i force-pushed the matrix-multiple-bridges branch 2 times, most recently from 190a7da to 2a06c70 Compare September 28, 2021 22:34
@ro-i
Copy link
Contributor Author

ro-i commented Sep 28, 2021

There are some Mypy errors in the CI.

static analysis is fixed now - I didn't realize that we are stuck to python 3.6 👀

@ro-i ro-i force-pushed the matrix-multiple-bridges branch 7 times, most recently from 67741ba to c4e9bac Compare September 29, 2021 20:49
@ro-i ro-i force-pushed the matrix-multiple-bridges branch from 6b48c3f to c36bf7e Compare July 28, 2022 06:48
@ro-i
Copy link
Contributor Author

ro-i commented Jul 28, 2022

Hi :) I'm sorry that I forgot to update this. 🙈 This semester, I have been rather busy (after august 15, which is the deadline of my bachelor's thesis, this will improve :) ). However, I would also love to get this finally merged! 🎉

Have some of you tried the new bridge for yourselves? Because the problem I was still struggling with has been the CPU load. When the bridge should be idling (at least in my opinion), it actually has a CPU load of 10% to 13% (observed using top with an interval of 1s). Did you experience this, too? I once asked about this on the nio matrix channel (on October 1, 2021 actually ^^). At this time, the CPU load was even higher (~30%). But I didn't manage to resolve this. I'm not sure why the CPU load has improved. However, I still think that it is too much. What do you think?

@rht
Copy link
Contributor

rht commented Jul 28, 2022

I once asked about this on the nio matrix channel (on October 1, 2021 actually ^^)

What did they say about it? If it is nio's perf problem, there is nothing we can do about it, and shouldn't be a blocker to this PR getting merged.

@ro-i
Copy link
Contributor Author

ro-i commented Jul 28, 2022

They told me to use a profiler, which I did:
Screenshot from 2022-07-28 09-57-35

Then they said nothing more 🤔

@ro-i
Copy link
Contributor Author

ro-i commented Jul 28, 2022

However, my code changed since then. So maybe I should give it another try?

@ro-i
Copy link
Contributor Author

ro-i commented Jul 28, 2022

Just to be sure that I did not mess up with asyncio 🤔

@rht
Copy link
Contributor

rht commented Jul 28, 2022

More effective to use this lib instead: https://pypi.org/project/line-profiler/ gives line-by-line profile, and has been used in investigating Zulip code, e.g. the Markdown parsing.

@ro-i ro-i force-pushed the matrix-multiple-bridges branch from c36bf7e to d07efc7 Compare July 28, 2022 09:34
@ro-i ro-i force-pushed the matrix-multiple-bridges branch from d07efc7 to 251a558 Compare July 28, 2022 10:51
@ro-i
Copy link
Contributor Author

ro-i commented Jul 28, 2022

Hm, that sounds cool, thanks! However, I seem to not be able to get it to work:

$ kernprof -lv matrix_bridge.py -c ~/matrix_bridge.conf 
Starting Zulip <-> Matrix mirroring bot
Starting message handler on Matrix client
Starting message handler on Zulip client
^C

kernprof does not generate any output. Do I overlook something?

@rht
Copy link
Contributor

rht commented Jul 28, 2022

You have to decorate the function you want to profile with @profile, and then run this

kp () {
	kernprof -v -l $1 > profile.txt
}
kp your_python_file.py

@ro-i
Copy link
Contributor Author

ro-i commented Jul 28, 2022

I did, but it didn't work. Probably because I need to terminate the script externally or per exception... Or the profiler cannot deal with async code?

@ro-i
Copy link
Contributor Author

ro-i commented Jul 28, 2022

Hm, and somehow, the requirements of the matrix bridge from requirements.txt do not get installed during the setup phase 🤔 (That is why the tests fail...)

@rht
Copy link
Contributor

rht commented Jul 29, 2022

Async should work rkern/line_profiler#103.

@rht
Copy link
Contributor

rht commented Jul 29, 2022

I think it's fine to add matrix-nio in https://github.com/zulip/python-zulip-api/blob/main/requirements.txt.

@rht
Copy link
Contributor

rht commented Jul 29, 2022

That requirements.txt is for dev-purpose only.

@ro-i ro-i force-pushed the matrix-multiple-bridges branch from 251a558 to 12b5e7a Compare July 29, 2022 07:53
@ro-i
Copy link
Contributor Author

ro-i commented Jul 29, 2022

Async should work rkern/line_profiler#103.

I got it working. I just needed to make sure that I can trigger the termination without exit (so that the main function just returns). I did not yet see any obvious issues, though...

Total time: 0.001073 s
File: ./matrix_bridge.py
Function: run at line 208

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
   208                                               @profile
   209                                               async def run(self) -> None:
   210         1         16.0     16.0      1.5          print("Starting message handler on Matrix client")
   211                                           
   212                                                   # Set up event callback.
   213         1          8.0      8.0      0.7          self.matrix_client.add_event_callback(self._matrix_to_zulip, nio.Event)
   214                                           
   215         1       1049.0   1049.0     97.8          await self.matrix_client.sync_forever()

Total time: 0.001328 s
File: ./matrix_bridge.py
Function: run at line 394

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
   394                                               @profile
   395                                               async def run(self) -> None:
   396         1          9.0      9.0      0.7          print("Starting message handler on Zulip client")
   397                                           
   398         1          2.0      2.0      0.2          self.loop = asyncio.get_event_loop()
   399                                           
   400         2        177.0     88.5     13.3          with ThreadPoolExecutor() as executor:
   401         2       1138.0    569.0     85.7              await asyncio.get_event_loop().run_in_executor(
   402         1          2.0      2.0      0.2                  executor, self.zulip_client.call_on_each_message, self._zulip_to_matrix
   403                                                       )

@ro-i
Copy link
Contributor Author

ro-i commented Jul 29, 2022

Total time: 0.124969 s
File: ./matrix_bridge.py
Function: _matrix_to_zulip at line 95

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    95                                               @profile
    96                                               async def _matrix_to_zulip(self, room: nio.MatrixRoom, event: nio.Event) -> None:
    97         2         66.0     33.0      0.1          logging.debug("_matrix_to_zulip; room %s, event: %s" % (str(room.room_id), str(event)))
    98                                           
    99                                                   # We do this to identify the messages generated from Zulip -> Matrix
   100                                                   # and we make sure we don't forward it again to the Zulip stream.
   101         2          9.0      4.5      0.0          if event.sender == self.matrix_config["mxid"]:
   102         1          1.0      1.0      0.0              return
   103                                           
   104         1          1.0      1.0      0.0          if room.room_id not in self.matrix_config["bridges"]:
   105                                                       return
   106         1          1.0      1.0      0.0          stream, topic = self.matrix_config["bridges"][room.room_id]
   107                                           
   108         1         19.0     19.0      0.0          content: Optional[str] = await self.get_message_content_from_event(event, room)
   109         1          1.0      1.0      0.0          if not content:
   110                                                       return
   111                                           
   112         1          1.0      1.0      0.0          try:
   113         2     123569.0  61784.5     98.9              result: Dict[str, Any] = self.zulip_client.send_message(
   114         1          1.0      1.0      0.0                  {
   115         1          1.0      1.0      0.0                      "type": "stream",
   116         1          0.0      0.0      0.0                      "to": stream,
   117         1          0.0      0.0      0.0                      "subject": topic,
   118         1          1.0      1.0      0.0                      "content": content,
   119                                                           }
   120                                                       )
   121                                                   except Exception as exception:
   122                                                       # Generally raised when user is forbidden
   123                                                       raise Bridge_FatalZulipException(exception)
   124         1          2.0      2.0      0.0          if result["result"] != "success":
   125                                                       # Generally raised when API key is invalid
   126                                                       raise Bridge_FatalZulipException(result["msg"])
   127                                           
   128                                                   # Update the bot's read marker in order to show the other users which
   129                                                   # messages are already processed by the bot.
   130         2       1294.0    647.0      1.0          await self.matrix_client.room_read_markers(
   131         1          2.0      2.0      0.0              room.room_id, fully_read_event=event.event_id, read_event=event.event_id
   132                                                   )

Total time: 0.124933 s
File: ./matrix_bridge.py
Function: _zulip_to_matrix at line 258

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
   258                                               @profile
   259                                               def _zulip_to_matrix(self, msg: Dict[str, Any]) -> None:
   260         2         74.0     37.0      0.1          logging.debug("_zulip_to_matrix; msg: %s" % (str(msg),))
   261                                           
   262         2          4.0      2.0      0.0          if msg["content"] == "exit":
   263         1          3.0      3.0      0.0              raise Bridge_FatalZulipException()
   264                                           
   265         1          4.0      4.0      0.0          room_id: Optional[str] = self.get_matrix_room_for_zulip_message(msg)
   266         1          1.0      1.0      0.0          if room_id is None:
   267                                                       return
   268                                           
   269         1          1.0      1.0      0.0          sender: str = msg["sender_full_name"]
   270         2          5.0      2.5      0.0          content: str = MATRIX_MESSAGE_TEMPLATE.format(
   271         1          1.0      1.0      0.0              username=sender, uid=msg["sender_id"], message=msg["content"]
   272                                                   )
   273                                           
   274                                                   # Forward Zulip message to Matrix.
   275         2     123735.0  61867.5     99.0          self._matrix_send(
   276         1          0.0      0.0      0.0              room_id=room_id,
   277         1          0.0      0.0      0.0              message_type="m.room.message",
   278         1          1.0      1.0      0.0              content={"msgtype": "m.text", "body": content},
   279                                                   )
   280                                           
   281                                                   # Get embedded files.
   282         3        167.0     55.7      0.1          files_to_send, media_success = asyncio.run_coroutine_threadsafe(
   283         1         28.0     28.0      0.0              self.handle_media(msg["content"]), self.loop
   284         1        895.0    895.0      0.7          ).result()
   285                                           
   286         1          7.0      7.0      0.0          if files_to_send:
   287                                                       self._matrix_send(
   288                                                           room_id=room_id,
   289                                                           message_type="m.room.message",
   290                                                           content={"msgtype": "m.text", "body": "This message contains the following files:"},
   291                                                       )
   292                                                       for file in files_to_send:
   293                                                           self._matrix_send(room_id=room_id, message_type="m.room.message", content=file)
   294         1          7.0      7.0      0.0          if not media_success:
   295                                                       self._matrix_send(
   296                                                           room_id=room_id,
   297                                                           message_type="m.room.message",
   298                                                           content={
   299                                                               "msgtype": "m.text",
   300                                                               "body": "This message contained some files which could not be forwarded.",
   301                                                           },
   302                                                       )

@rht
Copy link
Contributor

rht commented Jul 29, 2022

I suppose zulip_client.send and _matrix_send are slow because of the network request. And it's unlikely that they contribute to the CPU load. Protip: you can also add @profile in the installed site-packages files of matrix-nio itself if you want to get the line profiling.

Anyway, I think we should discuss about the performance elsewhere, so that we can focus on getting this PR merged.

Everything else other than this comment (the message might be less concise with the extra ID, but I have no strong opinion on this) LGTM.

@ro-i
Copy link
Contributor Author

ro-i commented Jul 29, 2022

Ok :) I think it would be good to have something unique to identify the users from both sides. The matrix id, which is shown on the Zulip side, is of course a bit nicer than the Zulip user id shown on the Matrix side. But the internal Zulip email address wouldn't make much sense either, so I think it's not bad :)

@timabbott
Copy link
Member

This looks great, merged, thanks @ro-i and @rht!

I think we can wait for feedback from users on the precise formatting; it seems possible it's overly verbose in a way that's annoying, but I'm not really sure. Perhaps you can post some screenshots in #integrations to get more eyes on it.

(The CI is an infrastructure failure on a Windows Actions node).

The one thing that seems like a natural follow-up to me is that it'd probably be better if the Matrix bridge were a separate zulip-bridge-matrix package. It'd still be in this codebase, and built as part of the same source bundle, just be another output, like the zulip_botsserver package.

@timabbott timabbott merged commit 72ef52d into zulip:main Aug 18, 2022
andersk added a commit to andersk/python-zulip-api that referenced this pull request Aug 23, 2022
This reverts commit 72ef52d (zulip#723).

The test failure on Windows will need to be debugged before this can
be re-merged.

Signed-off-by: Anders Kaseorg <[email protected]>
@andersk
Copy link
Member

andersk commented Aug 23, 2022

(The CI is an infrastructure failure on a Windows Actions node).

Unfortunately, this was not just an infrastructure failure. I’ve had to revert the last commit in #763 to restore passing CI.

@ro-i When you get a chance, please open a new pull request restoring this commit. The failure on Windows will need to be debugged before it can be merged.

@ro-i
Copy link
Contributor Author

ro-i commented Aug 24, 2022

Will do as discussed here!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants