Skip to content

model: Introduce data structures for "recent senders criterion" of user-mention autocomplete #692

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

sm-sayedi
Copy link
Collaborator

@sm-sayedi sm-sayedi commented May 20, 2024

This is the second PR in the series of PRs #608 is divided into, coming after #693. This solely focuses on bringing the data structures for the upcoming "recent senders" criterion of @-mention autocomplete. Next PR in the series: #828.

Fixes part of: #228

@sm-sayedi sm-sayedi added the maintainer review PR ready for review by Zulip maintainers label May 20, 2024
Comment on lines 382 to 410
if (_messageVisible(message)) {
_addMessage(message);
store.recentSenders.handleMessage(message);
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reply to your 608 comment:

Good question.

What does the dartdoc on that _messageVisible method say?

Based on that, what would be the consequence of having that condition control whether to make this call? And what would be the consequence of ignoring that condition when making this call?

I concluded that the call should be inside the if condition. We track the messages that are shown in the feed, and if they're not shown, then there's no need to show the users who sent these messages, in the autocomplete results. Thanks for the clue!!

Copy link
Member

@rajveermalviya rajveermalviya May 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we look at web impl, recent_senders.process_stream_message in process_new_message doesn't seem to care about topic visibility policy. Not sure if that's intended or a bug.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, good comparison.

Here, _messageVisible says whether the message should appear in this particular message list:

  /// Whether [message] should actually appear in this message list,
  /// given that it does belong to the narrow.
  ///
  /// This depends in particular on whether the message is muted in
  /// one way or another.
  ///
  /// See also [_allMessagesVisible].
  bool _messageVisible(Message message) {

But if the message doesn't appear in this message list, it might still appear in a different one later. A message hidden from the combined feed because the stream is muted will appear if you navigate to that stream; a message hidden because the stream and/or topic is muted will appear if you navigate to the topic.

The recent-senders data structure isn't specific to a particular message list; it's shared across them all (within an account). So let's have the data it gets not be controlled by the message list's _messageVisible.

@gnprice
Copy link
Member

gnprice commented May 22, 2024

Let's start applying the "buddy review" system on this PR, now that the start of the official GSoC period is imminent and it seems like everyone's actively engaged again.

@rajveermalviya, would you do the first review on this PR? Some quick bits of context to note:

Meanwhile we'll proceed with maintainer review for the first PR in this series, #693, because after the previous rounds of review (on #608) that one is further along.

As a reminder since we're just starting to use this system: when you get a buddy review request, please prioritize it right after important bugs and your regressions, ahead of most of your own PRs. (In particular I don't think anyone is currently working on an issue that would come ahead of doing buddy reviews.)

@gnprice gnprice added buddy review GSoC buddy review needed. and removed maintainer review PR ready for review by Zulip maintainers labels May 22, 2024
@gnprice gnprice requested a review from rajveermalviya May 22, 2024 19:23
Copy link
Member

@rajveermalviya rajveermalviya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments, otherwise looks great!
Thanks!


/// Extracts and keeps track of the necessary data from a [message] only
/// if it is a stream message.
void handleMessage(Message message) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
void handleMessage(Message message) {
void handleMessageEvent(MessageEvent event) {
final message = event.message;

Matches RecentDmConversationsView.handleMessageEvent.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change would match it with other somewhat similar parts of the codebase, but there are places this method is called where there is no MessageEvent available, such as in model/message_list.dart file. In contrast, RecentDmConversationsView.handleMessageEvent is only called in PerAccountStore.handleEvent where the MessageEvent is available.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense.

Comment on lines 73 to 75
/// Whether stream senders and topic senders are both not empty.
@visibleForTesting
bool get debugIsNotEmpty => _streamSenders.isNotEmpty && _topicSenders.isNotEmpty;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Whether stream senders and topic senders are both not empty.
@visibleForTesting
bool get debugIsNotEmpty => _streamSenders.isNotEmpty && _topicSenders.isNotEmpty;

remove, it's not used.

Comment on lines 382 to 410
if (_messageVisible(message)) {
_addMessage(message);
store.recentSenders.handleMessage(message);
}
Copy link
Member

@rajveermalviya rajveermalviya May 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we look at web impl, recent_senders.process_stream_message in process_new_message doesn't seem to care about topic visibility policy. Not sure if that's intended or a bug.

@gnprice gnprice added mentor review GSoC mentor review needed. and removed mentor review GSoC mentor review needed. labels May 23, 2024
@sm-sayedi sm-sayedi force-pushed the issue-228-@-mention-results-with-recent-senders-criterion branch from 6dac3d8 to 0d28448 Compare May 29, 2024 15:35
@sm-sayedi
Copy link
Collaborator Author

Thanks, @rajveermalviya for the review! Pushed the revision. Please have a look.

@sm-sayedi sm-sayedi requested a review from rajveermalviya May 29, 2024 15:48
Comment on lines 408 to 415
// test('fetching older messages', () async {
// final users = generateUsers(count: 1500);
// final store = eg.store();
// final stream = eg.stream();
// final narrow = StreamNarrow(stream.streamId);
// final subscription = eg.subscription(stream);
// await store.addStream(stream);
// await store.addSubscription(subscription);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gnprice I have commented out this test for now as it wouldn't pass. I spent a significant amount of time to diagnose what was the problem, but unfortunately, I wasn't able to figure it out. I would appreciate it if you can have a look at it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. I think fundamentally the issue here is that these tests using await Future(() {}); are fragile.

They're trying to make several different things happen in a certain order, but those things are asynchronous in the code. The await Future(() {}) basically allows each thread of async control flow to advance from one await to the next await… but what that means in terms of actual behavior of the models depends in a very brittle way on the exact details of how the code being tested is written. Small differences in that code will cause things in the test to happen in a different order and break the test.

That's also why the existing tests of this flavor have so many checks like check(done).isFalse() — those are there to make sure that the test's assumptions about what order things happen in are satisfied. That way, if those assumptions are broken by a refactor or by a change to the test, the test fails instead of silently ceasing to successfully test what it's meant to test.


It looks like this test is failing at the check(done).isFalse() here:

      await act(store);

      await Future(() {});
      check(done).isFalse();
      await Future(() {});
      check(done).isTrue();
      expect(view.results);

so IOW the search is done sooner than the test expects. Presumably it's due to some difference between the fetchOlder in this test, and the variations on handleEvent found in the neighboring tests.


The test passes if you make this change:

@@ -293,7 +296,6 @@ void main() {
       view.addListener(() { done = true; });
       view.query = MentionAutocompleteQuery(rawQuery);
 
-      await Future(() {});
       check(done).isFalse();
 
       await act(store);

But in that case is the test still successfully testing what it's supposed to test? I'm not sure what the assumptions are of your doCheck method.

(Probably a chat thread would be a good place to go into further detail.)

Copy link
Member

@rajveermalviya rajveermalviya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from that test, it looks great :)
Thanks!

@sm-sayedi sm-sayedi added mentor review GSoC mentor review needed. and removed buddy review GSoC buddy review needed. labels May 30, 2024
@sm-sayedi sm-sayedi requested a review from hackerkid May 30, 2024 14:37
Copy link
Member

@hackerkid hackerkid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't do any review of the code. But approving since this seems to be part of Zulip's this year's GSOC process.

@sm-sayedi sm-sayedi removed the mentor review GSoC mentor review needed. label Jun 3, 2024
@gnprice
Copy link
Member

gnprice commented Jun 4, 2024

@hackerkid Putting this back in your queue after we discussed today how mentor review is meant to work 🙂

(#713 would be a good PR to review first, though — it's smaller, and also this one is stacked atop #693 which is still in progress, so there's no rush for this one to get through review before #693 is ready.)

@gnprice gnprice added the mentor review GSoC mentor review needed. label Jun 4, 2024
@gnprice gnprice requested a review from hackerkid June 4, 2024 02:06
@sm-sayedi sm-sayedi force-pushed the issue-228-@-mention-results-with-recent-senders-criterion branch 3 times, most recently from d6d337c to 2fa9e68 Compare June 11, 2024 19:08
@sm-sayedi
Copy link
Collaborator Author

@hackerkid Now that #693 is merged, this PR is now ready for the mentor review!

Comment on lines 26 to 30
// TODO: remove

/// The maximum id in the tracker list.
///
/// Returns -1 if the tracker list is empty.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these comments still relevant?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think they're relevant, specifically the "Returns null if the tracker list is empty" part which will help the reader what to expect.

@sm-sayedi sm-sayedi force-pushed the issue-228-@-mention-results-with-recent-senders-criterion branch from ef93c67 to 3f96e33 Compare July 19, 2024 20:00
@sm-sayedi sm-sayedi changed the title autocomplete: Add "recent senders criterion" to user-mention autocomplete autocomplete: Introduce data structures for "recent senders criterion" of user-mention autocomplete Jul 19, 2024
@sm-sayedi sm-sayedi force-pushed the issue-228-@-mention-results-with-recent-senders-criterion branch from 3f96e33 to ddcba66 Compare July 19, 2024 20:15
@sm-sayedi
Copy link
Collaborator Author

Thanks @gnprice for the review. Updates pushed. Please have a look.

}
}

class MessageIdTracker {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making this class private would cause the following lint message, thus failing the CI:

Invalid use of a private type in a public API.
Try making the private type public, or making the API that uses the private type also be private.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, because the type appears in RecentSenders.streamSenders and RecentSenders.topicSenders.

Still we can keep a @visibleForTesting annotation on this, right? It looks like that went away in the latest revision.

@sm-sayedi sm-sayedi changed the title autocomplete: Introduce data structures for "recent senders criterion" of user-mention autocomplete model: Introduce data structures for "recent senders criterion" of user-mention autocomplete Jul 19, 2024
Copy link
Member

@gnprice gnprice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the revision! This is very close now. Comments below.

For the remaining comments on the tests, I decided it'd be clearest to demonstrate rather than describe, so I'll push to this branch some added commits that make those changes. Then please leave those in the branch, after your main commits, while acting on the several other comments below. For the two comments that request followup commits, those can go after my test commits.

/// [newIds] should be sorted ascendingly.
void addAll(List<int> newIds) {
if (ids.isEmpty) {
ids = QueueList.from(newIds);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't improve over calling addAll (following up on #692 (comment) ), because it's still copying the list.

This isn't any worse than that either, though, and I guess it makes it more parallel with the setUnion case below — now in both cases we copy the input (and in one case the existing data) into a new QueueList.

To do better would require having this method take a QueueList so that it can actually say ids = newIds;, and having that input list be constructed as a QueueList in the first place. If you'd like to try doing that, let's leave it as a separate followup commit, in the interest of letting the rest of these changes converge toward merging.


/// Add the message ID to the tracker list at the proper place, if not present.
///
/// Optimized, taking O(log n) time for the case where that place is the end,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is accurate now, but I actually did want the O(1) behavior 🙂 (following up on #692 (comment) ).

That can be a followup commit, though, again to help the other changes converge toward merging. I might squash it in, or I might leave it out and make that optimization myself.

Comment on lines 80 to 81
(messagesByUserInStream[(streamId, senderId)] ??= []).add(id);
(messagesByUserInTopic[(streamId, topic, senderId)] ??= []).add(id);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is simplified slightly in this revision, but it can be simplified a lot more. There's only one channel/stream ID and one topic in the whole method. So in particular:

Suggested change
(messagesByUserInStream[(streamId, senderId)] ??= []).add(id);
(messagesByUserInTopic[(streamId, topic, senderId)] ??= []).add(id);
(messagesByUser[senderId] ??= []).add(id);

Comment on lines 71 to 74
if (event.messageType != MessageType.stream) return;
final messagesByUserInStream = <(int, int), List<int>>{};
final messagesByUserInTopic = <(int, String, int), List<int>>{};

final DeleteMessageEvent(:streamId!, :topic!) = event;
for (final id in event.messageIds) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: group closely-related lines into stanzas:

Suggested change
if (event.messageType != MessageType.stream) return;
final messagesByUserInStream = <(int, int), List<int>>{};
final messagesByUserInTopic = <(int, String, int), List<int>>{};
final DeleteMessageEvent(:streamId!, :topic!) = event;
for (final id in event.messageIds) {
if (event.messageType != MessageType.stream) return;
final DeleteMessageEvent(:streamId!, :topic!) = event;
final messagesByUserInStream = <(int, int), List<int>>{};
final messagesByUserInTopic = <(int, String, int), List<int>>{};
for (final id in event.messageIds) {

}
}

class MessageIdTracker {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, because the type appears in RecentSenders.streamSenders and RecentSenders.topicSenders.

Still we can keep a @visibleForTesting annotation on this, right? It looks like that went away in the latest revision.

Comment on lines 146 to 150
eg.streamMessage(stream: stream1, topic: 'a', sender: user10, id: 200),
];
setupModel(existingMessages);

final dmMessage = eg.dmMessage(from: eg.otherUser, to: [eg.selfUser], id: 200);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bad test data: message ID collision

(and can't be the same message seen twice, because stream/channel messages can't turn into DMs or vice versa)

Comment on lines 70 to 85
group('single tracker', () {
test('batch goes before the existing messages', () {
final existingMessages = [
eg.streamMessage(stream: stream1, topic: 'a', sender: user10, id: 300),
eg.streamMessage(stream: stream1, topic: 'a', sender: user10, id: 400),
];
setupModel(existingMessages);

final messages = [
eg.streamMessage(stream: stream1, topic: 'a', sender: user10, id: 100),
eg.streamMessage(stream: stream1, topic: 'a', sender: user10, id: 200),
];
model.handleMessages(messages);

checkMatchesMessages(model,
[...existingMessages, ...messages]..sort((m1, m2) => m1.id.compareTo(m2.id)));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These handleMessages tests are pretty verbose, with a lot of stuff in them that's the same from test to test and isn't what the test is about. I'll push a couple of added commits that factor that out, so that it's a lot easier for the reader to scan these tests and see what they're saying.

Comment on lines 163 to 164
group('multiple trackers', () {
test('batch goes before the existing messages', () {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's all that helpful to take the single-tracker tests and make a copy of each of them that distributes across a couple of different topics. What I asked for at #692 (comment) was:

there should also be a couple of test cases that show that handleMessages correctly handles taking messages that belong in several different trackers, and distributing them across those.

These test cases don't really do that. They don't test that handleMessages correctly handles:

  • having messages in several streams
  • or messages from several senders
  • or creating a new tracker where there wasn't one already;

really the one thing they add is testing that it handles getting messages in multiple topics of one stream.

Conversely there isn't a need to repeat the testing of all the different ways the old and new messages in a given tracker can be related; that logic all operates within each tracker separately, so the single-tracker tests cover it.

These tests do add a lot of lines of code. One effect of that is that it takes some work to scan through and see all the things the tests are saying, in order to get the picture of what scenarios are covered and what scenarios aren't. One reason making tests concise and focused (as discussed in comments just above this one) is important is that it helps with detecting missing scenarios, like here.

I'll push another commit replacing these with tests that cover those.

gnprice added a commit to sm-sayedi/zulip-flutter that referenced this pull request Jul 19, 2024
This replaces the "multiple trackers" group with test cases that
are written to exercise the various scenarios relevant to the
logic in RecentSenders.handleMessages.

See also discussion here:
  zulip#692 (comment)
sm-sayedi pushed a commit to sm-sayedi/zulip-flutter that referenced this pull request Jul 22, 2024
This replaces the "multiple trackers" group with test cases that
are written to exercise the various scenarios relevant to the
logic in RecentSenders.handleMessages.

See also discussion here:
  zulip#692 (comment)
@sm-sayedi sm-sayedi force-pushed the issue-228-@-mention-results-with-recent-senders-criterion branch from 0c1edc4 to a208d56 Compare July 22, 2024 20:05
@sm-sayedi
Copy link
Collaborator Author

Thank you @gnprice for the review and the added commits; they're really helpful. New revision pushed. PTAL.

@sm-sayedi sm-sayedi requested a review from gnprice July 22, 2024 20:10
Copy link
Member

@gnprice gnprice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the revision! No remaining comments on the behavior; just three small comments below, and then this will be ready for merge.

Comment on lines 84 to 88
final sendersByStream = streamSenders[streamId];
final messagesBySenderInStream = sendersByStream?[senderId];
messagesBySenderInStream?.removeAll(messages);
if (messagesBySenderInStream?.maxId == null) sendersByStream?.remove(senderId);
if (sendersByStream?.isEmpty ?? false) streamSenders.remove(streamId);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first and last lines of this stanza can move outside the loop, since they don't depend on senderId or messages.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also moved the lines related to the topic outside of the loop as they were not dependent on senderId or messages either.

/// because that's the common case for a message that is received through
/// [PerAccountStore.handleEvent]. May take O(n) time in some rare cases.
void add(int id) {
if (ids.isEmpty || id > maxId!) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
if (ids.isEmpty || id > maxId!) {
if (ids.isEmpty || id > ids.last) {

The behavior is the same, and since the rest of this function is in terms of ids it makes it a bit easier to reason about how the different pieces of logic within the function relate to each other.

ids = setUnion(ids, newIds);
}

void removeAll(QueueList<int> idsToRemove) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this doesn't use that it's a QueueList, so might as well take a general List (and the caller might as well make plain default lists with [], rather than QueueLists)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The motivation was that we use QueueList internally as well as in addAll, so I thought passing a QueueList in removeAll would align with them. Anyway, changed it back to List. 🙂

sm-sayedi and others added 8 commits July 23, 2024 10:03
…necessary

In this specific case, this change is not crucial for the test performance
as the loop iterations are way less to cause an issue, but it is a good
habit to add it as one would copy this part of the code in the future
and repurpose it with large loop iterations; in which it would be a
test performance issue.
… structures

MessageIdTracker data structure is used to keep track of message ids in
an ascending sorted list. It is used in RecentSenders data structure.
RecentSenders data structure is used to keep track of user messages in topics
and streams.

Much of this code is transcribed from Zulip web; in particular,
from:
  https://github.com/zulip/zulip/blob/bd04a30bbc6dc5bd7c20940a3d1d34cf8c8c6f28/web/src/recent_senders.ts
The function we're passing this to doesn't notice the order anyway --
it just puts these into sets.
This replaces the "multiple trackers" group with test cases that
are written to exercise the various scenarios relevant to the
logic in RecentSenders.handleMessages.

See also discussion here:
  zulip#692 (comment)
…ferent

Most of these tests don't use the specific stream IDs, user IDs,
or message IDs, so cut them out to reduce the number of details
for the reader to read.

Then also rename the streams, topics, and users so that they're
a bit more visually distinct from each other.  That helps in seeing
what's going on in the tests that are all about which bits of data
are the same as, or different from, which other ones.
This will avoid copying the list on each call to `addAll`.
@sm-sayedi sm-sayedi force-pushed the issue-228-@-mention-results-with-recent-senders-criterion branch from a208d56 to d3d3774 Compare July 23, 2024 05:38
@sm-sayedi
Copy link
Collaborator Author

Thanks @gnprice for the review. Revision pushed.

gnprice added 3 commits July 23, 2024 10:31
A map of "foos by bar" is a map where each key is a "bar", and
each value is a "foo" -- the phrase can be read as an abbreviation
of "foos indexed by bar" or "foos, where each foo is indexed by
its respective bar".

So in this function `messagesByUser` is accurately named because
a key represents a user and a value represents some messages.

But these other variables should be e.g. `topicsInStream`, not
`topicsByStream`.  That one isn't a map where the key is a stream
and the value is a topic or topics; rather it's a collection of
topics (and more data about them) which is the collection
specifically of topics in the stream we're acting on.
@gnprice gnprice merged commit dbb2dcf into zulip:main Jul 23, 2024
1 check passed
@gnprice
Copy link
Member

gnprice commented Jul 23, 2024

Looks good — merging! Thanks again for all your work on this. On to #828, the last stage, so that we can get this feature out to users 🙂

I added some small NFC commits on top; please take a look through those:

73c827c recent_senders [nfc]: Rename some foosByBar that mean foosInBar

A map of "foos by bar" is a map where each key is a "bar", and
each value is a "foo" -- the phrase can be read as an abbreviation
of "foos indexed by bar" or "foos, where each foo is indexed by
its respective bar".

So in this function messagesByUser is accurately named because
a key represents a user and a value represents some messages.

But these other variables should be e.g. topicsInStream, not
topicsByStream. That one isn't a map where the key is a stream
and the value is a topic or topics; rather it's a collection of
topics (and more data about them) which is the collection
specifically of topics in the stream we're acting on.

eca8af9 recent_senders [nfc]: Add and revise some docs

dbb2dcf recent_senders [nfc]: Write "sorted ascending"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
integration review Added by maintainers when PR may be ready for integration mentor review GSoC mentor review needed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants