-
Notifications
You must be signed in to change notification settings - Fork 306
model: Introduce data structures for "recent senders criterion" of user-mention autocomplete #692
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
model: Introduce data structures for "recent senders criterion" of user-mention autocomplete #692
Conversation
lib/model/message_list.dart
Outdated
if (_messageVisible(message)) { | ||
_addMessage(message); | ||
store.recentSenders.handleMessage(message); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reply to your 608 comment:
Good question.
What does the dartdoc on that _messageVisible method say?
Based on that, what would be the consequence of having that condition control whether to make this call? And what would be the consequence of ignoring that condition when making this call?
I concluded that the call should be inside the if condition. We track the messages that are shown in the feed, and if they're not shown, then there's no need to show the users who sent these messages, in the autocomplete results. Thanks for the clue!!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we look at web impl, recent_senders.process_stream_message
in process_new_message
doesn't seem to care about topic visibility policy. Not sure if that's intended or a bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, good comparison.
Here, _messageVisible
says whether the message should appear in this particular message list:
/// Whether [message] should actually appear in this message list,
/// given that it does belong to the narrow.
///
/// This depends in particular on whether the message is muted in
/// one way or another.
///
/// See also [_allMessagesVisible].
bool _messageVisible(Message message) {
But if the message doesn't appear in this message list, it might still appear in a different one later. A message hidden from the combined feed because the stream is muted will appear if you navigate to that stream; a message hidden because the stream and/or topic is muted will appear if you navigate to the topic.
The recent-senders data structure isn't specific to a particular message list; it's shared across them all (within an account). So let's have the data it gets not be controlled by the message list's _messageVisible
.
1bd73ff
to
6dac3d8
Compare
Let's start applying the "buddy review" system on this PR, now that the start of the official GSoC period is imminent and it seems like everyone's actively engaged again. @rajveermalviya, would you do the first review on this PR? Some quick bits of context to note:
Meanwhile we'll proceed with maintainer review for the first PR in this series, #693, because after the previous rounds of review (on #608) that one is further along. As a reminder since we're just starting to use this system: when you get a buddy review request, please prioritize it right after important bugs and your regressions, ahead of most of your own PRs. (In particular I don't think anyone is currently working on an issue that would come ahead of doing buddy reviews.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments, otherwise looks great!
Thanks!
|
||
/// Extracts and keeps track of the necessary data from a [message] only | ||
/// if it is a stream message. | ||
void handleMessage(Message message) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
void handleMessage(Message message) { | |
void handleMessageEvent(MessageEvent event) { | |
final message = event.message; |
Matches RecentDmConversationsView.handleMessageEvent
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change would match it with other somewhat similar parts of the codebase, but there are places this method is called where there is no MessageEvent
available, such as in model/message_list.dart
file. In contrast, RecentDmConversationsView.handleMessageEvent
is only called in PerAccountStore.handleEvent
where the MessageEvent
is available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense.
lib/model/recent_senders.dart
Outdated
/// Whether stream senders and topic senders are both not empty. | ||
@visibleForTesting | ||
bool get debugIsNotEmpty => _streamSenders.isNotEmpty && _topicSenders.isNotEmpty; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// Whether stream senders and topic senders are both not empty. | |
@visibleForTesting | |
bool get debugIsNotEmpty => _streamSenders.isNotEmpty && _topicSenders.isNotEmpty; |
remove, it's not used.
lib/model/message_list.dart
Outdated
if (_messageVisible(message)) { | ||
_addMessage(message); | ||
store.recentSenders.handleMessage(message); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we look at web impl, recent_senders.process_stream_message
in process_new_message
doesn't seem to care about topic visibility policy. Not sure if that's intended or a bug.
6dac3d8
to
0d28448
Compare
Thanks, @rajveermalviya for the review! Pushed the revision. Please have a look. |
test/model/autocomplete_test.dart
Outdated
// test('fetching older messages', () async { | ||
// final users = generateUsers(count: 1500); | ||
// final store = eg.store(); | ||
// final stream = eg.stream(); | ||
// final narrow = StreamNarrow(stream.streamId); | ||
// final subscription = eg.subscription(stream); | ||
// await store.addStream(stream); | ||
// await store.addSubscription(subscription); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gnprice I have commented out this test for now as it wouldn't pass. I spent a significant amount of time to diagnose what was the problem, but unfortunately, I wasn't able to figure it out. I would appreciate it if you can have a look at it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. I think fundamentally the issue here is that these tests using await Future(() {});
are fragile.
They're trying to make several different things happen in a certain order, but those things are asynchronous in the code. The await Future(() {})
basically allows each thread of async control flow to advance from one await
to the next await
… but what that means in terms of actual behavior of the models depends in a very brittle way on the exact details of how the code being tested is written. Small differences in that code will cause things in the test to happen in a different order and break the test.
That's also why the existing tests of this flavor have so many checks like check(done).isFalse()
— those are there to make sure that the test's assumptions about what order things happen in are satisfied. That way, if those assumptions are broken by a refactor or by a change to the test, the test fails instead of silently ceasing to successfully test what it's meant to test.
It looks like this test is failing at the check(done).isFalse()
here:
await act(store);
await Future(() {});
check(done).isFalse();
await Future(() {});
check(done).isTrue();
expect(view.results);
so IOW the search is done sooner than the test expects. Presumably it's due to some difference between the fetchOlder
in this test, and the variations on handleEvent
found in the neighboring tests.
The test passes if you make this change:
@@ -293,7 +296,6 @@ void main() {
view.addListener(() { done = true; });
view.query = MentionAutocompleteQuery(rawQuery);
- await Future(() {});
check(done).isFalse();
await act(store);
But in that case is the test still successfully testing what it's supposed to test? I'm not sure what the assumptions are of your doCheck
method.
(Probably a chat thread would be a good place to go into further detail.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from that test, it looks great :)
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't do any review of the code. But approving since this seems to be part of Zulip's this year's GSOC process.
@hackerkid Putting this back in your queue after we discussed today how mentor review is meant to work 🙂 (#713 would be a good PR to review first, though — it's smaller, and also this one is stacked atop #693 which is still in progress, so there's no rush for this one to get through review before #693 is ready.) |
d6d337c
to
2fa9e68
Compare
@hackerkid Now that #693 is merged, this PR is now ready for the mentor review! |
lib/model/recent_senders.dart
Outdated
// TODO: remove | ||
|
||
/// The maximum id in the tracker list. | ||
/// | ||
/// Returns -1 if the tracker list is empty. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these comments still relevant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think they're relevant, specifically the "Returns null
if the tracker list is empty" part which will help the reader what to expect.
ef93c67
to
3f96e33
Compare
3f96e33
to
ddcba66
Compare
Thanks @gnprice for the review. Updates pushed. Please have a look. |
} | ||
} | ||
|
||
class MessageIdTracker { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Making this class private would cause the following lint message, thus failing the CI:
Invalid use of a private type in a public API.
Try making the private type public, or making the API that uses the private type also be private.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, because the type appears in RecentSenders.streamSenders
and RecentSenders.topicSenders
.
Still we can keep a @visibleForTesting
annotation on this, right? It looks like that went away in the latest revision.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the revision! This is very close now. Comments below.
For the remaining comments on the tests, I decided it'd be clearest to demonstrate rather than describe, so I'll push to this branch some added commits that make those changes. Then please leave those in the branch, after your main commits, while acting on the several other comments below. For the two comments that request followup commits, those can go after my test commits.
lib/model/recent_senders.dart
Outdated
/// [newIds] should be sorted ascendingly. | ||
void addAll(List<int> newIds) { | ||
if (ids.isEmpty) { | ||
ids = QueueList.from(newIds); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't improve over calling addAll
(following up on #692 (comment) ), because it's still copying the list.
This isn't any worse than that either, though, and I guess it makes it more parallel with the setUnion
case below — now in both cases we copy the input (and in one case the existing data) into a new QueueList.
To do better would require having this method take a QueueList so that it can actually say ids = newIds;
, and having that input list be constructed as a QueueList in the first place. If you'd like to try doing that, let's leave it as a separate followup commit, in the interest of letting the rest of these changes converge toward merging.
lib/model/recent_senders.dart
Outdated
|
||
/// Add the message ID to the tracker list at the proper place, if not present. | ||
/// | ||
/// Optimized, taking O(log n) time for the case where that place is the end, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is accurate now, but I actually did want the O(1) behavior 🙂 (following up on #692 (comment) ).
That can be a followup commit, though, again to help the other changes converge toward merging. I might squash it in, or I might leave it out and make that optimization myself.
lib/model/recent_senders.dart
Outdated
(messagesByUserInStream[(streamId, senderId)] ??= []).add(id); | ||
(messagesByUserInTopic[(streamId, topic, senderId)] ??= []).add(id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is simplified slightly in this revision, but it can be simplified a lot more. There's only one channel/stream ID and one topic in the whole method. So in particular:
(messagesByUserInStream[(streamId, senderId)] ??= []).add(id); | |
(messagesByUserInTopic[(streamId, topic, senderId)] ??= []).add(id); | |
(messagesByUser[senderId] ??= []).add(id); |
lib/model/recent_senders.dart
Outdated
if (event.messageType != MessageType.stream) return; | ||
final messagesByUserInStream = <(int, int), List<int>>{}; | ||
final messagesByUserInTopic = <(int, String, int), List<int>>{}; | ||
|
||
final DeleteMessageEvent(:streamId!, :topic!) = event; | ||
for (final id in event.messageIds) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: group closely-related lines into stanzas:
if (event.messageType != MessageType.stream) return; | |
final messagesByUserInStream = <(int, int), List<int>>{}; | |
final messagesByUserInTopic = <(int, String, int), List<int>>{}; | |
final DeleteMessageEvent(:streamId!, :topic!) = event; | |
for (final id in event.messageIds) { | |
if (event.messageType != MessageType.stream) return; | |
final DeleteMessageEvent(:streamId!, :topic!) = event; | |
final messagesByUserInStream = <(int, int), List<int>>{}; | |
final messagesByUserInTopic = <(int, String, int), List<int>>{}; | |
for (final id in event.messageIds) { |
} | ||
} | ||
|
||
class MessageIdTracker { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, because the type appears in RecentSenders.streamSenders
and RecentSenders.topicSenders
.
Still we can keep a @visibleForTesting
annotation on this, right? It looks like that went away in the latest revision.
test/model/recent_senders_test.dart
Outdated
eg.streamMessage(stream: stream1, topic: 'a', sender: user10, id: 200), | ||
]; | ||
setupModel(existingMessages); | ||
|
||
final dmMessage = eg.dmMessage(from: eg.otherUser, to: [eg.selfUser], id: 200); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bad test data: message ID collision
(and can't be the same message seen twice, because stream/channel messages can't turn into DMs or vice versa)
test/model/recent_senders_test.dart
Outdated
group('single tracker', () { | ||
test('batch goes before the existing messages', () { | ||
final existingMessages = [ | ||
eg.streamMessage(stream: stream1, topic: 'a', sender: user10, id: 300), | ||
eg.streamMessage(stream: stream1, topic: 'a', sender: user10, id: 400), | ||
]; | ||
setupModel(existingMessages); | ||
|
||
final messages = [ | ||
eg.streamMessage(stream: stream1, topic: 'a', sender: user10, id: 100), | ||
eg.streamMessage(stream: stream1, topic: 'a', sender: user10, id: 200), | ||
]; | ||
model.handleMessages(messages); | ||
|
||
checkMatchesMessages(model, | ||
[...existingMessages, ...messages]..sort((m1, m2) => m1.id.compareTo(m2.id))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These handleMessages tests are pretty verbose, with a lot of stuff in them that's the same from test to test and isn't what the test is about. I'll push a couple of added commits that factor that out, so that it's a lot easier for the reader to scan these tests and see what they're saying.
test/model/recent_senders_test.dart
Outdated
group('multiple trackers', () { | ||
test('batch goes before the existing messages', () { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's all that helpful to take the single-tracker tests and make a copy of each of them that distributes across a couple of different topics. What I asked for at #692 (comment) was:
there should also be a couple of test cases that show that
handleMessages
correctly handles taking messages that belong in several different trackers, and distributing them across those.
These test cases don't really do that. They don't test that handleMessages
correctly handles:
- having messages in several streams
- or messages from several senders
- or creating a new tracker where there wasn't one already;
really the one thing they add is testing that it handles getting messages in multiple topics of one stream.
Conversely there isn't a need to repeat the testing of all the different ways the old and new messages in a given tracker can be related; that logic all operates within each tracker separately, so the single-tracker tests cover it.
These tests do add a lot of lines of code. One effect of that is that it takes some work to scan through and see all the things the tests are saying, in order to get the picture of what scenarios are covered and what scenarios aren't. One reason making tests concise and focused (as discussed in comments just above this one) is important is that it helps with detecting missing scenarios, like here.
I'll push another commit replacing these with tests that cover those.
This replaces the "multiple trackers" group with test cases that are written to exercise the various scenarios relevant to the logic in RecentSenders.handleMessages. See also discussion here: zulip#692 (comment)
This replaces the "multiple trackers" group with test cases that are written to exercise the various scenarios relevant to the logic in RecentSenders.handleMessages. See also discussion here: zulip#692 (comment)
0c1edc4
to
a208d56
Compare
Thank you @gnprice for the review and the added commits; they're really helpful. New revision pushed. PTAL. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the revision! No remaining comments on the behavior; just three small comments below, and then this will be ready for merge.
lib/model/recent_senders.dart
Outdated
final sendersByStream = streamSenders[streamId]; | ||
final messagesBySenderInStream = sendersByStream?[senderId]; | ||
messagesBySenderInStream?.removeAll(messages); | ||
if (messagesBySenderInStream?.maxId == null) sendersByStream?.remove(senderId); | ||
if (sendersByStream?.isEmpty ?? false) streamSenders.remove(streamId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first and last lines of this stanza can move outside the loop, since they don't depend on senderId
or messages
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also moved the lines related to the topic outside of the loop as they were not dependent on senderId
or messages
either.
lib/model/recent_senders.dart
Outdated
/// because that's the common case for a message that is received through | ||
/// [PerAccountStore.handleEvent]. May take O(n) time in some rare cases. | ||
void add(int id) { | ||
if (ids.isEmpty || id > maxId!) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
if (ids.isEmpty || id > maxId!) { | |
if (ids.isEmpty || id > ids.last) { |
The behavior is the same, and since the rest of this function is in terms of ids
it makes it a bit easier to reason about how the different pieces of logic within the function relate to each other.
lib/model/recent_senders.dart
Outdated
ids = setUnion(ids, newIds); | ||
} | ||
|
||
void removeAll(QueueList<int> idsToRemove) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this doesn't use that it's a QueueList, so might as well take a general List (and the caller might as well make plain default lists with []
, rather than QueueLists)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The motivation was that we use QueueList
internally as well as in addAll
, so I thought passing a QueueList
in removeAll
would align with them. Anyway, changed it back to List
. 🙂
…necessary In this specific case, this change is not crucial for the test performance as the loop iterations are way less to cause an issue, but it is a good habit to add it as one would copy this part of the code in the future and repurpose it with large loop iterations; in which it would be a test performance issue.
… structures MessageIdTracker data structure is used to keep track of message ids in an ascending sorted list. It is used in RecentSenders data structure. RecentSenders data structure is used to keep track of user messages in topics and streams. Much of this code is transcribed from Zulip web; in particular, from: https://github.com/zulip/zulip/blob/bd04a30bbc6dc5bd7c20940a3d1d34cf8c8c6f28/web/src/recent_senders.ts
The function we're passing this to doesn't notice the order anyway -- it just puts these into sets.
This replaces the "multiple trackers" group with test cases that are written to exercise the various scenarios relevant to the logic in RecentSenders.handleMessages. See also discussion here: zulip#692 (comment)
…ferent Most of these tests don't use the specific stream IDs, user IDs, or message IDs, so cut them out to reduce the number of details for the reader to read. Then also rename the streams, topics, and users so that they're a bit more visually distinct from each other. That helps in seeing what's going on in the tests that are all about which bits of data are the same as, or different from, which other ones.
This will avoid copying the list on each call to `addAll`.
a208d56
to
d3d3774
Compare
Thanks @gnprice for the review. Revision pushed. |
A map of "foos by bar" is a map where each key is a "bar", and each value is a "foo" -- the phrase can be read as an abbreviation of "foos indexed by bar" or "foos, where each foo is indexed by its respective bar". So in this function `messagesByUser` is accurately named because a key represents a user and a value represents some messages. But these other variables should be e.g. `topicsInStream`, not `topicsByStream`. That one isn't a map where the key is a stream and the value is a topic or topics; rather it's a collection of topics (and more data about them) which is the collection specifically of topics in the stream we're acting on.
Looks good — merging! Thanks again for all your work on this. On to #828, the last stage, so that we can get this feature out to users 🙂 I added some small NFC commits on top; please take a look through those: 73c827c recent_senders [nfc]: Rename some foosByBar that mean foosInBar A map of "foos by bar" is a map where each key is a "bar", and So in this function But these other variables should be e.g. eca8af9 recent_senders [nfc]: Add and revise some docs dbb2dcf recent_senders [nfc]: Write "sorted ascending" |
This is the second PR in the series of PRs #608 is divided into, coming after #693. This solely focuses on bringing the data structures for the upcoming "recent senders" criterion of @-mention autocomplete. Next PR in the series: #828.
Fixes part of: #228