Skip to content

Commit b24987f

Browse files
authored
fix(txnames): Revert high threshold for running the clusterer (#49087)
As part of getsentry/team-ingest#93, we merged #46503, to ensure we would not run the clusterer for fresh projects until they collect a high amount of unique transaction names. This was based on a suspicion that we would otherwise declare all URL transactions as sanitized prematurely. However, we did not have any data to back up this decision, and there is no reason to impose this threshold from the algorithm's point of view: There is already the (lower) `MERGE_THRESHOLD` which should prevent low-quality replacement rules. What we _do_ know is that we've seen a decline in the number of transactions changed by clustering rules (see metric `event.transaction_name_changes`), which might be because we are now too strict about when we run the clusterer.
1 parent aaa7d66 commit b24987f

File tree

2 files changed

+2
-2
lines changed

2 files changed

+2
-2
lines changed

src/sentry/ingest/transaction_clusterer/tasks.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -63,7 +63,7 @@ def cluster_projects(projects: Sequence[Project]) -> None:
6363
span.set_data("project_id", project.id)
6464
tx_names = list(redis.get_transaction_names(project))
6565
new_rules = []
66-
if len(tx_names) >= redis.MAX_SET_SIZE:
66+
if len(tx_names) >= MERGE_THRESHOLD:
6767
clusterer = TreeClusterer(merge_threshold=MERGE_THRESHOLD)
6868
clusterer.add_input(tx_names)
6969
new_rules = clusterer.get_rules()

tests/sentry/ingest/test_transaction_clusterer.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -204,7 +204,7 @@ def _add_mock_data(proj, number):
204204
project2 = Project(id=223, name="project2", organization_id=default_organization.id)
205205
for project in (project1, project2):
206206
project.save()
207-
_add_mock_data(project, 10)
207+
_add_mock_data(project, 4)
208208

209209
spawn_clusterers()
210210

0 commit comments

Comments
 (0)