Skip to content

Commit 4ba6f15

Browse files
authored
update the prompt to include needs-info (#296)
- update the prompt to include triaging issues w/ `needs-info` labels - `[breaking change]` issues should have `breaking-change-request` labels but don't necessarily need area labels - add 5 'few-shot' examples in the classification prompt - update benchmark expectations - classification performance has generally improved from 56% to 65% --- - [x] I’ve reviewed the contributor guide and applied the relevant portions to this PR. <details> <summary>Contribution guidelines:</summary><br> - See our [contributor guide](https://github.com/dart-lang/.github/blob/main/CONTRIBUTING.md) for general expectations for PRs. - Larger or significant changes should be discussed in an issue before creating a PR. - Contributions to our repos should follow the [Dart style guide](https://dart.dev/guides/language/effective-dart) and use `dart format`. - Most changes should add an entry to the changelog and may need to [rev the pubspec package version](https://github.com/dart-lang/sdk/blob/main/docs/External-Package-Maintenance.md#making-a-change). - Changes to packages require [corresponding tests](https://github.com/dart-lang/.github/blob/main/CONTRIBUTING.md#Testing). Note that many Dart repos have a weekly cadence for reviewing PRs - please allow for some latency before initial review feedback. </details>
1 parent 3496f54 commit 4ba6f15

File tree

6 files changed

+188
-52
lines changed

6 files changed

+188
-52
lines changed

pkgs/sdk_triage_bot/lib/src/common.dart

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -37,11 +37,14 @@ String get geminiKey {
3737
return token;
3838
}
3939

40-
/// Don't return more than 5k of text for an issue body.
41-
String trimmedBody(String body) {
42-
const textLimit = 5 * 1024;
40+
/// Maximal length of body used for querying.
41+
const bodyLengthLimit = 10 * 1024;
4342

44-
return body.length > textLimit ? body = body.substring(0, textLimit) : body;
43+
/// The [body], truncated if larger than [bodyLengthLimit].
44+
String trimmedBody(String body) {
45+
return body.length > bodyLengthLimit
46+
? body = body.substring(0, bodyLengthLimit)
47+
: body;
4548
}
4649

4750
class Logger {

pkgs/sdk_triage_bot/lib/src/gemini.dart

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,8 @@ import 'package:google_generative_ai/google_generative_ai.dart';
66
import 'package:http/http.dart' as http;
77

88
class GeminiService {
9-
// gemini-1.5-pro-latest, gemini-1.5-flash-latest, gemini-1.0-pro-latest
9+
// Possible values for models: gemini-1.5-pro-latest, gemini-1.5-flash-latest,
10+
// gemini-1.0-pro-latest, gemini-1.5-flash-exp-0827.
1011
static const String classificationModel = 'models/gemini-1.5-flash-latest';
1112
static const String summarizationModel = 'models/gemini-1.5-flash-latest';
1213

@@ -33,17 +34,17 @@ class GeminiService {
3334

3435
/// Call the summarize model with the given prompt.
3536
///
36-
/// On failures, this will throw a `GenerativeAIException`.
37+
/// On failures, this will throw a [GenerativeAIException].
3738
Future<String> summarize(String prompt) {
3839
return _query(_summarizeModel, prompt);
3940
}
4041

4142
/// Call the classify model with the given prompt.
4243
///
43-
/// On failures, this will throw a `GenerativeAIException`.
44+
/// On failures, this will throw a [GenerativeAIException].
4445
Future<List<String>> classify(String prompt) async {
4546
final result = await _query(_classifyModel, prompt);
46-
final labels = result.split(',').map((l) => l.trim()).toList();
47+
final labels = result.split(',').map((l) => l.trim()).toList()..sort();
4748
return labels;
4849
}
4950

pkgs/sdk_triage_bot/lib/src/prompts.dart

Lines changed: 119 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,8 @@
22
// for details. All rights reserved. Use of this source code is governed by a
33
// BSD-style license that can be found in the LICENSE file.
44

5+
// TODO(devoncarew): Add additional prompt instructions for `area-pkg` issues.
6+
57
String assignAreaPrompt({
68
required String title,
79
required String body,
@@ -45,13 +47,110 @@ If the issue is clearly a bug report, then also apply the label 'type-bug'.
4547
If the issue is mostly a question, then also apply the label 'type-question'.
4648
Otherwise don't apply a 'type-' label.
4749
50+
If the issue title starts with "[breaking change]" it was likely created using
51+
existing issue template; do not assign an area label. IMPORTANT: only do this if
52+
the issue title starts with "[breaking change]".
53+
54+
If the issue was largely unchanged from our default issue template, then apply
55+
the 'needs-info' label and don't assign an area label. These issues will
56+
generally have a title of "Create an issue" and the body will start with "Thank
57+
you for taking the time to file an issue!".
58+
59+
If the issue title is "Analyzer Feedback from IntelliJ", these are generally not
60+
well qualified. For these issues, apply the 'needs-info' label but don't assign
61+
an area label.
62+
4863
Return the labels as comma separated text.
4964
50-
Issue follows:
65+
Here are a series of few-shot examples:
66+
67+
<EXAMPLE>
68+
INPUT: title: Create an issue
69+
70+
body: Thank you for taking the time to file an issue!
71+
72+
This tracker is for issues related to:
73+
74+
Dart analyzer and linter
75+
Dart core libraries (dart:async, dart:io, etc.)
76+
Dart native and web compilers
77+
Dart VM
78+
79+
OUTPUT: needs-info
80+
</EXAMPLE>
81+
82+
<EXAMPLE>
83+
INPUT: title: Analyzer Feedback from IntelliJ
84+
85+
body: ## Version information
86+
87+
- `IDEA AI-202.7660.26.42.7351085`
88+
- `3.4.4`
89+
- `AI-202.7660.26.42.7351085, JRE 11.0.8+10-b944.6842174x64 JetBrains s.r.o, OS Windows 10(amd64) v10.0 , screens 1600x900`
90+
91+
OUTPUT: needs-info
92+
</EXAMPLE>
93+
94+
<EXAMPLE>
95+
INPUT: title: Support likely() and unlikely() hints for AOT code optimization
96+
97+
body: ```dart
98+
// Tell the compiler which branches are going to be taken most of the time.
99+
100+
if (unlikely(n == 0)) {
101+
// This branch is known to be taken rarely.
102+
} else {
103+
// This branch is expected to be in the hot path.
104+
}
105+
106+
final result = likely(s == null) ? commonPath() : notTakenOften();
107+
```
108+
109+
Please add support for the `likely()` and `unlikely()` optimization hints within branching conditions. The AOT compiler can use these hints to generate faster code in a hot path that contains multiple branches.
51110
52-
$title
111+
OUTPUT: area-vm, type-enhancement, type-performance
112+
</EXAMPLE>
53113
54-
$body
114+
<EXAMPLE>
115+
INPUT: title: Analyzer doesn't notice incorrect return type of generic method
116+
117+
body: dart analyze gives no errors on the follow code:
118+
119+
```dart
120+
void main() {
121+
method(getB());
122+
}
123+
124+
void method(String b) => print(b);
125+
126+
B getB<B extends A>() {
127+
return A() as B;
128+
}
129+
130+
class A {}
131+
```
132+
I would have suspected it to say something along the line of **The argument type 'A' can't be assigned to the parameter type 'String'.**
133+
134+
OUTPUT: area-analyzer, type-enhancement
135+
</EXAMPLE>
136+
137+
<EXAMPLE>
138+
INPUT: title: DDC async function stepping improvements
139+
140+
body: Tracking issue to monitor progress on improving debugger stepping through async function bodies.
141+
142+
The new DDC async semantics expand async function bodies into complex state machines. The normal JS stepping semantics don't map cleanly to steps through Dart code given this lowering. There are a couple potential approaches to fix this:
143+
1) Add more logic to the Dart debugger to perform custom stepping behavior when stepping through async code.
144+
2) Modify the async lowering in such a way that stepping more closely resembles stepping through Dart. For example, rather than returning multiple times, the state machine function might be able to yield. Stepping over a yield might allow the debugger to stay within the function body.
145+
146+
OUTPUT: area-web
147+
</EXAMPLE>
148+
149+
The issue to triage follows:
150+
151+
title: $title
152+
153+
body: $body
55154
56155
${lastComment ?? ''}'''
57156
.trim();
@@ -60,15 +159,26 @@ ${lastComment ?? ''}'''
60159
String summarizeIssuePrompt({
61160
required String title,
62161
required String body,
162+
required bool needsInfo,
63163
}) {
164+
const needsMoreInfo = '''
165+
Our classification model determined that we'll need more information to triage
166+
this issue. Thank them for their contribution and gently prompt them to provide
167+
more information.
168+
''';
169+
170+
final responseLimit = needsInfo ? '' : ' (1-2 sentences, 24 words or less)';
171+
64172
return '''
65-
You are a software engineer on the Dart team at Google. You are responsible for
66-
triaging incoming issues from users. For each issue, briefly summarize the issue
67-
(1-2 sentences, 24 words or less).
173+
You are a software engineer on the Dart team at Google.
174+
You are responsible for triaging incoming issues from users.
175+
For each issue, briefly summarize the issue $responseLimit.
176+
177+
${needsInfo ? needsMoreInfo : ''}
68178
69-
Issue follows:
179+
The issue to triage follows:
70180
71-
$title
181+
title: $title
72182
73-
$body''';
183+
body: $body''';
74184
}

pkgs/sdk_triage_bot/lib/triage.dart

Lines changed: 34 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -70,37 +70,45 @@ ${trimmedBody(comment.body ?? '')}
7070
}
7171
}
7272

73-
// ask for the summary
7473
var bodyTrimmed = trimmedBody(issue.body);
75-
String summary;
74+
75+
// ask for the 'area-' classification
76+
List<String> newLabels;
7677
try {
77-
summary = await geminiService.summarize(
78-
summarizeIssuePrompt(title: issue.title, body: bodyTrimmed),
78+
newLabels = await geminiService.classify(
79+
assignAreaPrompt(
80+
title: issue.title,
81+
body: bodyTrimmed,
82+
lastComment: lastComment,
83+
),
7984
);
8085
} on GenerativeAIException catch (e) {
8186
// Failures here can include things like gemini safety issues, ...
8287
stderr.writeln('gemini: $e');
8388
exit(1);
8489
}
8590

86-
logger.log('## gemini summary');
87-
logger.log('');
88-
logger.log(summary);
89-
logger.log('');
90-
91-
// ask for the 'area-' classification
92-
List<String> newLabels;
91+
// ask for the summary
92+
String summary;
9393
try {
94-
newLabels = await geminiService.classify(
95-
assignAreaPrompt(
96-
title: issue.title, body: bodyTrimmed, lastComment: lastComment),
94+
summary = await geminiService.summarize(
95+
summarizeIssuePrompt(
96+
title: issue.title,
97+
body: bodyTrimmed,
98+
needsInfo: newLabels.contains('needs-info'),
99+
),
97100
);
98101
} on GenerativeAIException catch (e) {
99102
// Failures here can include things like gemini safety issues, ...
100103
stderr.writeln('gemini: $e');
101104
exit(1);
102105
}
103106

107+
logger.log('## gemini summary');
108+
logger.log('');
109+
logger.log(summary);
110+
logger.log('');
111+
104112
logger.log('## gemini classification');
105113
logger.log('');
106114
logger.log(newLabels.toString());
@@ -123,9 +131,9 @@ ${trimmedBody(comment.body ?? '')}
123131
// create github comment
124132
await githubService.createComment(sdkSlug, issueNumber, comment);
125133

126-
final allRepoLabels = (await githubService.getAllLabels(sdkSlug)).toSet();
127-
final labelAdditions = newLabels.toSet().intersection(allRepoLabels).toList()
128-
..sort();
134+
final allRepoLabels = await githubService.getAllLabels(sdkSlug);
135+
final labelAdditions =
136+
filterLegalLabels(newLabels, allRepoLabels: allRepoLabels);
129137
if (labelAdditions.isNotEmpty) {
130138
labelAdditions.add('triage-automation');
131139
}
@@ -141,7 +149,13 @@ ${trimmedBody(comment.body ?? '')}
141149
logger.log('Triaged ${issue.htmlUrl}');
142150
}
143151

144-
List<String> filterExistingLabels(
145-
List<String> allLabels, List<String> newLabels) {
146-
return newLabels.toSet().intersection(allLabels.toSet()).toList();
152+
List<String> filterLegalLabels(
153+
List<String> labels, {
154+
required List<String> allRepoLabels,
155+
}) {
156+
final validLabels = allRepoLabels.toSet();
157+
return [
158+
for (var label in labels)
159+
if (validLabels.contains(label)) label,
160+
]..sort();
147161
}

pkgs/sdk_triage_bot/tool/bench.dart

Lines changed: 19 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,7 @@ void main(List<String> args) async {
5151
await githubService.fetchIssue(sdkSlug, expectation.issueNumber);
5252
final bodyTrimmed = trimmedBody(issue.body);
5353

54-
print('#${issue.number}');
54+
print('#${issue.number}: ${expectation.expectedLabels.join(', ')}');
5555

5656
try {
5757
final labels = await geminiService.classify(
@@ -60,12 +60,7 @@ void main(List<String> args) async {
6060
if (expectation.satisfiedBy(labels)) {
6161
predicted++;
6262
} else {
63-
var title = issue.title.length > 100
64-
? '${issue.title.substring(0, 100)}...'
65-
: issue.title;
66-
print(' "$title"');
67-
print(' labeled: ${expectation.expectedLabels.join(', ')}');
68-
print(' prediction: ${labels.join(', ')}');
63+
stderr.writeln(' bot: ${labels.join(', ')}');
6964
}
7065
} on GenerativeAIException catch (e) {
7166
// Failures here can include things like gemini safety issues, ...
@@ -114,8 +109,22 @@ class ClassificationResults {
114109
}
115110

116111
bool satisfiedBy(List<String> labels) {
117-
final filtered = labels.where((l) => !l.startsWith('type-')).toSet();
118-
final expected = expectedLabels.where((l) => !l.startsWith('type-'));
119-
return expected.every(filtered.contains);
112+
// Handle a `needs-info` label.
113+
if (expectedLabels.contains('needs-info')) {
114+
return labels.contains('needs-info');
115+
}
116+
117+
// Handle a `breaking-change-request` label.
118+
if (expectedLabels.contains('breaking-change-request')) {
119+
return labels.contains('breaking-change-request');
120+
}
121+
122+
for (final label in expectedLabels.where((l) => l.startsWith('area-'))) {
123+
if (!labels.contains(label)) {
124+
return false;
125+
}
126+
}
127+
128+
return true;
120129
}
121130
}

pkgs/sdk_triage_bot/tool/bench.md

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,6 @@ materially improve the classification performance.
2020
| #56354 | `area-web`, `type-bug` |
2121
| #56353 | `area-dart2wasm` |
2222
| #56350 | `area-analyzer`, `type-enhancement` |
23-
| #56348 | `area-intellij` |
2423
| #56347 | `area-dart-cli`, `type-bug` |
2524
| #56346 | `area-pkg`, `pkg-json`, `type-enhancement` |
2625
| #56345 | `area-analyzer`, `type-enhancement` |
@@ -44,7 +43,7 @@ materially improve the classification performance.
4443
| #56316 | `area-web` |
4544
| #56315 | `area-web` |
4645
| #56314 | `area-web`, `type-bug` |
47-
| #56308 | `area-vm` |
46+
| #56308 | `area-vm` `breaking-change-request` |
4847
| #56306 | `area-vm`, `type-bug` |
4948
| #56305 | `area-front-end`, `type-bug`, `type-question` |
5049
| #56304 | `area-core-library`, `type-enhancement` |
@@ -55,13 +54,12 @@ materially improve the classification performance.
5554
| #56283 | `area-dart2wasm` |
5655
| #56256 | `area-front-end`, `type-bug` |
5756
| #56254 | `area-pkg`, `pkg-vm-service`, `type-bug` |
58-
| #56246 | `area-intellij` |
59-
| #56240 | `area-intellij` |
57+
| #56240 | `needs-info` |
6058
| #56229 | `area-infrastructure` |
6159
| #56227 | `area-native-interop` |
6260
| #56220 | `area-infrastructure`, `type-code-health` |
6361
| #56217 | `area-meta` |
64-
| #56216 | `area-intellij` |
62+
| #56216 | `needs-info` |
6563
| #56214 | `area-native-interop` |
6664
| #56208 | `area-google3`, `type-enhancement` |
6765
| #56207 | `area-google3` |
@@ -108,3 +106,4 @@ We need more information from the user before we can triage these issues.
108106

109107
## Results
110108
2024-08-27: 55.6% using gemini-1.5-flash-latest
109+
2024-08-30: 64.8% using gemini-1.5-flash-latest

0 commit comments

Comments
 (0)