Skip to content

Commit c96fbaf

Browse files
gary-huangYun-Kim
andcommitted
address comments
Update releasenotes/notes/llmobs-dne-experiments-multi-run-ef099e98a5827e49.yaml Co-authored-by: Yun Kim <[email protected]> Update ddtrace/llmobs/_llmobs.py Co-authored-by: Yun Kim <[email protected]>
1 parent 922008a commit c96fbaf

File tree

3 files changed

+7
-14
lines changed

3 files changed

+7
-14
lines changed

ddtrace/llmobs/_experiment.py

Lines changed: 3 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -403,8 +403,6 @@ def run(self, jobs: int = 1, raise_errors: bool = False, sample_size: Optional[i
403403
self._run_name = experiment_run_name
404404
run_results = []
405405
# for backwards compatibility
406-
first_run_rows = []
407-
first_run_summary_evals = {}
408406
for run_iteration in range(self._runs):
409407
run = _ExperimentRunInfo(run_iteration)
410408
self._tags["run_id"] = str(run._id)
@@ -418,13 +416,11 @@ def run(self, jobs: int = 1, raise_errors: bool = False, sample_size: Optional[i
418416
self._id, experiment_evals, convert_tags_dict_to_list(self._tags)
419417
)
420418
run_results.append(run_result)
421-
if run_iteration == 0:
422-
first_run_rows = run_result.rows
423-
first_run_summary_evals = run_result.summary_evaluations
424419

425420
experiment_result: ExperimentResult = {
426-
"summary_evaluations": first_run_summary_evals,
427-
"rows": first_run_rows,
421+
# for backwards compatibility, the first result fills the old fields of rows and summary evals
422+
"summary_evaluations": run_results[0].summary_evaluations if len(run_results) > 0 else {},
423+
"rows": run_results[0].rows if len(run_results) > 0 else [],
428424
"runs": run_results,
429425
}
430426
return experiment_result

ddtrace/llmobs/_llmobs.py

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -456,9 +456,7 @@ def _llmobs_tags(span: Span, ml_app: str, session_id: Optional[str] = None) -> L
456456

457457
# set experiment tags on children spans if the tags do not already exist
458458
experiment_id = span.context.get_baggage_item(EXPERIMENT_ID_KEY)
459-
if experiment_id:
460-
# the children spans of an experiment span should be tagged by the experiment ID as well
461-
if "experiment_id" not in tags:
459+
if experiment_id and "experiment_id" not in tags:
462460
tags["experiment_id"] = experiment_id
463461

464462
run_id = span.context.get_baggage_item(EXPERIMENT_RUN_ID_KEY)

releasenotes/notes/llmobs-dne-experiments-multi-run-ef099e98a5827e49.yaml

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2,11 +2,10 @@
22
features:
33
- |
44
LLM Observability: Experiments can now be run multiple times by using the optional ``runs`` argument,
5-
to assess the true performance of an experiment in the face of the non determinism of LLMs
5+
to assess the true performance of an experiment in the face of the non determinism of LLMs. Use the new ``ExperimentResult`` class' ``runs`` attribute to access the results by run iteration.
66
deprecations:
77
- |
8-
LLM Observability: The ``ExperimentResult`` class now has a new ``runs`` attribute to store the results of
9-
every experiment run. The ``rows`` and ``summary_evaluations`` attributes will only store the results from the first run
8+
LLM Observability: The ``ExperimentResult`` class' ``rows`` and ``summary_evaluations`` attributes are deprecated and will be removed in the next major release. ``ExperimentResult.rows/summary_evaluations`` attributes will only store the results of the first run iteration for multi-run experiments. Use the ``ExperimentResult.runs`` attribute instead to access experiment results and summary evaluations.
109
fixes:
1110
- |
12-
LLM Observability: experiment children span now have experiment related tags
11+
LLM Observability: Non-root experiment spans are now tagged with experiment ID, run ID, and run iteration tags.

0 commit comments

Comments
 (0)