Skip to content

Conversation

@slister1001
Copy link
Member

Description

Please add an informative description that covers that changes made by the pull request and link all relevant issues.

If an SDK is being regenerated based on a new swagger spec, a link to the pull request containing these swagger spec changes has been included above.

All SDK Contribution checklist:

  • The pull request does not introduce [breaking changes]
  • CHANGELOG is updated for new features, bug fixes or other significant changes.
  • I have read the contribution guidelines.

General Guidelines and Best Practices

  • Title of the pull request is clear and informative.
  • There are a small number of commits, each of which have an informative message. This means that previously merged commits do not appear in the history of the PR. For more information on cleaning up the commits in your PR, see this page.

Testing Guidelines

  • Pull request includes test coverage for the included changes.

Nagkumar Arkalgud and others added 6 commits March 21, 2025 22:07
Modified logging_utils.py to accept output_dir parameter for scan-specific log files.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
…g_testing

Enhancement/parallelism logging testing
@nagkumar91 nagkumar91 requested a review from Copilot March 24, 2025 18:18
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces the RedTeamAgent by updating various autogenerated SDK modules and enhancing the safety evaluation code to support red team scenarios. Key changes include:

  • Adding a new error target "RedTeamAgent" in the exceptions module.
  • Updates to client, configuration, and patch files in both synchronous and asynchronous modules.
  • Significant modifications in the safety evaluation functions to refine simulation output handling and defect rate calculations.

Reviewed Changes

Copilot reviewed 57 out of 57 changed files in this pull request and generated no comments.

File Description
azure/ai/evaluation/autogen/raiclient/{aio/,}init.py, _client.py, _configuration.py, _version.py Updates to generated client code with minor adjustments to imports and initialization logic.
azure/ai/evaluation/_safety_evaluation/_safety_evaluation.py Major updates to simulation, evaluation, and defect rate calculation logic.
azure/ai/evaluation/_exceptions.py Introduces a new error target "RedTeamAgent".
All *_patch.py files Standard patch files with no functional changes.
Comments suppressed due to low confidence (2)

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_safety_evaluation/_safety_evaluation.py:698

  • The variable 'data_paths' is referenced before initialization. Ensure that 'data_paths' is defined (e.g., initialize it to an empty value) before using it in the condition.
if not data_paths and data_path is None and jailbreak_data_path is None and isinstance(target, Callable):

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_safety_evaluation/_safety_evaluation.py:620

  • There is a duplicate dictionary key 'content_safety.violence_defect_rate'; this causes the first value to be overwritten. Use distinct keys (for example, one for hate unfairness and one for violence) to preserve both defect rates.
evaluation_result["metrics"] = { "content_safety.violence_defect_rate": hate_unfairness_defect_rate, "content_safety.violence_defect_rate": violence_defect_rate, "content_safety.sexual_defect_rate": sexual_defect_rate, "content_safety.self_harm_defect_rate": self_harm_defect_rate }

@slister1001 slister1001 changed the title Introducing RedTeamAgent Introducing RedTeam Mar 25, 2025
@slister1001 slister1001 enabled auto-merge (squash) March 26, 2025 00:22
@slister1001 slister1001 disabled auto-merge March 26, 2025 17:27
@slister1001 slister1001 merged commit d4042ff into Azure:main Mar 26, 2025
26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Evaluation Issues related to the client library for Azure AI Evaluation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants