Commit d4042ff
Introducing RedTeam (#39898)
* remove redundant quotes
* Fix typo
* pylint fix
* Update broken tests
* Include the grounding json in the manifest
* Fix typo
* Come on package
* Release 1.0.0b5
* Notice from Chang
* Remove adv_conv template parameters from the outputs
* Update chanagelog
* Experimental tags on adv scenarios
* Readme fix onbreaking change
* Add the category and both user and assistant context to the response of qr_json_lines
* Update changelog
* Rename _kwargs to _options
* _options as prefix
* update troubleshooting for simulator
* Rename according to suggestions
* Clean up readme
* more links
* Bugfix: zip_longest created null parameters
* Updated changelog
* zip does the job
* remove ununsed import
* Fix changelog merge
* Remove print statements
* Adding pyrit dependency
* updates
* updates
* Make pyrit extra
* Set up pyrit as extra correctly
* Limit the number of turns
* adding sample
* baseline sample with evals
* updates
* Update setup to use roman's branch
* callback chat target
* sample update
* add local callback chat target
* Revert to adding main pyrit as extra
* add option for simulation only
* budget first pass error on model target
* low budget working (without many shot)
* Adding RedTeamAgent
* updates for parallelism and cleanup
* add different converters, dispose of memory
* add typespec autogen files
* updates
* removing unused orchestrators and ensuring baseline always included
* first attempt to get attack objectives
* retrieve attack objectives sucessfully
* cachine to retrieve multiple objectives for each of the attack strategy
* Add generated client
* Add content
* Remove params argument from get method call
* scorecard and more targets
* I did mess up that merge oops
* Make attack objective generator mandatory
* Remove debug statement from _red_team_agent.py
* get rai call working
* nits
* remove change not needed
* whitespace
* mock call to evaluate
* Remove caching for objectives, get num_objectives from the attack objective generator
* update the sample with attack objective generator
* scorecard updates
* remove baseline from detailed_joint_risk_attack_asr
* smaller changes
* Update all the content safety evalutors to have a pass/fail result and treshold
* Update groundedness service based
* Binary results for prompt based evaluators
* Update changelog
* new typespec gen and readme
* Newly generated client responds with objectives properly
* mlflow run
* Pass -> pass Fail -> fail
* Add thresholds to NLP evals
* mlflow updates
* init idea, waiting for tunnels or int to work
* Update sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluators/_gleu/_gleu.py
Co-authored-by: Copilot <[email protected]>
* Update sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluators/_common/_base_eval.py
Co-authored-by: Copilot <[email protected]>
* print scorecard init call to evaluate and mock attackobjectives until int is fixed
* Make a call to jailbreak and prepend response
* flip call to evaluate instead of mock
* Binarization in rouge
* Adding threshold to all evaluators
* adding progress bar
* fixing and supressing errors
* more updates
* syntax error
* More syntax fifxes
* Typo fixes
* print a message if exception occurs for binary result calc
* Final typo
* start MLFlow run earlier
* Update built in evals test
* updates for call to evaluate
* draft of binarization
* RE add the previously removed _label
* Trying a fix for the test
* Why ar we checking len of keys instead of the keys themselves
* refactoring
* Update sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/red_team_agent/red_team_agent_result.py
Co-authored-by: Nagkumar Arkalgud <[email protected]>
* Update sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/red_team_agent/red_team_agent_result.py
Co-authored-by: Nagkumar Arkalgud <[email protected]>
* address feedback
* Update redundant comment and change to
* Yaay tests passed
* Fix bug
* uncomment recording
* updates
* Update
* use the values for accessing keys
* updates
* updates
* parsing risk category
* Handle data only scenario
* update
* Fix the keys for risk category to be lower case in result and show int url for ai studio
* Fix the way we render results
* updates
* updates
* Updates to use the new generated client
* updates
* version pin pyrit
* Update to include autogen files in setup
* Add init files and update setup
* Removed mocks
* risk category to output and update to asr calc
* revert unnecessary changes
* debug content filter error for open ai target
* minor updates
* mark startergies that are not supported
* update jailbreak retrieval
* update objective filtering logic
* fix safety eval unit tests
* remove empty utils subfolder
* init attempt to move things to diff files and add tests
* Fix the cspell error
* feat(security): Add RedTeamAgent for AI system vulnerability assessment
Implements a comprehensive RedTeamAgent feature for systematically
testing AI system security vulnerabilities using various attack strategies.
Key additions:
- Red Team Agent class with support for multiple attack strategies (Base64, ROT13, Jailbreak, etc.)
- Risk category assessment across Violence, SelfHarm, Sexual, and HateUnfairness domains
- MLflow integration for experiment tracking and result visualization
- Comprehensive scoring metrics including Attack Success Rate (ASR)
- Detailed test coverage for all major components
- Updated CHANGELOG.md with feature documentation
The RedTeamAgent helps security teams and AI developers evaluate system robustness
against potential attacks and provides detailed analytics on vulnerabilities.
* Revert all the changes to difficulty
* refactor(redteam): Simplify scorecard formatting output
Removed redundant 'Scorecard:' header and studio URL from output for cleaner display.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <[email protected]>
* test(redteam): Update formatting_utils tests
Updated test_formatting_utils.py to match the simplified scorecard format by removing assertions for removed elements.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <[email protected]>
* test(redteam): Fix Red Team Agent unit tests
Updated test_red_team_agent.py to properly mock logging, file handlers, and tempfile operations to support scan-specific output folders and MLflow integration.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <[email protected]>
* feat(redteam): Add output directory support to logging
Modified logging_utils.py to accept output_dir parameter for scan-specific log files.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <[email protected]>
* making it work, thanks claude
* Allow custom attac objectives
* keeping a count of timeout, add a test
* Update init param and test
* update to aog and scan name
* Skip work in progress tests
* fix timeout and other small tweaks
* add init file to utils
* Make redteam agent tests optional CI stage
* trying to update ci to require installation of redteam extra
* update naming, unit tests
* minor updates
* update sample with release installation
* remove sample notebook
* Please pass CI analyze
* Again CI analyze please pass
* Again CI analyze please pass
---------
Co-authored-by: Nagkumar Arkalgud <[email protected]>
Co-authored-by: Nagkumar Arkalgud <[email protected]>
Co-authored-by: Nagkumar Arkalgud <[email protected]>
Co-authored-by: Miles Holland <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Claude <[email protected]>1 parent 093efca commit d4042ff
File tree
58 files changed
+13332
-163
lines changed- sdk/evaluation/azure-ai-evaluation
- azure/ai/evaluation
- _safety_evaluation
- autogen
- raiclient
- aio
- operations
- models
- operations
- red_team
- utils
- simulator/_model_tools
- samples
- tests/unittests
- test_redteam
- data
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
58 files changed
+13332
-163
lines changedLines changed: 1 addition & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
84 | 84 | | |
85 | 85 | | |
86 | 86 | | |
| 87 | + | |
87 | 88 | | |
88 | 89 | | |
89 | 90 | | |
| |||
Lines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | | - | |
| 3 | + | |
Lines changed: 142 additions & 147 deletions
Large diffs are not rendered by default.
Lines changed: 38 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
Lines changed: 7 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
Lines changed: 34 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
Lines changed: 128 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
Lines changed: 87 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
0 commit comments