diff --git a/.github/agent-sops/task-implementer.sop.md b/.github/agent-sops/task-implementer.sop.md index 34aee7e5..cc7aa333 100644 --- a/.github/agent-sops/task-implementer.sop.md +++ b/.github/agent-sops/task-implementer.sop.md @@ -140,6 +140,8 @@ Outline the high-level structure of the implementation and create an implementat Write test cases based on the outlines, following strict TDD principles. **Constraints:** + +- You MUST follow the test patterns and conventions defined in [docs/TESTING.md](../../docs/TESTING.md) - You MUST validate that the task environment is set up properly - If you already created a commit, ensure the latest commit matches the expected hash - If not, ensure the correct branch is checked out @@ -197,13 +199,12 @@ Write implementation code to pass the tests, focusing on simplicity and correctn - You MUST otherwise continue automatically after verifying test results - You MUST follow the Build Output Management practices defined in the Best Practices section -#### 4.3 Review, Refactor, and Optimize +#### 4.3 Review and Refactor Implementation -If the implementation is complete, proceed with review of the implementation to identify opportunities for simplification or improvement. +If the implementation is complete, proceed with a self-review of the implementation code to identify opportunities for simplification or improvement. **Constraints:** -- You MAY reply to user review threads with a concise response - - You MUST keep your response to less than 3 sentences + - You MUST check that all tasks are complete before proceeding - if tests fail, you MUST identify the issue and implement a fix - if builds fail, you MUST identify the issue implement a fix @@ -211,8 +212,40 @@ If the implementation is complete, proceed with review of the implementation to - You MUST maintain test passing status throughout refactoring - You SHOULD make note of simplification in your progress notes - You SHOULD record significant refactorings in your progress notes +- You MUST return to step 4.2 if refactoring reveals additional implementation needs + +#### 4.4 Review and Refactor Tests + +After reviewing the implementation, review the test code to ensure it follows established patterns and provides adequate coverage. + +**Constraints:** + +- You MUST review your test code according to the guidelines in [docs/TESTING.md](../../docs/TESTING.md). +- You MUST verify tests conform to the testing documentation standards +- You MUST verify tests are readable and maintainable +- You SHOULD refactor tests that are overly complex or duplicative +- You MUST return to step 4.1 if tests need significant restructuring + +**Testing Checklist Verification (REQUIRED):** + +You MUST copy the checklist from [docs/TESTING.md](../../docs/TESTING.md) into your progress notes and explicitly verify each item. For each checklist item, you MUST: + +1. Copy the checklist item verbatim +2. Mark it as `[x]` (pass) or `[-]` (fail) +3. If failed, provide a brief explanation and fix the issue before proceeding + +Example format in your notes: + +```markdown +## Testing Checklist Verification -#### 4.4 Validate Implementation +- [x] Do the tests use relevant helpers from `__fixtures__` as noted in the "Test Fixtures Reference" section +- [ ] Are tests asserting on the entire object instead of specific fields? → FAILED: test on line 45 asserts individual properties, refactoring now +``` + +You MUST NOT proceed to step 4.5 until ALL checklist items pass. + +#### 4.5 Validate Implementation If the implementation meets all requirements and follows established patterns, proceed with this step. Otherwise, return to step 4.2 to fix any issues. @@ -230,6 +263,24 @@ If the implementation meets all requirements and follows established patterns, p - You MUST verify that all dependencies are satisfied - You MUST follow the Build Output Management practices defined in the Best Practices section +#### 4.6 Respond to Review Feedback + +If you have received feedback from user reviews or PR comments, address them before proceeding to the commit phase. + +**Constraints:** + +- You MAY skip this step if no user feedback has been received yet +- You MUST reply to user review threads with a concise response + - You MUST keep your response to less than 3 sentences +- You MUST categorize each piece of feedback as: + - Actionable code changes that can be implemented immediately + - Clarifying questions that require user input + - Suggestions to consider for future iterations +- You MUST implement actionable code changes before proceeding +- You MUST re-run tests after addressing feedback to ensure nothing is broken +- You MUST return to step 4.3 after implementing changes to review the updated code +- You MUST use the handoff_to_user tool if clarification is needed before you can proceed + ### 5. Commit and Pull Request Phase If all tests are passing, draft a conventional commit message, perform the git commit, and create/update the pull request. @@ -239,7 +290,7 @@ If all tests are passing, draft a conventional commit message, perform the git c Before creating or updating a PR, you MUST copy the checklist from [docs/PR.md](../../docs/PR.md) into your progress notes and explicitly verify each item. For each checklist item, you MUST: 1. Copy the checklist item verbatim -2. Mark it as `[x]` (pass) or `[ ]` (fail) +2. Mark it as `[x]` (pass) or `[-]` (fail) 3. If failed, revise the PR description until the item passes Example format in your notes: @@ -401,6 +452,8 @@ If builds fail during implementation: - Use appropriate package managers and dependency files for the project type ### Testing Best Practices + +- You MUST follow the comprehensive testing guidelines in [docs/TESTING.md](../../docs/TESTING.md) - Follow TDD principles: RED → GREEN → REFACTOR - Write tests that fail initially, then implement to make them pass - Use appropriate testing frameworks for the project type or as specified in DEVELOPMENT.md @@ -416,7 +469,7 @@ If builds fail during implementation: ### Checklist Verification Pattern -When documentation files contain checklists (e.g., `docs/PR.md`), you MUST: +When documentation files contain checklists (e.g., `docs/TESTING.md`, `docs/PR.md`), you MUST: 1. Copy the entire checklist into your progress notes 2. Explicitly verify each item by marking `[x]` or `[ ]` diff --git a/AGENTS.md b/AGENTS.md index 98a83a8c..373a8ad5 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -162,8 +162,21 @@ See [CONTRIBUTING.md - Development Environment](CONTRIBUTING.md#development-envi 3. **Run quality checks** before committing (pre-commit hooks will run automatically) 4. **Commit with conventional commits**: `feat:`, `fix:`, `refactor:`, `docs:`, etc. 5. **Push to remote**: `git push origin agent-tasks/{ISSUE_NUMBER}` +6. **Create pull request** following [PR.md](docs/PR.md) guidelines -### 3. Quality Gates +### 3. Pull Request Guidelines + +When creating pull requests, you **MUST** follow the guidelines in [PR.md](docs/PR.md). Key principles: + +- **Focus on WHY**: Explain motivation and user impact, not implementation details +- **Document public API changes**: Show before/after code examples +- **Be concise**: Use prose over bullet lists; avoid exhaustive checklists +- **Target senior engineers**: Assume familiarity with the SDK +- **Exclude implementation details**: Leave these to code comments and diffs + +See [PR.md](docs/PR.md) for the complete guidance and template. + +### 4. Quality Gates Pre-commit hooks automatically run: - Unit tests (via npm test) @@ -173,6 +186,18 @@ Pre-commit hooks automatically run: All checks must pass before commit is allowed. +### 5. Testing Guidelines + +When writing tests, you **MUST** follow the guidelines in [docs/TESTING.md](docs/TESTING.md). Key topics covered: + +- Test organization and file location +- Test batching strategy +- Object assertion best practices +- Test coverage requirements +- Multi-environment testing (Node.js and browser) + +See [TESTING.md](docs/TESTING.md) for the complete testing reference. + ## Coding Patterns and Best Practices ### Logging Style Guide @@ -288,102 +313,6 @@ test/integ/ └── feature.test.ts # Tests public API ``` -### Test Organization Pattern - -Follow this nested describe pattern for consistency: - -**For functions**: -```typescript -import { describe, it, expect } from 'vitest' -import { functionName } from '../module' - -describe('functionName', () => { - describe('when called with valid input', () => { - it('returns expected result', () => { - const result = functionName('input') - expect(result).toBe('expected') - }) - }) - - describe('when called with edge case', () => { - it('handles gracefully', () => { - const result = functionName('') - expect(result).toBeDefined() - }) - }) -}) -``` - -**For classes**: -```typescript -import { describe, it, expect } from 'vitest' -import { ClassName } from '../module' - -describe('ClassName', () => { - describe('methodName', () => { - it('returns expected result', () => { - const instance = new ClassName() - const result = instance.methodName() - expect(result).toBe('expected') - }) - - it('handles error case', () => { - const instance = new ClassName() - expect(() => instance.methodName()).toThrow() - }) - }) - - describe('anotherMethod', () => { - it('performs expected action', () => { - // Test implementation - }) - }) -}) -``` - -**Key principles**: -- Top-level `describe` uses the function/class name -- Nested `describe` blocks group related test scenarios -- Use descriptive test names without "should" prefix -- Group tests by functionality or scenario - -### Test Batching Strategy - -**Rule**: When test setup cost exceeds test logic cost, you MUST batch related assertions into a single test. - -**You MUST batch when**: -- Setup complexity > test logic complexity -- Multiple assertions verify the same object state -- Related behaviors share expensive context - -**You SHOULD keep separate tests for**: -- Distinct behaviors or execution paths -- Error conditions -- Different input scenarios - -**Bad - Redundant setup**: -```typescript -it('has correct tool name', () => { - const tool = createComplexTool({ /* expensive setup */ }) - expect(tool.toolName).toBe('testTool') -}) - -it('has correct description', () => { - const tool = createComplexTool({ /* same expensive setup */ }) - expect(tool.description).toBe('Test description') -}) -``` - -**Good - Batched properties**: -```typescript -it('creates tool with correct properties', () => { - const tool = createComplexTool({ /* setup once */ }) - expect(tool.toolName).toBe('testTool') - expect(tool.description).toBe('Test description') - expect(tool.toolSpec.name).toBe('testTool') -}) -``` - ### TypeScript Type Safety **Strict requirements**: @@ -652,185 +581,6 @@ export class ValidationError extends Error { } ``` -## Testing Patterns - -### Unit Test Location - -**Rule**: Unit tests files are co-located with source files, grouped in a directory named `__tests__` - -``` -src/subdir/ -├── agent.ts # Source file -├── model.ts # Source file -└── __tests__/ - ├── agent.test.ts # Tests for agent.ts - └── model.test.ts # Tests for model.ts -``` - -### Integration Test Location - -**Rule**: Integration tests are separate in `test/integ/` - -``` -test/integ/ -├── api.test.ts # Tests public API -└── environment.test.ts # Tests environment compatibility -``` - -### Test File Naming - -- Unit tests: `{sourceFileName}.test.ts` in `src/**/__tests__/**` -- Integration tests: `{feature}.test.ts` in `test/integ/` - -### Test Coverage - -- **Minimum**: 80% coverage required (enforced by Vitest) -- **Target**: Aim for high coverage on critical paths -- **Exclusions**: Test files, config files, generated code - -### Writing Effective Tests - -```typescript -// Good: Clear, specific test -describe('calculateTotal', () => { - describe('when given valid numbers', () => { - it('returns the sum', () => { - expect(calculateTotal([1, 2, 3])).toBe(6) - }) - }) - - describe('when given empty array', () => { - it('returns zero', () => { - expect(calculateTotal([])).toBe(0) - }) - }) -}) - -// Bad: Vague, unclear test -describe('calculateTotal', () => { - it('works', () => { - expect(calculateTotal([1, 2, 3])).toBeTruthy() - }) -}) -``` - -### Object Assertion Best Practices - -**Prefer testing entire objects at once** instead of individual properties for better readability and test coverage. - -```typescript -// ✅ Good: Verify entire object at once -it('returns expected user object', () => { - const user = getUser('123') - expect(user).toEqual({ - id: '123', - name: 'John Doe', - email: 'john@example.com', - isActive: true - }) -}) - -// ✅ Good: Verify entire array of objects -it('yields expected stream events', async () => { - const events = await collectIterator(stream) - expect(events).toEqual([ - { type: 'streamEvent', data: 'Starting...' }, - { type: 'streamEvent', data: 'Processing...' }, - { type: 'streamEvent', data: 'Complete!' }, - ]) -}) - -// ❌ Bad: Testing individual properties -it('returns expected user object', () => { - const user = getUser('123') - expect(user).toBeDefined() - expect(user.id).toBe('123') - expect(user.name).toBe('John Doe') - expect(user.email).toBe('john@example.com') - expect(user.isActive).toBe(true) -}) - -// ❌ Bad: Testing array elements individually in a loop -it('yields expected stream events', async () => { - const events = await collectIterator(stream) - for (const event of events) { - expect(event.type).toBe('streamEvent') - expect(event).toHaveProperty('data') - } -}) -``` - -**Benefits of testing entire objects**: -- **More concise**: Single assertion instead of multiple -- **Better test coverage**: Catches unexpected additional or missing properties -- **More readable**: Clear expectation of the entire structure -- **Easier to maintain**: Changes to the object require updating one place - -**Use cases**: -- Always use `toEqual()` for object and array comparisons -- Use `toBe()` only for primitive values and reference equality -- When testing error objects, verify the entire structure including message and type - -### Testing Guidelines - -**Testing Approach:** -- You **MUST** write tests for implementations (functions, classes, methods) -- You **SHOULD NOT** write tests for interfaces since TypeScript compiler already enforces type correctness -- You **SHOULD** write Vitest type tests (`*.test-d.ts`) for complex types to ensure backwards compatibility - -**Example Implementation Test:** -```typescript -describe('BedrockModel', () => { - it('streams messages correctly', async () => { - const provider = new BedrockModel(config) - const stream = provider.stream(messages) - - for await (const event of stream) { - if (event.type === 'modelMessageStartEvent') { - expect(event.role).toBe('assistant') - } - } - }) -}) -``` - -### Test Model Providers - -**When to use each test provider:** - -- **`MockMessageModel`**: For agent loop tests and high-level flows - focused on content blocks -- **`TestModelProvider`**: For low-level event streaming tests where you need precise control over individual events - -#### MockMessageModel - Content-Focused Testing - -For tests focused on messages, you SHOULD use `MockMessageModel` with a content-focused API that eliminates boilerplate: - -```typescript -import { MockMessageModel } from '../__fixtures__/mock-message-model' - -// ✅ RECOMMENDED - Single content block (most common) -const provider = new MockMessageModel().addTurn({ type: 'textBlock', text: 'Hello' }) - -// ✅ RECOMMENDED - Array of content blocks -const provider = new MockMessageModel().addTurn([ - { type: 'textBlock', text: 'Let me help' }, - { type: 'toolUseBlock', name: 'calc', toolUseId: 'id-1', input: {} }, -]) - -// ✅ RECOMMENDED - Multi-turn with builder pattern -const provider = new MockMessageModel() - .addTurn({ type: 'toolUseBlock', name: 'calc', toolUseId: 'id-1', input: {} }) // Auto-derives 'toolUse' - .addTurn({ type: 'textBlock', text: 'The answer is 42' }) // Auto-derives 'endTurn' - -// ✅ OPTIONAL - Explicit stopReason when needed -const provider = new MockMessageModel().addTurn({ type: 'textBlock', text: 'Partial response' }, 'maxTokens') - -// ✅ OPTIONAL - Error handling -const provider = new MockMessageModel() - .addTurn({ type: 'textBlock', text: 'Success' }) - .addTurn(new Error('Model failed')) -``` - ## MCP (Model Context Protocol) Integration The [Model Context Protocol (MCP)](https://modelcontextprotocol.io) enables agents to connect to external tools and data sources through a standardized protocol. The SDK provides `McpClient` for seamless integration with MCP servers. @@ -928,7 +678,7 @@ Quick reference: npm test # Run unit tests in Node.js npm run test:browser # Run unit tests in browser (Chromium via Playwright) npm run test:all # Run all tests in all environments -npm run test:integ # Run integration tests +npm run test:integ # Run integration tests npm run test:coverage # Run tests with coverage report npm run lint # Check code quality npm run format # Auto-fix formatting @@ -936,37 +686,6 @@ npm run type-check # Verify TypeScript types npm run build # Compile TypeScript ``` -## Multi-Environment Testing - -The SDK is designed to work seamlessly in both Node.js and browser environments. Our test suite validates this by running tests in both environments using Vitest's browser mode with Playwright. - -### Test Projects - -The test suite is organized into three projects: - -1. **unit-node** (green): Unit tests running in Node.js environment -2. **unit-browser** (cyan): Same unit tests running in Chromium browser -3. **integ** (magenta): Integration tests running in Node.js - -### Environment-Specific Test Patterns - -- You MUST write tests that are environment-agnostic unless they depend on Node.js features like filesystem or env-vars - -Some tests require Node.js-specific features (like process.env, AWS SDK) and should be skipped in browser environments: - -```typescript -import { describe, it, expect } from 'vitest' -import { isNode } from '../__fixtures__/environment' - -// Tests will run in Node.js, skip in browser -describe.skipIf(!isNode)("Node.js specific features", () => { - it("uses environment variables", () => { - // This test accesses process.env - expect(process.env.NODE_ENV).toBeDefined() - }) -}) -``` - ## Troubleshooting Common Issues If TypeScript compilation fails: @@ -1018,6 +737,8 @@ When responding to PR feedback: ### Integration with Other Files - **CONTRIBUTING.md**: Contains testing/setup commands and human contribution guidelines +- **docs/TESTING.md**: Comprehensive testing guidelines (MUST follow when writing tests) +- **docs/PR.md**: Pull request guidelines and template - **README.md**: Public-facing documentation, links to strandsagents.com - **package.json**: Defines all npm scripts referenced in documentation diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 0d83dbe9..06b76929 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -35,6 +35,7 @@ When proposing solutions or reviewing code, we reference these principles to gui ``` 2. Install Playwright browsers for browser testing: + ```bash npm run test:browser:install ``` @@ -94,7 +95,7 @@ npm run test:all:coverage - **Integration Tests**: Test complete workflows in `test/integ/` directory - **TSDoc Coverage**: All exported functions must have complete documentation -For detailed testing patterns and examples, see [AGENTS.md - Testing Patterns](AGENTS.md#testing-patterns). +For detailed testing patterns and guidelines, see [Testing Guidelines](docs/TESTING.md). ### Documentation Updates @@ -119,7 +120,7 @@ When filing an issue, please check existing open, or recently closed, issues to Contributions via pull requests are much appreciated. Before sending us a pull request, please ensure that: -1. You are working against the latest source on the *main* branch. +1. You are working against the latest source on the _main_ branch. 2. You check existing open, and recently merged, pull requests to make sure someone else hasn't addressed the problem already. 3. You open an issue to discuss any significant work - we would hate for your time to be wasted. @@ -148,6 +149,7 @@ To send us a pull request, please: - **Formatting**: Prettier formatting applied consistently - **Type safety**: No `any` types allowed, explicit return types required - **Conventional commits**: Use conventional commit message format +- **PR description**: Follow the [PR description guidelines](docs/PR.md) for writing effective descriptions GitHub provides additional documentation on [forking a repository](https://help.github.com/articles/fork-a-repo/) and [creating a pull request](https://help.github.com/articles/creating-a-pull-request/). @@ -169,4 +171,3 @@ If you discover a potential security issue in this project we ask that you notif ## Licensing See the [LICENSE](LICENSE) file for our project's licensing. We will ask you to confirm the licensing of your contribution. - diff --git a/docs/TESTING.md b/docs/TESTING.md new file mode 100644 index 00000000..21b74108 --- /dev/null +++ b/docs/TESTING.md @@ -0,0 +1,504 @@ +# Testing Guidelines - Strands TypeScript SDK + +> **IMPORTANT**: When writing tests, you **MUST** follow the guidelines in this document. These patterns ensure consistency, maintainability, and proper test coverage across the SDK. + +This document contains comprehensive testing guidelines for the Strands TypeScript SDK. For general development guidance, see [AGENTS.md](../AGENTS.md). + +## Test Fixtures Quick Reference + +All test fixtures are located in `src/__fixtures__/`. Use these helpers to reduce boilerplate and ensure consistency. + +| Fixture | File | When to Use | Details | +| ---------------------- | ----------------------- | ------------------------------------------------------------------------------------ | --------------------------------------------------------------------------- | +| `MockMessageModel` | `mock-message-model.ts` | Agent loop tests - specify content blocks, auto-generates stream events | [Model Fixtures](#model-fixtures-mock-message-modelts-model-test-helpersts) | +| `TestModelProvider` | `model-test-helpers.ts` | Low-level model tests - precise control over individual `ModelStreamEvent` sequences | [Model Fixtures](#model-fixtures-mock-message-modelts-model-test-helpersts) | +| `collectIterator()` | `model-test-helpers.ts` | Collect all items from any async iterable into an array | [Model Fixtures](#model-fixtures-mock-message-modelts-model-test-helpersts) | +| `collectGenerator()` | `model-test-helpers.ts` | Collect yielded items AND final return value from async generators | [Model Fixtures](#model-fixtures-mock-message-modelts-model-test-helpersts) | +| `MockHookProvider` | `mock-hook-provider.ts` | Record and verify hook invocations during agent execution | [Hook Fixtures](#hook-fixtures-mock-hook-providerts) | +| `createMockTool()` | `tool-helpers.ts` | Create mock tools with custom result behavior | [Tool Fixtures](#tool-fixtures-tool-helpersts) | +| `createRandomTool()` | `tool-helpers.ts` | Create minimal mock tools when execution doesn't matter | [Tool Fixtures](#tool-fixtures-tool-helpersts) | +| `createMockContext()` | `tool-helpers.ts` | Create mock `ToolContext` for testing tool implementations directly | [Tool Fixtures](#tool-fixtures-tool-helpersts) | +| `createMockAgent()` | `agent-helpers.ts` | Create minimal mock Agent with messages and state | [Agent Fixtures](#agent-fixtures-agent-helpersts) | +| `isNode` / `isBrowser` | `environment.ts` | Environment detection for conditional test execution | [Environment Fixtures](#environment-fixtures-environmentts) | + +## Test Organization + +### Unit Test Location + +**Rule**: Unit test files are co-located with source files, grouped in a directory named `__tests__` + +``` +src/subdir/ +├── agent.ts # Source file +├── model.ts # Source file +└── __tests__/ + ├── agent.test.ts # Tests for agent.ts + └── model.test.ts # Tests for model.ts +``` + +### Integration Test Location + +**Rule**: Integration tests are separate in `tests_integ/` + +``` +tests_integ/ +├── api.test.ts # Tests public API +└── environment.test.ts # Tests environment compatibility +``` + +### Test File Naming + +- Unit tests: `{sourceFileName}.test.ts` in `src/**/__tests__/**` +- Integration tests: `{feature}.test.ts` in `test/integ/` + +## Test Structure Pattern + +Follow this nested describe pattern for consistency: + +### For Functions + +```typescript +import { describe, it, expect } from 'vitest' +import { functionName } from '../module' + +describe('functionName', () => { + describe('when called with valid input', () => { + it('returns expected result', () => { + const result = functionName('input') + expect(result).toBe('expected') + }) + }) + + describe('when called with edge case', () => { + it('handles gracefully', () => { + const result = functionName('') + expect(result).toBeDefined() + }) + }) +}) +``` + +### For Classes + +```typescript +import { describe, it, expect } from 'vitest' +import { ClassName } from '../module' + +describe('ClassName', () => { + describe('methodName', () => { + it('returns expected result', () => { + const instance = new ClassName() + const result = instance.methodName() + expect(result).toBe('expected') + }) + + it('handles error case', () => { + const instance = new ClassName() + expect(() => instance.methodName()).toThrow() + }) + }) + + describe('anotherMethod', () => { + it('performs expected action', () => { + // Test implementation + }) + }) +}) +``` + +### Key Principles + +- Top-level `describe` uses the function/class name +- Nested `describe` blocks group related test scenarios +- Use descriptive test names without "should" prefix +- Group tests by functionality or scenario + +## Writing Effective Tests + +```typescript +// Good: Clear, specific test +describe('calculateTotal', () => { + describe('when given valid numbers', () => { + it('returns the sum', () => { + expect(calculateTotal([1, 2, 3])).toBe(6) + }) + }) + + describe('when given empty array', () => { + it('returns zero', () => { + expect(calculateTotal([])).toBe(0) + }) + }) +}) + +// Bad: Vague, unclear test +describe('calculateTotal', () => { + it('works', () => { + expect(calculateTotal([1, 2, 3])).toBeTruthy() + }) +}) +``` + +## Test Batching Strategy + +**Rule**: When test setup cost exceeds test logic cost, you MUST batch related assertions into a single test. + +**You MUST batch when**: + +- Setup complexity > test logic complexity +- Multiple assertions verify the same object state +- Related behaviors share expensive context + +**You SHOULD keep separate tests for**: + +- Distinct behaviors or execution paths +- Error conditions +- Different input scenarios + +**Bad - Redundant setup**: + +```typescript +it('has correct tool name', () => { + const tool = createComplexTool({ + /* expensive setup */ + }) + expect(tool.toolName).toBe('testTool') +}) + +it('has correct description', () => { + const tool = createComplexTool({ + /* same expensive setup */ + }) + expect(tool.description).toBe('Test description') +}) +``` + +**Good - Batched properties**: + +```typescript +it('creates tool with correct properties', () => { + const tool = createComplexTool({ + /* setup once */ + }) + expect(tool.toolName).toBe('testTool') + expect(tool.description).toBe('Test description') + expect(tool.toolSpec.name).toBe('testTool') +}) +``` + +## Object Assertion Best Practices + +**Prefer testing entire objects at once** instead of individual properties for better readability and test coverage. + +```typescript +// ✅ Good: Verify entire object at once +it('returns expected user object', () => { + const user = getUser('123') + expect(user).toEqual({ + id: '123', + name: 'John Doe', + email: 'john@example.com', + isActive: true, + }) +}) + +// ✅ Good: Verify entire array of objects +it('yields expected stream events', async () => { + const events = await collectIterator(stream) + expect(events).toEqual([ + { type: 'streamEvent', data: 'Starting...' }, + { type: 'streamEvent', data: 'Processing...' }, + { type: 'streamEvent', data: 'Complete!' }, + ]) +}) + +// ❌ Bad: Testing individual properties +it('returns expected user object', () => { + const user = getUser('123') + expect(user).toBeDefined() + expect(user.id).toBe('123') + expect(user.name).toBe('John Doe') + expect(user.email).toBe('john@example.com') + expect(user.isActive).toBe(true) +}) + +// ❌ Bad: Testing array elements individually in a loop +it('yields expected stream events', async () => { + const events = await collectIterator(stream) + for (const event of events) { + expect(event.type).toBe('streamEvent') + expect(event).toHaveProperty('data') + } +}) +``` + +**Benefits of testing entire objects**: + +- **More concise**: Single assertion instead of multiple +- **Better test coverage**: Catches unexpected additional or missing properties +- **More readable**: Clear expectation of the entire structure +- **Easier to maintain**: Changes to the object require updating one place + +**Use cases**: + +- Always use `toEqual()` for object and array comparisons +- Use `toBe()` only for primitive values and reference equality +- When testing error objects, verify the entire structure including message and type + +## What to Test + +**Testing Approach:** + +- You **MUST** write tests for implementations (functions, classes, methods) +- You **SHOULD NOT** write tests for interfaces since TypeScript compiler already enforces type correctness +- You **SHOULD** write Vitest type tests (`*.test-d.ts`) for complex types to ensure backwards compatibility + +**Example Implementation Test:** + +```typescript +describe('BedrockModel', () => { + it('streams messages correctly', async () => { + const provider = new BedrockModel(config) + const stream = provider.stream(messages) + + for await (const event of stream) { + if (event.type === 'modelMessageStartEvent') { + expect(event.role).toBe('assistant') + } + } + }) +}) +``` + +## Test Coverage + +- **Minimum**: 80% coverage required (enforced by Vitest) +- **Target**: Aim for high coverage on critical paths +- **Exclusions**: Test files, config files, generated code + +## Test Model Providers + +**When to use each test provider:** + +- **`MockMessageModel`**: For agent loop tests and high-level flows - focused on content blocks +- **`TestModelProvider`**: For low-level event streaming tests where you need precise control over individual events + +### MockMessageModel - Content-Focused Testing + +For tests focused on messages, you SHOULD use `MockMessageModel` with a content-focused API that eliminates boilerplate: + +```typescript +import { MockMessageModel } from '../__fixtures__/mock-message-model' + +// ✅ RECOMMENDED - Single content block (most common) +const provider = new MockMessageModel().addTurn({ type: 'textBlock', text: 'Hello' }) + +// ✅ RECOMMENDED - Array of content blocks +const provider = new MockMessageModel().addTurn([ + { type: 'textBlock', text: 'Let me help' }, + { type: 'toolUseBlock', name: 'calc', toolUseId: 'id-1', input: {} }, +]) + +// ✅ RECOMMENDED - Multi-turn with builder pattern +const provider = new MockMessageModel() + .addTurn({ type: 'toolUseBlock', name: 'calc', toolUseId: 'id-1', input: {} }) // Auto-derives 'toolUse' + .addTurn({ type: 'textBlock', text: 'The answer is 42' }) // Auto-derives 'endTurn' + +// ✅ OPTIONAL - Explicit stopReason when needed +const provider = new MockMessageModel().addTurn({ type: 'textBlock', text: 'Partial response' }, 'maxTokens') + +// ✅ OPTIONAL - Error handling +const provider = new MockMessageModel() + .addTurn({ type: 'textBlock', text: 'Success' }) + .addTurn(new Error('Model failed')) +``` + +## Testing Hooks + +When testing hook behavior, you **MUST** use `agent.hooks.addCallback()` for registering single callbacks when `agent.hooks` is available. Do NOT create inline `HookProvider` objects — this is an anti-pattern for single callbacks. + +```typescript +// ✅ CORRECT - Use agent.hooks.addCallback() for single callbacks +const agent = new Agent({ model, tools: [tool] }) + +agent.hooks.addCallback(BeforeToolCallEvent, (event: BeforeToolCallEvent) => { + event.toolUse = { + ...event.toolUse, + input: { value: 42 }, + } +}) + +// ✅ CORRECT - Use MockHookProvider to record and verify hook invocations +const hookProvider = new MockHookProvider() +const agent = new Agent({ model, hooks: [hookProvider] }) +await agent.invoke('Hi') +expect(hookProvider.invocations).toContainEqual(new BeforeInvocationEvent({ agent })) + +// ❌ WRONG - Do NOT create inline HookProvider objects +const switchToolHook = { + registerCallbacks: (registry: HookRegistry) => { + registry.addCallback(BeforeToolCallEvent, (event: BeforeToolCallEvent) => { + if (event.toolUse.name === 'tool1') { + event.tool = tool2 + } + }) + }, +} +``` + +**When to use each approach:** + +- **`agent.hooks.addCallback()`** - For adding a single callback to verify hook behavior (e.g., modifying tool input, switching tools) +- **`MockHookProvider`** - For recording and verifying hook lifecycle behavior and that specific hook events fired during execution + +## Test Fixtures Reference + +All test fixtures are located in `src/__fixtures__/`. Use these helpers to reduce boilerplate and ensure consistency. + +### Model Fixtures (`mock-message-model.ts`, `model-test-helpers.ts`) + +- **`MockMessageModel`** - Content-focused model for agent loop tests. Use `addTurn()` with content blocks. +- **`TestModelProvider`** - Low-level model for precise control over `ModelStreamEvent` sequences. +- **`collectIterator(stream)`** - Collects all items from an async iterable into an array. +- **`collectGenerator(generator)`** - Collects yielded items and final return value from an async generator. + +```typescript +// MockMessageModel for agent tests +const model = new MockMessageModel() + .addTurn({ type: 'toolUseBlock', name: 'calc', toolUseId: 'id-1', input: {} }) + .addTurn({ type: 'textBlock', text: 'Done' }) + +// collectIterator for stream results +const events = await collectIterator(agent.stream('Hi')) +``` + +### Hook Fixtures (`mock-hook-provider.ts`) + +- **`MockHookProvider`** - Records all hook invocations for verification. Pass to `Agent({ hooks: [provider] })`. + - Use `{ includeModelEvents: false }` to exclude `ModelStreamEventHook` from recordings. + - Access `provider.invocations` to verify hook events fired. + +```typescript +// Record and verify hook invocations +const hookProvider = new MockHookProvider({ includeModelEvents: false }) +const agent = new Agent({ model, hooks: [hookProvider] }) + +await agent.invoke('Hi') + +expect(hookProvider.invocations[0]).toEqual(new BeforeInvocationEvent({ agent })) +``` + +### Tool Fixtures (`tool-helpers.ts`) + +- **`createMockTool(name, resultFn)`** - Creates a mock tool with custom result behavior. +- **`createRandomTool(name?)`** - Creates a minimal mock tool (use when tool execution doesn't matter). +- **`createMockContext(toolUse, agentState?)`** - Creates a mock `ToolContext` for testing tool implementations directly. + +```typescript +// Mock tool with custom result +const tool = createMockTool( + 'calculator', + () => new ToolResultBlock({ toolUseId: 'id', status: 'success', content: [new TextBlock('42')] }) +) + +// Minimal tool when execution doesn't matter +const tool = createRandomTool('myTool') +``` + +**When to use fixtures vs `FunctionTool` directly:** + +Use `createMockTool()` or `createRandomTool()` when tools are incidental to the test. Use `FunctionTool` or `tool()` directly only when testing tool-specific behavior. + +```typescript +// ✅ Use fixtures when testing agent/hook behavior +const tool = createMockTool('testTool', () => ({ + type: 'toolResultBlock', + toolUseId: 'tool-1', + status: 'success' as const, + content: [new TextBlock('Success')], +})) +const agent = new Agent({ model, tools: [tool] }) + +// ❌ Don't use FunctionTool when tool behavior is irrelevant to the test +const tool = new FunctionTool({ name: 'testTool', description: '...', inputSchema: {...}, callback: ... }) +``` + +### Agent Fixtures (`agent-helpers.ts`) + +- **`createMockAgent(data?)`** - Creates a minimal mock Agent with messages and state. Use for testing components that need an Agent reference without full agent behavior. + +```typescript +const agent = createMockAgent({ + messages: [new Message({ role: 'user', content: [new TextBlock('Hi')] })], + state: { key: 'value' }, +}) +``` + +### Environment Fixtures (`environment.ts`) + +- **`isNode`** - Boolean that detects if running in Node.js environment. +- **`isBrowser`** - Boolean that detects if running in a browser environment. + +Use these for conditional test execution when tests depend on environment-specific features. + +```typescript +import { isNode } from '../__fixtures__/environment' + +// Skip tests that require Node.js features in browser +describe.skipIf(!isNode)('Node.js specific features', () => { + it('uses environment variables', () => { + expect(process.env.NODE_ENV).toBeDefined() + }) +}) +``` + +## Multi-Environment Testing + +The SDK is designed to work seamlessly in both Node.js and browser environments. Our test suite validates this by running tests in both environments using Vitest's browser mode with Playwright. + +### Test Projects + +The test suite is organized into three projects: + +1. **unit-node** (green): Unit tests running in Node.js environment +2. **unit-browser** (cyan): Same unit tests running in Chromium browser +3. **integ** (magenta): Integration tests running in Node.js + +### Environment-Specific Test Patterns + +- You MUST write tests that are environment-agnostic unless they depend on Node.js features like filesystem or env-vars + +Some tests require Node.js-specific features (like process.env, AWS SDK) and should be skipped in browser environments: + +```typescript +import { describe, it, expect } from 'vitest' +import { isNode } from '../__fixtures__/environment' + +// Tests will run in Node.js, skip in browser +describe.skipIf(!isNode)('Node.js specific features', () => { + it('uses environment variables', () => { + // This test accesses process.env + expect(process.env.NODE_ENV).toBeDefined() + }) +}) +``` + +## Development Commands + +```bash +npm test # Run unit tests in Node.js +npm run test:browser # Run unit tests in browser (Chromium via Playwright) +npm run test:all # Run all tests in all environments +npm run test:integ # Run integration tests +npm run test:coverage # Run tests with coverage report +``` + +For detailed command usage, see [CONTRIBUTING.md - Testing Instructions](../CONTRIBUTING.md#testing-instructions-and-best-practices). + +## Checklist Items + +- [ ] Do the tests use relevant helpers from `src/__fixtures__/` as noted in the "Test Fixtures Quick Reference" table above? +- [ ] Are recurring code or patterns extracted to functions for better usability/readability? +- [ ] Are tests focused on verifying one or two things only? +- [ ] Are tests written concisely enough that the bulk of each test is important to the test instead of boilerplate code? +- [ ] Are tests asserting on the entire object instead of specific fields?