-
Notifications
You must be signed in to change notification settings - Fork 32
Description
Executive Summary
Comprehensive semantic analysis of 245 non-test Go files (68,320 lines) reveals significant refactoring opportunities driven by missing abstractions rather than poor code quality. Individual files are well-written, but systemic patterns show ~15,000 lines (22%) of duplicated code across safe-output job builders, MCP rendering, and configuration parsing.
Key Findings:
- 2,400 lines of 85-95% identical safe-output job building code across 20 files
- 1,200 lines of 75% duplicated MCP rendering with two parallel systems
- 600 lines of scattered configuration parsing logic
- 11 monolithic files over 1,000 lines requiring decomposition
- 100% duplicate functions (formatYAMLValue) in multiple locations
Potential Impact:
- Reduce codebase from 68,320 to ~53,000 lines (22% reduction)
- Eliminate 11 files over 1,000 lines
- Centralize scattered helpers from 50+ locations to <10
- Improve new feature development speed (10 lines vs 100+ for new safe-output)
Full Analysis Report
Analysis Scope
- Total Files Analyzed: 245 non-test Go files in pkg/
- Total Lines of Code: 68,320 lines
- Functions Catalogued: 1,000+ functions
- Primary Focus: pkg/workflow/ directory (148 files, ~36,000 lines)
- Analysis Method: Semantic clustering + naming pattern analysis + code similarity detection
1. MONOLITHIC FILES (>1,000 Lines)
Critical Priority Files Requiring Decomposition
| File | Lines | Issue | Recommended Split |
|---|---|---|---|
| pkg/cli/trial_command.go | 1,811 | Mixed trial execution, validation, reporting | → trial_executor.go, trial_validator.go, trial_reporter.go |
| pkg/workflow/compiler.go | 1,713 | Core compilation + parsing + utilities | → compiler.go, compiler_parser.go, compiler_config.go, compiler_helpers.go |
| pkg/cli/logs.go | 1,561 | Log processing, parsing, formatting | → logs_parser.go, logs_formatter.go, logs_analyzer.go |
| pkg/workflow/safe_outputs.go | 1,412 | Config parsing + job building + env vars | → safe_outputs_config.go, safe_outputs_builder.go, safe_outputs_env.go |
| pkg/workflow/compiler_yaml.go | 1,299 | YAML generation + step generation | → yaml_generator.go, yaml_steps.go |
| pkg/workflow/compiler_jobs.go | 1,239 | Multiple job builders | → job_builder_main.go, job_builder_safe_outputs.go |
| pkg/cli/audit_report.go | 1,228 | Audit reporting + formatting | → audit_analyzer.go, audit_formatter.go |
| pkg/workflow/copilot_engine.go | 1,178 | Engine + MCP config + logs + execution | → copilot_engine_core.go, copilot_mcp.go, copilot_logs.go |
| pkg/parser/frontmatter.go | 1,165 | Parsing + schema validation | → frontmatter_parser.go, frontmatter_validator.go |
| pkg/parser/schema.go | 1,156 | Schema definitions + validation | → schema_types.go, schema_validator.go |
| pkg/cli/compile_command.go | 1,133 | CLI command + orchestration | → compile_command.go, compile_orchestrator.go |
Total: 11 files with 15,076 lines → should become 33 focused files averaging ~450 lines each
Estimated Impact: Improved navigability, clearer responsibilities, easier testing
2. MAJOR CODE DUPLICATION CLUSTERS
Cluster A: Safe-Output Job Builders (85-95% similarity)
Affected Files: 20 files in pkg/workflow/
Pattern Identified:
Every safe-output type (create_issue, create_discussion, close_issue, update_issue, etc.) follows near-identical structure:
// Pattern repeated 20 times with 85-95% similarity:
func (c *Compiler) parseXXXConfig(outputMap map[string]any) *XXXConfig {
if configData, exists := outputMap["xxx"]; exists {
config := &XXXConfig{}
if configMap, ok := configData.(map[string]any); ok {
// Parse fields using shared helpers (90% identical)
config.TitlePrefix = parseTitlePrefixFromConfig(configMap)
config.Labels = parseLabelsFromConfig(configMap)
targetRepoSlug, _ := parseTargetRepoWithValidation(configMap)
c.parseBaseSafeOutputConfig(configMap, &config.BaseSafeOutputConfig, 1)
}
return config
}
return nil
}
func (c *Compiler) buildCreateOutputXXXJob(data *WorkflowData, mainJobName string) (*Job, error) {
// Validation (100% identical structure)
if data.SafeOutputs == nil || data.SafeOutputs.CreateXXX == nil {
return nil, fmt.Errorf("configuration required")
}
// Build env vars (85% identical)
var customEnvVars []string
customEnvVars = append(customEnvVars, buildTitlePrefixEnvVar(...))
customEnvVars = append(customEnvVars, buildLabelsEnvVar(...))
customEnvVars = append(customEnvVars, c.buildStandardSafeOutputEnvVars(...)...)
// Build outputs (90% identical)
outputs := map[string]string{
"xxx_number": "${{ steps.create_xxx.outputs.xxx_number }}",
"xxx_url": "${{ steps.create_xxx.outputs.xxx_url }}",
}
// Return job (100% identical)
return c.buildSafeOutputJob(data, SafeOutputJobConfig{...})
}Files with this pattern:
- create_issue.go (118 lines)
- create_discussion.go (109 lines)
- close_issue.go (141 lines)
- close_discussion.go (153 lines)
- update_issue.go (116 lines)
- update_pull_request.go (117 lines)
- create_pull_request.go (200 lines)
- create_pr_review_comment.go (130 lines)
- create_code_scanning_alert.go (140 lines)
- update_release.go (110 lines)
- add_comment.go (140 lines)
- add_labels.go (70 lines)
- add_reviewer.go (90 lines)
- assign_milestone.go (60 lines)
- assign_to_agent.go (60 lines)
- link_sub_issue.go (115 lines)
- publish_assets.go (135 lines)
- push_to_pull_request_branch.go (200 lines)
- missing_tool.go (80 lines)
- noop.go (30 lines)
Total Duplication: ~2,400 lines of 85-95% identical code
Refactoring Recommendation:
Create pkg/workflow/safe_output_job_factory.go:
type SafeOutputJobType string
const (
JobTypeCreateIssue SafeOutputJobType = "create_issue"
JobTypeCreateDiscussion SafeOutputJobType = "create_discussion"
// ... 18 more types
)
type SafeOutputJobSpec struct {
JobType SafeOutputJobType
StepName string
ScriptGetter func() string
Permissions *Permissions
OutputsBuilder func(*WorkflowData) map[string]string
EnvBuilder func(*WorkflowData, SafeOutputConfig) []string
ConditionBuilder func(*WorkflowData, SafeOutputConfig) ConditionNode
}
func BuildSafeOutputJob(spec SafeOutputJobSpec, data *WorkflowData, config SafeOutputConfig) (*Job, error)Estimated Impact: Reduce 2,400 lines to ~800 lines (60% reduction)
Cluster B: MCP Configuration Rendering (75% overlap)
Affected Files:
- pkg/workflow/mcp-config.go (962 lines) - Legacy system
- pkg/workflow/mcp_renderer.go (637 lines) - Unified system
Problem: Two rendering systems coexist with 75% functional overlap
Legacy System (mcp-config.go):
- renderPlaywrightMCPConfig()
- renderPlaywrightMCPConfigWithOptions()
- renderSerenaMCPConfigWithOptions()
- renderSafeOutputsMCPConfig()
- renderSafeOutputsMCPConfigWithOptions()
- renderCustomMCPConfigWrapper()
- renderSharedMCPConfig() (313 lines!)
- renderPlaywrightMCPConfigTOML()
- renderSafeOutputsMCPConfigTOML()
Unified System (mcp_renderer.go):
- MCPConfigRendererUnified.RenderGitHubMCP()
- MCPConfigRendererUnified.RenderPlaywrightMCP()
- MCPConfigRendererUnified.RenderSerenaMCP()
- MCPConfigRendererUnified.RenderSafeOutputsMCP()
- RenderGitHubMCPDockerConfig()
- RenderGitHubMCPRemoteConfig()
Code Similarity Analysis:
- renderPlaywrightMCPConfigWithOptions vs MCPConfigRendererUnified.RenderPlaywrightMCP: 85% identical
- renderSharedMCPConfig (313 lines): Should be split into separate renderers
- Both systems support TOML and JSON with duplicated logic
Total Duplication: ~1,200 lines across both files
Refactoring Recommendation:
- Delete legacy system (mcp-config.go)
- Enhance unified renderer with split files:
- mcp_renderer.go (core, 400 lines)
- mcp_renderer_github.go (200 lines)
- mcp_renderer_playwright.go (150 lines)
- mcp_renderer_serena.go (150 lines)
- mcp_renderer_helpers.go (100 lines)
Estimated Impact: Reduce 1,599 lines to ~1,000 lines (37% reduction)
Cluster C: Configuration Parsing Helpers (Scattered)
Current State: Helper functions exist in config_helpers.go but many files use inline parsing
Central Location: pkg/workflow/config_helpers.go (109 lines) ✓
Files with inline duplication:
- close_issue.go (lines 29-37, 40-54) - Label and target parsing
- close_discussion.go (lines 30-55) - Similar inline parsing
- update_issue.go (lines 72-101) - Inline field parsing
- update_pull_request.go (lines 74-102) - Inline field parsing
- create_code_scanning_alert.go - Inline parsing
- 10+ more files
Common Duplicated Patterns:
- Label Parsing (repeated 6 times):
// DUPLICATE in close_issue.go, close_discussion.go, etc.
if requiredLabels, exists := configMap["required-labels"]; exists {
if labelList, ok := requiredLabels.([]any); ok {
for _, label := range labelList {
if labelStr, ok := label.(string); ok {
config.RequiredLabels = append(config.RequiredLabels, labelStr)
}
}
}
}
// Should use: parseLabelsFromConfig(configMap)- Target Field Parsing (repeated 8 times):
// DUPLICATE in multiple files
if target, exists := configMap["target"]; exists {
if targetStr, ok := target.(string); ok {
config.Target = targetStr
}
}
// Should use: parseTargetField(configMap)- Boolean Presence Detection (repeated 15+ times):
// DUPLICATE in update_issue.go, update_pull_request.go, create_pull_request.go
if _, exists := configMap["status"]; exists {
config.Status = new(bool)
}
// Should use: parseBoolPresenceField(configMap, "status")Total Duplication: ~600 lines of scattered parsing logic
Refactoring Recommendation:
Extend config_helpers.go with:
// Add these functions to config_helpers.go:
func parseBoolPresenceField(configMap map[string]any, key string) *bool
func parseBoolValueField(configMap map[string]any, key string) *bool
func parseTargetField(configMap map[string]any) string
func parseRequiredLabels(configMap map[string]any) []string
func parseRequiredTitlePrefix(configMap map[string]any) string
func parseRequiredCategory(configMap map[string]any) string
func parseArrayField(configMap map[string]any, key string) []string
func parseIntField(configMap map[string]any, key string, defaultValue int) intEstimated Impact: Eliminate 600 lines of inline parsing, expand config_helpers.go from 109 to ~300 lines (net reduction: 400+ lines)
Cluster D: Engine Implementation Duplication (60-70%)
Affected Files:
- claude_engine.go (300 lines)
- codex_engine.go (645 lines)
- copilot_engine.go (1,178 lines)
- custom_engine.go (250 lines)
- agentic_engine.go (534 lines - base)
Common Duplication:
- Installation Steps (70% similar):
All engines generate Node.js setup with near-identical code:
- claude_engine.go:34-77 (44 lines)
- codex_engine.go:48-71 (24 lines)
- copilot_engine.go:44-129 (86 lines)
-
MCP Config Rendering (60% similar structure):
All iterate over tools with same pattern, different render methods -
Log Parsing (50% similar structure):
- claude_logs.go (565 lines) - Separate file
- codex_engine.go:408-586 (179 lines) - Embedded
- copilot_engine.go:572-658 (87 lines) - Embedded
Inconsistency: Claude has separate logs file, others embed parsing
Total Duplication: ~800 lines across engine files
Refactoring Recommendation:
Create engine_common_steps.go:
func GenerateStandardInstallationSteps(packageName, version, stepName string) []GitHubActionStep
func RenderStandardMCPConfig(yaml *strings.Builder, tools map[string]any, renderer MCPRenderer)
func ParseLogsByLine(logContent string, parsers map[string]LineParser) LogMetricsStandardize log parsing: Create *_logs.go for each engine (codex_logs.go, copilot_logs.go)
Estimated Impact: Reduce engine duplication by ~800 lines
3. EXACT DUPLICATE FUNCTIONS
Critical: 100% Identical Code
1. formatYAMLValue() - EXACT DUPLICATE
Locations:
- pkg/workflow/compiler_yaml.go:636 (50 lines)
- pkg/workflow/runtime_setup.go:636 (50 lines)
Code: Identical function, 100% duplication
Action: Create yaml_helpers.go, move function there, delete duplicates
Impact: Remove 50 lines of exact duplication
2. Boolean Presence Parsing Pattern - REPEATED 15+ TIMES
Pattern in 15+ files:
if _, exists := configMap["fieldName"]; exists {
config.FieldName = new(bool)
}Locations:
- update_issue.go (3 occurrences)
- update_pull_request.go (2 occurrences)
- create_pull_request.go (3 occurrences)
- 8+ more files with 1-2 occurrences each
Total Repetitions: 15+
Action: Create parseBoolPresenceField() helper
Impact: Remove 30+ lines of duplication
4. OUTLIER FUNCTIONS (Misplaced Code)
High Priority Misplacements
1. Expression Building in compiler.go
- Location: compiler.go:180-250
- Issue: Expression AST building doesn't belong in main compiler
- Recommendation: Move to expression_builder.go (file exists but incomplete)
2. YAML Formatting Duplication
- Issue: formatYAMLValue() exists in 2 files (compiler_yaml.go, runtime_setup.go)
- Recommendation: Create yaml_helpers.go
3. Confusing Cache File Names
- Files: action_cache.go (action resolution) vs cache.go (memory caching)
- Issue: Similar names, completely different purposes
- Recommendation: Rename cache.go → cache_memory.go
4. Permission Constructors in permissions.go
- **(redacted) permissions.go (934 lines)
- Issue: Core types + 20+ NewPermissions*() constructors mixed
- Recommendation: Split into:
- permissions_types.go (200 lines)
- permissions_constructors.go (400 lines)
- permissions_operations.go (334 lines)
5. Inconsistent Engine Log Parsing
- Claude: claude_logs.go (separate file) ✓
- Codex: Embedded in codex_engine.go ✗
- Copilot: Embedded in copilot_engine.go ✗
- Recommendation: Create codex_logs.go and copilot_logs.go for consistency
5. SCATTERED HELPER FUNCTIONS
Functions That Should Be Centralized
Target: yaml_helpers.go (NEW, ~200 lines)
Consolidate YAML formatting:
func formatYAMLValue(value any) string // From compiler_yaml.go + runtime_setup.go
func writeYAMLArray(yaml *strings.Builder, ...) // From multiple files
func writeYAMLMap(yaml *strings.Builder, ...) // From multiple files
func escapeYAMLString(value string) string // From multiple files
func quoteYAMLValue(value string) string // From multiple filesTarget: type_helpers.go (NEW, ~150 lines)
Consolidate type conversions:
func ConvertToInt(value any) (int, error) // From metrics.go
func ConvertToFloat(value any) (float64, error) // From metrics.go
func parseIntValue(value any) (int, bool) // From map_helpers.go
func parseBoolValue(value any) (bool, bool) // ScatteredTarget: error_helpers.go (NEW, ~150 lines)
Consolidate error handling:
func aggregateValidationErrors(errors []error) error
func formatValidationError(field, message string) error
func enhanceSchemaValidationError(err error, context string) errorTarget: github_helpers.go (NEW, ~200 lines)
Consolidate GitHub operations:
func parseRepoSlug(repo string) (owner, name string, err error)
func getCurrentRepository() string
func extractBaseRepo(context map[string]any) stringImpact: Centralize 50+ scattered helper locations to <10 clear locations
6. REFACTORING PRIORITIES
Priority 1: Critical (High Impact, Low Risk)
1. Fix YAML Duplication
- Impact: Remove 100% duplicate function
- Files: compiler_yaml.go, runtime_setup.go
- Effort: 2 hours
- Create: yaml_helpers.go
- Lines saved: 50
2. Extend config_helpers.go
- Impact: Remove 600 lines of inline parsing
- Files: 15+ config files
- Effort: 1-2 days
- Lines saved: 400+
3. Consolidate Boolean Presence Parsing
- Impact: Remove repeated pattern (15+ occurrences)
- Files: 10+ files
- Effort: 4 hours
- Lines saved: 30+
Priority 2: High (High Impact, Medium Risk)
4. Create Safe-Output Job Factory
- Impact: Reduce 2,400 lines to 800 lines (60% reduction)
- Files: 20 safe-output files
- Effort: 3-5 days
- Create: safe_output_job_factory.go
- Lines saved: 1,600
5. Consolidate MCP Rendering
- Impact: Reduce 1,599 lines to 1,000 lines (37% reduction)
- Files: mcp-config.go (delete), mcp_renderer.go (enhance + split)
- Effort: 5-7 days
- Lines saved: 600
6. Split Monolithic Files
- Impact: Improve maintainability significantly
- Files: 11 files over 1,000 lines
- Effort: 2-4 days per file (20-40 days total)
- Strategy: Split into focused files (3-5 per monolith)
Priority 3: Medium (Maintenance Improvement)
7. Standardize Engine Implementations
- Impact: Reduce 800 lines of duplicated setup
- Files: 4 engine files
- Effort: 3-4 days
- Create: engine_common_steps.go, codex_logs.go, copilot_logs.go
- Lines saved: 800
8. Centralize Helper Functions
- Impact: Remove scattered utilities
- Files: 30+ files
- Effort: 2-3 days
- Create: yaml_helpers.go, type_helpers.go, error_helpers.go, github_helpers.go
- Lines saved: 500
Priority 4: Low (Long-term Technical Debt)
9. Create Validation Framework
- Impact: Standardize validation patterns
- Files: 16 validation files
- Effort: 5-7 days
- Create: validation_framework.go
- Net impact: +200 lines (framework) but improved maintainability
7. ESTIMATED IMPACT SUMMARY
Quantitative Metrics
Before Refactoring:
- Total lines: 68,320
- Average file size: 280 lines
- Files >1,000 lines: 11
- Duplicated code: ~4,000 lines
- Parse functions: 30 (scattered across 20+ files)
- Build functions: 35 (scattered across 20+ files)
- Helpers: 50+ scattered locations
After Refactoring (Target):
- Total lines: ~53,000 (-22%)
- Average file size: ~220 lines (-21%)
- Files >1,000 lines: 0 (-100%)
- Duplicated code: <500 lines (-87%)
- Parse functions: 5 centralized helper files
- Build functions: 1 factory + job specs
- Helpers: <10 clear locations (-80%)
Code Reduction Breakdown
| Refactoring | Current Lines | Target Lines | Reduction |
|---|---|---|---|
| Safe-output job consolidation | 2,400 | 800 | -1,600 |
| MCP rendering consolidation | 1,599 | 1,000 | -599 |
| Config parsing consolidation | 600 scattered | 200 | -400 |
| Engine standardization | 800 duplicated | 200 | -600 |
| YAML helper consolidation | 200 | 150 | -50 |
| Duplicate function removal | 100 | 0 | -100 |
| Boolean parsing pattern | 30 | 5 | -25 |
| Monolithic file splitting | N/A | N/A | -200 (net) |
| Other improvements | N/A | N/A | -800 |
| Total | 68,320 | ~53,000 | -15,320 (-22%) |
8. IMPLEMENTATION ROADMAP
Phase 1: Quick Wins (Week 1)
- Create yaml_helpers.go
- Fix formatYAMLValue() duplication
- Extend config_helpers.go with 8 new functions
- Add parseBoolPresenceField() helper
- Update 10-15 files to use new helpers
Expected Impact: -500 lines, low risk, immediate improvement
Phase 2: Safe-Output Factory (Weeks 2-3)
- Design SafeOutputJobSpec abstraction
- Create safe_output_job_factory.go
- Migrate 3 simple job types (create_issue, create_discussion, update_issue)
- Test thoroughly
- Migrate remaining 17 job types in batches
Expected Impact: -1,600 lines, medium risk, high value
Phase 3: MCP Consolidation (Weeks 4-5)
- Analyze mcp_renderer vs mcp-config overlap
- Split mcp_renderer.go into 5 focused files
- Migrate all engines to unified renderer
- Delete legacy mcp-config.go functions
- Update all MCP tests
Expected Impact: -600 lines, medium risk, critical maintenance improvement
Phase 4: Monolithic File Splitting (Weeks 6-8)
- Split compiler.go (1,713 lines → 5 files)
- Split safe_outputs.go (1,412 lines → 3 files)
- Split copilot_engine.go (1,178 lines → 3 files)
- Split compiler_yaml.go (1,299 lines → 2 files)
- Update imports across dependent files
Expected Impact: -200 lines (net after organizing), significant navigability improvement
Phase 5: Engine Standardization (Weeks 9-10)
- Create engine_common_steps.go
- Standardize installation step generation
- Create codex_logs.go and copilot_logs.go
- Refactor all engines to use common utilities
Expected Impact: -800 lines, improved consistency
Phase 6: Helper Centralization (Week 11)
- Create type_helpers.go
- Create error_helpers.go
- Create github_helpers.go
- Move scattered helpers to appropriate locations
- Update all references
Expected Impact: -500 lines, improved discoverability
Phase 7: Final Cleanup (Week 12)
- Remove remaining inline duplication
- Standardize validation patterns
- Update documentation
- Final testing pass
Expected Impact: -300 lines, comprehensive cleanup
Total Timeline: 12 weeks for complete refactoring
9. TESTING STRATEGY
Test Coverage Requirements
For each refactoring phase:
- ✅ Ensure existing tests pass before changes
- ✅ Run
make test-unitafter each file modification - ✅ Verify
make lintpasses - ✅ Check
make buildsucceeds - ✅ No changes to public APIs (internal refactoring only)
New Tests Required:
- safe_output_job_factory_test.go (comprehensive factory testing)
- yaml_helpers_test.go (YAML formatting utilities)
- config_helpers_test.go (extend existing with new helpers)
- type_helpers_test.go (type conversion utilities)
- error_helpers_test.go (error handling utilities)
Risk Mitigation:
- All refactorings are internal with no public API changes
- Existing test coverage should catch regressions
- Changes are primarily organizational (moving code, not rewriting logic)
- Incremental migration allows for rollback at any phase
10. SUCCESS CRITERIA
Quantitative Goals
- Reduce total lines from 68,320 to ~53,000 (-22%)
- Reduce average file size from 280 to ~220 lines (-21%)
- Eliminate all files over 1,000 lines (0 files >1,000 lines)
- Reduce duplicated code from ~4,000 to <500 lines (-87%)
- Centralize helpers from 50+ to <10 locations (-80%)
Qualitative Goals
- Clear separation of concerns in all files
- Consistent naming patterns across codebase
- Single responsibility per file
- Easy discoverability of related functions
- Reduced cognitive load for developers
- New safe-output job: 10 lines (vs current 100+ lines)
- New validator: Implement interface (vs current ad-hoc)
- MCP rendering: Single unified system (vs current 2 parallel systems)
11. MIGRATION SAFETY
Low-Risk Refactorings (Start Here)
✅ YAML helper consolidation - Pure code movement
✅ Config helper extension - Adding new helpers, backward compatible
✅ Boolean presence pattern - Simple replacement with helper
Medium-Risk Refactorings (Requires Testing)
Higher-Risk Refactorings (Requires Careful Planning)
🔴 Monolithic file splitting - Many import updates across codebase
🔴 Validation framework - Architectural change, needs design review
12. KEY INSIGHTS
Root Cause Analysis
This codebase exhibits organic growth without architectural refactoring. The duplication isn't from poor coding practices - individual files are well-written. The issue is missing abstractions for common patterns that emerged over time.
Pattern Evolution:
- First safe-output type (create_issue) implemented directly ✓
- Second type (create_discussion) copy-pasted and modified ✓
- Third type (close_issue) followed same pattern ✓
- ... 17 more types added using same copy-paste approach ✗
- Result: 2,400 lines of 85-95% identical code across 20 files
Similar Evolution:
- MCP rendering: Original system + new unified renderer → both coexist with 75% overlap
- Config parsing: Helpers created but not universally adopted → inline parsing persists
- Engine implementations: Base patterns established but not extracted → duplication across engines
Recommended Approach
Start with Priority 1 (low-risk, high-impact):
- YAML helper consolidation (2 hours, -50 lines)
- Config helper extension (1-2 days, -400 lines)
- Boolean presence pattern (4 hours, -30 lines)
Then tackle Priority 2 (high-impact, medium-risk):
- Safe-output job factory (3-5 days, -1,600 lines)
- MCP consolidation (5-7 days, -600 lines)
Estimated ROI:
- Phase 1 (Quick wins): 2-3 days → -500 lines
- Phase 2 (Job factory): 3-5 days → -1,600 lines
- Phase 3 (MCP consolidation): 5-7 days → -600 lines
Total for first 3 phases: 10-15 days → -2,700 lines (4% reduction)
13. NEXT STEPS
Immediate Actions
- Review findings - Team review of analysis and priorities
- Select refactoring scope - Choose which priorities to pursue
- Create detailed implementation plan - Break down selected refactorings into tasks
- Start with Phase 1 - Low-risk quick wins to build momentum
- Establish metrics - Track lines of code, file sizes, test coverage
- Set up incremental reviews - Review after each phase
Long-Term Strategy
- Establish code review guidelines to prevent future duplication
- Create pattern library for common abstractions
- Document architectural decisions for safe-output jobs, MCP rendering, validation
- Consider automated refactoring tools for future consolidations
- Schedule quarterly architectural reviews to catch duplication early
14. CONCLUSION
This comprehensive analysis reveals a healthy codebase with systemic architectural duplication rather than poor code quality. The path forward is clear: create missing abstractions (job factory, unified MCP renderer, expanded helpers) to eliminate ~15,000 lines of duplicated code while maintaining all functionality.
Key Recommendation: Start with low-risk Phase 1 refactorings (YAML helpers, config helpers, boolean pattern) to gain confidence and momentum, then proceed to high-value Phase 2 refactorings (job factory, MCP consolidation) for maximum impact.
Expected Outcome: A 22% reduction in codebase size, elimination of all 1,000+ line files, and significantly improved maintainability through centralized patterns and clear abstractions.
Analysis Metadata
- Repository: githubnext/gh-aw
- Analysis Date: 2025-11-28
- Files Analyzed: 245 non-test Go files
- Total Lines Analyzed: 68,320 lines
- Functions Catalogued: 1,000+
- Primary Focus: pkg/workflow/ (148 files, ~36,000 lines)
- Detection Method: Semantic clustering + naming pattern analysis + code similarity detection
- Analysis Tool: Claude Code Agent with comprehensive codebase exploration
References:
AI generated by Semantic Function Refactoring