Use your entire codebase as semantically searchable context for you coding agents, indexes your whole project to a graph intelligently with ML-enhanced Node,Edge and Symbol discovery integrated with regular AST-parsing and exposes intelligent context through mcp-server and agentic context gathering tools.
- Single pass extraction: Unified TreeSitter + FastML walk produces nodes, edges, symbols in one pass, then lightweight pattern+symbol resolution enriches results.
- Node annotations: Project/organization/repo IDs, language, type, complexity, chunk_count recorded for retrieval and filtering.
- Chunk β embed β persist: Nodes are chunked using a model-aware tokenizer unless
CODEGRAPH_EMBEDDING_SKIP_CHUNKING=1. Each chunk gets an embedding; writes go to SurrealDB tablesnodes,edges,symbol_embeddings,chunks. - Hybrid search functions: Surreal functions combine HNSW vector search + lexical + graph context. Internal semantic_search tool of MCP-server (used by the in-built agent to get context) chooses chunk or node search based on the env flag, preserving relevance and context.
- Incremental tracking: File/project metadata stored to support
--watchincremental re-indexing and dedup by project_id + file_path. - Batched + resilient: Embedding calls and Surreal upserts are batched; sizes tunable via
CODEGRAPH_EMBEDDINGS_BATCH_SIZEand provider-specific env vars.
CodeGraph follows Anthropic's MCP best practices by providing rich, pre-computed context to AI agents, eliminating the need for agents to burn tokens gathering project information. Instead of async agents like Claude Code executing searches and building dependency graphs, CodeGraph's MCP server handles these operations efficiently and exposes them through standardized tools.
- What they do: The MCP server ships multi-step agentic tools (e.g.,
agentic_code_search,agentic_dependency_analysis,agentic_call_chain_analysis,agentic_architecture_analysis,agentic_api_surface_analysis,agentic_context_builder,agentic_semantic_question) that plan, call graph tools, and return synthesized answers. - When to use:
- Need an LLM to explore unfamiliar code paths β use
agentic_call_chain_analysis. - Need impact/dep mapping before edits β use
agentic_dependency_analysis. - Need a broad architectural map β use
agentic_architecture_analysis. - Need quick semantic answers with citations β use
agentic_semantic_questionoragentic_code_search. - Need to understand repository API design, public surfaces β use
agentic_api_surface_analysis - Need pre-chewn semantically rich context for a feature development β
agentic_context_builder
- Need an LLM to explore unfamiliar code paths β use
- All these tools internally execute a reasoning agent (ReACT or LATS) with built-in graph analysis- and semantic search tools. The agent does multi-step graph analysis on your indexed codebase and semantic similarity hybrid searches (0.7 vector similarity, 0.3 lexical search) so that it gains a comprehensive understanding of your codebaset before answering clients query.
- The Depth and quality of provided answers is governed by a 4-tier prompt mechanism, less detailed faster analyses for models with smaller ctx windows more broader comprehensive analyses for models with larger ctx windows.
- This enables running local models in systems with limited resources and the agent chaining tool uses for comprehensive context while using models like grok-4-1-fast-reasoning give you the full monty in one go.
- Prerequisites: Index your codebase first (
codegraph index . -r -l language) so graph and embeddings exist. For live edits, run the mcp server withcodegraph start stdio --watch(daemon) to keep results current. - How to invoke in clients without MCP βinstructionsβ support: Call the prompt tool
read_initial_instructionsonce; it returns the same guidance the MCP instructions feature would have injected. Example (stdio):In HTTP/SSE clients, invoke the same tool or copy its response into your system prompt.-> call tool: read_initial_instructions
Core Capabilities:
- Hybrid lexical+semantic code+graph traversal search with vector embeddings and BM25 weighted and graph query combined (and optional reranking of graphDB results)
- LLM-powered code intelligence and dependency analysis
- Automatic AST+Fast ML Enhanced dependency graph construction and surrealDB functions enabled fast traversal
- Agentic MCP tools with tier-aware multi-step reasoning based on used models ctx window
- Incremental indexing with change detection (only re-index modified files)
- Daemon mode for automatic file watching and re-indexing
- Automatic chunking (and support for disabling it - more speed)
LLM Providers:
- Anthropic Claude (Sonnet, Opus, Haiku - 4.5 model family)
- OpenAI (GPT-5.1, GPT-5.1-codex)
- Ollama (local models)
- LM Studio (Local models)
- xAI Grok (Grok-4.1-fast-reasoning for true codebase understanding 2M ctx!)
- Any OpenAI-compatible provider (OpenRouter Kimi-K2 Thinking)
CodeGraph uses the modern Responses API by default for LM Studio and OpenAI-compatible providers. This API provides:
- Better support for reasoning models
- Improved token management
- Clearer request/response structure
Backward Compatibility: If your provider doesn't support Responses API, enable the legacy Chat Completions API:
export CODEGRAPH_USE_COMPLETIONS_API=trueOr in ~/.codegraph/config.toml:
[llm]
use_completions_api = trueNote: Ollama is NOT affected by this setting - it uses its native API.
Embedding Providers:
- Any model with supported dimensions: 384, 768, 1024, 1536, 2048, 2560, 3072, 4096
- Ollama (local models: qwen, jina, nomic, bge, e5, minilm, etc.)
- LM Studio (local models via OpenAI-compatible API)
- Jina AI (cloud API)
- OpenAI (cloud API)
- ONNX Runtime (local CPU/GPU inference)
- Set
CODEGRAPH_EMBEDDING_DIMENSIONto match the dimension of the model of your choosing (and that is supported by codegraph) - Configure
CODEGRAPH_MAX_CHUNK_TOKENSbased on your model's context window (80% is pretty safe to create good chunks if chunking triggers, use f.ex. qwen3-embedding family models with 32k context for clean <500 rows per file codebases for great quality)
Reranking Providers:
- Jina AI (cloud API with jina-reranker-v3)
- Ollama (local chat models: Qwen3-Reranker family, etc.)
Vector & Graph Database:
- SurrealDB HNSW index (2-5ms query latency)
- Supports 384, 768, 1024, 1536, 2048, 2560, 3072, 4096 dimensions
- Cloud-native or local deployment
MCP Integration:
- Stdio transport (production-ready) - when adding to a client point to the absolute path of your built codegraph binary - codegraph start stdio as cmd and arguments doesn't seem to work for most clients
- Streamable HTTP transport with SSE (experimental)
- Compatible with CLI Agents Claude Code, Gemini, Qwen, Cursor-Agent (Haven't tested with Codex due to it's extreme sandbox) - won't work with stdio with f.ex. cursor because working project is inferred from MCP servers process location, but http is there for this reason - run in your working project
codegraph start http --port PORTand add to your client as codegraph url:http://localhost:PORT/mcp - Two agent architectures: default ReACT (fast, single-shot reasoning) and new LATS (Language Agent Tree Search; slower, more comprehensive and thorough answers - uses more tokens)
CodeGraph writes embeddings directly into SurrealDB's dimension-specific HNSW columns. Choose your provider:
Option 1: Ollama (any model)
export CODEGRAPH_EMBEDDING_PROVIDER=ollama
export CODEGRAPH_EMBEDDING_MODEL=qwen3-embedding:0.6b # Use ANY Ollama embedding model
export CODEGRAPH_EMBEDDING_DIMENSION=1024 # Match your model's output: 384, 768, 1024, 1536, 2048, 2560, 3072, or 4096
export CODEGRAPH_MAX_CHUNK_TOKENS=32000 # Match your model's context windowOption 2: LM Studio (any OpenAI-compatible model)
export CODEGRAPH_EMBEDDING_PROVIDER=lmstudio
export CODEGRAPH_LMSTUDIO_MODEL=jina-embeddings-v4 # Use ANY model loaded in LM Studio
export CODEGRAPH_EMBEDDING_DIMENSION=2048 # Match your model's output dimension
export CODEGRAPH_MAX_CHUNK_TOKENS=8196 # Match your model's context window
export CODEGRAPH_LMSTUDIO_URL=http://localhost:1234 # Default LM Studio endpoint (the /v1 path is appended automatically)We automatically route embeddings to embedding_384, embedding_768, embedding_1024, embedding_2048, embedding_2560, or embedding_4096 columns based on your model's dimension.
CodeGraph supports text-based reranking to improve search result quality using cross-encoder models:
How Reranking Works:
- Fast semantic search retrieves initial candidates (HNSW vector search)
- Reranker scores query-document pairs using text content (not embeddings)
- Results re-ranked by relevance score for higher precision
Option 1: Jina AI Reranking (Cloud API)
- Uses jina-reranker-v3 cross-encoder model
- Highest quality, purpose-built for reranking
- Requires Jina API key
export CODEGRAPH_RERANK_PROVIDER=jina
export JINA_API_KEY=jina_...Or in ~/.codegraph/config.toml:
[rerank]
provider = "jina"
top_n = 10 # Number of results after reranking
[rerank.jina]
model = "jina-reranker-v3"
api_key_env = "JINA_API_KEY"Option 2: Ollama Reranking (Local, Free)
- Uses chat models for relevance scoring (e.g., Qwen3-Reranker)
- Runs locally via Ollama
- No API key required
# Pull the model first
ollama pull dengcao/Qwen3-Reranker-4B:Q5_K_M
# Configure CodeGraph
export CODEGRAPH_RERANK_PROVIDER=ollama
export CODEGRAPH_OLLAMA_RERANK_MODEL="dengcao/Qwen3-Reranker-4B:Q5_K_M"Or in ~/.codegraph/config.toml:
[rerank]
provider = "ollama"
top_n = 10
[rerank.ollama]
model = "dengcao/Qwen3-Reranker-4B:Q5_K_M"
api_base = "http://localhost:11434"Performance Notes:
- Jina reranking: 80-200ms per query (cloud API)
- Ollama reranking: Varies by model (local inference)
- Both significantly improve result precision over vector search alone
The agentic MCP tools (agentic_code_search, agentic_dependency_analysis, etc.) require SurrealDB for graph analysis:
Option 1: Free Cloud Instance (Recommended)
- Sign up at Surreal Cloud
- Get 1GB FREE instance - perfect for testing and small projects
- Configure connection details in environment variables
Option 2: Local Installation
# Install SurrealDB
curl -sSf https://install.surrealdb.com | sh
# Run locally
surreal start --bind 127.0.0.1:3004 --user root --pass root memoryFree Cloud Resources:
- π SurrealDB Cloud: 1GB free instance at surrealdb.com/cloud
- π Jina AI: 10 million free API tokens at jina.ai for embeddings and reranking
CodeGraph supports two agent architectures for agentic MCP tools, allowing you to choose between speed and quality:
- Best for: Quick queries, simple analysis tasks, rapid iterations
- Algorithm: Single-pass reasoning with action execution
- Performance: Fastest response times (30-60 seconds typical)
- Set:
CODEGRAPH_AGENT_ARCHITECTURE=react(or leave unset)
- Best for: Complex queries requiring exploration, architectural analysis, deep dependency chains
- Algorithm: UCT-based tree search with beam width and depth control
- Performance: Slower but higher quality (60-120 seconds typical)
- Requires:
--features autoagents-latsat build time - Set:
CODEGRAPH_AGENT_ARCHITECTURE=lats
LATS supports using different LLM providers for different phases of the tree search:
# Use different models for different LATS phases
export CODEGRAPH_LATS_SELECTION_PROVIDER=openai
export CODEGRAPH_LATS_SELECTION_MODEL=gpt-4o-mini
export CODEGRAPH_LATS_EXPANSION_PROVIDER=anthropic
export CODEGRAPH_LATS_EXPANSION_MODEL=claude-3-5-sonnet-20241022
export CODEGRAPH_LATS_EVALUATION_PROVIDER=openai
export CODEGRAPH_LATS_EVALUATION_MODEL=o1-preview
# Algorithm tuning
export CODEGRAPH_LATS_BEAM_WIDTH=3 # Number of best paths to keep (default: 3)
export CODEGRAPH_LATS_MAX_DEPTH=5 # Maximum search depth (default: 5)Build with LATS support:
cargo build --release -p codegraph-mcp --features "ai-enhanced,autoagents,autoagents-lats,ollama"Why use multi-provider LATS?
- Selection: Use fast, cheap models (gpt-5.1-codex-mini) for rapid node selection
- Expansion: Use reasoning models (grok-4-1-fast-reasoning, Claude 4.5, Opus 4.5, GPT-5.1) for generating high-quality next steps
- Evaluation: Use specialized reasoning models (Opus 4.5, GPT-5.1-codex) for accurate state evaluation
- Cost optimization: Balance quality and cost by using expensive models only where needed (local for cheapest and most fun combinations and grok-4-1-fast-reasoning for everything dirt cheap and amazing results)
Configure timeouts and resource limits for agent execution:
# Global agent timeout (default: 300 seconds, 0 = unlimited)
export CODEGRAPH_AGENT_TIMEOUT_SECS=300
# LATS per-iteration timeout (default: 60 seconds, 0 = unlimited)
export CODEGRAPH_LATS_ITERATION_TIMEOUT_SECS=60
# Maximum tree nodes for LATS (default: auto-calculated from beam_width * max_depth * 2)
export CODEGRAPH_AGENT_MAX_TREE_NODES=0
# Agent memory window size (default: 40 messages, 0 = unlimited)
export CODEGRAPH_AGENT_MEMORY_WINDOW=40
# Enable debug logging for agent execution
export CODEGRAPH_DEBUG=1- Choose Your Setup
- Installation
- Configuration
- Usage
- Feature Flags Reference
- Performance
- Troubleshooting
- Advanced Features
Pick the setup that matches your needs:
Best for: Privacy-conscious users, offline work, no API costs
Providers:
- Embeddings: ONNX or Ollama
- LLM: Ollama (Qwen2.5-Coder, CodeLlama, etc.)
Pros: β Free, β Private, β No internet required after setup Cons: β Slower, β Requires local GPU/CPU resources
β Jump to Local Setup Instructions
Best for: Mac users (Apple Silicon), best local performance
Providers:
- Embeddings: LM Studio (Jina embeddings)
- LLM: LM Studio (DeepSeek Coder, etc.)
Pros: β 120 embeddings/sec, β MLX + Flash Attention 2, β Free Cons: β Mac only, β Requires LM Studio app
β Jump to LM Studio Setup Instructions
Best for: Production use, best quality, don't want to manage local models
Providers:
- Embeddings: Jina (You get 10 million tokens for free when you just create an account!)
- LLM: Anthropic Claude or OpenAI GPT-5.1-*
- Backend: SurrealDB graph database (You get a free cloud instance up-to 1gb! Or run it completely locally!)
Pros: β Best quality, β Fast, β 1M context (sonnet[1m]) Cons: β API costs, β Requires internet, β Data sent to cloud
β Jump to Cloud Setup Instructions
Best for: Balancing cost and quality
Example combinations:
- Local embeddings (ONNX) + Cloud LLM (OpenAI, Claude, x.ai)
- LMStudio embeddings + Cloud LLM (OpenAI, Claude, x.ai)
- Jina AI embeddings + Local LLM (Ollama, LMStudio)
β Jump to Hybrid Setup Instructions
# 1. Install Rust
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | shStep 1: Install Ollama
# macOS/Linux:
curl -fsSL https://ollama.com/install.sh | sh
# Or download from: https://ollama.com/download
brew install onnx-runtimeStep 2: Pull models
# Pull embedding model
hf (cli) download qdrant/all-minillm-onnx
# Pull LLM for code intelligence (optional)
ollama pull qwen2.5-coder:14bStep 3: Build CodeGraph
cd codegraph-rust
# Build with ONNX embeddings and Ollama support
cargo build --release --features "onnx,ollama"Step 4: Configure
Create ~/.codegraph/config.toml:
[embedding]
provider = "onnx" # or "ollama" if you prefer
model = "qdrant/all-minillm-onnx"
dimension = 384
[llm]
enabled = true
provider = "ollama"
model = "qwen2.5-coder:14b"
ollama_url = "http://localhost:11434"Step 5: Index and run
# Index your project
./target/release/codegraph index /path/to/your/project
# Start MCP server
./target/release/codegraph start stdioβ Done! Your local setup is ready.
Step 1: Install LM Studio
- Download from lmstudio.ai
- Install and launch the app
Step 2: Download models in LM Studio
- Embedding model:
jinaai/jina-embeddings-v4 - LLM model (optional):
lmstudio-community/DeepSeek-Coder-V2-Lite-Instruct-GGUF
Step 3: Start LM Studio server
- In LM Studio, go to "Local Server" tab
- Click "Start Server" (runs on
http://localhost:1234)
Step 4: Build CodeGraph
cd codegraph-rust
# Build MCP server with LM Studio support (recommended)
make build-mcp-autoagents
# Or build manually with feature flags
cargo build --release -p codegraph-mcp --features "ai-enhanced,autoagents,embeddings-lmstudio,codegraph-ai/openai-compatible"Step 5: Configure
Option A: Config file (recommended)
Create ~/.codegraph/config.toml:
[embedding]
provider = "lmstudio"
model = "jinaai/jina-embeddings-v4"
lmstudio_url = "http://localhost:1234"
dimension = 2048
[llm]
enabled = true
provider = "lmstudio"
model = "lmstudio-community/DeepSeek-Coder-V2-Lite-Instruct-GGUF"
lmstudio_url = "http://localhost:1234"
# use_completions_api = true # Uncomment if your LM Studio version doesn't support Responses APINote: LM Studio now uses the modern Responses API by default. If you're using an older version of LM Studio that doesn't support it, add use_completions_api = true to the [llm] section.
Option B: Environment variables
export CODEGRAPH_EMBEDDING_PROVIDER=lmstudio
export CODEGRAPH_LMSTUDIO_MODEL=jinaai/jina-embeddings-v4
export CODEGRAPH_LMSTUDIO_URL=http://localhost:1234
export CODEGRAPH_EMBEDDING_DIMENSION=2048Step 6: Index and run
# Index your project
./target/release/codegraph index /path/to/your/project
# Start MCP server
./target/release/codegraph start stdioβ Done! LM Studio setup complete.
Step 1: Get API keys
- Anthropic: console.anthropic.com(Claude 4.5 models 1M/200k ctx)
- OpenAI: platform.openai.com(GPT-5 models 400k/200k ctx)
- xAI: x.ai (grok-4-1-fast-reasoning with 2M ctx, $0.50-$1.50/M tokens)
- Jina AI: jina.ai (for SOTA embeddings & reranking)
- SurrealDB [https://www.surrealdb.com] (for graph dabase backend local or cloud based setup)
Step 2: Build CodeGraph with cloud features
cd codegraph-rust
# Build with all cloud providers
cargo build --release --features "anthropic,openai-llm,openai"
# Or with Jina AI cloud embeddings
cargo build --release --features "cloud-jina,anthropic"
# Or with SurrealDB HNSW cloud/local vector backend
cargo build --release --features "cloud-surrealdb,openai"Step 3: Run setup wizard (easiest)
./target/release/codegraph-setupThe wizard will guide you through configuration.
Or manually configure ~/.codegraph/config.toml:
For Anthropic Claude:
[embedding]
provider = "jina" # or openai
model = "jina-embeddings-v4"
openai_api_key = "sk-..." # or set OPENAI_API_KEY env var
dimension = 2048
[llm]
enabled = true
provider = "anthropic"
model = "claude-haiku"
anthropic_api_key = "sk-ant-..." # or set ANTHROPIC_API_KEY env var
context_window = 200000For OpenAI (with reasoning models):
[embedding]
provider = "jina" # or openai
model = "jina-embeddings-v4"
openai_api_key = "sk-..."
dimension = 2048
[llm]
enabled = true
provider = "openai"
model = "gpt-5-codex-mini"
context_window=200000
openai_api_key = "sk-..."
max_completion_token = 128000
reasoning_effort = "medium" # reasoning models: "minimal", "medium", "high"
# use_completions_api = true # Uncomment if your provider doesn't support Responses APINote: OpenAI-compatible providers now use the modern Responses API by default. Most providers support this (99%+), but if you encounter errors, add use_completions_api = true to the [llm] section.
For Jina AI (cloud embeddings + reranking):
[embedding]
provider = "jina"
model = "jina-embeddings-v4"
jina_api_key = "jina_..." # or set JINA_API_KEY env var
dimension = 2048 # or matryoshka 1024,512,256 adjust the schemas/*.surql file HNSW vector index to match your embedding model dimensions
[rerank]
provider = "jina" # Optional two-stage retrieval
top_n = 10
[rerank.jina]
model = "jina-reranker-v3"
api_key_env = "JINA_API_KEY"
[llm]
enabled = true
provider = "anthropic"
model = "claude-haiku"
context_window = 200000
max_completion_tokens= 25000
anthropic_api_key = "sk-ant-..."For xAI Grok (2M context window, $0.50-$1.50/M tokens):
[embedding]
provider = "openai" # or "jina"
model = "text-embedding-3-small"
openai_api_key = "sk-..."
dimension = 1536
[llm]
enabled = true
provider = "xai"
model = "grok-4-1-fast-reasoning" # or "grok-4-turbo"
xai_api_key = "xai-..." # or set XAI_API_KEY env var
xai_base_url = "https://api.x.ai/v1" # default, can be omitted
reasoning_effort = "medium" # Options: "minimal", "medium", "high"
context_window = 2000000 # 2M tokens!
# use_completions_api = true # Uncomment if your xAI version doesn't support Responses APINote: xAI uses the modern Responses API by default. Current xAI versions support this API, but if you encounter compatibility issues, add use_completions_api = true to the [llm] section.
For SurrealDB HNSW (graph database backend with advanced features):
[embedding]
provider = "jina" # or "openai"
model = "jina-embeddings-v4"
openai_api_key = "sk-..."
dimension = 2048
[vector_store]
backend = "surrealdb"
surrealdb_url = "ws://localhost:8000" # or cloud instance
surrealdb_namespace = "codegraph"
surrealdb_database = "production"
[llm]
enabled = true
provider = "anthropic"
model = "claude-haiku"Step 4: Index and run
# Index your project
./target/release/codegraph index /path/to/your/project
# Start MCP server
./target/release/codegraph start stdioβ Done! Cloud setup complete.
Mix local and cloud providers to balance cost and quality:
Example: Local embeddings + Cloud LLM
[embedding]
provider = "onnx" # Free, local
model = "sentence-transformers/all-MiniLM-L6-v2"
dimension = 384
[llm]
enabled = true
provider = "anthropic" # Best quality for analysis
model = "sonnet[1m]"
context_window = 1000000
anthropic_api_key = "sk-ant-..."Build with required features:
cargo build --release --features "onnx,anthropic"Use the interactive wizard:
cargo build --release --bin codegraph-setup --features all-cloud-providers
./target/release/codegraph-setupConfiguration directory: ~/.codegraph/
All configuration files are stored in ~/.codegraph/ in TOML format.
Configuration is loaded from (in order):
~/.codegraph/default.toml(base configuration)~/.codegraph/{environment}.toml(e.g., development.toml, production.toml)~/.codegraph/local.toml(local overrides, machine-specific)./config/(fallback for backward compatibility)- Environment variables (CODEGRAPH__* prefix)
See Configuration Guide for complete documentation.
Full configuration example:
[embedding]
provider = "lmstudio" # or "onnx", "ollama", "openai"
model = "jinaai/jina-embeddings-v4"
dimension = 2048
batch_size = 64
[llm]
enabled = true
provider = "anthropic" # or "openai", "ollama", "lmstudio" or "xai"
model = "haiku"
anthropic_api_key = "sk-ant-..."
context_window = 200000
temperature = 0.1
max_completion_token = 25000
[performance]
num_threads = 0 # 0 = auto-detect
cache_size_mb = 512
max_concurrent_requests = 4
[logging]
level = "warn" # trace, debug, info, warn, error
format = "pretty" # pretty, json, compactSee .codegraph.toml.example for all options.
# Index a project
codegraph index -r /path/to/project
# Start MCP server (for Claude Desktop, LM Studio, etc.)
codegraph start stdio
# List available MCP tools
codegraph tools listKeep your index up-to-date automatically by running the daemon:
# Start watching a project (runs in background)
codegraph daemon start /path/to/project
# Start in foreground for debugging
codegraph daemon start /path/to/project --foreground
# Filter by languages
codegraph daemon start /path/to/project --languages rust,typescript
# Exclude patterns
codegraph daemon start /path/to/project --exclude "**/node_modules/**" --exclude "**/target/**"
# Check daemon status
codegraph daemon status /path/to/project
codegraph daemon status /path/to/project --json # JSON output
# Stop the daemon
codegraph daemon stop /path/to/projectFeatures:
- π Automatic re-indexing when files change (create, modify, delete, rename)
- β‘ Event coalescing to batch rapid changes
- π‘οΈ Circuit breaker pattern for SurrealDB resilience
- π Session metrics tracking (batches processed, errors, etc.)
Note: Requires the daemon feature flag:
cargo build --release -p codegraph-mcp --features "daemon,ai-enhanced"Start the MCP server with automatic file watching - the daemon runs in the background:
# Start MCP server with file watching
codegraph start stdio --watch
# Watch a specific directory
codegraph start stdio --watch --watch-path /path/to/project
# Disable watching even if enabled in config
codegraph start stdio --no-watch
# Via environment variable
CODEGRAPH_DAEMON_AUTO_START=true codegraph start stdioConfiguration in ~/.codegraph/config.toml:
[daemon]
auto_start_with_mcp = true # Auto-start daemon when MCP server starts
debounce_ms = 30
batch_timeout_ms = 200
exclude_patterns = ["**/node_modules/**", "**/target/**"]Note: HTTP transport is not yet implemented with the official rmcp SDK. Use STDIO transport for all MCP integrations.
Add to your Claude Desktop config (~/Library/Application Support/Claude/claude_desktop_config.json on Mac):
{
"mcpServers": {
"codegraph": {
"command": "/path/to/codegraph",
"args": ["start", "stdio"],
"env": {
"RUST_LOG": "warn"
}
}
}
}- Start CodeGraph MCP server:
codegraph start stdio - In LM Studio, enable MCP support in settings
- CodeGraph tools will appear in LM Studio's tool palette
For web integrations and multi-client scenarios:
# Build with HTTP support
make build-mcp-http
# Start server
./target/release/codegraph start http
# Test endpoints
curl http://127.0.0.1:3000/health
curl -X POST http://127.0.0.1:3000/mcp \
-H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2025-06-18","capabilities":{},"clientInfo":{"name":"test","version":"1.0"}}}'Features:
- β Session-based stateful connections
- β SSE streaming for real-time progress
- β Automatic session management
- β Reconnection support with Last-Event-Id
Use Cases:
- Web-based code analysis dashboards
- Multi-client collaborative environments
- API integrations
- Development/debugging (easier to inspect than STDIO)
Note: For production use with Claude Desktop, use STDIO mode.
CodeGraph provides powerful code intelligence tools via the Model Context Protocol (MCP).
After indexing your codebase, AI agents can use these agentic workflows (requires SurrealDB-backed graphs):
- π
agentic_code_search- Autonomous semantic exploration for finding and understanding code - π
agentic_dependency_analysis- Impact and coupling analysis across forward and reverse dependencies - π
agentic_call_chain_analysis- Execution path tracing through controllers, services, and downstream systems - ποΈ
agentic_architecture_analysis- Architectural pattern assessment plus layer and cohesion breakdowns - π
agentic_api_surface_analysis- Public API surface analysis with consumer mapping and change risk detection - π¦
agentic_context_builder- Comprehensive context gathering ahead of implementing or modifying features - β
agentic_semantic_question- Deep semantic Q&A that synthesizes answers across multiple subsystems
These multi-step workflows typically take 30β90 seconds to complete because they traverse the code graph and build detailed reasoning summaries.
Project scoping: MCP tool calls and SurrealDB functions are project-isolated. The server picks CODEGRAPH_PROJECT_ID (falls back to your current working directory); use a distinct value per workspace when sharing one Surreal instance, or youβll blend graph results across projects. If you update the schema, re-apply schema/codegraph.surql so the project-aware functions are available.
# 1. Index your codebase
codegraph index /path/to/your/project
# 2. Start MCP server
codegraph start stdio
# 3. Use tools from your AI agent
agentic_code_search("how does authentication work?")
agentic_dependency_analysis("what depends on AuthService?")The following diagram shows how CodeGraph's agentic MCP tools work internally:
flowchart TD
subgraph "External: AI Agent (Claude Desktop, etc.)"
A[AI Agent] -->|MCP Tool Call| B[agentic_code_search<br/>agentic_dependency_analysis<br/>etc.]
end
subgraph "CodeGraph MCP Server"
B --> C{Tier Detection}
C -->|Read LLM Context<br/>Window Config| D[Determine Tier]
D -->|< 50K tokens| E1[Small Tier<br/>TERSE prompts<br/>5 max steps<br/>2,048 tokens]
D -->|50K-150K tokens| E2[Medium Tier<br/>BALANCED prompts<br/>10 max steps<br/>4,096 tokens]
D -->|150K-400K tokens| E3[Large Tier<br/>DETAILED prompts<br/>15 max steps<br/>8,192 tokens]
D -->|> 400K tokens| E4[Massive Tier<br/>EXPLORATORY prompts<br/>20 max steps<br/>16,384 tokens]
E1 & E2 & E3 & E4 --> F[Load Tier-Specific<br/>System Prompt]
F --> G[ReAct Agent<br/>Multi-Step Reasoning]
subgraph "Internal Graph Analysis Tools"
G -->|Step 1-N| H1[get_transitive_dependencies]
G -->|Step 1-N| H2[detect_circular_dependencies]
G -->|Step 1-N| H3[trace_call_chain]
G -->|Step 1-N| H4[calculate_coupling_metrics]
G -->|Step 1-N| H5[get_hub_nodes]
G -->|Step 1-N| H6[get_reverse_dependencies]
end
H1 & H2 & H3 & H4 & H5 & H6 --> I[SurrealDB Graph<br/>Query Execution]
I -->|Cached Results| J[LRU Cache<br/>100 entries]
I -->|Raw Data| K[Agent Reasoning]
J -->|Cache Hit| K
K -->|Iterative| G
K -->|Final Analysis| L[Structured Response]
end
L -->|Return via MCP| A
subgraph "Initial Codegraph Instructions Flow"
M[Agent Reads<br/>MCP Server Info] -->|Auto-loaded| N[Read Initial<br/>Codegraph Instructions]
N --> O[Tool Discovery:<br/>7 agentic_* tools listed]
N --> P[Tier Configuration:<br/>Context window limits]
N --> Q[Cache Settings:<br/>LRU enabled, size]
N --> R[Orchestrator Config:<br/>Max steps per tier]
O & P & Q & R --> S[Agent Ready<br/>to Use Tools]
S -.->|Invokes| B
end
style B fill:#e1f5ff
style G fill:#fff4e1
style I fill:#f0e1ff
style L fill:#e1ffe1
style N fill:#ffe1e1
Key Components:
-
Tier Detection: Automatically adapts prompt complexity based on LLM's context window
- Small (<50K): Fast, terse responses for limited context models f.ex. local gemma3 etc.
- Medium (50K-150K): Balanced analysis for Claude Haiku, gpt-5.1-codex-mini
- Large (150K-400K): Detailed exploration for Sonnet, Opus, gpt-5.1, qwen3:4b
- Massive (>400K): Comprehensive deep-dives for grok-4-1-fast-reasoning, gemini-3.0-pro, Sonnet[1m]
-
Multi-Step Reasoning: ReAct pattern with tier-specific limits
- Each step can call internal graph analysis tools
- LRU cache prevents redundant SurrealDB queries
- Iterative refinement until analysis complete
-
Internal Tools: 6 graph analysis primitives
- Zero heuristicsβLLM infers from structured data only
- Results cached transparently (100 entries default)
- Tool call logging for debugging
-
Initial Instructions: Auto-loaded when MCP server connects
- The in-built Agent discovers available tools and their capabilities
- Learns tier configuration and limits
- Understands caching and orchestration settings
- Understands the context of the graphDB and it's structure.
-
No parameters needed: External client using the mcp-servers agentic tools only need to pass semantic queries for minimal cognitive load
- Server uses mcp-protocols instructions feature to broadcast the capabilities of the server automatically to clients that support it
CodeGraph uses feature flags to enable only the components you need. Build with the features that match your deployment.
| Feature | Description | Use Case |
|---|---|---|
ai-enhanced |
Agentic MCP tools | Enables 7 agentic workflows with multi-step reasoning |
server-http |
HTTP/SSE transport | Experimental HTTP server (use STDIO for production) |
autoagents |
Agent orchestration | ReAct agent architecture (default for agentic tools) |
autoagents-lats |
LATS algorithm | Optional higher-quality tree search architecture |
Dimension-Based Model Support:
CodeGraph supports any embedding model that outputs one of these dimensions:
- 384, 768, 1024, 1536, 2048, 2560, 3072, 4096
Set the dimension explicitly to use any model:
export CODEGRAPH_EMBEDDING_DIMENSION=2048 # Match your model's output dimension
export CODEGRAPH_MAX_CHUNK_TOKENS=2048 # Match your model's context window| Feature | Provider | Notes |
|---|---|---|
embeddings-local |
ONNX Runtime | Local CPU/GPU inference. Any ONNX model with supported dimensions. |
embeddings-ollama |
Ollama | Any Ollama embedding model (qwen, jina, nomic, bge, e5, minilm, etc.) |
lmstudio |
LM Studio | Any model via OpenAI-compatible API. Auto-detects dimensions from common models. |
embeddings-openai |
OpenAI | Cloud API. Supports text-embedding-3-small (1536) and text-embedding-3-large (3072). |
embeddings-jina |
Jina AI | Cloud API. Supports jina-embeddings-v3 (1024) and jina-embeddings-v4 (2048). Use CODEGRAPH_RERANK_PROVIDER=jina for reranking. |
Configuration Priority:
CODEGRAPH_EMBEDDING_DIMENSIONenvironment variable (recommended)- Model inference from name (fallback, limited to known models)
Chunk Size Configuration:
CODEGRAPH_MAX_CHUNK_TOKENScontrols how text is split before embedding- Set this based on your embedding model's maximum context window
- Examples: 512 for small models, 2048 for medium, 8192 for large context models
| Feature | Provider | Models/Notes |
|---|---|---|
anthropic |
Anthropic Claude | Claude Sonnet 4.5, Haiku 4.5, Opus 4.5 |
openai-llm |
OpenAI | gpt-5.1, gpt-5.1-codex, gpt-5.1-codex-mini |
openai-compatible |
LM Studio, xAI, Ollama, custom | OpenAI-compatible APIs (grok-4-1-fast-reasoning xai, kimi-k2-thinking openrouter local models) |
| Feature | Includes | Use Case |
|---|---|---|
cloud |
embeddings-jina + SurrealDB |
Jina embeddings + cloud graph database |
all-cloud-providers |
anthropic + openai-llm + openai-compatible |
All LLM providers for agentic tools |
# Local only (ONNX + Ollama)
cargo build --release --features "onnx,ollama"
# LM Studio
cargo build --release --features "openai-compatible"
# Cloud only (Anthropic + OpenAI)
cargo build --release --features "anthropic,openai-llm,openai"
# Jina AI cloud embeddings + local surrealDB
cargo build --release --features "cloud-jina"
# SurrealDB cloud vector backend
cargo build --release --features "cloud-surrealdb,openai"
# Full cloud (Jina + SurrealDB + Anthropic)
cargo build --release --features "cloud,anthropic"
# Everything (local + cloud)
cargo build --release --features "all-cloud-providers,onnx,ollama,cloud"
# HTTP server with agent orchestration
cargo build --release -p codegraph-mcp --features "ai-enhanced,autoagents,embeddings-ollama,server-http"
# With LATS tree search architecture
cargo build --release -p codegraph-mcp --features "ai-enhanced,autoagents,autoagents-lats,embeddings-ollama"| Operation | Performance | Notes |
|---|---|---|
| Embedding generation | 120 embeddings/sec | LM Studio with MLX |
| Vector search (local) | 2-5ms latency | SurrealDB HNSW |
| Vector search (cloud) | 2-5ms latency | SurrealDB HNSW |
| Jina AI embeddings | 50-150ms per query | Cloud API call overhead |
| Jina reranking | 80-200ms for top-K | Two-stage retrieval |
| Ollama embeddings | ~1024 embeddings/30sec | all-minillm:latest (Ollama) |
| Optimization | Speedup | Memory Cost |
|---|---|---|
| Embedding cache | 10-100Γ | ~90 MB |
| Query cache | 100Γ | ~10 MB |
| Parallel search | 2-3Γ | Minimal |
"API key not found"
- Set environment variable:
export ANTHROPIC_API_KEY="sk-ant-..." - Or add to config file:
anthropic_api_key = "sk-ant-..."
"Model not found"
- For Ollama: Run
ollama pull <model-name>first - For LM Studio: Download the model in LM Studio app
- For cloud: Check your model name matches available models
"Connection refused"
- LM Studio: Make sure the local server is running
- Ollama: Check Ollama is running with
ollama list - Cloud: Check your internet connection
- Check docs/CLOUD_PROVIDERS.md for detailed provider setup
- See LMSTUDIO_SETUP.md for LM Studio specifics
- Open an issue on GitHub with your error message
We welcome contributions!
# Format code
cargo fmt --all
# Run linter
cargo clippy --workspace --all-targets
# Run tests
cargo test --workspaceOpen an issue to discuss large changes before starting.
Dual-licensed under MIT and Apache 2.0. See LICENSE-MIT and LICENSE-APACHE for details.
- Cloud Providers Guide - Detailed cloud provider setup
- Configuration Reference - All configuration options
- Changelog - Version history and release notes
- Legacy Docs - Historical experiments and architecture notes