Multi-agent AI system for monitoring research literature, building knowledge graphs, and providing proactive intelligence to researchers.
Built with: Google Gemini API, Cloud Run, Gemini 2.0 Flash & Gemini 2.5 Pro, Firestore
π Live Demo: https://frontend-up5qa34vea-uc.a.run.app (or https://frontend-338657477881.us-central1.run.app)
This platform uses 6 specialized AI agents to:
- π Automatically ingest and index research papers from arXiv
- πΈοΈ Build knowledge graphs showing paper relationships (150 relationships across 49 papers)
- π Proactively alert researchers to relevant publications
- π¬ Answer questions with citations and confidence scores
- π Detect contradictions and controversies in research
Key Achievement: Improved knowledge graph density from 7.7% to 12.8% (66% improvement) through multi-agent relationship detection with selective confidence thresholds.
| Service | URL | Purpose |
|---|---|---|
| Frontend | https://frontend-up5qa34vea-uc.a.run.app | React UI with D3.js graph visualization |
| API Gateway | https://api-gateway-up5qa34vea-uc.a.run.app | Request routing, service discovery |
| Orchestrator | https://orchestrator-up5qa34vea-uc.a.run.app | Coordinates ingestion & Q&A workflows |
| Graph Service | https://graph-service-up5qa34vea-uc.a.run.app | Knowledge graph queries & traversal |
| Intake Pipeline | Cloud Run Job | Paper ingestion processing |
| Graph Updater | Cloud Run Job | Relationship detection & updates |
All agents use Google ADK primitives (LlmAgent, Runner, InMemorySessionService) with Gemini 2.5 Pro:
- Entity Agent - Extracts authors, methods, datasets, and infers arXiv category
- Relationship Agent - Detects paper relationships: extends, supports, contradicts
- Answer Agent - Generates answers with citations
- Confidence Agent - Scores answer confidence
- Graph Query Agent - Translates natural language to graph queries
- Alert Matching Agent - Matches papers to user watch rules with explanations
See ARCHITECTURE.md for detailed architecture diagrams.
- Papers: 49 AI/ML research papers
- Relationships: 150 total
- 124 "extends" relationships
- 26 "supports" relationships
- Graph Density: 12.8% (up from 7.7%)
- Relationship Types: extends, supports, contradicts, cites, builds_on, applies
Optimization Story: We improved graph density by 66% through:
- Temperature increase (0.3 β 0.7) for more diverse LLM outputs
- Refined relationship detection prompt
- Selective confidence thresholds (contradicts=0.7, extends/supports=0.5)
- Union strategy to account for LLM variation
- Python 3.9+
- UV - Fast Python package installer
- Google Cloud Project
- Gemini API key
# 1. Install UV (if not already installed)
curl -LsSf https://astral.sh/uv/install.sh | sh
# Or on macOS: brew install uv
# 2. Clone repository
git clone https://github.com/yourusername/research-intelligence-agents.git
cd research-intelligence-agents
# 3. Create virtual environment and install dependencies
uv venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
uv pip install -e ".[dev]"
# 4. Configure environment
cp .env.example .env
# Edit .env with your credentials:
# GOOGLE_CLOUD_PROJECT=your-project-id
# GOOGLE_API_KEY=your-gemini-api-key
# DEFAULT_MODEL=gemini-2.5-pro
# [email protected] # For alert notifications
# SENDGRID_API_KEY=your-sendgrid-key # Optional, for email delivery
# 5. Verify setup
python scripts/test_setup.pyUV is a blazingly fast Python package installer written in Rust:
- π 10-100x faster than pip
- π¦ Drop-in replacement for pip
- π Reliable dependency resolution
- πΎ Better caching
# Deploy all services
bash scripts/deploy_all_services.sh
# Verify deployment
bash scripts/verify_services.sh# Deploy specific service
gcloud run deploy api-gateway \
--source ./src/services/api_gateway \
--region us-central1 \
--allow-unauthenticatedSee DEPLOYMENT.md for detailed deployment procedures.
# Activate environment
source .venv/bin/activate
# Run API Gateway locally
cd src/services/api_gateway
python main.py
# Run Orchestrator locally
cd src/services/orchestrator
python main.py
# Format code
black src/ tests/
# Lint code
ruff check src/ tests/
# Type checking
mypy src/# Add demo papers
uv run python scripts/add_papers.py
# Add specific AI papers
uv run python scripts/add_ai_papers.py
# Generate relationships
uv run python scripts/populate_relationships.py# Run all tests
pytest
# Unit tests
pytest tests/unit/
# Integration tests
pytest tests/integration/
# Test specific functionality
python scripts/test_qa_comprehensive.py
python scripts/test_relationship_detection.py
python scripts/test_graph_queries.py- β Basic PDF ingestion from arXiv
- β Simple Q&A with citations
- β Entity extraction
- Result: Proved concept end-to-end
- β Knowledge graph relationships (150 relationships)
- β Proactive alerting system with SendGrid
- β Multi-agent intelligence (7 agents)
- β Confidence scoring for answers
- β Graph density optimization (66% improvement)
- Result: Added trust and intelligence layer
- β Production deployment to Cloud Run
- β Interactive graph visualization with D3.js
- β Service health monitoring
- β Comprehensive documentation
- Result: Production-ready for demo
See docs/planning/IMPLEMENTATION_PLAN.md for detailed phase breakdown.
- DEMO_GUIDE.md - Hackathon presentation guide
- API_REFERENCE.md - API endpoints and examples
- docs/CODEBASE_AUDIT_2025-11-08.md - Comprehensive codebase audit
- docs/guides/PHASE_0_SETUP_GUIDE.md - Environment setup guide
- docs/guides/UV_SETUP.md - UV package manager guide
- docs/guides/GCP_ARXIV_SETUP.md - GCP project & arXiv API setup
- docs/guides/GENAI_SDK_MIGRATION.md - Google GenAI SDK migration guide
- ARCHITECTURE.md - System architecture with diagrams
- docs/planning/IMPLEMENTATION_PLAN.md - Phased development plan (Crawl/Walk/Run)
- docs/planning/KNOWLEDGE_GRAPH_DESIGN.md - Graph schema & relationship types
- docs/planning/STATUS.md - Current progress tracker
- docs/reference/HackathonBrief.md - Google Cloud Run Hackathon requirements
- FUTURE_ROADMAP.md - Planned features and enhancements
- Phase 0: Environment Setup
- Phase 1: Crawl - Basic Features (PDF ingestion, Q&A, citations)
- Phase 2: Walk - Intelligence Layer (Graph, alerts, confidence scoring)
- Phase 3: Run - Production Ready (Deployment, visualization, monitoring)
Current Status: Production-ready, all services deployed and healthy β
Built for Google Cloud Run Hackathon - AI Agents Category
Requirements Met:
- β Multi-agent application (6 specialized agents)
- β Google Gemini API integration
- β Deployed to Cloud Run (4 services + 3 jobs + 1 worker)
- β All 3 resource types: Services, Jobs, Workers
- β Solves real-world problem (research literature monitoring)
- β Agent collaboration (multi-agent orchestration)
- β Production-ready with monitoring
Unique Features:
- Knowledge graph with 12.8% density (150 relationships)
- Multi-agent relationship detection with selective thresholds
- Interactive graph visualization
- Proactive alerting system
- Confidence-scored Q&A with citations
- Automatic PDF download from arXiv
- arXiv metadata fetching from arXiv API for manual uploads
- Filename-based arXiv ID extraction (e.g., 2411.04997.pdf)
- Entity extraction (authors, methods, datasets)
- LLM-based arXiv category inference
- Semantic indexing with embeddings
- Metadata enrichment
- Multi-agent relationship detection
- 6 relationship types: extends, supports, contradicts, cites, builds_on, applies
- Temporal constraint handling (papers can only reference older papers)
- Graph density optimization through LLM temperature tuning
- D3.js force-directed graph
- Node coloring by paper category
- Relationship type filtering
- Hover tooltips with paper metadata
- Click to view paper details
- User-defined interest profiles (claim-based, keyword, author, relationship)
- Semantic matching with Gemini
- Enhanced email notifications with:
- Paper category/field display
- Key findings excerpt
- Match confidence percentage with color coding
- More specific subject lines
- Email notifications via SendGrid
- Alert history tracking
- Watch rules default to FROM_EMAIL if not specified
- Natural language question answering
- Confidence scoring (0-1 scale)
- Source citations with paper IDs
- Graph-augmented retrieval
research-intelligence-agents/
βββ src/ # Core application code
β βββ agents/ # AI agents (entity, relationship, Q&A, confidence, alert)
β βββ pipelines/ # Ingestion & Q&A orchestration
β βββ services/ # 6 Cloud Run services
β βββ jobs/ # Background jobs (arXiv watcher, graph updater)
β βββ workers/ # Pub/Sub workers (alert worker)
β βββ tools/ # PDF reading, retrieval, graph queries
β βββ storage/ # Firestore client
β βββ utils/ # Config, logging, embeddings
β
βββ scripts/ # 54 operational scripts
β βββ deploy_all_services.sh
β βββ add_papers.py
β βββ populate_relationships.py
β βββ test_*.py
β
βββ docs/ # Comprehensive documentation
β βββ guides/ # Setup & migration guides
β βββ planning/ # Phase plans & design docs
β βββ reference/ # Hackathon brief
β
βββ tests/ # Test suite (pytest)
βββ unit/ # Unit tests
βββ integration/ # Integration tests
βββ fixtures/ # Test papers & expected outputs
This is a hackathon project. Contributions welcome after initial submission!
To contribute:
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests
- Submit a pull request
MIT License - see LICENSE file
- Google Cloud Run Hackathon for the opportunity
- Google Gemini API for powerful LLM capabilities
- arXiv.org for open access to research papers
- D3.js for graph visualization
For questions or feedback, please open an issue on GitHub.