Skip to content

Conversation

@EwanTauran
Copy link
Contributor

@EwanTauran EwanTauran commented Oct 17, 2025

Description

This PR adds Airweave as a tool integration, enabling LlamaIndex agents to search across data automatically synced from 30+ sources.

What is Airweave?
Airweave is an open-source platform that syncs data from multiple sources (Google Drive, Notion, GitHub, databases, APIs, etc.) and provides unified search with advanced retrieval capabilities.

Implementation:

  • AirweaveToolSpec with 5 tool functions:

    • search_collection: Simple search with default settings
    • advanced_search_collection: Full control over retrieval parameters
    • search_and_generate_answer: RAG-style direct answers
    • list_collections: Discover available collections
    • get_collection_info: Get collection details
  • Advanced search features:

    • Multiple retrieval strategies (hybrid, neural, keyword)
    • Temporal relevance weighting for recent content
    • Query expansion for better recall
    • Auto-interpret filters from natural language
    • LLM-based reranking for improved relevance
    • Natural language answer generation

Testing & Documentation:

  • 13 passing unit tests with comprehensive coverage
  • Full README with usage examples
  • Jupyter notebook demonstrating all features
  • Production tested with real data

Links:

Fixes #20110

New Package?

Did I fill in the tool.llamahub section in the pyproject.toml and provide a detailed README.md for my new integration or package?

  • Yes
  • No

Version Bump?

Did I bump the version in the pyproject.toml file of the package I am updating? (Except for the llama-index-core package)

  • Yes - Set to version 0.1.0 in pyproject.toml
  • No

Type of Change

Please delete options that are not relevant.

[x] New feature (non-breaking change which adds functionality)

How Has This Been Tested?

  • I added new unit tests to cover this change

Testing details:

cd llama-index-integrations/tools/llama-index-tools-airweave
uv run pytest tests/ -v
# Result: 13 passed, 0 failed

All tests use mocked Airweave SDK calls and cover:

  • Class inheritance and initialization
  • All 5 tool functions (search, advanced search, RAG answers, list, get info)
  • Edge cases (empty results, missing completions)
  • Both dict and object response parsing

Additionally tested with production Airweave instance and real data.

Suggested Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I ran uv run make format; uv run make lint to appease the lint gods

- Add AirweaveToolSpec with 5 tool functions:
  * search_collection: Simple search with default settings
  * advanced_search_collection: Full control over retrieval parameters
  * search_and_generate_answer: RAG-style direct answers
  * list_collections: Discover available collections
  * get_collection_info: Get collection details

- Advanced search features:
  * Multiple retrieval strategies (hybrid, neural, keyword)
  * Temporal relevance weighting for recent content
  * Query expansion for better recall
  * Auto-interpret filters from natural language
  * LLM-based reranking for improved relevance
  * Natural language answer generation

- 13 passing unit tests with comprehensive coverage
- Full documentation with usage examples
- Jupyter notebook example demonstrating all features
- Follows LlamaIndex conventions (gpt-4o-mini, async patterns)
- Compatible with FunctionAgent
- Production tested with real data
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@dosubot dosubot bot added the size:XL This PR changes 500-999 lines, ignoring generated files. label Oct 17, 2025
Copy link
Member

@AstraBert AstraBert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good! Just some minor comments and I would recommend to replace function/class-level imports with top-level imports

EwanTauran and others added 2 commits December 1, 2025 16:07
…ling

- Removed redundant imports of AirweaveSDK and SearchRequest.
- Updated the return type of the search method to Optional[str].
- Added a warning when no answer can be generated from search results.
@EwanTauran EwanTauran requested a review from AstraBert December 2, 2025 00:26
@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Dec 2, 2025
@AstraBert AstraBert enabled auto-merge (squash) December 2, 2025 20:39
- Modified the mock Airweave SDK patch to use the correct import path.
- Updated the test to assert that a UserWarning is raised when no answer can be generated from the search results, and changed the expected return value to None.
auto-merge was automatically disabled December 2, 2025 21:27

Head branch was pushed to by a user without write access

@AstraBert AstraBert enabled auto-merge (squash) December 2, 2025 22:24
@AstraBert AstraBert merged commit 0cf1995 into run-llama:main Dec 2, 2025
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lgtm This PR has been approved by a maintainer size:XL This PR changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature Request]: Add Airweave Integration as a Tool

2 participants