Releases: thiswillbeyourgithub/wdoc
Release 4.1.0
What's new
What's new
This release focuses on robustness improvements, particularly around language detection, file loading, and error handling.
Features
- Task type system: Introduced dataclass-based task type storage for better type safety [7c95e3c]
- Source tag logging: Added failure count and success rate tracking to source tag logging [69dca45]
Fixes
- PowerPoint loader: Fixed TypeError when loading PowerPoint files [ebfc66c]
- Anki loader: Resolved forward reference error [73924e1]
- Language detection: Fixed potential edge case issue [2d928ab]
- Infinite loop detection:
Enhancements
- Language detection improvements:
- Batch file loader: Reduced verbosity of progress logging [d207d98]
- Testing: Improved model detection logic [5257c5a]
- Post-install: Use logger.error instead of print during installation [c0795e9]
Refactoring
- wdoc class: Added dynamic interaction_settings property [f806b98]
- Type hints: Improved type annotations across multiple modules [a94a889, 920e5d3]
Documentation
- Help text: Fixed powerpoint filetype documentation incorrectly mentioning .doc/.docx instead of .ppt/.pptx [e9b29eb]
Dependencies
- Bumped litellm to enable latest OpenRouter pricing [577e6f6]
Maintenance
- Removed debug print statement [80f7f32]
- Better warning messages [faa5d3b]
- Fixed setup.py logger usage [4a672c1]
Commits details since the last release
- [5adc87e] by @thiswillbeyourgithub, 72 seconds ago:
bump version 4.0.4 -> 4.1.0
bumpver.toml
docs/source/conf.py
setup.py
wdoc/wdoc.py
- [0b9c6da] by @thiswillbeyourgithub, 2 hours ago:
enh: better exception catcher in language detecction
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/misc.py
- [80f7f32] by @thiswillbeyourgithub, 3 hours ago:
remove a debug print
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/loaders/init.py
- [d7589cc] by @thiswillbeyourgithub, 3 hours ago:
enh: language detector reduce debug logs
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/misc.py
- [c0e2ce7] by @thiswillbeyourgithub, 3 hours ago:
enh: language detector
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/misc.py
- [2d928ab] by @thiswillbeyourgithub, 3 hours ago:
fix: potential issue in edge case when detecting language
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/misc.py
- [69dca45] by @thiswillbeyourgithub, 4 hours ago:
feat: add failure count and success rate to source tag logging
Co-authored-by: aider (openrouter/anthropic/claude-sonnet-4.5) [email protected]
wdoc/utils/batch_file_loader.py
- [f806b98] by @thiswillbeyourgithub, 4 hours ago:
refactor: add dynamic interaction_settings property to wdoc class
Co-authored-by: aider (openrouter/anthropic/claude-sonnet-4.5) [email protected]
wdoc/wdoc.py
- [a94a889] by @thiswillbeyourgithub, 4 hours ago:
minor: type hints
Signed-off-by: thiswillbeyourgithub
[email protected]
wdoc/wdoc.py
- [7c95e3c] by @thiswillbeyourgithub, 4 hours ago:
new: use a dataclass to store the type of tasks
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/main.py
wdoc/utils/batch_file_loader.py
wdoc/utils/loaders/init.py
wdoc/utils/misc.py
wdoc/utils/retrievers.py
wdoc/utils/tasks/parse.py
wdoc/utils/tasks/summarize.py
wdoc/utils/tasks/types.py
wdoc/wdoc.py
- [5257c5a] by @thiswillbeyourgithub, 5 hours ago:
better way to check for testing model
Signed-off-by: thiswillbeyourgithub
[email protected]
wdoc/utils/llm.py
wdoc/utils/misc.py
wdoc/wdoc.py
- [73924e1] by @thiswillbeyourgithub, 7 hours ago:
fix: forward reference error in anki
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/loaders/anki.py
- [ebfc66c] by @thiswillbeyourgithub, 7 hours ago:
fix: typeerror when loading powerpoint files
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/loaders/powerpoint.py
- [d207d98] by @thiswillbeyourgithub, 25 hours ago:
enh: reduice verbosity of something that looked like an infinite loop but was not
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/batch_file_loader.py
- [920e5d3] by @thiswillbeyourgithub, 25 hours ago:
fix: typehint in pdf
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/loaders/pdf.py
- [bb147b3] by @thiswillbeyourgithub, 25 hours ago:
refactor: replace loop counter with hash-based infinite loop detection
Co-authored-by: aider (openrouter/anthropic/claude-sonnet-4.5) [email protected]
wdoc/utils/batch_file_loader.py
- [4a672c1] by @thiswillbeyourgithub, 25 hours ago:
fix: actually no we can't use loguru in setup.py
Signed-off-by: thiswillbeyourgithub [email protected]
setup.py
- [faa5d3b] by @thiswillbeyourgithub, 26 hours ago:
minor: better warning
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/batch_file_loader.py
- [fcf9ca5] by @thiswillbeyourgithub, 26 hours ago:
fix: the loop counter has to be high enough to detect infinite loop
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/batch_file_loader.py
- [e9b29eb] by @thiswillbeyourgithub, 26 hours ago:
doc: powerpoint filetype doc mentionned .doc and .docx instead of .ppt and .pptx
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/docs/help.md
- [577e6f6] by @thiswillbeyourgithub, 5 days ago:
bump litellm, allows using the latest openrouter price by litellm
Signed-off-by: thiswillbeyourgithub [email protected]
setup.py
- [c0795e9] by @thiswillbeyourgithub, 6 days ago:
enh: use logger.error instead of print during the postinstall process
Signed-off-by: thiswillbeyourgithub [email protected]
setup.py
Release 4.0.2
What's new
What's new
This release focuses on bug fixes, performance improvements, and code cleanup related to docstore filtering and retriever functionality.
🐛 Fixes
-
Docstore filtering improvements
-
Retriever fixes
⚡ Performance
- Do not store nor serialize the unfiltered docstore ([d29d3a5], [a9a0a35])
- Renamed
filter_docstoretofilter_vectorstorefor clarity
- Renamed
✨ Features
- Added timing measurements for docstore serialization and deletion ([375c1a1])
🧹 Chores
Commits details since the last release
- [83f23dd] by @thiswillbeyourgithub, 9 seconds ago:
bump version 4.0.1 -> 4.0.2
bumpver.toml
docs/source/conf.py
setup.py
wdoc/wdoc.py
- [a9a0a35] by @thiswillbeyourgithub, 17 minutes ago:
rename filter_docstore to filter_vectorstore
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/filters.py
wdoc/wdoc.py
- [d29d3a5] by @thiswillbeyourgithub, 18 minutes ago:
perf: do not store nor serialize the unfiltered docsstore
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/filters.py
wdoc/wdoc.py
- [cf9171d] by @thiswillbeyourgithub, 24 minutes ago:
fix: parent retriever when loading from embeddings
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/retrievers.py
- [de18c65] by @thiswillbeyourgithub, 37 minutes ago:
remove unused import
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/retrievers.py
- [39951a8] by @thiswillbeyourgithub, 37 minutes ago:
fix: typehint of retrievers in an edge case
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/retrievers.py
- [375c1a1] by @thiswillbeyourgithub, 45 minutes ago:
feat: add timing measurements for docstore serialization and deletion
Co-authored-by: aider (openrouter/anthropic/claude-sonnet-4.5) [email protected]
wdoc/utils/filters.py
- [6525464] by @thiswillbeyourgithub, 48 minutes ago:
remove unused import
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/filters.py
- [9c1d967] by @thiswillbeyourgithub, 48 minutes ago:
fix: actually the unfiltered docstore is serialized
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/filters.py
wdoc/wdoc.py
- [ee9cc6f] by @thiswillbeyourgithub, 67 minutes ago:
fix: wrong type hint for create_filter_metadata
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/filters.py
- [d96c2f3] by @thiswillbeyourgithub, 69 minutes ago:
fix: wrong type hint for filter_docstore
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/filters.py
- [1a2442d] by @thiswillbeyourgithub, 70 minutes ago:
fix: forgot to pass arguments to filter_docstore
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/wdoc.py
Release 4.0.1
What's new
What's new
This release focuses on langfuse v3 compatibility and improved error handling.
🐛 Fixes
-
Langfuse v3 compatibility
-
Document loading robustness
📝 Documentation
- [56866d1] Add warning for using youtube audio backend instead of whisper or deepgram
🔧 Maintenance
- [fb49e60] Bump version 4.0.0 → 4.0.1
Commits details since the last release
- [fb49e60] by @thiswillbeyourgithub, 13 seconds ago:
bump version 4.0.0 -> 4.0.1
bumpver.toml
docs/source/conf.py
setup.py
wdoc/wdoc.py
- [07257e0] by @thiswillbeyourgithub, 3 minutes ago:
fix: use langfuse opentelemetry for v3
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/misc.py
- [56866d1] by @thiswillbeyourgithub, 11 minutes ago:
doc: add warning for using the youtube audio backend instead of whisper or deepgram
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/loaders/youtube.py
- [89f5132] by @thiswillbeyourgithub, 14 minutes ago:
fix: langfuse callback import changed for langfuse v3
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/misc.py
- [3039bcf] by @thiswillbeyourgithub, 20 minutes ago:
fix: do not crash if no documents after transform_documents is ran
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/loaders/init.py
- [101c7f7] by @thiswillbeyourgithub, 29 minutes ago:
add assert that docs were found
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/loaders/init.py
Release 4.0.0
What's new
What's new
This release focuses on major performance improvements through lazy loading and deferred imports, extensive code refactoring for better maintainability, and improved testing infrastructure.
⚡ Performance
- Significantly faster startup time through deferred imports and lazy loading [52985d5, dce3c24, 3ffaec3]
- Moved litellm imports to run only when needed [52985d5]
- Deferred requests import [0b4c2fb]
- Removed eager imports from
__init__.pyfiles [306d4ca] - Moved imports in loaders, embeddings, and core modules [de1cecc, 08b9206, fd2dcba, 1838e0f, 1bd4ced, f1740c4, 2b3d9e8, 6fbe51d, 6c74d8e, f306325]
- Added lazy loading for document loaders with
WDOC_LAZY_LOADenv var [7fc5fad, ce10c4b]
🔧 Fixes
- Fixed forward reference type hints across multiple modules [fd6a7e7, 22b44b4, 15a2746]
- Fixed signature wrapping for parse function [29dbf5d]
- Fixed API tests for DuckDuckGo and OpenRouter [8b9ebc2, 8f511dd, 32e036d, 048f99e]
- Fixed missing filetype handling in edge cases [0422dec]
- Fixed error for Word document loading [8cad00d]
- Fixed lazy loading logic (was reversed) [a35446f]
- Fixed query_task and search_task output handling [6f633e8, 8b95a81]
- Fixed error when summary doesn't output to file using pipe [2a85a6b]
- Fixed imports in loaders [ebd4558, af85343, 986abd2, 4e61a6f]
- Added missing
audioop-ltsrequirement for Python 3.13+ [56bd634]
♻️ Refactoring
- Modularized loaders: Split monolithic loader file into separate modules [df1a0ad, d3ed873, f0a3fce, b249068, 984a8d3, def441f, fb421cc]
- Created dedicated files for PDF, Anki, URL, audio, HTML, and other loaders
- Enabled lazy loading of loader modules [7fc5fad]
- Extracted task-specific functions to separate modules:
- Moved
parse_doctoutils/tasks/parse.py[1c7c6e4] - Moved query/search retrieval logic to task modules [7982051, c2e6142]
- Moved
evaluate_doc_chaintoshared_query_search.py[8965c48] - Extracted query splitting logic to shared utility [4bb54a5]
- Moved
source_replaceto query.py [0ce5f4f] - Moved
autoincrease_top_kto query.py [38e82b4]
- Moved
- Split search and query task methods with better type hints [1d94644, 824f395, 319b8eb]
- Moved
debug_exceptionsto logger module [99cc99f] - Moved VectorStore filtering code to filters.py [de4ce57]
- Added
wdocSummarydataclass for type hinting [9fc51c0, 92f5c47] - Added lazy caching for
all_textsproperty [79b1661, 7b45948] - Removed obsolete
import_tricks.py[5116616]
🧪 Testing
- Improved test cleanup and temp folder removal [a768642, 35ef63e, c149f5d, 913378a]
- Better verbose output in cost tests [342ad3f]
- Use Mistral for OpenRouter API tests (zero data retention) [8f511dd]
- Added shell-based CLI test script for more reliable testing [cc74a84, 4170567]
- Added check for
wdoc[full]installation [7cb9a3c] - Updated Ollama embedding test to use
embeddingsgemma[4d47631] - Improved test assertions with more info [3d0f947]
📦 Dependencies
- Bumped langchain version [98fd2cb]
- Bumped litellm version [7aa2ce1]
- Bumped langfuse version (litellm bug fix) [fc16e5e]
- Updated general dependencies [616457c]
- Added unstructured to required dependencies [c98d0e9]
- Added bumpver to dev packages [54be0e2]
✨ Features
- Added
wdoc[full]installation option for all optional dependencies [6321942] - Added beartype runtime type validation for numpy arrays [691dbff]
- Prioritize throughput and Groq when using OpenRouter [f049846]
- Enable lazy loading of imports by default [7c2e397]
📝 Documentation
- Updated default models to latest Gemini in README and help [761ddd1, 0868086, 78e562f]
- Clarified that binary embeddings are not always better [fb611c4]
- Added link explaining fixed cache of LLM issue [fdc3c64]
- Improved docstrings for summarization functions [a06f570]
- Added docstring for VectorStore filtering [2bd8dcb]
🎨 Code Quality
- PEP8 formatting improvements [559fcc0, 13dd7d44, e10fb05]
- Removed unused imports [16c518e, cb9563e, b5a42ef]
- Improved type hints [78d399f, a801718, 0875b23]
- Import logger first to set log level [2dead91]
- Removed
if Truestatement [263a45d]
🔖 Version
- Bumped version 3.3.1 → 4.0.0 [e1548c4]
Commits details since the last release
- [e1548c4] by @thiswillbeyourgithub, 12 seconds ago:
bump version 3.3.1 -> 4.0.0
bumpver.toml
docs/source/conf.py
setup.py
wdoc/wdoc.py
- [54be0e2] by @thiswillbeyourgithub, 47 seconds ago:
add bumpver to dev packages
Signed-off-by: thiswillbeyourgithub [email protected]
setup.py
- [37e80a7] by @thiswillbeyourgithub, 2 minutes ago:
doc: todo
Signed-off-by: thiswillbeyourgithub [email protected]
README.md
- [a768642] by @thiswillbeyourgithub, 3 minutes ago:
better trash removal
Signed-off-by: thiswillbeyourgithub [email protected]
tests/run_all_tests.sh
- [35ef63e] by @thiswillbeyourgithub, 16 minutes ago:
less verbose test removal of temp folders
Signed-off-by: thiswillbeyourgithub [email protected]
tests/run_all_tests.sh
- [c149f5d] by @thiswillbeyourgithub, 30 minutes ago:
enh: delete cache dir at start of test
Signed-off-by: thiswillbeyourgithub [email protected]
tests/run_all_tests.sh
- [ae6f28a] by @thiswillbeyourgithub, 32 minutes ago:
minor: name of a test
Signed-off-by: thiswillbeyourgithub [email protected]
tests/test_wdoc.py
- [342ad3f] by @thiswillbeyourgithub, 39 minutes ago:
better verbose output in cost test
Signed-off-by: thiswillbeyourgithub [email protected]
tests/test_wdoc.py
- [8f511dd] by @thiswillbeyourgithub, 63 minutes ago:
fix: use mistral instead of openai when testing api from openrouter because it supports zero data retention
Signed-off-by: thiswillbeyourgithub
[email protected]
tests/test_wdoc.py
- [8b9ebc2] by @thiswillbeyourgithub, 65 minutes ago:
fix: api test for ddg from the shell
Signed-off-by: thiswillbeyourgithub
[email protected]
tests/test_cli.sh
- [fd6a7e7] by @thiswillbeyourgithub, 72 minutes ago:
fix forward reference for typehinting
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/wdoc.py
- [1428497] by @thiswillbeyourgithub, 72 minutes ago:
fix forgot to test api using cli script
Signed-off-by: thiswillbeyourgithub [email protected]
tests/run_all_tests.sh
- [22b44b4] by @thiswillbeyourgithub, 85 minutes ago:
fix forward reference type hints
Signed-off-by: thiswillbeyourgithub
[email protected]
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/tasks/query.py
wdoc/wdoc.py
- [15a2746] by @thiswillbeyourgithub, 89 minutes ago:
fix forward reference type hints
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/llm.py
wdoc/utils/retrievers.py
wdoc/utils/tasks/query.py
wdoc/utils/tasks/summarize.py
- [29dbf5d] by @thiswillbeyourgithub, 2 hours ago:
fix: signature wrapping for parse
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/main.py
wdoc/wdoc.py
- [7aa2ce1] by @thiswillbeyourgithub, 2 hours ago:
bump litellm
Signed-off-by: thiswillbeyourgithub [email protected]
setup.py
- [98fd2cb] by @thiswillbeyourgithub, 2 hours ago:
bump langchain version
Signed-off-by: thiswillbeyourgithub
[email protected]
setup.py
- [616457c] by @thiswillbeyourgithub, 2 hours ago:
bump dependencies
Signed-off-by: thiswillbeyourgithub [email protected]
setup.py
- [dce3c24] by @thiswillbeyourgithub, 2 hours ago:
actually use lazy import for litellm
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/misc.py
- [52985d5] by @thiswillbeyourgithub, 3 hours ago:
better startup time by defering litellm import
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/customs/litellm_embeddings.py
wdoc/utils/embeddings.py
wdoc/utils/llm.py
wdoc/utils/loaders/shared_audio.py
wdoc/utils/misc.py
wdoc/utils/retrievers.py
wdoc/utils/tasks/query.py
wdoc/utils/tasks/summarize.py
wdoc/wdoc.py
- [559fcc0] by @thiswillbeyourgithub, 3 hours ago:
pep8
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/init.py
- [0b4c2fb] by @thiswillbeyourgithub, 3 hours ago:
defer requests import
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/loaders/shared_audio.py
wdoc/utils/misc.py
- [78d399f] by @thiswillbeyourgithub, 3 hours ago:
type hint for multiquery retriever
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/retrievers.py
- [13dd7d1] by @thiswillbeyourgithub, 3 hours ago:
pep8
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/main.py
wdoc/utils/customs/binary_faiss_vectorstore.py
wdoc/utils/embeddings.py
wdoc/utils/env.py
wdoc/utils/interact.py
wdoc/utils/llm.py
wdoc/utils/misc.py
wdoc/utils/prompts.py
wdoc/utils/tasks/parse.py
wdoc/utils/tasks/query.py
wdoc/utils/tasks/search.py
- [6ce23d4] by @thiswillbeyourgithub, 4 hours ago:
fix import statements
Signed-off-by: thiswillbeyourgithub <[email protected]...
Release 3.3.1
What's new
This release focuses on improving code quality through comprehensive type hint fixes and enhanced testing infrastructure.
🔧 Fixes
-
Type Hints: Comprehensive type hint improvements across the codebase
- Binary FAISS vectorstore type hints ([e46ed4a], [ac02a65], [da864cd], [95c3705], [81b36cb], [be3f352])
- Loader function type hints ([e6fcad8], [e65abad], [b624373])
- Semantic batching type hints ([f3a5289], [dd6ad29])
- Prompt template type hints ([1b4ec86], [bc56beb])
- General type hint fixes ([d4e99fd])
-
Model Compatibility: Fixed issue where some models consider
<answer>as implying</think>([09684bb]) -
Langchain Integration: Fixed callable_chain compatibility by creating runnables without decorators ([0c89cac])
✨ Enhancements
- Type Checking: Replaced manual type checking with import hook system ([56b353a], [6b3ddab])
- Logging: Reduced verbosity of litellm logging ([9a4a69c])
- Search: Added duplicate check for DuckDuckGo search results ([f68c8a4])
🧪 Tests
- Added comprehensive test for DuckDuckGo search functionality ([7dbd3c2])
- Fixed existing CLI tests ([781f6d6])
📦 Version
- Bumped version from 3.3.0 to 3.3.1 ([0690df9])
Commits details since the last release
- [0690df9] by @thiswillbeyourgithub, 41 seconds ago:
bump version 3.3.0 -> 3.3.1
bumpver.toml
docs/source/conf.py
setup.py
wdoc/wdoc.py
- [7dbd3c2] by @thiswillbeyourgithub, 8 hours ago:
test: add test for DuckDuckGo search functionality
Co-authored-by: aider (openrouter/anthropic/claude-sonnet-4) [email protected]
tests/test_wdoc.py
- [781f6d6] by @thiswillbeyourgithub, 9 hours ago:
fix: test
Signed-off-by: thiswillbeyourgithub [email protected]
tests/test_cli.py
- [e46ed4a] by @thiswillbeyourgithub, 14 hours ago:
fix: typehint for marginal score search
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/customs/binary_faiss_vectorstore.py
- [ac02a65] by @thiswillbeyourgithub, 18 hours ago:
fix: type hint of binary faiss
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/customs/binary_faiss_vectorstore.py
- [f68c8a4] by @thiswillbeyourgithub, 19 hours ago:
add check for duplicate ddg result
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/batch_file_loader.py
- [da864cd] by @thiswillbeyourgithub, 19 hours ago:
fix: binary faiss type hints
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/customs/binary_faiss_vectorstore.py
- [9a4a69c] by @thiswillbeyourgithub, 19 hours ago:
enh: tune down the verbosity of litellm
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/wdoc.py
- [e6fcad8] by @thiswillbeyourgithub, 19 hours ago:
fix: type hint of load_one_doc
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/loaders.py
- [e65abad] by @thiswillbeyourgithub, 20 hours ago:
Revert "fix: typehint of load_one_doc"
This reverts commit f0037b54ac5ce317442e672f12e1da266b58c5c1.
wdoc/utils/loaders.py
- [b624373] by @thiswillbeyourgithub, 20 hours ago:
fix: typehint of load_one_doc
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/loaders.py
- [95c3705] by @thiswillbeyourgithub, 20 hours ago:
fix: typehints for binary faiss
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/customs/binary_faiss_vectorstore.py
- [f3a5289] by @thiswillbeyourgithub, 20 hours ago:
fix: type hints for semantic batching
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/tasks/query.py
- [d4e99fd] by @thiswillbeyourgithub, 20 hours ago:
fix: type hints
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/wdoc.py
- [81b36cb] by @thiswillbeyourgithub, 21 hours ago:
forgot some type hint for binary faiss
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/customs/binary_faiss_vectorstore.py
- [dd6ad29] by @thiswillbeyourgithub, 21 hours ago:
fix: wrong typehint in semantic_batch
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/tasks/query.py
- [09684bb] by @thiswillbeyourgithub, 21 hours ago:
fix: some models consider than implied
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/misc.py
- [0c89cac] by @thiswillbeyourgithub, 22 hours ago:
fix: actually callable_chain does not work for langchain so we have to make runnables without decorators
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/customs/callable_runnable.py
wdoc/utils/misc.py
wdoc/utils/tasks/query.py
wdoc/wdoc.py
- [6b3ddab] by @thiswillbeyourgithub, 22 hours ago:
new: remove the ubiquitous optional_typecheck decorator
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/main.py
wdoc/utils/batch_file_loader.py
wdoc/utils/embeddings.py
wdoc/utils/interact.py
wdoc/utils/llm.py
wdoc/utils/loaders.py
wdoc/utils/logger.py
wdoc/utils/misc.py
wdoc/utils/prompts.py
wdoc/utils/retrievers.py
wdoc/utils/tasks/query.py
wdoc/utils/tasks/summarize.py
wdoc/utils/typechecker.py
wdoc/wdoc.py
- [56b353a] by @thiswillbeyourgithub, 22 hours ago:
new: neutralize manual type checking and instead use the import hook
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/init.py
wdoc/utils/customs/callable_runnable.py
wdoc/utils/misc.py
wdoc/utils/tasks/query.py
wdoc/utils/typechecker.py
wdoc/wdoc.py
- [be3f352] by @thiswillbeyourgithub, 22 hours ago:
fix: type hints in binary faiss
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/customs/binary_faiss_vectorstore.py
- [1b4ec86] by @thiswillbeyourgithub, 23 hours ago:
fix: wrong type for Prompts class
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/prompts.py
- [bc56beb] by @thiswillbeyourgithub, 23 hours ago:
add type checking to prompt template
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/prompts.py
Release 3.3.0
What's new
This release focuses on adding DuckDuckGo web search capabilities and introducing binary embeddings support for more efficient vector storage.
✨ New Features
DuckDuckGo Web Search Integration
- [372fe57] Add DuckDuckGo search support with URL extraction and metadata
- [273195e] Support
wdoc wdb "your query"shorthand for web search - [03bfe08] Add DuckDuckGo search tests and documentation
Binary Embeddings Support
- [c528bad] Add support for binary embeddings with 8x memory reduction
- [8f65197] Enable FAISS vectorstore compression by default
- [37ebd97] Create CompressedFAISS subclass with zlib compression
🐛 Bug Fixes
Core Functionality
- [0d72efd] Fix wrong decorator used for
load_one_doc - [edcf671] Fix
ddg_regiontype (str not int) - [66ab177] Fix type hints for
ddg_safesearchandloading_failure - [957936c] Use keyword arguments instead of fire when calling wdoc
Testing Environment
- [d3de58e] Fix piped input/output handling in pytest environment
- [42ff516] Prevent pipe usage in pytest environment
- [c78dc0b] Add pytest environment detection
🧪 Testing Improvements
- [1b09996] Fix the
run_all_testscript - [8ed1d0c] Add comprehensive DuckDuckGo search functionality tests
- [b184177] Split CLI tests into separate
test_cli.pyfile - [9d7fe9c] Split parsing tests into separate
test_parsing.pyfile - [12b012d] Move vector store tests to dedicated test file
📚 Documentation
- [d7d6b04] Explain how to run tests in README
- [dc15001] Clarify how to disable parallel processing
- [df4b79f] Document debug mode's effect on
loading_failuredefault - [1832299] Add shell examples for DuckDuckGo usage
🔧 Enhancements
CLI/UX Improvements
- [7e994a6] Rename
parse_filefunction toparse_doc - [4aa247e] Re-ask for input when empty query provided in CLI
- [57d5d5f] Fix Fire's pager issue in CLI
Performance THISISANAMPERSAND Reliability
- [68d4c75] Bump LiteLLM to latest version for improved startup time
- [ab9c5e9] Add parallel processing option for Whisper audio splits
- [6b13044] Add loop counter and crash protection for recursive file processing
🔄 Version Update
- [6435133] Bump version from 3.2.5 → 3.3.0
Commits details since the last release
- [6435133] by @thiswillbeyourgithub, 36 minutes ago:
bump version 3.2.5 -> 3.3.0
bumpver.toml
docs/source/conf.py
setup.py
wdoc/wdoc.py
- [1b09996] by @thiswillbeyourgithub, 24 hours ago:
test: fix the run_all_test script
Signed-off-by: thiswillbeyourgithub [email protected]
tests/run_all_tests.sh
- [d7d6b04] by @thiswillbeyourgithub, 24 hours ago:
doc: explain how to run the tests
Signed-off-by: thiswillbeyourgithub [email protected]
README.md
- [62cc2ce] by @thiswillbeyourgithub, 24 hours ago:
fix: ddg test
Signed-off-by: thiswillbeyourgithub [email protected]
tests/test_cli.py
- [0d72efd] by @thiswillbeyourgithub, 24 hours ago:
fix: wrong decorator used for load_one_doc
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/loaders.py
- [dc15001] by @thiswillbeyourgithub, 24 hours ago:
doc: clarify how to disable parallel processing
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/docs/help.md
- [e0453cb] by @thiswillbeyourgithub, 24 hours ago:
minor: mention a type hint
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/batch_file_loader.py
- [edcf671] by @thiswillbeyourgithub, 24 hours ago:
fix: ddg_region is actually a str not an int
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/misc.py
- [df4b79f] by @thiswillbeyourgithub, 24 hours ago:
doc: mention that debug changes the default value for loading_failure
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/docs/help.md
- [66ab177] by @thiswillbeyourgithub, 25 hours ago:
fix: type of ddg_safesearch and loading_failure should be Literal
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/misc.py
- [98b0867] by @thiswillbeyourgithub, 25 hours ago:
doc: explain that loading_failure defaultto crash when parsing
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/docs/help.md
- [90eacb3] by @thiswillbeyourgithub, 25 hours ago:
test: ddg should use us region by default
Signed-off-by: thiswillbeyourgithub [email protected]
tests/test_cli.py
- [c8b1944] by @thiswillbeyourgithub, 25 hours ago:
test: less severe check for pipes
Signed-off-by: thiswillbeyourgithub [email protected]
tests/test_cli.py
- [6e12e5c] by @thiswillbeyourgithub, 25 hours ago:
test: remove one -n auto arg
Signed-off-by: thiswillbeyourgithub [email protected]
tests/run_all_tests.sh
- [d3de58e] by @thiswillbeyourgithub, 2 days ago:
fix: actually inside pytest we should not bypass piped input but only piped output
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/env.py
wdoc/utils/misc.py
- [5715bc4] by @thiswillbeyourgithub, 2 days ago:
test: add env variable to detect if being called by pytest
Signed-off-by: thiswillbeyourgithub
[email protected]
tests/conftest.py
- [42ff516] by @thiswillbeyourgithub, 2 days ago:
new: do not allow using pipe input or output in pytest environment
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/misc.py
- [c78dc0b] by @thiswillbeyourgithub, 2 days ago:
new: detect when wdoc is called in pytest environment
Signed-off-by: thiswillbeyourgithub [email protected]
tests/test_wdoc.py
wdoc/utils/env.py
wdoc/utils/misc.py
- [fca39c0] by @thiswillbeyourgithub, 2 days ago:
test: missing oneoff and failsafe when testing ddg
Signed-off-by: thiswillbeyourgithub [email protected]
tests/test_cli.py
- [b2b4cf1] by @thiswillbeyourgithub, 2 days ago:
test: fix missing quotation sign for args
Signed-off-by: thiswillbeyourgithub [email protected]
tests/test_cli.py
- [13409b1] by @thiswillbeyourgithub, 2 days ago:
test: fix a timeout not long enough
Signed-off-by: thiswillbeyourgithub [email protected]
tests/test_cli.py
- [957936c] by @thiswillbeyourgithub, 2 days ago:
fix: use keyword aguments instead of fire when calling wdoc
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/main.py
- [b034337] by @thiswillbeyourgithub, 2 days ago:
minor
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/main.py
- [b44d730] by @thiswillbeyourgithub, 2 days ago:
fix: replacing ddg_max_result
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/main.py
- [dfcaf3b] by @thiswillbeyourgithub, 2 days ago:
fix: wrong way to replace ddg_max_result to ddg_max_results
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/main.py
- [adc991a] by @thiswillbeyourgithub, 2 days ago:
actually no
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/loaders.py
- [9ab4cbf] by @thiswillbeyourgithub, 2 days ago:
fix: type hint of load_one_doc can be a list of string in case of error
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/loaders.py
- [5f7fcf4] by @thiswillbeyourgithub, 2 days ago:
typo: Nvidia instead of NVidia
Signed-off-by: thiswillbeyourgithub [email protected]
README.md
tests/test_cli.py
wdoc/docs/examples.md
- [03bfe08] by @thiswillbeyourgithub, 2 days ago:
test: add test for ddg search
Signed-off-by: thiswillbeyourgithub [email protected]
tests/run_all_tests.sh
- [48165fa] by @thiswillbeyourgithub, 2 days ago:
test: clearer echo
Signed-off-by: thiswillbeyourgithub [email protected]
tests/run_all_tests.sh
- [08cac94] by @thiswillbeyourgithub, 2 days ago:
remove unused import
Signed-off-by: thiswillbeyourgithub [email protected]
tests/test_cli.py
- [3aabd2d] by @thiswillbeyourgithub, 2 days ago:
style: format test_cli.py with linter
Co-authored-by: aider (openrouter/anthropic/claude-sonnet-4) [email protected]
tests/test_cli.py
- [8ed1d0c] by @thiswillbeyourgithub, 2 days ago:
feat: add test for DuckDuckGo search functionality with NVIDIA query
Co-authored-by: aider (openrouter/anthropic/claude-sonnet-4) [email protected]
tests/test_cli.py
- [a8e3e04] by @thiswillbeyourgithub, 2 days ago:
test: add test for DuckDuckGo search with NVIDIA query
Co-authored-by: aider (openrouter/anthropic/claude-sonnet-4) [email protected]
tests/test_cli.py
- [1832299] by @thiswillbeyourgithub, 2 days ago:
doc: add shell example for using duckduckgo
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/docs/examples.md
- [e6c4641] by @thiswillbeyourgithub, 2 days ago:
typo
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/docs/examples.md
- [917ee51] by @...
Release 3.2.5
What's new
This release brings several improvements to command-line argument handling and filetype detection, along with key bug fixes and build process enhancements.
✨ Features
- CLI & Filetype Detection:
- Build Process:
- Integrated
sphinx-apidocinto the ReadTheDocs build process via a pre-build job in.readthedocs.yaml([cc86c7b]).
- Integrated
🐛 Fixes
- Corrected an issue with
sys.argvhandling that led to duplicated arguments ([e7cf185]). - Updated
litellmdependency to resolve crashes experienced on Windows environments ([cfff0ac]), see #20.
🛠️ Improvements & Refactoring
- Filetype Detection Internals:
- Code Quality:
- Improved documentation by adding docstrings to custom exception classes ([8e6ca1a]).
Chores
- Version bumped to 3.2.5 ([82b7f81]).
Commits details since the last release
- [82b7f81] by @thiswillbeyourgithub, 19 minutes ago:
bump version 3.2.4 -> 3.2.5
bumpver.toml
docs/source/conf.py
setup.py
wdoc/wdoc.py
- [e7cf185] by @thiswillbeyourgithub, 3 minutes ago:
fix: badly handled sys.argv was duplicating args
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/main.py
- [cfff0ac] by @thiswillbeyourgithub, 19 minutes ago:
fix: bump version of litellm because windows crashes
Signed-off-by: thiswillbeyourgithub [email protected]
setup.py
- [cc86c7b] by @thiswillbeyourgithub (aider), 2 days ago:
feat: add pre-build job to run sphinx-apidoc in .readthedocs.yaml
.readthedocs.yaml
- [ab76610] by @thiswillbeyourgithub, 2 days ago:
new: use the filetype detector to infer what to do in case of multiple implicit arguments from the cli
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/main.py
- [05966c6] by @thiswillbeyourgithub, 2 days ago:
enh: add debug prints to the filetype detector
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/batch_file_loader.py
- [520f4ce] by @thiswillbeyourgithub, 2 days ago:
new: use a specific exception when we can't infer the filetype
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/batch_file_loader.py
- [8e6ca1a] by @thiswillbeyourgithub, 2 days ago:
add docstring to some exceptions
Signed-off-by: thiswillbeyourgithub
[email protected]
wdoc/utils/errors.py
- [39af223] by @thiswillbeyourgithub, 2 days ago:
add an error for undetectable filetype
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/errors.py
- [b453748] by @thiswillbeyourgithub, 2 days ago:
new: put the filetype detection code in a separate function
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/batch_file_loader.py
Release 3.2.4
What's new
This release primarily focuses on significant documentation enhancements, crucial bug fixes for stability and build processes, and introduces updated dependencies and tokenization.
✨ New Features
- Upgraded default token estimation to use
gpt-4o-minitokenizer, replacinggpt-3.5-turbo([6d41817]). - Integrated the latest
yt-dlpfor YouTube downloads ([ab207b4]). - Environment variable documentation is now automatically added to the
EnvDataclassclass__doc__([ed9dd38]).
🐛 Bug Fixes
- Resolved a crash on ReadTheDocs caused by missing
yt-dlpdependency ([f5068a3]). - Fixed an issue where accessing
env.__class__on ReadTheDocs could cause a crash ([4e180f0]). - Corrected relative import paths in
wdocthat were preventing Sphinx API documentation builds ([ade5930]). - Fixed issues with the Sphinx API command in the FAQ section of the README ([38008aa], [ff093a2]).
- Ensured collapsible bars in documentation function correctly ([3cef833]).
📚 Documentation & Refinements
- Extensive updates and fixes to Sphinx documentation generation and content:
- Addressed outdated Sphinx documentation files ([90bde99]).
- Improved API autodoc parameters for clearer documentation ([243de66]).
- Excluded private and special members from documentation ([7abedd4]).
- Added Sphinx command to FAQ in README ([1e6602e]) and removed private members from it ([11ae11b]).
- Updated copyright year to 2025 ([bd7e3c5]).
- Streamlined documentation structure and configuration:
- Removed unused make files (
Makefile,make.bat) for documentation ([07b0a7d]). - Removed unused argument for theme flyout display ([17bc5e6]).
- Removed unused templates path ([6bffa20]) and CSS ([712df08]).
- Removed duplicate README from the documentation source ([2b93162]).
- Added a documentation table to the main index ([1dfe2b3]).
- Removed unused make files (
⚙️ Build & Chores
- Bumped version to 3.2.4 ([ed7a9c7]).
Commits details since the last release
- [ed7a9c7] by @thiswillbeyourgithub, 20 seconds ago:
bump version 3.2.3 -> 3.2.4
bumpver.toml
docs/source/conf.py
setup.py
wdoc/wdoc.py
- [f5068a3] by @thiswillbeyourgithub, 13 minutes ago:
fix: missing yt-dlp makes readthedock crash
Signed-off-by: thiswillbeyourgithub [email protected]
setup.py
- [17bc5e6] by @thiswillbeyourgithub, 19 minutes ago:
remove unused argument for theme flyout display
Signed-off-by: thiswillbeyourgithub [email protected]
docs/source/conf.py
- [4e180f0] by @thiswillbeyourgithub, 22 minutes ago:
fix: class attribute of env is accessed by readthedocks and should not crash
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/env.py
- [243de66] by @thiswillbeyourgithub, 2 hours ago:
saner api autodoc parameters
Signed-off-by: thiswillbeyourgithub [email protected]
docs/source/conf.py
- [ed9dd38] by @thiswillbeyourgithub, 3 hours ago:
new: add the environment variable documentation to the doc of the EnvDataclass class
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/docs/help.md
wdoc/utils/env.py
- [07b0a7d] by @thiswillbeyourgithub, 3 hours ago:
doc: remove unused make files for doc
Signed-off-by: thiswillbeyourgithub
[email protected]
docs/Makefile
docs/make.bat
- [7abedd4] by @thiswillbeyourgithub, 4 hours ago:
doc: dont include private nor special
Signed-off-by: thiswillbeyourgithub
[email protected]
docs/source/conf.py
- [38008aa] by @thiswillbeyourgithub, 2 hours ago:
fix: sphinx api command of faq
Signed-off-by: thiswillbeyourgithub
[email protected]
Signed-off-by: thiswillbeyourgithub
[email protected]
Signed-off-by: thiswillbeyourgithub [email protected]
README.md
- [11ae11b] by @thiswillbeyourgithub, 4 hours ago:
remove private from sphinx command
Signed-off-by: thiswillbeyourgithub [email protected]
README.md
- [90bde99] by @thiswillbeyourgithub, 4 hours ago:
fix outdated sphinx doc
Signed-off-by: thiswillbeyourgithub [email protected]
docs/source/wdoc.rst
docs/source/wdoc.utils.batch_file_loader.rst
docs/source/wdoc.utils.customs.compressed_embeddings_cache.rst
docs/source/wdoc.utils.customs.fix_llm_caching.rst
docs/source/wdoc.utils.customs.rst
docs/source/wdoc.utils.embeddings.rst
docs/source/wdoc.utils.env.rst
docs/source/wdoc.utils.errors.rst
docs/source/wdoc.utils.flags.rst
docs/source/wdoc.utils.import_tricks.rst
docs/source/wdoc.utils.interact.rst
docs/source/wdoc.utils.llm.rst
docs/source/wdoc.utils.loaders.rst
docs/source/wdoc.utils.logger.rst
docs/source/wdoc.utils.misc.rst
docs/source/wdoc.utils.prompts.rst
docs/source/wdoc.utils.retrievers.rst
docs/source/wdoc.utils.rst
docs/source/wdoc.utils.tasks.query.rst
docs/source/wdoc.utils.tasks.rst
docs/source/wdoc.utils.tasks.summarize.rst
docs/source/wdoc.utils.typechecker.rst
docs/source/wdoc.wdoc.rst
- [ff093a2] by @thiswillbeyourgithub, 4 hours ago:
fix: sphinx api command of faq
Signed-off-by: thiswillbeyourgithub [email protected]
README.md
- [ade5930] by @thiswillbeyourgithub, 4 hours ago:
fix: relative wdoc imports were stopping sphinx api build
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/init.py
wdoc/main.py
wdoc/utils/init.py
wdoc/utils/batch_file_loader.py
wdoc/utils/customs/init.py
wdoc/utils/embeddings.py
wdoc/utils/env.py
wdoc/utils/import_tricks.py
wdoc/utils/interact.py
wdoc/utils/llm.py
wdoc/utils/loaders.py
wdoc/utils/logger.py
wdoc/utils/misc.py
wdoc/utils/prompts.py
wdoc/utils/retrievers.py
wdoc/utils/tasks/init.py
wdoc/utils/tasks/query.py
wdoc/utils/tasks/summarize.py
wdoc/utils/typechecker.py
wdoc/wdoc.py
- [1e6602e] by @thiswillbeyourgithub, 5 hours ago:
doc: add to faq the sphinx command
Signed-off-by: thiswillbeyourgithub [email protected]
README.md
- [bd7e3c5] by @thiswillbeyourgithub, 5 hours ago:
update copyright year to 2025
Signed-off-by: thiswillbeyourgithub [email protected]
docs/source/conf.py
- [6bffa20] by @thiswillbeyourgithub, 6 hours ago:
remove unused templates path in doc
Signed-off-by: thiswillbeyourgithub
[email protected]
docs/source/conf.py
- [2b93162] by @thiswillbeyourgithub, 6 hours ago:
remove duplicate readme from the doc
Signed-off-by: thiswillbeyourgithub [email protected]
docs/source/index.rst
- [3cef833] by @thiswillbeyourgithub, 6 hours ago:
fix collapsible bar
Signed-off-by: thiswillbeyourgithub [email protected]
docs/source/conf.py
- [712df08] by @thiswillbeyourgithub, 6 hours ago:
remove unused css from the doc
Signed-off-by: thiswillbeyourgithub [email protected]
docs/source/_static/custom.css
docs/source/conf.py
- [1dfe2b3] by @thiswillbeyourgithub, 6 hours ago:
documentation table
Signed-off-by: thiswillbeyourgithub [email protected]
docs/source/index.rst
- [6d41817] by @thiswillbeyourgithub, 25 hours ago:
new: use gpt-4o-mini tokenizer by default to estimate tokens
previously we used the ageing gpt-3.5-turbo
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/docs/help.md
wdoc/utils/misc.py
- [ab207b4] by @thiswillbeyourgithub, 25 hours ago:
new: use the latest yt-dl install from yt-dlp
Signed-off-by: thiswillbeyourgithub [email protected]
setup.py
Release 3.2.3
What's new
This release primarily focuses on enhancing context management for embedding models, improving debugging utilities, and updating documentation for better clarity. It also includes several important bug fixes and feature additions.
✨ Features
- Introduced a new environment variable
WDOC_MAX_EMBED_CONTEXTto allow capping the context size for embedding models ([d9e200f8])- Documentation for this new variable has been added (
[a2408fd0])
- Documentation for this new variable has been added (
- Enhanced debugging by ensuring debug prints are always active when
md_printeris used. This helps in retrieving LLM answers from logs if they weren't saved to a file ([69db1916]) - Added the current date to summary metadata and headers to help reduce potential LLM hallucinations (
[64ca4665])
🐛 Fixes
- Text Splitting & Context Handling:
- Addressed an issue where large language models have more context than embedding models by setting a
max_tokenslimit for the text splitter ([dac6802d]) - Fixed an edge case where the
wdoc max chunksetting could be ignored ([196b3a00]) - Corrected an old variable name within the text splitting logic (
[767bc754])
- Addressed an issue where large language models have more context than embedding models by setting a
- Updated the default model to
gemini 2.5 previewto reflect its renaming on OpenRouter ([22978609]) - Improved the mechanism for ignoring initial "breathing" or placeholder lines in summaries (
[4dbcf158])
📚 Documentation
- Clarity and Enhancements:
- Clarified the usage of
saveandloadfunctionalities ([9d9642d4]) and specifically advised against using them simultaneously ([5270c350]) - Made multiple clarifications to the README for better understanding (
[9284ff54],[cb4cb519],[f677e5a2],[39e0da55]) - Updated Ollama examples to recommend
snowflake-arctic-embed2instead ofbge-m3([d045702b]) - Added documentation for the
WDOC_MAX_EMBED_CONTEXTenvironment variable ([a2408fd0])
- Clarified the usage of
- Removed a documentation file (
summary_rag.md) that was not yet ready for release ([6d20c220])
⚙️ Chore & Maintenance
- Version bumped to
3.2.3(following an earlier bump to3.2.2[[71ac503c]]) ([f62a2322]) - README Updates:
- Updated TODO items (
[8f2cbfd7],[5d090421]) - Added a PyPI badge for better project visibility (
[60ef4112])
- Updated TODO items (
Commits details since the last release
- [f62a232] by @thiswillbeyourgithub, 46 seconds ago:
bump version 3.2.2 -> 3.2.3
bumpver.toml
docs/source/conf.py
setup.py
wdoc/wdoc.py
- [6d20c22] by @thiswillbeyourgithub, 76 seconds ago:
doc: removed file not yet ready
Signed-off-by: thiswillbeyourgithub [email protected]
summary_rag.md
- [71ac503] by @thiswillbeyourgithub, 4 minutes ago:
bump version 3.2.1 -> 3.2.2
bumpver.toml
docs/source/conf.py
setup.py
wdoc/wdoc.py
- [8f2cbfd] by @thiswillbeyourgithub, 3 minutes ago:
todo
Signed-off-by: thiswillbeyourgithub
[email protected]
README.md
- [69db191] by @thiswillbeyourgithub, 40 minutes ago:
new: now debug print is used anyway when md_printer is used
this is to make you able to go to the logs to fetch and answer form the
LLM if you have forgotten to store it to a file
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/logger.py
wdoc/wdoc.py
- [a2408fd] by @thiswillbeyourgithub (aider), 66 minutes ago:
docs: Add documentation for WDOC_MAX_EMBED_CONTEXT variable
wdoc/docs/help.md
- [d9e200f] by @thiswillbeyourgithub, 66 minutes ago:
feat: add new env var to cap the context size for embedding models
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/env.py
wdoc/utils/misc.py
- [196b3a0] by @thiswillbeyourgithub, 72 minutes ago:
fix: edge case where wdoc max chunk would be ignored
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/misc.py
- [dac6802] by @thiswillbeyourgithub, 76 minutes ago:
fix: set a limit to max_tokens for the text splitter as large LLM have more context than embeddings models nowadays
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/misc.py
- [767bc75] by @thiswillbeyourgithub, 80 minutes ago:
fix: forgot to rename an old variable name
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/misc.py
- [2297860] by @thiswillbeyourgithub, 86 minutes ago:
fix: set default model to gemini 2.5 preview without date timestamp
openrouter renamed that model apparently
Signed-off-by: thiswillbeyourgithub [email protected]
README.md
wdoc/utils/env.py
- [9d9642d] by @thiswillbeyourgithub, 22 hours ago:
doc: clarify save and load
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/docs/help.md
- [5270c35] by @thiswillbeyourgithub, 22 hours ago:
doc: clarify that load and save shouldnt be used at the same time
Signed-off-by: thiswillbeyourgithub
[email protected]
wdoc/docs/help.md
- [d045702] by @thiswillbeyourgithub, 23 hours ago:
doc: use snowflake-arctic-embed2 instead of bge-m3 for ollama examples
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/docs/examples.md
- [60ef411] by @thiswillbeyourgithub, 26 hours ago:
add a pypi badge
Signed-off-by: thiswillbeyourgithub [email protected]
README.md
- [5d09042] by @thiswillbeyourgithub, 7 days ago:
update todo
Signed-off-by: thiswillbeyourgithub [email protected]
README.md
- [9284ff5] by @thiswillbeyourgithub, 7 days ago:
doc: clarify
Signed-off-by: thiswillbeyourgithub
[email protected]
Signed-off-by: thiswillbeyourgithub
[email protected]
Signed-off-by: thiswillbeyourgithub
[email protected]
Signed-off-by: thiswillbeyourgithub [email protected]
README.md
- [cb4cb51] by @thiswillbeyourgithub, 7 days ago:
doc: clarify
Signed-off-by: thiswillbeyourgithub
[email protected]
Signed-off-by: thiswillbeyourgithub
[email protected]
Signed-off-by: thiswillbeyourgithub [email protected]
README.md
- [f677e5a] by @thiswillbeyourgithub, 7 days ago:
doc: clarify
Signed-off-by: thiswillbeyourgithub
[email protected]
Signed-off-by: thiswillbeyourgithub [email protected]
README.md
- [39e0da5] by @thiswillbeyourgithub, 7 days ago:
doc: clarify
Signed-off-by: thiswillbeyourgithub [email protected]
README.md
- [64ca466] by @thiswillbeyourgithub (aider), 10 days ago:
feat: Add current date to summary metadata and header to reduce hallucinations
wdoc/wdoc.py
- [4dbcf15] by @thiswillbeyourgithub, 10 days ago:
enh: better ignoring of first line of summary if just breathing
Signed-off-by: thiswillbeyourgithub
[email protected]
wdoc/utils/tasks/summarize.py
Release 3.2.1
What's new
This small patch release primarily focuses on integrating OpenRouter for model pricing/metadata and refining cost calculations.
✨ Features
- Set default models to use OpenRouter ([915699c]).
- Fetch model prices and metadata automatically from OpenRouter, improving reliability ([7f840b7]).
🐛 Fixes & Enhancements
- Much improved price calculation and handling:
- Updated
litellmdependency ([179b589]).
🧪 Tests
- API integration tests now fail faster if an underlying API call fails ([9a0c856]).
Commits details since the last release
- [03aeab2] by @thiswillbeyourgithub, 2 minutes ago:
bump version 3.2.0 -> 3.2.1
bumpver.toml
docs/source/conf.py
setup.py
wdoc/wdoc.py
- [915699c] by @thiswillbeyourgithub, 6 minutes ago:
new: set the default models to use openrouter
Signed-off-by: thiswillbeyourgithub [email protected]
README.md
wdoc/utils/env.py
- [c0b90d8] by @thiswillbeyourgithub, 64 minutes ago:
fix: reworked how pricing are computed to take internal thinking into account
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/llm.py
wdoc/utils/misc.py
wdoc/utils/tasks/summarize.py
wdoc/wdoc.py
- [a17b41c] by @thiswillbeyourgithub, 80 minutes ago:
enh: better way to get the model prices
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/misc.py
wdoc/wdoc.py
- [9a0c856] by @thiswillbeyourgithub, 22 minutes ago:
test: crash early if one api crash fails
Signed-off-by: thiswillbeyourgithub [email protected]
tests/run_all_tests.sh
- [7f840b7] by @thiswillbeyourgithub, 2 hours ago:
feat: automatically fetch the price and metadata from openrouter instead of waiting for litellm
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/misc.py
wdoc/wdoc.py
- [2b29a9d] by @thiswillbeyourgithub, 2 hours ago:
fix: error message on missing model price
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/misc.py
- [179b589] by @thiswillbeyourgithub, 2 hours ago:
bump litellm version
Signed-off-by: thiswillbeyourgithub [email protected]
setup.py