Releases: thiswillbeyourgithub/wdoc
Release 4.1.0
What's new
What's new
This release focuses on robustness improvements, particularly around language detection, file loading, and error handling.
Features
- Task type system: Introduced dataclass-based task type storage for better type safety [7c95e3c]
- Source tag logging: Added failure count and success rate tracking to source tag logging [69dca45]
Fixes
- PowerPoint loader: Fixed TypeError when loading PowerPoint files [ebfc66c]
- Anki loader: Resolved forward reference error [73924e1]
- Language detection: Fixed potential edge case issue [2d928ab]
- Infinite loop detection:
Enhancements
- Language detection improvements:
- Batch file loader: Reduced verbosity of progress logging [d207d98]
- Testing: Improved model detection logic [5257c5a]
- Post-install: Use logger.error instead of print during installation [c0795e9]
Refactoring
- wdoc class: Added dynamic interaction_settings property [f806b98]
- Type hints: Improved type annotations across multiple modules [a94a889, 920e5d3]
Documentation
- Help text: Fixed powerpoint filetype documentation incorrectly mentioning .doc/.docx instead of .ppt/.pptx [e9b29eb]
Dependencies
- Bumped litellm to enable latest OpenRouter pricing [577e6f6]
Maintenance
- Removed debug print statement [80f7f32]
- Better warning messages [faa5d3b]
- Fixed setup.py logger usage [4a672c1]
Commits details since the last release
- [5adc87e] by @thiswillbeyourgithub, 72 seconds ago:
 bump version 4.0.4 -> 4.1.0
bumpver.toml
docs/source/conf.py
setup.py
wdoc/wdoc.py
- [0b9c6da] by @thiswillbeyourgithub, 2 hours ago:
 enh: better exception catcher in language detecction
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/misc.py
- [80f7f32] by @thiswillbeyourgithub, 3 hours ago:
 remove a debug print
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/loaders/init.py
- [d7589cc] by @thiswillbeyourgithub, 3 hours ago:
 enh: language detector reduce debug logs
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/misc.py
- [c0e2ce7] by @thiswillbeyourgithub, 3 hours ago:
 enh: language detector
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/misc.py
- [2d928ab] by @thiswillbeyourgithub, 3 hours ago:
 fix: potential issue in edge case when detecting language
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/misc.py
- [69dca45] by @thiswillbeyourgithub, 4 hours ago:
 feat: add failure count and success rate to source tag logging
 Co-authored-by: aider (openrouter/anthropic/claude-sonnet-4.5) [email protected]
wdoc/utils/batch_file_loader.py
- [f806b98] by @thiswillbeyourgithub, 4 hours ago:
 refactor: add dynamic interaction_settings property to wdoc class
 Co-authored-by: aider (openrouter/anthropic/claude-sonnet-4.5) [email protected]
wdoc/wdoc.py
- [a94a889] by @thiswillbeyourgithub, 4 hours ago:
 minor: type hints
 Signed-off-by: thiswillbeyourgithub
 [email protected]
wdoc/wdoc.py
- [7c95e3c] by @thiswillbeyourgithub, 4 hours ago:
 new: use a dataclass to store the type of tasks
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/main.py
wdoc/utils/batch_file_loader.py
wdoc/utils/loaders/init.py
wdoc/utils/misc.py
wdoc/utils/retrievers.py
wdoc/utils/tasks/parse.py
wdoc/utils/tasks/summarize.py
wdoc/utils/tasks/types.py
wdoc/wdoc.py
- [5257c5a] by @thiswillbeyourgithub, 5 hours ago:
 better way to check for testing model
 Signed-off-by: thiswillbeyourgithub
 [email protected]
wdoc/utils/llm.py
wdoc/utils/misc.py
wdoc/wdoc.py
- [73924e1] by @thiswillbeyourgithub, 7 hours ago:
 fix: forward reference error in anki
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/loaders/anki.py
- [ebfc66c] by @thiswillbeyourgithub, 7 hours ago:
 fix: typeerror when loading powerpoint files
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/loaders/powerpoint.py
- [d207d98] by @thiswillbeyourgithub, 25 hours ago:
 enh: reduice verbosity of something that looked like an infinite loop but was not
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/batch_file_loader.py
- [920e5d3] by @thiswillbeyourgithub, 25 hours ago:
 fix: typehint in pdf
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/loaders/pdf.py
- [bb147b3] by @thiswillbeyourgithub, 25 hours ago:
 refactor: replace loop counter with hash-based infinite loop detection
 Co-authored-by: aider (openrouter/anthropic/claude-sonnet-4.5) [email protected]
wdoc/utils/batch_file_loader.py
- [4a672c1] by @thiswillbeyourgithub, 25 hours ago:
 fix: actually no we can't use loguru in setup.py
 Signed-off-by: thiswillbeyourgithub [email protected]
setup.py
- [faa5d3b] by @thiswillbeyourgithub, 26 hours ago:
 minor: better warning
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/batch_file_loader.py
- [fcf9ca5] by @thiswillbeyourgithub, 26 hours ago:
 fix: the loop counter has to be high enough to detect infinite loop
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/batch_file_loader.py
- [e9b29eb] by @thiswillbeyourgithub, 26 hours ago:
 doc: powerpoint filetype doc mentionned .doc and .docx instead of .ppt and .pptx
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/docs/help.md
- [577e6f6] by @thiswillbeyourgithub, 5 days ago:
 bump litellm, allows using the latest openrouter price by litellm
 Signed-off-by: thiswillbeyourgithub [email protected]
setup.py
- [c0795e9] by @thiswillbeyourgithub, 6 days ago:
 enh: use logger.error instead of print during the postinstall process
 Signed-off-by: thiswillbeyourgithub [email protected]
setup.py
Release 4.0.2
What's new
What's new
This release focuses on bug fixes, performance improvements, and code cleanup related to docstore filtering and retriever functionality.
🐛 Fixes
- 
Docstore filtering improvements 
- 
Retriever fixes 
⚡ Performance
- Do not store nor serialize the unfiltered docstore ([d29d3a5], [a9a0a35])
- Renamed filter_docstoretofilter_vectorstorefor clarity
 
- Renamed 
✨ Features
- Added timing measurements for docstore serialization and deletion ([375c1a1])
🧹 Chores
Commits details since the last release
- [83f23dd] by @thiswillbeyourgithub, 9 seconds ago:
 bump version 4.0.1 -> 4.0.2
bumpver.toml
docs/source/conf.py
setup.py
wdoc/wdoc.py
- [a9a0a35] by @thiswillbeyourgithub, 17 minutes ago:
 rename filter_docstore to filter_vectorstore
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/filters.py
wdoc/wdoc.py
- [d29d3a5] by @thiswillbeyourgithub, 18 minutes ago:
 perf: do not store nor serialize the unfiltered docsstore
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/filters.py
wdoc/wdoc.py
- [cf9171d] by @thiswillbeyourgithub, 24 minutes ago:
 fix: parent retriever when loading from embeddings
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/retrievers.py
- [de18c65] by @thiswillbeyourgithub, 37 minutes ago:
 remove unused import
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/retrievers.py
- [39951a8] by @thiswillbeyourgithub, 37 minutes ago:
 fix: typehint of retrievers in an edge case
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/retrievers.py
- [375c1a1] by @thiswillbeyourgithub, 45 minutes ago:
 feat: add timing measurements for docstore serialization and deletion
 Co-authored-by: aider (openrouter/anthropic/claude-sonnet-4.5) [email protected]
wdoc/utils/filters.py
- [6525464] by @thiswillbeyourgithub, 48 minutes ago:
 remove unused import
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/filters.py
- [9c1d967] by @thiswillbeyourgithub, 48 minutes ago:
 fix: actually the unfiltered docstore is serialized
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/filters.py
wdoc/wdoc.py
- [ee9cc6f] by @thiswillbeyourgithub, 67 minutes ago:
 fix: wrong type hint for create_filter_metadata
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/filters.py
- [d96c2f3] by @thiswillbeyourgithub, 69 minutes ago:
 fix: wrong type hint for filter_docstore
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/filters.py
- [1a2442d] by @thiswillbeyourgithub, 70 minutes ago:
 fix: forgot to pass arguments to filter_docstore
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/wdoc.py
Release 4.0.1
What's new
What's new
This release focuses on langfuse v3 compatibility and improved error handling.
🐛 Fixes
- 
Langfuse v3 compatibility 
- 
Document loading robustness 
📝 Documentation
- [56866d1] Add warning for using youtube audio backend instead of whisper or deepgram
🔧 Maintenance
- [fb49e60] Bump version 4.0.0 → 4.0.1
Commits details since the last release
- [fb49e60] by @thiswillbeyourgithub, 13 seconds ago:
 bump version 4.0.0 -> 4.0.1
bumpver.toml
docs/source/conf.py
setup.py
wdoc/wdoc.py
- [07257e0] by @thiswillbeyourgithub, 3 minutes ago:
 fix: use langfuse opentelemetry for v3
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/misc.py
- [56866d1] by @thiswillbeyourgithub, 11 minutes ago:
 doc: add warning for using the youtube audio backend instead of whisper or deepgram
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/loaders/youtube.py
- [89f5132] by @thiswillbeyourgithub, 14 minutes ago:
 fix: langfuse callback import changed for langfuse v3
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/misc.py
- [3039bcf] by @thiswillbeyourgithub, 20 minutes ago:
 fix: do not crash if no documents after transform_documents is ran
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/loaders/init.py
- [101c7f7] by @thiswillbeyourgithub, 29 minutes ago:
 add assert that docs were found
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/loaders/init.py
Release 4.0.0
What's new
What's new
This release focuses on major performance improvements through lazy loading and deferred imports, extensive code refactoring for better maintainability, and improved testing infrastructure.
⚡ Performance
- Significantly faster startup time through deferred imports and lazy loading [52985d5, dce3c24, 3ffaec3]
- Moved litellm imports to run only when needed [52985d5]
- Deferred requests import [0b4c2fb]
- Removed eager imports from __init__.pyfiles [306d4ca]
- Moved imports in loaders, embeddings, and core modules [de1cecc, 08b9206, fd2dcba, 1838e0f, 1bd4ced, f1740c4, 2b3d9e8, 6fbe51d, 6c74d8e, f306325]
- Added lazy loading for document loaders with WDOC_LAZY_LOADenv var [7fc5fad, ce10c4b]
 
🔧 Fixes
- Fixed forward reference type hints across multiple modules [fd6a7e7, 22b44b4, 15a2746]
- Fixed signature wrapping for parse function [29dbf5d]
- Fixed API tests for DuckDuckGo and OpenRouter [8b9ebc2, 8f511dd, 32e036d, 048f99e]
- Fixed missing filetype handling in edge cases [0422dec]
- Fixed error for Word document loading [8cad00d]
- Fixed lazy loading logic (was reversed) [a35446f]
- Fixed query_task and search_task output handling [6f633e8, 8b95a81]
- Fixed error when summary doesn't output to file using pipe [2a85a6b]
- Fixed imports in loaders [ebd4558, af85343, 986abd2, 4e61a6f]
- Added missing audioop-ltsrequirement for Python 3.13+ [56bd634]
♻️ Refactoring
- Modularized loaders: Split monolithic loader file into separate modules [df1a0ad, d3ed873, f0a3fce, b249068, 984a8d3, def441f, fb421cc]
- Created dedicated files for PDF, Anki, URL, audio, HTML, and other loaders
- Enabled lazy loading of loader modules [7fc5fad]
 
- Extracted task-specific functions to separate modules:
- Moved parse_doctoutils/tasks/parse.py[1c7c6e4]
- Moved query/search retrieval logic to task modules [7982051, c2e6142]
- Moved evaluate_doc_chaintoshared_query_search.py[8965c48]
- Extracted query splitting logic to shared utility [4bb54a5]
- Moved source_replaceto query.py [0ce5f4f]
- Moved autoincrease_top_kto query.py [38e82b4]
 
- Moved 
- Split search and query task methods with better type hints [1d94644, 824f395, 319b8eb]
- Moved debug_exceptionsto logger module [99cc99f]
- Moved VectorStore filtering code to filters.py [de4ce57]
- Added wdocSummarydataclass for type hinting [9fc51c0, 92f5c47]
- Added lazy caching for all_textsproperty [79b1661, 7b45948]
- Removed obsolete import_tricks.py[5116616]
🧪 Testing
- Improved test cleanup and temp folder removal [a768642, 35ef63e, c149f5d, 913378a]
- Better verbose output in cost tests [342ad3f]
- Use Mistral for OpenRouter API tests (zero data retention) [8f511dd]
- Added shell-based CLI test script for more reliable testing [cc74a84, 4170567]
- Added check for wdoc[full]installation [7cb9a3c]
- Updated Ollama embedding test to use embeddingsgemma[4d47631]
- Improved test assertions with more info [3d0f947]
📦 Dependencies
- Bumped langchain version [98fd2cb]
- Bumped litellm version [7aa2ce1]
- Bumped langfuse version (litellm bug fix) [fc16e5e]
- Updated general dependencies [616457c]
- Added unstructured to required dependencies [c98d0e9]
- Added bumpver to dev packages [54be0e2]
✨ Features
- Added wdoc[full]installation option for all optional dependencies [6321942]
- Added beartype runtime type validation for numpy arrays [691dbff]
- Prioritize throughput and Groq when using OpenRouter [f049846]
- Enable lazy loading of imports by default [7c2e397]
📝 Documentation
- Updated default models to latest Gemini in README and help [761ddd1, 0868086, 78e562f]
- Clarified that binary embeddings are not always better [fb611c4]
- Added link explaining fixed cache of LLM issue [fdc3c64]
- Improved docstrings for summarization functions [a06f570]
- Added docstring for VectorStore filtering [2bd8dcb]
🎨 Code Quality
- PEP8 formatting improvements [559fcc0, 13dd7d44, e10fb05]
- Removed unused imports [16c518e, cb9563e, b5a42ef]
- Improved type hints [78d399f, a801718, 0875b23]
- Import logger first to set log level [2dead91]
- Removed if Truestatement [263a45d]
🔖 Version
- Bumped version 3.3.1 → 4.0.0 [e1548c4]
Commits details since the last release
- [e1548c4] by @thiswillbeyourgithub, 12 seconds ago:
 bump version 3.3.1 -> 4.0.0
bumpver.toml
docs/source/conf.py
setup.py
wdoc/wdoc.py
- [54be0e2] by @thiswillbeyourgithub, 47 seconds ago:
 add bumpver to dev packages
 Signed-off-by: thiswillbeyourgithub [email protected]
setup.py
- [37e80a7] by @thiswillbeyourgithub, 2 minutes ago:
 doc: todo
 Signed-off-by: thiswillbeyourgithub [email protected]
README.md
- [a768642] by @thiswillbeyourgithub, 3 minutes ago:
 better trash removal
 Signed-off-by: thiswillbeyourgithub [email protected]
tests/run_all_tests.sh
- [35ef63e] by @thiswillbeyourgithub, 16 minutes ago:
 less verbose test removal of temp folders
 Signed-off-by: thiswillbeyourgithub [email protected]
tests/run_all_tests.sh
- [c149f5d] by @thiswillbeyourgithub, 30 minutes ago:
 enh: delete cache dir at start of test
 Signed-off-by: thiswillbeyourgithub [email protected]
tests/run_all_tests.sh
- [ae6f28a] by @thiswillbeyourgithub, 32 minutes ago:
 minor: name of a test
 Signed-off-by: thiswillbeyourgithub [email protected]
tests/test_wdoc.py
- [342ad3f] by @thiswillbeyourgithub, 39 minutes ago:
 better verbose output in cost test
 Signed-off-by: thiswillbeyourgithub [email protected]
tests/test_wdoc.py
- [8f511dd] by @thiswillbeyourgithub, 63 minutes ago:
 fix: use mistral instead of openai when testing api from openrouter because it supports zero data retention
 Signed-off-by: thiswillbeyourgithub
 [email protected]
tests/test_wdoc.py
- [8b9ebc2] by @thiswillbeyourgithub, 65 minutes ago:
 fix: api test for ddg from the shell
 Signed-off-by: thiswillbeyourgithub
 [email protected]
tests/test_cli.sh
- [fd6a7e7] by @thiswillbeyourgithub, 72 minutes ago:
 fix forward reference for typehinting
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/wdoc.py
- [1428497] by @thiswillbeyourgithub, 72 minutes ago:
 fix forgot to test api using cli script
 Signed-off-by: thiswillbeyourgithub [email protected]
tests/run_all_tests.sh
- [22b44b4] by @thiswillbeyourgithub, 85 minutes ago:
 fix forward reference type hints
 Signed-off-by: thiswillbeyourgithub
 [email protected]
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/tasks/query.py
wdoc/wdoc.py
- [15a2746] by @thiswillbeyourgithub, 89 minutes ago:
 fix forward reference type hints
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/llm.py
wdoc/utils/retrievers.py
wdoc/utils/tasks/query.py
wdoc/utils/tasks/summarize.py
- [29dbf5d] by @thiswillbeyourgithub, 2 hours ago:
 fix: signature wrapping for parse
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/main.py
wdoc/wdoc.py
- [7aa2ce1] by @thiswillbeyourgithub, 2 hours ago:
 bump litellm
 Signed-off-by: thiswillbeyourgithub [email protected]
setup.py
- [98fd2cb] by @thiswillbeyourgithub, 2 hours ago:
 bump langchain version
 Signed-off-by: thiswillbeyourgithub
 [email protected]
setup.py
- [616457c] by @thiswillbeyourgithub, 2 hours ago:
 bump dependencies
 Signed-off-by: thiswillbeyourgithub [email protected]
setup.py
- [dce3c24] by @thiswillbeyourgithub, 2 hours ago:
 actually use lazy import for litellm
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/misc.py
- [52985d5] by @thiswillbeyourgithub, 3 hours ago:
 better startup time by defering litellm import
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/customs/litellm_embeddings.py
wdoc/utils/embeddings.py
wdoc/utils/llm.py
wdoc/utils/loaders/shared_audio.py
wdoc/utils/misc.py
wdoc/utils/retrievers.py
wdoc/utils/tasks/query.py
wdoc/utils/tasks/summarize.py
wdoc/wdoc.py
- [559fcc0] by @thiswillbeyourgithub, 3 hours ago:
 pep8
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/init.py
- [0b4c2fb] by @thiswillbeyourgithub, 3 hours ago:
 defer requests import
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/loaders/shared_audio.py
wdoc/utils/misc.py
- [78d399f] by @thiswillbeyourgithub, 3 hours ago:
 type hint for multiquery retriever
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/retrievers.py
- [13dd7d1] by @thiswillbeyourgithub, 3 hours ago:
 pep8
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/main.py
wdoc/utils/customs/binary_faiss_vectorstore.py
wdoc/utils/embeddings.py
wdoc/utils/env.py
wdoc/utils/interact.py
wdoc/utils/llm.py
wdoc/utils/misc.py
wdoc/utils/prompts.py
wdoc/utils/tasks/parse.py
wdoc/utils/tasks/query.py
wdoc/utils/tasks/search.py
- [6ce23d4] by @thiswillbeyourgithub, 4 hours ago:
 fix import statements
 Signed-off-by: thiswillbeyourgithub <[email protected]...
Release 3.3.1
What's new
This release focuses on improving code quality through comprehensive type hint fixes and enhanced testing infrastructure.
🔧 Fixes
- 
Type Hints: Comprehensive type hint improvements across the codebase - Binary FAISS vectorstore type hints ([e46ed4a], [ac02a65], [da864cd], [95c3705], [81b36cb], [be3f352])
- Loader function type hints ([e6fcad8], [e65abad], [b624373])
- Semantic batching type hints ([f3a5289], [dd6ad29])
- Prompt template type hints ([1b4ec86], [bc56beb])
- General type hint fixes ([d4e99fd])
 
- 
Model Compatibility: Fixed issue where some models consider <answer>as implying</think>([09684bb])
- 
Langchain Integration: Fixed callable_chain compatibility by creating runnables without decorators ([0c89cac]) 
✨ Enhancements
- Type Checking: Replaced manual type checking with import hook system ([56b353a], [6b3ddab])
- Logging: Reduced verbosity of litellm logging ([9a4a69c])
- Search: Added duplicate check for DuckDuckGo search results ([f68c8a4])
🧪 Tests
- Added comprehensive test for DuckDuckGo search functionality ([7dbd3c2])
- Fixed existing CLI tests ([781f6d6])
📦 Version
- Bumped version from 3.3.0 to 3.3.1 ([0690df9])
Commits details since the last release
- [0690df9] by @thiswillbeyourgithub, 41 seconds ago:
 bump version 3.3.0 -> 3.3.1
bumpver.toml
docs/source/conf.py
setup.py
wdoc/wdoc.py
- [7dbd3c2] by @thiswillbeyourgithub, 8 hours ago:
 test: add test for DuckDuckGo search functionality
 Co-authored-by: aider (openrouter/anthropic/claude-sonnet-4) [email protected]
tests/test_wdoc.py
- [781f6d6] by @thiswillbeyourgithub, 9 hours ago:
 fix: test
 Signed-off-by: thiswillbeyourgithub [email protected]
tests/test_cli.py
- [e46ed4a] by @thiswillbeyourgithub, 14 hours ago:
 fix: typehint for marginal score search
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/customs/binary_faiss_vectorstore.py
- [ac02a65] by @thiswillbeyourgithub, 18 hours ago:
 fix: type hint of binary faiss
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/customs/binary_faiss_vectorstore.py
- [f68c8a4] by @thiswillbeyourgithub, 19 hours ago:
 add check for duplicate ddg result
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/batch_file_loader.py
- [da864cd] by @thiswillbeyourgithub, 19 hours ago:
 fix: binary faiss type hints
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/customs/binary_faiss_vectorstore.py
- [9a4a69c] by @thiswillbeyourgithub, 19 hours ago:
 enh: tune down the verbosity of litellm
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/wdoc.py
- [e6fcad8] by @thiswillbeyourgithub, 19 hours ago:
 fix: type hint of load_one_doc
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/loaders.py
- [e65abad] by @thiswillbeyourgithub, 20 hours ago:
 Revert "fix: typehint of load_one_doc"
 This reverts commit f0037b54ac5ce317442e672f12e1da266b58c5c1.
wdoc/utils/loaders.py
- [b624373] by @thiswillbeyourgithub, 20 hours ago:
 fix: typehint of load_one_doc
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/loaders.py
- [95c3705] by @thiswillbeyourgithub, 20 hours ago:
 fix: typehints for binary faiss
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/customs/binary_faiss_vectorstore.py
- [f3a5289] by @thiswillbeyourgithub, 20 hours ago:
 fix: type hints for semantic batching
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/tasks/query.py
- [d4e99fd] by @thiswillbeyourgithub, 20 hours ago:
 fix: type hints
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/wdoc.py
- [81b36cb] by @thiswillbeyourgithub, 21 hours ago:
 forgot some type hint for binary faiss
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/customs/binary_faiss_vectorstore.py
- [dd6ad29] by @thiswillbeyourgithub, 21 hours ago:
 fix: wrong typehint in semantic_batch
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/tasks/query.py
- [09684bb] by @thiswillbeyourgithub, 21 hours ago:
 fix: some models consider than implied
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/misc.py
- [0c89cac] by @thiswillbeyourgithub, 22 hours ago:
 fix: actually callable_chain does not work for langchain so we have to make runnables without decorators
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/customs/callable_runnable.py
wdoc/utils/misc.py
wdoc/utils/tasks/query.py
wdoc/wdoc.py
- [6b3ddab] by @thiswillbeyourgithub, 22 hours ago:
 new: remove the ubiquitous optional_typecheck decorator
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/main.py
wdoc/utils/batch_file_loader.py
wdoc/utils/embeddings.py
wdoc/utils/interact.py
wdoc/utils/llm.py
wdoc/utils/loaders.py
wdoc/utils/logger.py
wdoc/utils/misc.py
wdoc/utils/prompts.py
wdoc/utils/retrievers.py
wdoc/utils/tasks/query.py
wdoc/utils/tasks/summarize.py
wdoc/utils/typechecker.py
wdoc/wdoc.py
- [56b353a] by @thiswillbeyourgithub, 22 hours ago:
 new: neutralize manual type checking and instead use the import hook
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/init.py
wdoc/utils/customs/callable_runnable.py
wdoc/utils/misc.py
wdoc/utils/tasks/query.py
wdoc/utils/typechecker.py
wdoc/wdoc.py
- [be3f352] by @thiswillbeyourgithub, 22 hours ago:
 fix: type hints in binary faiss
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/customs/binary_faiss_vectorstore.py
- [1b4ec86] by @thiswillbeyourgithub, 23 hours ago:
 fix: wrong type for Prompts class
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/prompts.py
- [bc56beb] by @thiswillbeyourgithub, 23 hours ago:
 add type checking to prompt template
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/prompts.py
Release 3.3.0
What's new
This release focuses on adding DuckDuckGo web search capabilities and introducing binary embeddings support for more efficient vector storage.
✨ New Features
DuckDuckGo Web Search Integration
- [372fe57] Add DuckDuckGo search support with URL extraction and metadata
- [273195e] Support wdoc wdb "your query"shorthand for web search
- [03bfe08] Add DuckDuckGo search tests and documentation
Binary Embeddings Support
- [c528bad] Add support for binary embeddings with 8x memory reduction
- [8f65197] Enable FAISS vectorstore compression by default
- [37ebd97] Create CompressedFAISS subclass with zlib compression
🐛 Bug Fixes
Core Functionality
- [0d72efd] Fix wrong decorator used for load_one_doc
- [edcf671] Fix ddg_regiontype (str not int)
- [66ab177] Fix type hints for ddg_safesearchandloading_failure
- [957936c] Use keyword arguments instead of fire when calling wdoc
Testing Environment
- [d3de58e] Fix piped input/output handling in pytest environment
- [42ff516] Prevent pipe usage in pytest environment
- [c78dc0b] Add pytest environment detection
🧪 Testing Improvements
- [1b09996] Fix the run_all_testscript
- [8ed1d0c] Add comprehensive DuckDuckGo search functionality tests
- [b184177] Split CLI tests into separate test_cli.pyfile
- [9d7fe9c] Split parsing tests into separate test_parsing.pyfile
- [12b012d] Move vector store tests to dedicated test file
📚 Documentation
- [d7d6b04] Explain how to run tests in README
- [dc15001] Clarify how to disable parallel processing
- [df4b79f] Document debug mode's effect on loading_failuredefault
- [1832299] Add shell examples for DuckDuckGo usage
🔧 Enhancements
CLI/UX Improvements
- [7e994a6] Rename parse_filefunction toparse_doc
- [4aa247e] Re-ask for input when empty query provided in CLI
- [57d5d5f] Fix Fire's pager issue in CLI
Performance THISISANAMPERSAND Reliability
- [68d4c75] Bump LiteLLM to latest version for improved startup time
- [ab9c5e9] Add parallel processing option for Whisper audio splits
- [6b13044] Add loop counter and crash protection for recursive file processing
🔄 Version Update
- [6435133] Bump version from 3.2.5 → 3.3.0
Commits details since the last release
- [6435133] by @thiswillbeyourgithub, 36 minutes ago:
 bump version 3.2.5 -> 3.3.0
bumpver.toml
docs/source/conf.py
setup.py
wdoc/wdoc.py
- [1b09996] by @thiswillbeyourgithub, 24 hours ago:
 test: fix the run_all_test script
 Signed-off-by: thiswillbeyourgithub [email protected]
tests/run_all_tests.sh
- [d7d6b04] by @thiswillbeyourgithub, 24 hours ago:
 doc: explain how to run the tests
 Signed-off-by: thiswillbeyourgithub [email protected]
README.md
- [62cc2ce] by @thiswillbeyourgithub, 24 hours ago:
 fix: ddg test
 Signed-off-by: thiswillbeyourgithub [email protected]
tests/test_cli.py
- [0d72efd] by @thiswillbeyourgithub, 24 hours ago:
 fix: wrong decorator used for load_one_doc
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/loaders.py
- [dc15001] by @thiswillbeyourgithub, 24 hours ago:
 doc: clarify how to disable parallel processing
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/docs/help.md
- [e0453cb] by @thiswillbeyourgithub, 24 hours ago:
 minor: mention a type hint
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/batch_file_loader.py
- [edcf671] by @thiswillbeyourgithub, 24 hours ago:
 fix: ddg_region is actually a str not an int
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/misc.py
- [df4b79f] by @thiswillbeyourgithub, 24 hours ago:
 doc: mention that debug changes the default value for loading_failure
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/docs/help.md
- [66ab177] by @thiswillbeyourgithub, 25 hours ago:
 fix: type of ddg_safesearch and loading_failure should be Literal
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/misc.py
- [98b0867] by @thiswillbeyourgithub, 25 hours ago:
 doc: explain that loading_failure defaultto crash when parsing
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/docs/help.md
- [90eacb3] by @thiswillbeyourgithub, 25 hours ago:
 test: ddg should use us region by default
 Signed-off-by: thiswillbeyourgithub [email protected]
tests/test_cli.py
- [c8b1944] by @thiswillbeyourgithub, 25 hours ago:
 test: less severe check for pipes
 Signed-off-by: thiswillbeyourgithub [email protected]
tests/test_cli.py
- [6e12e5c] by @thiswillbeyourgithub, 25 hours ago:
 test: remove one -n auto arg
 Signed-off-by: thiswillbeyourgithub [email protected]
tests/run_all_tests.sh
- [d3de58e] by @thiswillbeyourgithub, 2 days ago:
 fix: actually inside pytest we should not bypass piped input but only piped output
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/env.py
wdoc/utils/misc.py
- [5715bc4] by @thiswillbeyourgithub, 2 days ago:
 test: add env variable to detect if being called by pytest
 Signed-off-by: thiswillbeyourgithub
 [email protected]
tests/conftest.py
- [42ff516] by @thiswillbeyourgithub, 2 days ago:
 new: do not allow using pipe input or output in pytest environment
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/misc.py
- [c78dc0b] by @thiswillbeyourgithub, 2 days ago:
 new: detect when wdoc is called in pytest environment
 Signed-off-by: thiswillbeyourgithub [email protected]
tests/test_wdoc.py
wdoc/utils/env.py
wdoc/utils/misc.py
- [fca39c0] by @thiswillbeyourgithub, 2 days ago:
 test: missing oneoff and failsafe when testing ddg
 Signed-off-by: thiswillbeyourgithub [email protected]
tests/test_cli.py
- [b2b4cf1] by @thiswillbeyourgithub, 2 days ago:
 test: fix missing quotation sign for args
 Signed-off-by: thiswillbeyourgithub [email protected]
tests/test_cli.py
- [13409b1] by @thiswillbeyourgithub, 2 days ago:
 test: fix a timeout not long enough
 Signed-off-by: thiswillbeyourgithub [email protected]
tests/test_cli.py
- [957936c] by @thiswillbeyourgithub, 2 days ago:
 fix: use keyword aguments instead of fire when calling wdoc
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/main.py
- [b034337] by @thiswillbeyourgithub, 2 days ago:
 minor
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/main.py
- [b44d730] by @thiswillbeyourgithub, 2 days ago:
 fix: replacing ddg_max_result
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/main.py
- [dfcaf3b] by @thiswillbeyourgithub, 2 days ago:
 fix: wrong way to replace ddg_max_result to ddg_max_results
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/main.py
- [adc991a] by @thiswillbeyourgithub, 2 days ago:
 actually no
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/loaders.py
- [9ab4cbf] by @thiswillbeyourgithub, 2 days ago:
 fix: type hint of load_one_doc can be a list of string in case of error
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/loaders.py
- [5f7fcf4] by @thiswillbeyourgithub, 2 days ago:
 typo: Nvidia instead of NVidia
 Signed-off-by: thiswillbeyourgithub [email protected]
README.md
tests/test_cli.py
wdoc/docs/examples.md
- [03bfe08] by @thiswillbeyourgithub, 2 days ago:
 test: add test for ddg search
 Signed-off-by: thiswillbeyourgithub [email protected]
tests/run_all_tests.sh
- [48165fa] by @thiswillbeyourgithub, 2 days ago:
 test: clearer echo
 Signed-off-by: thiswillbeyourgithub [email protected]
tests/run_all_tests.sh
- [08cac94] by @thiswillbeyourgithub, 2 days ago:
 remove unused import
 Signed-off-by: thiswillbeyourgithub [email protected]
tests/test_cli.py
- [3aabd2d] by @thiswillbeyourgithub, 2 days ago:
 style: format test_cli.py with linter
 Co-authored-by: aider (openrouter/anthropic/claude-sonnet-4) [email protected]
tests/test_cli.py
- [8ed1d0c] by @thiswillbeyourgithub, 2 days ago:
 feat: add test for DuckDuckGo search functionality with NVIDIA query
 Co-authored-by: aider (openrouter/anthropic/claude-sonnet-4) [email protected]
tests/test_cli.py
- [a8e3e04] by @thiswillbeyourgithub, 2 days ago:
 test: add test for DuckDuckGo search with NVIDIA query
 Co-authored-by: aider (openrouter/anthropic/claude-sonnet-4) [email protected]
tests/test_cli.py
- [1832299] by @thiswillbeyourgithub, 2 days ago:
 doc: add shell example for using duckduckgo
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/docs/examples.md
- [e6c4641] by @thiswillbeyourgithub, 2 days ago:
 typo
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/docs/examples.md
- [917ee51] by @...
Release 3.2.5
What's new
This release brings several improvements to command-line argument handling and filetype detection, along with key bug fixes and build process enhancements.
✨ Features
- CLI & Filetype Detection:
- Build Process:
- Integrated sphinx-apidocinto the ReadTheDocs build process via a pre-build job in.readthedocs.yaml([cc86c7b]).
 
- Integrated 
🐛 Fixes
- Corrected an issue with sys.argvhandling that led to duplicated arguments ([e7cf185]).
- Updated litellmdependency to resolve crashes experienced on Windows environments ([cfff0ac]), see #20.
🛠️ Improvements & Refactoring
- Filetype Detection Internals:
- Code Quality:
- Improved documentation by adding docstrings to custom exception classes ([8e6ca1a]).
 
Chores
- Version bumped to 3.2.5 ([82b7f81]).
Commits details since the last release
- [82b7f81] by @thiswillbeyourgithub, 19 minutes ago:
 bump version 3.2.4 -> 3.2.5
bumpver.toml
docs/source/conf.py
setup.py
wdoc/wdoc.py
- [e7cf185] by @thiswillbeyourgithub, 3 minutes ago:
 fix: badly handled sys.argv was duplicating args
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/main.py
- [cfff0ac] by @thiswillbeyourgithub, 19 minutes ago:
 fix: bump version of litellm because windows crashes
 Signed-off-by: thiswillbeyourgithub [email protected]
setup.py
- [cc86c7b] by @thiswillbeyourgithub (aider), 2 days ago:
 feat: add pre-build job to run sphinx-apidoc in .readthedocs.yaml
.readthedocs.yaml
- [ab76610] by @thiswillbeyourgithub, 2 days ago:
 new: use the filetype detector to infer what to do in case of multiple implicit arguments from the cli
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/main.py
- [05966c6] by @thiswillbeyourgithub, 2 days ago:
 enh: add debug prints to the filetype detector
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/batch_file_loader.py
- [520f4ce] by @thiswillbeyourgithub, 2 days ago:
 new: use a specific exception when we can't infer the filetype
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/batch_file_loader.py
- [8e6ca1a] by @thiswillbeyourgithub, 2 days ago:
 add docstring to some exceptions
 Signed-off-by: thiswillbeyourgithub
 [email protected]
wdoc/utils/errors.py
- [39af223] by @thiswillbeyourgithub, 2 days ago:
 add an error for undetectable filetype
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/errors.py
- [b453748] by @thiswillbeyourgithub, 2 days ago:
 new: put the filetype detection code in a separate function
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/batch_file_loader.py
Release 3.2.4
What's new
This release primarily focuses on significant documentation enhancements, crucial bug fixes for stability and build processes, and introduces updated dependencies and tokenization.
✨ New Features
- Upgraded default token estimation to use gpt-4o-minitokenizer, replacinggpt-3.5-turbo([6d41817]).
- Integrated the latest yt-dlpfor YouTube downloads ([ab207b4]).
- Environment variable documentation is now automatically added to the EnvDataclassclass__doc__([ed9dd38]).
🐛 Bug Fixes
- Resolved a crash on ReadTheDocs caused by missing yt-dlpdependency ([f5068a3]).
- Fixed an issue where accessing env.__class__on ReadTheDocs could cause a crash ([4e180f0]).
- Corrected relative import paths in wdocthat were preventing Sphinx API documentation builds ([ade5930]).
- Fixed issues with the Sphinx API command in the FAQ section of the README ([38008aa], [ff093a2]).
- Ensured collapsible bars in documentation function correctly ([3cef833]).
📚 Documentation & Refinements
- Extensive updates and fixes to Sphinx documentation generation and content:
- Addressed outdated Sphinx documentation files ([90bde99]).
- Improved API autodoc parameters for clearer documentation ([243de66]).
- Excluded private and special members from documentation ([7abedd4]).
- Added Sphinx command to FAQ in README ([1e6602e]) and removed private members from it ([11ae11b]).
- Updated copyright year to 2025 ([bd7e3c5]).
 
- Streamlined documentation structure and configuration:
- Removed unused make files (Makefile,make.bat) for documentation ([07b0a7d]).
- Removed unused argument for theme flyout display ([17bc5e6]).
- Removed unused templates path ([6bffa20]) and CSS ([712df08]).
- Removed duplicate README from the documentation source ([2b93162]).
- Added a documentation table to the main index ([1dfe2b3]).
 
- Removed unused make files (
⚙️ Build & Chores
- Bumped version to 3.2.4 ([ed7a9c7]).
Commits details since the last release
- [ed7a9c7] by @thiswillbeyourgithub, 20 seconds ago:
 bump version 3.2.3 -> 3.2.4
bumpver.toml
docs/source/conf.py
setup.py
wdoc/wdoc.py
- [f5068a3] by @thiswillbeyourgithub, 13 minutes ago:
 fix: missing yt-dlp makes readthedock crash
 Signed-off-by: thiswillbeyourgithub [email protected]
setup.py
- [17bc5e6] by @thiswillbeyourgithub, 19 minutes ago:
 remove unused argument for theme flyout display
 Signed-off-by: thiswillbeyourgithub [email protected]
docs/source/conf.py
- [4e180f0] by @thiswillbeyourgithub, 22 minutes ago:
 fix: class attribute of env is accessed by readthedocks and should not crash
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/env.py
- [243de66] by @thiswillbeyourgithub, 2 hours ago:
 saner api autodoc parameters
 Signed-off-by: thiswillbeyourgithub [email protected]
docs/source/conf.py
- [ed9dd38] by @thiswillbeyourgithub, 3 hours ago:
 new: add the environment variable documentation to the doc of the EnvDataclass class
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/docs/help.md
wdoc/utils/env.py
- [07b0a7d] by @thiswillbeyourgithub, 3 hours ago:
 doc: remove unused make files for doc
 Signed-off-by: thiswillbeyourgithub
 [email protected]
docs/Makefile
docs/make.bat
- [7abedd4] by @thiswillbeyourgithub, 4 hours ago:
 doc: dont include private nor special
 Signed-off-by: thiswillbeyourgithub
 [email protected]
docs/source/conf.py
- [38008aa] by @thiswillbeyourgithub, 2 hours ago:
 fix: sphinx api command of faq
 Signed-off-by: thiswillbeyourgithub
 [email protected]
 Signed-off-by: thiswillbeyourgithub
 [email protected]
 Signed-off-by: thiswillbeyourgithub [email protected]
README.md
- [11ae11b] by @thiswillbeyourgithub, 4 hours ago:
 remove private from sphinx command
 Signed-off-by: thiswillbeyourgithub [email protected]
README.md
- [90bde99] by @thiswillbeyourgithub, 4 hours ago:
 fix outdated sphinx doc
 Signed-off-by: thiswillbeyourgithub [email protected]
docs/source/wdoc.rst
docs/source/wdoc.utils.batch_file_loader.rst
docs/source/wdoc.utils.customs.compressed_embeddings_cache.rst
docs/source/wdoc.utils.customs.fix_llm_caching.rst
docs/source/wdoc.utils.customs.rst
docs/source/wdoc.utils.embeddings.rst
docs/source/wdoc.utils.env.rst
docs/source/wdoc.utils.errors.rst
docs/source/wdoc.utils.flags.rst
docs/source/wdoc.utils.import_tricks.rst
docs/source/wdoc.utils.interact.rst
docs/source/wdoc.utils.llm.rst
docs/source/wdoc.utils.loaders.rst
docs/source/wdoc.utils.logger.rst
docs/source/wdoc.utils.misc.rst
docs/source/wdoc.utils.prompts.rst
docs/source/wdoc.utils.retrievers.rst
docs/source/wdoc.utils.rst
docs/source/wdoc.utils.tasks.query.rst
docs/source/wdoc.utils.tasks.rst
docs/source/wdoc.utils.tasks.summarize.rst
docs/source/wdoc.utils.typechecker.rst
docs/source/wdoc.wdoc.rst
- [ff093a2] by @thiswillbeyourgithub, 4 hours ago:
 fix: sphinx api command of faq
 Signed-off-by: thiswillbeyourgithub [email protected]
README.md
- [ade5930] by @thiswillbeyourgithub, 4 hours ago:
 fix: relative wdoc imports were stopping sphinx api build
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/init.py
wdoc/main.py
wdoc/utils/init.py
wdoc/utils/batch_file_loader.py
wdoc/utils/customs/init.py
wdoc/utils/embeddings.py
wdoc/utils/env.py
wdoc/utils/import_tricks.py
wdoc/utils/interact.py
wdoc/utils/llm.py
wdoc/utils/loaders.py
wdoc/utils/logger.py
wdoc/utils/misc.py
wdoc/utils/prompts.py
wdoc/utils/retrievers.py
wdoc/utils/tasks/init.py
wdoc/utils/tasks/query.py
wdoc/utils/tasks/summarize.py
wdoc/utils/typechecker.py
wdoc/wdoc.py
- [1e6602e] by @thiswillbeyourgithub, 5 hours ago:
 doc: add to faq the sphinx command
 Signed-off-by: thiswillbeyourgithub [email protected]
README.md
- [bd7e3c5] by @thiswillbeyourgithub, 5 hours ago:
 update copyright year to 2025
 Signed-off-by: thiswillbeyourgithub [email protected]
docs/source/conf.py
- [6bffa20] by @thiswillbeyourgithub, 6 hours ago:
 remove unused templates path in doc
 Signed-off-by: thiswillbeyourgithub
 [email protected]
docs/source/conf.py
- [2b93162] by @thiswillbeyourgithub, 6 hours ago:
 remove duplicate readme from the doc
 Signed-off-by: thiswillbeyourgithub [email protected]
docs/source/index.rst
- [3cef833] by @thiswillbeyourgithub, 6 hours ago:
 fix collapsible bar
 Signed-off-by: thiswillbeyourgithub [email protected]
docs/source/conf.py
- [712df08] by @thiswillbeyourgithub, 6 hours ago:
 remove unused css from the doc
 Signed-off-by: thiswillbeyourgithub [email protected]
docs/source/_static/custom.css
docs/source/conf.py
- [1dfe2b3] by @thiswillbeyourgithub, 6 hours ago:
 documentation table
 Signed-off-by: thiswillbeyourgithub [email protected]
docs/source/index.rst
- [6d41817] by @thiswillbeyourgithub, 25 hours ago:
 new: use gpt-4o-mini tokenizer by default to estimate tokens
 previously we used the ageing gpt-3.5-turbo
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/docs/help.md
wdoc/utils/misc.py
- [ab207b4] by @thiswillbeyourgithub, 25 hours ago:
 new: use the latest yt-dl install from yt-dlp
 Signed-off-by: thiswillbeyourgithub [email protected]
setup.py
Release 3.2.3
What's new
This release primarily focuses on enhancing context management for embedding models, improving debugging utilities, and updating documentation for better clarity. It also includes several important bug fixes and feature additions.
✨ Features
- Introduced a new environment variable WDOC_MAX_EMBED_CONTEXTto allow capping the context size for embedding models ([d9e200f8])- Documentation for this new variable has been added ([a2408fd0])
 
- Documentation for this new variable has been added (
- Enhanced debugging by ensuring debug prints are always active when md_printeris used. This helps in retrieving LLM answers from logs if they weren't saved to a file ([69db1916])
- Added the current date to summary metadata and headers to help reduce potential LLM hallucinations ([64ca4665])
🐛 Fixes
- Text Splitting & Context Handling:
- Addressed an issue where large language models have more context than embedding models by setting a max_tokenslimit for the text splitter ([dac6802d])
- Fixed an edge case where the wdoc max chunksetting could be ignored ([196b3a00])
- Corrected an old variable name within the text splitting logic ([767bc754])
 
- Addressed an issue where large language models have more context than embedding models by setting a 
- Updated the default model to gemini 2.5 previewto reflect its renaming on OpenRouter ([22978609])
- Improved the mechanism for ignoring initial "breathing" or placeholder lines in summaries ([4dbcf158])
📚 Documentation
- Clarity and Enhancements:
- Clarified the usage of saveandloadfunctionalities ([9d9642d4]) and specifically advised against using them simultaneously ([5270c350])
- Made multiple clarifications to the README for better understanding ([9284ff54],[cb4cb519],[f677e5a2],[39e0da55])
- Updated Ollama examples to recommend snowflake-arctic-embed2instead ofbge-m3([d045702b])
- Added documentation for the WDOC_MAX_EMBED_CONTEXTenvironment variable ([a2408fd0])
 
- Clarified the usage of 
- Removed a documentation file (summary_rag.md) that was not yet ready for release ([6d20c220])
⚙️ Chore & Maintenance
- Version bumped to 3.2.3(following an earlier bump to3.2.2[[71ac503c]]) ([f62a2322])
- README Updates:
- Updated TODO items ([8f2cbfd7],[5d090421])
- Added a PyPI badge for better project visibility ([60ef4112])
 
- Updated TODO items (
Commits details since the last release
- [f62a232] by @thiswillbeyourgithub, 46 seconds ago:
 bump version 3.2.2 -> 3.2.3
bumpver.toml
docs/source/conf.py
setup.py
wdoc/wdoc.py
- [6d20c22] by @thiswillbeyourgithub, 76 seconds ago:
 doc: removed file not yet ready
 Signed-off-by: thiswillbeyourgithub [email protected]
summary_rag.md
- [71ac503] by @thiswillbeyourgithub, 4 minutes ago:
 bump version 3.2.1 -> 3.2.2
bumpver.toml
docs/source/conf.py
setup.py
wdoc/wdoc.py
- [8f2cbfd] by @thiswillbeyourgithub, 3 minutes ago:
 todo
 Signed-off-by: thiswillbeyourgithub
 [email protected]
README.md
- [69db191] by @thiswillbeyourgithub, 40 minutes ago:
 new: now debug print is used anyway when md_printer is used
 this is to make you able to go to the logs to fetch and answer form the
 LLM if you have forgotten to store it to a file
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/logger.py
wdoc/wdoc.py
- [a2408fd] by @thiswillbeyourgithub (aider), 66 minutes ago:
 docs: Add documentation for WDOC_MAX_EMBED_CONTEXT variable
wdoc/docs/help.md
- [d9e200f] by @thiswillbeyourgithub, 66 minutes ago:
 feat: add new env var to cap the context size for embedding models
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/env.py
wdoc/utils/misc.py
- [196b3a0] by @thiswillbeyourgithub, 72 minutes ago:
 fix: edge case where wdoc max chunk would be ignored
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/misc.py
- [dac6802] by @thiswillbeyourgithub, 76 minutes ago:
 fix: set a limit to max_tokens for the text splitter as large LLM have more context than embeddings models nowadays
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/misc.py
- [767bc75] by @thiswillbeyourgithub, 80 minutes ago:
 fix: forgot to rename an old variable name
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/misc.py
- [2297860] by @thiswillbeyourgithub, 86 minutes ago:
 fix: set default model to gemini 2.5 preview without date timestamp
 openrouter renamed that model apparently
Signed-off-by: thiswillbeyourgithub [email protected]
README.md
wdoc/utils/env.py
- [9d9642d] by @thiswillbeyourgithub, 22 hours ago:
 doc: clarify save and load
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/docs/help.md
- [5270c35] by @thiswillbeyourgithub, 22 hours ago:
 doc: clarify that load and save shouldnt be used at the same time
 Signed-off-by: thiswillbeyourgithub
 [email protected]
wdoc/docs/help.md
- [d045702] by @thiswillbeyourgithub, 23 hours ago:
 doc: use snowflake-arctic-embed2 instead of bge-m3 for ollama examples
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/docs/examples.md
- [60ef411] by @thiswillbeyourgithub, 26 hours ago:
 add a pypi badge
 Signed-off-by: thiswillbeyourgithub [email protected]
README.md
- [5d09042] by @thiswillbeyourgithub, 7 days ago:
 update todo
 Signed-off-by: thiswillbeyourgithub [email protected]
README.md
- [9284ff5] by @thiswillbeyourgithub, 7 days ago:
 doc: clarify
 Signed-off-by: thiswillbeyourgithub
 [email protected]
 Signed-off-by: thiswillbeyourgithub
 [email protected]
 Signed-off-by: thiswillbeyourgithub
 [email protected]
 Signed-off-by: thiswillbeyourgithub [email protected]
README.md
- [cb4cb51] by @thiswillbeyourgithub, 7 days ago:
 doc: clarify
 Signed-off-by: thiswillbeyourgithub
 [email protected]
 Signed-off-by: thiswillbeyourgithub
 [email protected]
 Signed-off-by: thiswillbeyourgithub [email protected]
README.md
- [f677e5a] by @thiswillbeyourgithub, 7 days ago:
 doc: clarify
 Signed-off-by: thiswillbeyourgithub
 [email protected]
 Signed-off-by: thiswillbeyourgithub [email protected]
README.md
- [39e0da5] by @thiswillbeyourgithub, 7 days ago:
 doc: clarify
 Signed-off-by: thiswillbeyourgithub [email protected]
README.md
- [64ca466] by @thiswillbeyourgithub (aider), 10 days ago:
 feat: Add current date to summary metadata and header to reduce hallucinations
wdoc/wdoc.py
- [4dbcf15] by @thiswillbeyourgithub, 10 days ago:
 enh: better ignoring of first line of summary if just breathing
 Signed-off-by: thiswillbeyourgithub
 [email protected]
wdoc/utils/tasks/summarize.py
Release 3.2.1
What's new
This small patch release primarily focuses on integrating OpenRouter for model pricing/metadata and refining cost calculations.
✨ Features
- Set default models to use OpenRouter ([915699c]).
- Fetch model prices and metadata automatically from OpenRouter, improving reliability ([7f840b7]).
🐛 Fixes & Enhancements
- Much improved price calculation and handling:
- Updated litellmdependency ([179b589]).
🧪 Tests
- API integration tests now fail faster if an underlying API call fails ([9a0c856]).
Commits details since the last release
- [03aeab2] by @thiswillbeyourgithub, 2 minutes ago:
 bump version 3.2.0 -> 3.2.1
bumpver.toml
docs/source/conf.py
setup.py
wdoc/wdoc.py
- [915699c] by @thiswillbeyourgithub, 6 minutes ago:
 new: set the default models to use openrouter
 Signed-off-by: thiswillbeyourgithub [email protected]
README.md
wdoc/utils/env.py
- [c0b90d8] by @thiswillbeyourgithub, 64 minutes ago:
 fix: reworked how pricing are computed to take internal thinking into account
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/llm.py
wdoc/utils/misc.py
wdoc/utils/tasks/summarize.py
wdoc/wdoc.py
- [a17b41c] by @thiswillbeyourgithub, 80 minutes ago:
 enh: better way to get the model prices
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/misc.py
wdoc/wdoc.py
- [9a0c856] by @thiswillbeyourgithub, 22 minutes ago:
 test: crash early if one api crash fails
 Signed-off-by: thiswillbeyourgithub [email protected]
tests/run_all_tests.sh
- [7f840b7] by @thiswillbeyourgithub, 2 hours ago:
 feat: automatically fetch the price and metadata from openrouter instead of waiting for litellm
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/misc.py
wdoc/wdoc.py
- [2b29a9d] by @thiswillbeyourgithub, 2 hours ago:
 fix: error message on missing model price
 Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/misc.py
- [179b589] by @thiswillbeyourgithub, 2 hours ago:
 bump litellm version
 Signed-off-by: thiswillbeyourgithub [email protected]
setup.py