Release 4.0.0
What's new
What's new
This release focuses on major performance improvements through lazy loading and deferred imports, extensive code refactoring for better maintainability, and improved testing infrastructure.
⚡ Performance
- Significantly faster startup time through deferred imports and lazy loading [52985d5, dce3c24, 3ffaec3]
- Moved litellm imports to run only when needed [52985d5]
- Deferred requests import [0b4c2fb]
- Removed eager imports from
__init__.pyfiles [306d4ca] - Moved imports in loaders, embeddings, and core modules [de1cecc, 08b9206, fd2dcba, 1838e0f, 1bd4ced, f1740c4, 2b3d9e8, 6fbe51d, 6c74d8e, f306325]
- Added lazy loading for document loaders with
WDOC_LAZY_LOADenv var [7fc5fad, ce10c4b]
🔧 Fixes
- Fixed forward reference type hints across multiple modules [fd6a7e7, 22b44b4, 15a2746]
- Fixed signature wrapping for parse function [29dbf5d]
- Fixed API tests for DuckDuckGo and OpenRouter [8b9ebc2, 8f511dd, 32e036d, 048f99e]
- Fixed missing filetype handling in edge cases [0422dec]
- Fixed error for Word document loading [8cad00d]
- Fixed lazy loading logic (was reversed) [a35446f]
- Fixed query_task and search_task output handling [6f633e8, 8b95a81]
- Fixed error when summary doesn't output to file using pipe [2a85a6b]
- Fixed imports in loaders [ebd4558, af85343, 986abd2, 4e61a6f]
- Added missing
audioop-ltsrequirement for Python 3.13+ [56bd634]
♻️ Refactoring
- Modularized loaders: Split monolithic loader file into separate modules [df1a0ad, d3ed873, f0a3fce, b249068, 984a8d3, def441f, fb421cc]
- Created dedicated files for PDF, Anki, URL, audio, HTML, and other loaders
- Enabled lazy loading of loader modules [7fc5fad]
- Extracted task-specific functions to separate modules:
- Moved
parse_doctoutils/tasks/parse.py[1c7c6e4] - Moved query/search retrieval logic to task modules [7982051, c2e6142]
- Moved
evaluate_doc_chaintoshared_query_search.py[8965c48] - Extracted query splitting logic to shared utility [4bb54a5]
- Moved
source_replaceto query.py [0ce5f4f] - Moved
autoincrease_top_kto query.py [38e82b4]
- Moved
- Split search and query task methods with better type hints [1d94644, 824f395, 319b8eb]
- Moved
debug_exceptionsto logger module [99cc99f] - Moved VectorStore filtering code to filters.py [de4ce57]
- Added
wdocSummarydataclass for type hinting [9fc51c0, 92f5c47] - Added lazy caching for
all_textsproperty [79b1661, 7b45948] - Removed obsolete
import_tricks.py[5116616]
🧪 Testing
- Improved test cleanup and temp folder removal [a768642, 35ef63e, c149f5d, 913378a]
- Better verbose output in cost tests [342ad3f]
- Use Mistral for OpenRouter API tests (zero data retention) [8f511dd]
- Added shell-based CLI test script for more reliable testing [cc74a84, 4170567]
- Added check for
wdoc[full]installation [7cb9a3c] - Updated Ollama embedding test to use
embeddingsgemma[4d47631] - Improved test assertions with more info [3d0f947]
📦 Dependencies
- Bumped langchain version [98fd2cb]
- Bumped litellm version [7aa2ce1]
- Bumped langfuse version (litellm bug fix) [fc16e5e]
- Updated general dependencies [616457c]
- Added unstructured to required dependencies [c98d0e9]
- Added bumpver to dev packages [54be0e2]
✨ Features
- Added
wdoc[full]installation option for all optional dependencies [6321942] - Added beartype runtime type validation for numpy arrays [691dbff]
- Prioritize throughput and Groq when using OpenRouter [f049846]
- Enable lazy loading of imports by default [7c2e397]
📝 Documentation
- Updated default models to latest Gemini in README and help [761ddd1, 0868086, 78e562f]
- Clarified that binary embeddings are not always better [fb611c4]
- Added link explaining fixed cache of LLM issue [fdc3c64]
- Improved docstrings for summarization functions [a06f570]
- Added docstring for VectorStore filtering [2bd8dcb]
🎨 Code Quality
- PEP8 formatting improvements [559fcc0, 13dd7d44, e10fb05]
- Removed unused imports [16c518e, cb9563e, b5a42ef]
- Improved type hints [78d399f, a801718, 0875b23]
- Import logger first to set log level [2dead91]
- Removed
if Truestatement [263a45d]
🔖 Version
- Bumped version 3.3.1 → 4.0.0 [e1548c4]
Commits details since the last release
- [e1548c4] by @thiswillbeyourgithub, 12 seconds ago:
bump version 3.3.1 -> 4.0.0
bumpver.toml
docs/source/conf.py
setup.py
wdoc/wdoc.py
- [54be0e2] by @thiswillbeyourgithub, 47 seconds ago:
add bumpver to dev packages
Signed-off-by: thiswillbeyourgithub [email protected]
setup.py
- [37e80a7] by @thiswillbeyourgithub, 2 minutes ago:
doc: todo
Signed-off-by: thiswillbeyourgithub [email protected]
README.md
- [a768642] by @thiswillbeyourgithub, 3 minutes ago:
better trash removal
Signed-off-by: thiswillbeyourgithub [email protected]
tests/run_all_tests.sh
- [35ef63e] by @thiswillbeyourgithub, 16 minutes ago:
less verbose test removal of temp folders
Signed-off-by: thiswillbeyourgithub [email protected]
tests/run_all_tests.sh
- [c149f5d] by @thiswillbeyourgithub, 30 minutes ago:
enh: delete cache dir at start of test
Signed-off-by: thiswillbeyourgithub [email protected]
tests/run_all_tests.sh
- [ae6f28a] by @thiswillbeyourgithub, 32 minutes ago:
minor: name of a test
Signed-off-by: thiswillbeyourgithub [email protected]
tests/test_wdoc.py
- [342ad3f] by @thiswillbeyourgithub, 39 minutes ago:
better verbose output in cost test
Signed-off-by: thiswillbeyourgithub [email protected]
tests/test_wdoc.py
- [8f511dd] by @thiswillbeyourgithub, 63 minutes ago:
fix: use mistral instead of openai when testing api from openrouter because it supports zero data retention
Signed-off-by: thiswillbeyourgithub
[email protected]
tests/test_wdoc.py
- [8b9ebc2] by @thiswillbeyourgithub, 65 minutes ago:
fix: api test for ddg from the shell
Signed-off-by: thiswillbeyourgithub
[email protected]
tests/test_cli.sh
- [fd6a7e7] by @thiswillbeyourgithub, 72 minutes ago:
fix forward reference for typehinting
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/wdoc.py
- [1428497] by @thiswillbeyourgithub, 72 minutes ago:
fix forgot to test api using cli script
Signed-off-by: thiswillbeyourgithub [email protected]
tests/run_all_tests.sh
- [22b44b4] by @thiswillbeyourgithub, 85 minutes ago:
fix forward reference type hints
Signed-off-by: thiswillbeyourgithub
[email protected]
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/tasks/query.py
wdoc/wdoc.py
- [15a2746] by @thiswillbeyourgithub, 89 minutes ago:
fix forward reference type hints
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/llm.py
wdoc/utils/retrievers.py
wdoc/utils/tasks/query.py
wdoc/utils/tasks/summarize.py
- [29dbf5d] by @thiswillbeyourgithub, 2 hours ago:
fix: signature wrapping for parse
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/main.py
wdoc/wdoc.py
- [7aa2ce1] by @thiswillbeyourgithub, 2 hours ago:
bump litellm
Signed-off-by: thiswillbeyourgithub [email protected]
setup.py
- [98fd2cb] by @thiswillbeyourgithub, 2 hours ago:
bump langchain version
Signed-off-by: thiswillbeyourgithub
[email protected]
setup.py
- [616457c] by @thiswillbeyourgithub, 2 hours ago:
bump dependencies
Signed-off-by: thiswillbeyourgithub [email protected]
setup.py
- [dce3c24] by @thiswillbeyourgithub, 2 hours ago:
actually use lazy import for litellm
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/misc.py
- [52985d5] by @thiswillbeyourgithub, 3 hours ago:
better startup time by defering litellm import
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/customs/litellm_embeddings.py
wdoc/utils/embeddings.py
wdoc/utils/llm.py
wdoc/utils/loaders/shared_audio.py
wdoc/utils/misc.py
wdoc/utils/retrievers.py
wdoc/utils/tasks/query.py
wdoc/utils/tasks/summarize.py
wdoc/wdoc.py
- [559fcc0] by @thiswillbeyourgithub, 3 hours ago:
pep8
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/init.py
- [0b4c2fb] by @thiswillbeyourgithub, 3 hours ago:
defer requests import
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/loaders/shared_audio.py
wdoc/utils/misc.py
- [78d399f] by @thiswillbeyourgithub, 3 hours ago:
type hint for multiquery retriever
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/retrievers.py
- [13dd7d1] by @thiswillbeyourgithub, 3 hours ago:
pep8
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/main.py
wdoc/utils/customs/binary_faiss_vectorstore.py
wdoc/utils/embeddings.py
wdoc/utils/env.py
wdoc/utils/interact.py
wdoc/utils/llm.py
wdoc/utils/misc.py
wdoc/utils/prompts.py
wdoc/utils/tasks/parse.py
wdoc/utils/tasks/query.py
wdoc/utils/tasks/search.py
- [6ce23d4] by @thiswillbeyourgithub, 4 hours ago:
fix import statements
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/wdoc.py
- [16c518e] by @thiswillbeyourgithub, 4 hours ago:
remove unused import
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/wdoc.py
- [3ffaec3] by @thiswillbeyourgithub, 4 hours ago:
perf: move imports for faster load time
Signed-off-by: thiswillbeyourgithub
[email protected]
Signed-off-by: thiswillbeyourgithub
[email protected]
wdoc/utils/loaders/shared_audio.py
wdoc/wdoc.py
- [306d4ca] by @thiswillbeyourgithub, 4 hours ago:
perf: remove the init.py content that was causing tons of imports
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/init.py
wdoc/utils/customs/init.py
wdoc/utils/tasks/init.py
- [b5a42ef] by @thiswillbeyourgithub, 4 hours ago:
minor: remove duplicate import statement
Signed-off-by: thiswillbeyourgithub
[email protected]
wdoc/main.py
- [de1cecc] by @thiswillbeyourgithub, 4 hours ago:
perf: move imports to increase perf
Signed-off-by: thiswillbeyourgithub
[email protected]
Signed-off-by: thiswillbeyourgithub
[email protected]
Signed-off-by: thiswillbeyourgithub
[email protected]
Signed-off-by: thiswillbeyourgithub
[email protected]
Signed-off-by: thiswillbeyourgithub
[email protected]
Signed-off-by: thiswillbeyourgithub
[email protected]
Signed-off-by: thiswillbeyourgithub
[email protected]
Signed-off-by: thiswillbeyourgithub
[email protected]
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/misc.py
- [08b9206] by @thiswillbeyourgithub, 5 hours ago:
perf: move imports to increase perf
Signed-off-by: thiswillbeyourgithub
[email protected]
Signed-off-by: thiswillbeyourgithub
[email protected]
Signed-off-by: thiswillbeyourgithub
[email protected]
Signed-off-by: thiswillbeyourgithub
[email protected]
Signed-off-by: thiswillbeyourgithub
[email protected]
Signed-off-by: thiswillbeyourgithub
[email protected]
Signed-off-by: thiswillbeyourgithub
[email protected]
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/loaders/anki.py
- [fd2dcba] by @thiswillbeyourgithub, 5 hours ago:
perf: move imports to increase perf
Signed-off-by: thiswillbeyourgithub
[email protected]
Signed-off-by: thiswillbeyourgithub
[email protected]
Signed-off-by: thiswillbeyourgithub
[email protected]
Signed-off-by: thiswillbeyourgithub
[email protected]
Signed-off-by: thiswillbeyourgithub
[email protected]
Signed-off-by: thiswillbeyourgithub
[email protected]
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/loaders/init.py
- [1838e0f] by @thiswillbeyourgithub, 5 hours ago:
perf: move imports to increase perf
Signed-off-by: thiswillbeyourgithub
[email protected]
Signed-off-by: thiswillbeyourgithub
[email protected]
Signed-off-by: thiswillbeyourgithub
[email protected]
Signed-off-by: thiswillbeyourgithub
[email protected]
Signed-off-by: thiswillbeyourgithub
[email protected]
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/batch_file_loader.py
wdoc/utils/loaders/shared.py
wdoc/wdoc.py
- [1bd4ced] by @thiswillbeyourgithub, 5 hours ago:
perf: move imports to increase perf
Signed-off-by: thiswillbeyourgithub
[email protected]
Signed-off-by: thiswillbeyourgithub
[email protected]
Signed-off-by: thiswillbeyourgithub
[email protected]
Signed-off-by: thiswillbeyourgithub
[email protected]
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/main.py
- [2dead91] by @thiswillbeyourgithub, 5 hours ago:
import the logger first to set the log level
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/wdoc.py
- [263a45d] by @thiswillbeyourgithub, 5 hours ago:
minor: remove a if True statement
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/embeddings.py
- [f1740c4] by @thiswillbeyourgithub, 5 hours ago:
perf: move imports to increase perf
Signed-off-by: thiswillbeyourgithub
[email protected]
Signed-off-by: thiswillbeyourgithub
[email protected]
Signed-off-by: thiswillbeyourgithub
[email protected]
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/embeddings.py
wdoc/wdoc.py
- [2b3d9e8] by @thiswillbeyourgithub, 5 hours ago:
perf: move imports to increase perf
Signed-off-by: thiswillbeyourgithub
[email protected]
Signed-off-by: thiswillbeyourgithub
[email protected]
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/wdoc.py
- [6fbe51d] by @thiswillbeyourgithub, 5 hours ago:
perf: import parse_doc only if needed
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/wdoc.py
- [cb9563e] by @thiswillbeyourgithub, 5 hours ago:
remove unused asyncio import
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/wdoc.py
- [65416e6] by @thiswillbeyourgithub, 5 hours ago:
minor
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/batch_file_loader.py
- [6c74d8e] by @thiswillbeyourgithub, 5 hours ago:
perf: move imports to increase perf
Signed-off-by: thiswillbeyourgithub
[email protected]
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/batch_file_loader.py
wdoc/utils/tasks/query.py
wdoc/utils/tasks/summarize.py
wdoc/wdoc.py
- [ad8691a] by @thiswillbeyourgithub, 5 hours ago:
perf: move retriever imports to increase perf
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/retrievers.py
- [f306325] by @thiswillbeyourgithub, 5 hours ago:
perf: move import of ask_user to run only if needed
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/wdoc.py
- [78e562f] by @thiswillbeyourgithub, 5 days ago:
fix: --help had outdated default model
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/docs/help.md
- [761ddd1] by @thiswillbeyourgithub, 5 days ago:
update default models to latest gemini
Signed-off-by: thiswillbeyourgithub
[email protected]
Signed-off-by: thiswillbeyourgithub [email protected]
README.md
- [0868086] by @thiswillbeyourgithub, 5 days ago:
update default models to latest gemini
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/env.py
- [913378a] by @thiswillbeyourgithub, 2 weeks ago:
test: better cleanup
Signed-off-by: thiswillbeyourgithub [email protected]
tests/run_all_tests.sh
- [fc16e5e] by @thiswillbeyourgithub, 2 weeks ago:
bump langfuse version as litellm bug got fixed
Signed-off-by: thiswillbeyourgithub
[email protected]
setup.py
- [37ccf4b] by @thiswillbeyourgithub, 3 weeks ago:
fix test file
Signed-off-by: thiswillbeyourgithub [email protected]
tests/run_all_tests.sh
- [0422dec] by @thiswillbeyourgithub, 3 weeks ago:
fix: missing filetype in some edge case
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/batch_file_loader.py
- [8cad00d] by @thiswillbeyourgithub, 3 weeks ago:
fix: error for filetype word
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/loaders/word.py
- [c98d0e9] by @thiswillbeyourgithub, 3 weeks ago:
add unstructured to the req dep
Signed-off-by: thiswillbeyourgithub [email protected]
setup.py
- [d469768] by @thiswillbeyourgithub, 3 weeks ago:
doc
Signed-off-by: thiswillbeyourgithub [email protected]
setup.py
- [7cb9a3c] by @thiswillbeyourgithub, 3 weeks ago:
add check that wdoc[full] install works
Signed-off-by: thiswillbeyourgithub [email protected]
tests/run_all_tests.sh
- [a35446f] by @thiswillbeyourgithub, 3 weeks ago:
fix: lazy loading logic was reversed
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/loaders/init.py
- [70bcbd0] by @thiswillbeyourgithub, 3 weeks ago:
add debug print when using not lazy loading
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/loaders/init.py
- [d603d49] by @thiswillbeyourgithub, 3 weeks ago:
change order of dep
Signed-off-by: thiswillbeyourgithub [email protected]
setup.py
- [2ae30fb] by @thiswillbeyourgithub, 3 weeks ago:
add missing pandas import
Signed-off-by: thiswillbeyourgithub [email protected]
setup.py
- [5116616] by @thiswillbeyourgithub, 3 weeks ago:
fix: no more need for import_tricks file now that we only import as needed
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/init.py
wdoc/utils/import_tricks.py
- [56bd634] by @thiswillbeyourgithub, 3 weeks ago:
feat: add audioop-lts requirement for Python 3.13+
Co-authored-by: aider (openrouter/anthropic/claude-sonnet-4) [email protected]
setup.py
- [6321942] by @thiswillbeyourgithub, 3 weeks ago:
new: add a wdoc[full] install
Signed-off-by: thiswillbeyourgithub [email protected]
README.md
setup.py
- [6f633e8] by @thiswillbeyourgithub, 3 weeks ago:
fix: output of query and search task
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/wdoc.py
- [8b95a81] by @thiswillbeyourgithub, 3 weeks ago:
fix: query_task and search_task handle
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/wdoc.py
- [824f395] by @thiswillbeyourgithub, 3 weeks ago:
reworked the way we split search and query task
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/wdoc.py
- [319b8eb] by @thiswillbeyourgithub, 3 weeks ago:
refactor: Modularize query_task method with type hints and arguments
Co-authored-by: aider (openrouter/anthropic/claude-sonnet-4) [email protected]
wdoc/wdoc.py
- [1d94644] by @thiswillbeyourgithub, 3 weeks ago:
step 1 in splitting search and query
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/wdoc.py
- [8965c48] by @thiswillbeyourgithub, 3 weeks ago:
refactor: moveevaluate_doc_chainfunction toshared_query_search.py
Co-authored-by: aider (openrouter/anthropic/claude-sonnet-4) [email protected]
wdoc/utils/tasks/shared_query_search.py
wdoc/wdoc.py
- [3785d12] by @thiswillbeyourgithub, 3 weeks ago:
style: format code with linter
Co-authored-by: aider (openrouter/anthropic/claude-sonnet-4) [email protected]
wdoc/utils/tasks/shared_query_search.py
- [4bb54a5] by @thiswillbeyourgithub, 3 weeks ago:
refactor: extract query splitting logic into shared utility function
Co-authored-by: aider (openrouter/anthropic/claude-sonnet-4) [email protected]
wdoc/utils/tasks/shared_query_search.py
wdoc/wdoc.py
- [2a85a6b] by @thiswillbeyourgithub, 3 weeks ago:
fix: error when summary does not output to a file using pipe
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/logger.py
- [0a55d9f] by @thiswillbeyourgithub, 3 weeks ago:
minor: do not disable a tqdm pbar if piping
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/wdoc.py
- [910a423] by @thiswillbeyourgithub, 3 weeks ago:
rewording
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/wdoc.py
- [0ce5f4f] by @thiswillbeyourgithub, 3 weeks ago:
refactor: movesource_replacefunction from wdoc.py to query.py
Co-authored-by: aider (openrouter/anthropic/claude-sonnet-4) [email protected]
wdoc/utils/tasks/query.py
wdoc/wdoc.py
- [c2e6142] by @thiswillbeyourgithub, 3 weeks ago:
feat: add retrieve_documents_for_query function to query tasks
Co-authored-by: aider (openrouter/anthropic/claude-sonnet-4) [email protected]
wdoc/utils/tasks/query.py
wdoc/utils/tasks/search.py
- [7982051] by @thiswillbeyourgithub, 3 weeks ago:
refactor: Extract retrieve_documents functions to separate task-specific modules
Co-authored-by: aider (openrouter/anthropic/claude-sonnet-4) [email protected]
wdoc/utils/tasks/search.py
wdoc/wdoc.py
- [1c7c6e4] by @thiswillbeyourgithub, 3 weeks ago:
refactor: Move parse_doc function to utils/tasks/parse.py and set as static method
Co-authored-by: aider (openrouter/anthropic/claude-sonnet-4) [email protected]
wdoc/utils/tasks/parse.py
wdoc/wdoc.py
- [38e82b4] by @thiswillbeyourgithub, 3 weeks ago:
refactor: moveautoincrease_top_kfunction to query.py with updated signature
Co-authored-by: aider (openrouter/anthropic/claude-sonnet-4) [email protected]
wdoc/utils/tasks/query.py
wdoc/wdoc.py
- [5676336] by @thiswillbeyourgithub, 3 weeks ago:
remove the test_cli.py
Signed-off-by: thiswillbeyourgithub [email protected]
tests/test_cli.py
- [4170567] by @thiswillbeyourgithub, 3 weeks ago:
refactor: Replace timeout-based debug test with expect-driven interactive debugger handling
Co-authored-by: aider (openrouter/anthropic/claude-sonnet-4) [email protected]
tests/test_cli.sh
- [1249cb0] by @thiswillbeyourgithub, 3 weeks ago:
fix test for cli
Signed-off-by: thiswillbeyourgithub [email protected]
tests/test_cli.sh
- [3d0f947] by @thiswillbeyourgithub, 3 weeks ago:
more info in the asserts of test_wdoc
Signed-off-by: thiswillbeyourgithub [email protected]
tests/test_wdoc.py
- [ffe46bc] by @thiswillbeyourgithub, 3 weeks ago:
try with the shell script for the cli test
Signed-off-by: thiswillbeyourgithub [email protected]
tests/run_all_tests.sh
- [e264ef8] by @thiswillbeyourgithub, 3 weeks ago:
fix: forward reference to wdoc is not possible
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/logger.py
- [cc74a84] by @thiswillbeyourgithub, 3 weeks ago:
tests: chore: add shell-based CLI test script for more reliable testing
Co-authored-by: aider (openrouter/anthropic/claude-sonnet-4) [email protected]
tests/test_cli.sh
- [a801718] by @thiswillbeyourgithub, 3 weeks ago:
fix: type hint for doc_total_cost
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/tasks/summarize.py
- [7f08200] by @thiswillbeyourgithub, 3 weeks ago:
fix: update create_retrievers call with correct arguments in wdoc.py
Co-authored-by: aider (openrouter/anthropic/claude-sonnet-4) [email protected]
wdoc/wdoc.py
- [3d68537] by @thiswillbeyourgithub, 3 weeks ago:
more some code to create_retrievers
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/retrievers.py
wdoc/wdoc.py
- [99cc99f] by @thiswillbeyourgithub, 3 weeks ago:
move debug_exceptions function to loggers
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/logger.py
wdoc/wdoc.py
- [847a4a2] by @thiswillbeyourgithub, 3 weeks ago:
typo
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/wdoc.py
- [2bd8dcb] by @thiswillbeyourgithub, 3 weeks ago:
docs: add docstring explaining VectorStore filtering functionality
Co-authored-by: aider (openrouter/anthropic/claude-sonnet-4) [email protected]
wdoc/utils/filters.py
- [de4ce57] by @thiswillbeyourgithub, 3 weeks ago:
move the code to filter the docstore to filters.py
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/filters.py
wdoc/wdoc.py
- [fa63119] by @thiswillbeyourgithub, 3 weeks ago:
style: fix linter formatting in wdoc.py
Co-authored-by: aider (openrouter/anthropic/claude-sonnet-4) [email protected]
wdoc/wdoc.py
- [79b1661] by @thiswillbeyourgithub, 3 weeks ago:
feat: add all_texts property to lazily cache document texts for retrieval
Co-authored-by: aider (openrouter/anthropic/claude-sonnet-4) [email protected]
wdoc/wdoc.py
- [7b45948] by @thiswillbeyourgithub, 3 weeks ago:
refactor: lazily create all_texts property for performance optimization
Co-authored-by: aider (openrouter/anthropic/claude-sonnet-4) [email protected]
wdoc/wdoc.py
- [9fc51c0] by @thiswillbeyourgithub, 3 weeks ago:
refactor: usewdocSummarydataclass for type hinting in summarization tasks
Co-authored-by: aider (openrouter/anthropic/claude-sonnet-4) [email protected]
wdoc/utils/tasks/summarize.py
wdoc/wdoc.py
- [92f5c47] by @thiswillbeyourgithub, 3 weeks ago:
feat: add SummaryResult dataclass for improved summary output handling
Co-authored-by: aider (openrouter/anthropic/claude-sonnet-4) [email protected]
wdoc/utils/tasks/summarize.py
- [bc5beb8] by @thiswillbeyourgithub, 3 weeks ago:
change order of functions in summarize
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/tasks/summarize.py
- [42b1c23] by @thiswillbeyourgithub, 3 weeks ago:
rename function do_summarize to _summarize
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/tasks/summarize.py
- [a06f570] by @thiswillbeyourgithub, 3 weeks ago:
docs: improve docstrings for document summarization functions
Co-authored-by: aider (openrouter/anthropic/claude-sonnet-4) [email protected]
wdoc/utils/tasks/summarize.py
- [2e36623] by @thiswillbeyourgithub, 3 weeks ago:
refactor: move more of the summary code into utils/task/summarize.py
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/tasks/summarize.py
wdoc/wdoc.py
- [32e036d] by @thiswillbeyourgithub, 3 weeks ago:
fix: ddg test
Signed-off-by: thiswillbeyourgithub [email protected]
tests/test_wdoc.py
wdoc/wdoc.py
- [4d47631] by @thiswillbeyourgithub, 3 weeks ago:
the embedding test of ollama now use embeddingsgemma
Signed-off-by: thiswillbeyourgithub [email protected]
tests/test_wdoc.py
- [048f99e] by @thiswillbeyourgithub, 3 weeks ago:
fix: test for ddg
Signed-off-by: thiswillbeyourgithub [email protected]
tests/test_wdoc.py
- [ebd4558] by @thiswillbeyourgithub, 3 weeks ago:
fix: imports
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/loaders/init.py
wdoc/utils/loaders/local_html.py
wdoc/utils/loaders/powerpoint.py
- [af85343] by @thiswillbeyourgithub, 3 weeks ago:
fix: local video loader
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/loaders/local_video.py
- [986abd2] by @thiswillbeyourgithub, 3 weeks ago:
fix: html loader
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/loaders/html.py
- [7c2e397] by @thiswillbeyourgithub, 3 weeks ago:
by default enable lazy loading of loaders
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/docs/help.md
wdoc/utils/env.py
- [9a5485b] by @thiswillbeyourgithub, 3 weeks ago:
cancel a check=True in test
Signed-off-by: thiswillbeyourgithub
[email protected]
tests/test_cli.py
- [3d5edf9] by @thiswillbeyourgithub, 3 weeks ago:
feat: use venv python path consistently in test script
Co-authored-by: aider (openrouter/anthropic/claude-sonnet-4) [email protected]
tests/run_all_tests.sh
tests/test_cli.py
- [a12ed2a] by @thiswillbeyourgithub, 3 weeks ago:
better test
Signed-off-by: thiswillbeyourgithub [email protected]
tests/test_cli.py
tests/test_wdoc.py
-
[45008ac] by @thiswillbeyourgithub, 3 weeks ago:
Merge branch 'refactor-loaders' into dev -
[cac724c] by @thiswillbeyourgithub, 3 weeks ago:
fix: ny time parser check failed
because they return much less text than before I think
Signed-off-by: thiswillbeyourgithub [email protected]
tests/test_cli.py
- [b413edd] by @thiswillbeyourgithub, 3 weeks ago:
fix: online pdf loader should be there
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/loaders/online_pdf.py
- [dea0fb0] by @thiswillbeyourgithub, 3 weeks ago:
update the readme
Signed-off-by: thiswillbeyourgithub [email protected]
README.md
- [a8fad66] by @thiswillbeyourgithub, 3 weeks ago:
run isort on imports
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/loaders/init.py
wdoc/utils/loaders/anki.py
wdoc/utils/loaders/epub.py
wdoc/utils/loaders/html.py
wdoc/utils/loaders/json_dict.py
wdoc/utils/loaders/local_audio.py
wdoc/utils/loaders/local_video.py
wdoc/utils/loaders/logseq_markdown.py
wdoc/utils/loaders/online_media.py
wdoc/utils/loaders/pdf.py
wdoc/utils/loaders/powerpoint.py
wdoc/utils/loaders/shared.py
wdoc/utils/loaders/shared_audio.py
wdoc/utils/loaders/string.py
wdoc/utils/loaders/text.py
wdoc/utils/loaders/txt.py
wdoc/utils/loaders/url.py
wdoc/utils/loaders/word.py
wdoc/utils/loaders/youtube.py
- [37eb563] by @thiswillbeyourgithub, 3 weeks ago:
fix: always use absolute import path
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/loaders/init.py
wdoc/utils/loaders/youtube.py
- [4e61a6f] by @thiswillbeyourgithub, 3 weeks ago:
fix: import inside online_media
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/loaders/online_media.py
- [20b3df3] by @thiswillbeyourgithub, 3 weeks ago:
fix: import a loader
Signed-off-by: thiswillbeyourgithub [email protected]
scripts/MediaURLFinder/media_url_finder.py
- [df1a0ad] by @thiswillbeyourgithub, 3 weeks ago:
split the rest of the loaders into files
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/loaders/init.py
wdoc/utils/loaders/epub.py
wdoc/utils/loaders/html.py
wdoc/utils/loaders/json_dict.py
wdoc/utils/loaders/local_audio.py
wdoc/utils/loaders/local_video.py
wdoc/utils/loaders/logseq_markdown.py
wdoc/utils/loaders/online_media.py
wdoc/utils/loaders/powerpoint.py
wdoc/utils/loaders/string.py
wdoc/utils/loaders/text.py
wdoc/utils/loaders/txt.py
wdoc/utils/loaders/word.py
wdoc/utils/loaders/youtube.py
- [d3ed873] by @thiswillbeyourgithub, 3 weeks ago:
move load_html to its own file
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/loaders/init.py
wdoc/utils/loaders/html.py
- [f0a3fce] by @thiswillbeyourgithub, 3 weeks ago:
split the audio loaders into files
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/batch_file_loader.py
wdoc/utils/loaders/init.py
wdoc/utils/loaders/local_audio.py
wdoc/utils/loaders/shared.py
wdoc/utils/loaders/shared_audio.py
wdoc/utils/loaders/youtube.py
wdoc/utils/misc.py
- [f70849d] by @thiswillbeyourgithub, 3 weeks ago:
fix: typo
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/misc.py
- [e10fb05] by @thiswillbeyourgithub, 3 weeks ago:
pip8
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/misc.py
- [f124384] by @thiswillbeyourgithub, 3 weeks ago:
minor
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/loaders/init.py
- [8da4d11] by @thiswillbeyourgithub, 3 weeks ago:
fix missing import
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/loaders/init.py
- [51461fb] by @thiswillbeyourgithub, 3 weeks ago:
loadable filetype is now a list
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/loaders/init.py
- [7abeb64] by @thiswillbeyourgithub, 3 weeks ago:
rename some loader for consistency
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/loaders/init.py
- [b249068] by @thiswillbeyourgithub, 3 weeks ago:
move load_url to url.py
wdoc/utils/loaders/init.py
wdoc/utils/loaders/shared.py
wdoc/utils/loaders/url.py
- [7fc5fad] by @thiswillbeyourgithub, 3 weeks ago:
feat: add lazy loading of imports
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/docs/help.md
wdoc/utils/env.py
wdoc/utils/loaders/init.py
- [ce10c4b] by @thiswillbeyourgithub, 3 weeks ago:
lazy loading import function
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/loaders/init.py
- [984a8d3] by @thiswillbeyourgithub, 3 weeks ago:
move anki loader to anki.py
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/loaders/init.py
wdoc/utils/loaders/anki.py
- [0875b23] by @thiswillbeyourgithub, 3 weeks ago:
fix typing
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/loaders/pdf.py
wdoc/utils/loaders/shared.py
- [def441f] by @thiswillbeyourgithub, 3 weeks ago:
moved pdf loaders to loaders/pdf.py
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/loaders/init.py
wdoc/utils/loaders/pdf.py
wdoc/utils/loaders/shared.py
- [fb421cc] by @thiswillbeyourgithub, 3 weeks ago:
moved loaders.py to loaders/init.py
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/loaders/init.py
- [9516d84] by @thiswillbeyourgithub, 5 weeks ago:
typo
Signed-off-by: thiswillbeyourgithub [email protected]
README.md
- [fb611c4] by @thiswillbeyourgithub, 5 weeks ago:
doc: clarify that binary embeddings are not always better
Signed-off-by: thiswillbeyourgithub [email protected]
README.md
- [fdc3c64] by @thiswillbeyourgithub, 8 weeks ago:
doc: add link to explain the fixed cache of llm issue
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/llm.py
- [691dbff] by @thiswillbeyourgithub, 8 weeks ago:
feat: add beartype runtime type validation for numpy arrays and numeric inputs
Co-authored-by: aider (openrouter/anthropic/claude-sonnet-4) [email protected]
wdoc/utils/customs/binary_faiss_vectorstore.py
- [f049846] by @thiswillbeyourgithub, 8 weeks ago:
new: prioritize throughput and groq if using openrouter
Signed-off-by: thiswillbeyourgithub [email protected]
wdoc/utils/llm.py