Skip to content

Release 4.0.0

Choose a tag to compare

@thiswillbeyourgithub thiswillbeyourgithub released this 05 Oct 16:36
· 184 commits to main since this release

What's new

What's new

This release focuses on major performance improvements through lazy loading and deferred imports, extensive code refactoring for better maintainability, and improved testing infrastructure.

⚡ Performance

🔧 Fixes

♻️ Refactoring

  • Modularized loaders: Split monolithic loader file into separate modules [df1a0ad, d3ed873, f0a3fce, b249068, 984a8d3, def441f, fb421cc]
    • Created dedicated files for PDF, Anki, URL, audio, HTML, and other loaders
    • Enabled lazy loading of loader modules [7fc5fad]
  • Extracted task-specific functions to separate modules:
    • Moved parse_doc to utils/tasks/parse.py [1c7c6e4]
    • Moved query/search retrieval logic to task modules [7982051, c2e6142]
    • Moved evaluate_doc_chain to shared_query_search.py [8965c48]
    • Extracted query splitting logic to shared utility [4bb54a5]
    • Moved source_replace to query.py [0ce5f4f]
    • Moved autoincrease_top_k to query.py [38e82b4]
  • Split search and query task methods with better type hints [1d94644, 824f395, 319b8eb]
  • Moved debug_exceptions to logger module [99cc99f]
  • Moved VectorStore filtering code to filters.py [de4ce57]
  • Added wdocSummary dataclass for type hinting [9fc51c0, 92f5c47]
  • Added lazy caching for all_texts property [79b1661, 7b45948]
  • Removed obsolete import_tricks.py [5116616]

🧪 Testing

  • Improved test cleanup and temp folder removal [a768642, 35ef63e, c149f5d, 913378a]
  • Better verbose output in cost tests [342ad3f]
  • Use Mistral for OpenRouter API tests (zero data retention) [8f511dd]
  • Added shell-based CLI test script for more reliable testing [cc74a84, 4170567]
  • Added check for wdoc[full] installation [7cb9a3c]
  • Updated Ollama embedding test to use embeddingsgemma [4d47631]
  • Improved test assertions with more info [3d0f947]

📦 Dependencies

  • Bumped langchain version [98fd2cb]
  • Bumped litellm version [7aa2ce1]
  • Bumped langfuse version (litellm bug fix) [fc16e5e]
  • Updated general dependencies [616457c]
  • Added unstructured to required dependencies [c98d0e9]
  • Added bumpver to dev packages [54be0e2]

✨ Features

  • Added wdoc[full] installation option for all optional dependencies [6321942]
  • Added beartype runtime type validation for numpy arrays [691dbff]
  • Prioritize throughput and Groq when using OpenRouter [f049846]
  • Enable lazy loading of imports by default [7c2e397]

📝 Documentation

  • Updated default models to latest Gemini in README and help [761ddd1, 0868086, 78e562f]
  • Clarified that binary embeddings are not always better [fb611c4]
  • Added link explaining fixed cache of LLM issue [fdc3c64]
  • Improved docstrings for summarization functions [a06f570]
  • Added docstring for VectorStore filtering [2bd8dcb]

🎨 Code Quality

🔖 Version

  • Bumped version 3.3.1 → 4.0.0 [e1548c4]

Commits details since the last release

bumpver.toml
docs/source/conf.py
setup.py
wdoc/wdoc.py

setup.py

README.md

tests/run_all_tests.sh

tests/run_all_tests.sh

tests/run_all_tests.sh

tests/test_wdoc.py

tests/test_wdoc.py

tests/test_wdoc.py

tests/test_cli.sh

wdoc/wdoc.py

tests/run_all_tests.sh

wdoc/utils/tasks/query.py
wdoc/wdoc.py

wdoc/utils/llm.py
wdoc/utils/retrievers.py
wdoc/utils/tasks/query.py
wdoc/utils/tasks/summarize.py

wdoc/main.py
wdoc/wdoc.py

setup.py

setup.py

setup.py

wdoc/utils/misc.py

wdoc/utils/customs/litellm_embeddings.py
wdoc/utils/embeddings.py
wdoc/utils/llm.py
wdoc/utils/loaders/shared_audio.py
wdoc/utils/misc.py
wdoc/utils/retrievers.py
wdoc/utils/tasks/query.py
wdoc/utils/tasks/summarize.py
wdoc/wdoc.py

wdoc/init.py

wdoc/utils/loaders/shared_audio.py
wdoc/utils/misc.py

wdoc/utils/retrievers.py

wdoc/main.py
wdoc/utils/customs/binary_faiss_vectorstore.py
wdoc/utils/embeddings.py
wdoc/utils/env.py
wdoc/utils/interact.py
wdoc/utils/llm.py
wdoc/utils/misc.py
wdoc/utils/prompts.py
wdoc/utils/tasks/parse.py
wdoc/utils/tasks/query.py
wdoc/utils/tasks/search.py

wdoc/wdoc.py

wdoc/wdoc.py

wdoc/utils/loaders/shared_audio.py
wdoc/wdoc.py

wdoc/utils/init.py
wdoc/utils/customs/init.py
wdoc/utils/tasks/init.py

wdoc/main.py

wdoc/utils/misc.py

wdoc/utils/loaders/anki.py

wdoc/utils/loaders/init.py

wdoc/utils/batch_file_loader.py
wdoc/utils/loaders/shared.py
wdoc/wdoc.py

wdoc/main.py

wdoc/wdoc.py

wdoc/utils/embeddings.py

wdoc/utils/embeddings.py
wdoc/wdoc.py

wdoc/wdoc.py

wdoc/wdoc.py

wdoc/wdoc.py

wdoc/utils/batch_file_loader.py

wdoc/utils/batch_file_loader.py
wdoc/utils/tasks/query.py
wdoc/utils/tasks/summarize.py
wdoc/wdoc.py

wdoc/utils/retrievers.py

wdoc/wdoc.py

wdoc/docs/help.md

README.md

wdoc/utils/env.py

tests/run_all_tests.sh

setup.py

tests/run_all_tests.sh

wdoc/utils/batch_file_loader.py

wdoc/utils/loaders/word.py

setup.py

setup.py

tests/run_all_tests.sh

wdoc/utils/loaders/init.py

wdoc/utils/loaders/init.py

setup.py

setup.py

wdoc/utils/init.py
wdoc/utils/import_tricks.py

setup.py

README.md
setup.py

wdoc/wdoc.py

wdoc/wdoc.py

wdoc/wdoc.py

wdoc/wdoc.py

wdoc/wdoc.py

wdoc/utils/tasks/shared_query_search.py
wdoc/wdoc.py

wdoc/utils/tasks/shared_query_search.py

wdoc/utils/tasks/shared_query_search.py
wdoc/wdoc.py

wdoc/utils/logger.py

wdoc/wdoc.py

wdoc/wdoc.py

wdoc/utils/tasks/query.py
wdoc/wdoc.py

wdoc/utils/tasks/query.py
wdoc/utils/tasks/search.py

wdoc/utils/tasks/search.py
wdoc/wdoc.py

wdoc/utils/tasks/parse.py
wdoc/wdoc.py

wdoc/utils/tasks/query.py
wdoc/wdoc.py

tests/test_cli.py

tests/test_cli.sh

tests/test_cli.sh

tests/test_wdoc.py

tests/run_all_tests.sh

wdoc/utils/logger.py

tests/test_cli.sh

wdoc/utils/tasks/summarize.py

wdoc/wdoc.py

wdoc/utils/retrievers.py
wdoc/wdoc.py

wdoc/utils/logger.py
wdoc/wdoc.py

wdoc/wdoc.py

wdoc/utils/filters.py

wdoc/utils/filters.py
wdoc/wdoc.py

wdoc/wdoc.py

wdoc/wdoc.py

wdoc/wdoc.py

wdoc/utils/tasks/summarize.py
wdoc/wdoc.py

wdoc/utils/tasks/summarize.py

wdoc/utils/tasks/summarize.py

wdoc/utils/tasks/summarize.py

wdoc/utils/tasks/summarize.py

wdoc/utils/tasks/summarize.py
wdoc/wdoc.py

tests/test_wdoc.py
wdoc/wdoc.py

tests/test_wdoc.py

tests/test_wdoc.py

wdoc/utils/loaders/init.py
wdoc/utils/loaders/local_html.py
wdoc/utils/loaders/powerpoint.py

wdoc/utils/loaders/local_video.py

wdoc/utils/loaders/html.py

wdoc/docs/help.md
wdoc/utils/env.py

tests/test_cli.py

tests/run_all_tests.sh
tests/test_cli.py

tests/test_cli.py
tests/test_wdoc.py

Signed-off-by: thiswillbeyourgithub [email protected]

tests/test_cli.py

wdoc/utils/loaders/online_pdf.py

README.md

wdoc/utils/loaders/init.py
wdoc/utils/loaders/anki.py
wdoc/utils/loaders/epub.py
wdoc/utils/loaders/html.py
wdoc/utils/loaders/json_dict.py
wdoc/utils/loaders/local_audio.py
wdoc/utils/loaders/local_video.py
wdoc/utils/loaders/logseq_markdown.py
wdoc/utils/loaders/online_media.py
wdoc/utils/loaders/pdf.py
wdoc/utils/loaders/powerpoint.py
wdoc/utils/loaders/shared.py
wdoc/utils/loaders/shared_audio.py
wdoc/utils/loaders/string.py
wdoc/utils/loaders/text.py
wdoc/utils/loaders/txt.py
wdoc/utils/loaders/url.py
wdoc/utils/loaders/word.py
wdoc/utils/loaders/youtube.py

wdoc/utils/loaders/init.py
wdoc/utils/loaders/youtube.py

wdoc/utils/loaders/online_media.py

scripts/MediaURLFinder/media_url_finder.py

wdoc/utils/loaders/init.py
wdoc/utils/loaders/epub.py
wdoc/utils/loaders/html.py
wdoc/utils/loaders/json_dict.py
wdoc/utils/loaders/local_audio.py
wdoc/utils/loaders/local_video.py
wdoc/utils/loaders/logseq_markdown.py
wdoc/utils/loaders/online_media.py
wdoc/utils/loaders/powerpoint.py
wdoc/utils/loaders/string.py
wdoc/utils/loaders/text.py
wdoc/utils/loaders/txt.py
wdoc/utils/loaders/word.py
wdoc/utils/loaders/youtube.py

wdoc/utils/loaders/init.py
wdoc/utils/loaders/html.py

wdoc/utils/batch_file_loader.py
wdoc/utils/loaders/init.py
wdoc/utils/loaders/local_audio.py
wdoc/utils/loaders/shared.py
wdoc/utils/loaders/shared_audio.py
wdoc/utils/loaders/youtube.py
wdoc/utils/misc.py

wdoc/utils/misc.py

wdoc/utils/misc.py

wdoc/utils/loaders/init.py

wdoc/utils/loaders/init.py

wdoc/utils/loaders/init.py

wdoc/utils/loaders/init.py

wdoc/utils/loaders/init.py
wdoc/utils/loaders/shared.py
wdoc/utils/loaders/url.py

wdoc/docs/help.md
wdoc/utils/env.py
wdoc/utils/loaders/init.py

wdoc/utils/loaders/init.py

wdoc/utils/loaders/init.py
wdoc/utils/loaders/anki.py

wdoc/utils/loaders/pdf.py
wdoc/utils/loaders/shared.py

wdoc/utils/loaders/init.py
wdoc/utils/loaders/pdf.py
wdoc/utils/loaders/shared.py

wdoc/utils/loaders/init.py

README.md

README.md

wdoc/utils/llm.py

wdoc/utils/customs/binary_faiss_vectorstore.py

wdoc/utils/llm.py