What's new
New Features
- **Shell Command Testing**
- Added shell command test for NYTimes parsing with content validation. [d3832f2d]
Fixes
- **Intermediate Merging**
- Fixed error during batch merging, ensuring intermediate answers are handled properly when too large. [d06cbb36]
- Improved logic for retry attempts during batch merging. [573e15fa]
- **Parsing Reliability**
- Prevent return of intermediately parsed output on parsing failure. [78c93640]
- Adapted handling to retry when encountering unparseable output or `finish_reason=length`. [763e9b4b]
- **Backend and Output**
- Resolved backend errors and edge case issues breaking summary generation. [bf00ced0, 89c01de3]
- Corrected handling of test outputs to avoid crashing in specific import modes. [ee416ece]
- **Testing Corrections**
- Corrected various test functions and ensured expected outcomes align with API changes. [bfcba71a, 89c01de3, 292ce90b]
Documentation
- **General Updates**
- Enhanced walkthrough formatting for improved readability in documentation. [2ec7fad9]
- Provided context about the origins of wdoc in README. [31e6c5d5]
- **Example and Help Docs**
- Updated examples documentation to remove confusing import_mode arguments. [cef5cdf1]
- Improved query and summary help documentation to reflect recent changes. [a289759c, f1a6294b]
Improvements
- **Configuration and Setup Adjustments**
- Simplified the post-install script to handle dependencies via `uv` if present. [7a551432, 4ed0a350]
- **Performance and Debugging Enhancements**
- Bumped max_tokens for intermediate answers to accommodate larger outputs during processing. [df059b61]
- Stored original strings before parsing for effective debugging. [6ec39576]
- Addressed deprecation warnings to keep up with latest standards. [15e4793c]
Minor Changes
- **Code and Debug Tune-ups**
- Streamlined testing arguments and outputs for precise validation. [7e5e4ce7, b7f1a1f0, d79b4c51]
- Fine-tuned post-install scripts and functional debug outputs for better clarity. [d620f879]
- **Enhanced wdoc Docs Via SVG Files (WIP)**
- Create SVG diagram and documentation for summary algorithm. [b5f49a96]
- Added SVG visualization and improved design and intuitive flow representation for better understanding. [9489a8af, 703dde08]
Commits details since the last release
- [d06cbb36] by thiswillbeyourgithub, 34 minutes ago:
fix: error when merging batch when intermediate answers got so large the model can't merge them anymore
We just concatennate them using semantic order and that will be good
enough, the alternative is too expensive
Signed-off-by: thiswillbeyourgithub <26625900+thiswillbeyourgithubusers.noreply.github.com>
wdoc/wdoc.py
- [573e15fa] by thiswillbeyourgithub, 35 minutes ago:
fix: one more trial given to merge batches
Signed-off-by: thiswillbeyourgithub <26625900+thiswillbeyourgithubusers.noreply.github.com>
wdoc/wdoc.py
- [7edbe1f8] by thiswillbeyourgithub, 54 minutes ago:
doc: add helpful debug message if abrupt message tail
Signed-off-by: thiswillbeyourgithub <26625900+thiswillbeyourgithubusers.noreply.github.com>
wdoc/utils/misc.py
- [df059b61] by thiswillbeyourgithub, 55 minutes ago:
new: bump max_token for intermediate answer from 1000 to 4000
Signed-off-by: thiswillbeyourgithub <26625900+thiswillbeyourgithubusers.noreply.github.com>
wdoc/wdoc.py
- [78c93640] by thiswillbeyourgithub, 3 hours ago:
fix: don't return intermediately parsed output if parsing fails
Signed-off-by: thiswillbeyourgithub <26625900+thiswillbeyourgithubusers.noreply.github.com>
wdoc/utils/misc.py
- [6ec39576] by thiswillbeyourgithub, 3 hours ago:
minor: store the original string before parsing to help debugging
Signed-off-by: thiswillbeyourgithub <26625900+thiswillbeyourgithubusers.noreply.github.com>
wdoc/utils/misc.py
- [ee1c8571] by thiswillbeyourgithub, 3 hours ago:
minor: better order of the output price prints
Signed-off-by: thiswillbeyourgithub <26625900+thiswillbeyourgithubusers.noreply.github.com>
wdoc/wdoc.py
- [cc49037a] by thiswillbeyourgithub, 3 hours ago:
fix: out_file test
Signed-off-by: thiswillbeyourgithub <26625900+thiswillbeyourgithubusers.noreply.github.com>
tests/test_wdoc.py
- [cef5cdf1] by thiswillbeyourgithub, 3 hours ago:
fix: forgot to remove import_mode args from examples
Signed-off-by: thiswillbeyourgithub <26625900+thiswillbeyourgithubusers.noreply.github.com>
wdoc/docs/examples.md
- [d3832f2d] by thiswillbeyourgithub (aider), 3 hours ago:
feat: Add shell command test for NYTimes parsing with content validation
tests/test_wdoc.py
- [ee416ece] by thiswillbeyourgithub, 3 hours ago:
new: don't crash if using import_mode at the same time as out_file
Signed-off-by: thiswillbeyourgithub <26625900+thiswillbeyourgithubusers.noreply.github.com>
wdoc/wdoc.py
- [32ecdbde] by thiswillbeyourgithub, 4 hours ago:
test: remove unused debug and verbose argsc
Signed-off-by: thiswillbeyourgithub <26625900+thiswillbeyourgithubusers.noreply.github.com>
tests/test_wdoc.py
- [a13d20a2] by thiswillbeyourgithub, 4 hours ago:
new: remove confusing arg 'import_mode' and set it automatically depending on if imported or launched from cli
Signed-off-by: thiswillbeyourgithub
<26625900+thiswillbeyourgithubusers.noreply.github.com>
README.md
scripts/AnkiFiltered/AnkiFilteredDeckCreator.py
scripts/TheFiche/TheFiche.py
tests/test_wdoc.py
wdoc/__main__.py
wdoc/docs/help.md
wdoc/wdoc.py
- [763e9b4b] by thiswillbeyourgithub, 4 hours ago:
fix: now if eval_llm returns something unparsable or with finish_reason=length we always retry
Signed-off-by: thiswillbeyourgithub <26625900+thiswillbeyourgithubusers.noreply.github.com>
wdoc/wdoc.py
- [15e4793c] by thiswillbeyourgithub, 4 hours ago:
minor: address deprecation warnings
Signed-off-by: thiswillbeyourgithub <26625900+thiswillbeyourgithubusers.noreply.github.com>
wdoc/utils/tasks/query.py
- [b7f1a1f0] by thiswillbeyourgithub, 4 hours ago:
test: set semantic batching test to api mark
Signed-off-by: thiswillbeyourgithub <26625900+thiswillbeyourgithubusers.noreply.github.com>
tests/test_wdoc.py
- [7e5e4ce7] by thiswillbeyourgithub, 4 hours ago:
test: move semantic batching test to the api section
Signed-off-by: thiswillbeyourgithub <26625900+thiswillbeyourgithubusers.noreply.github.com>
tests/test_wdoc.py
- [e5530b64] by thiswillbeyourgithub, 5 hours ago:
test: add test for mistral embeddings
Signed-off-by: thiswillbeyourgithub <26625900+thiswillbeyourgithubusers.noreply.github.com>
tests/test_wdoc.py
- [d79b4c51] by thiswillbeyourgithub, 5 hours ago:
fix: remove unused arg in tests
Signed-off-by: thiswillbeyourgithub
<26625900+thiswillbeyourgithubusers.noreply.github.com>
Signed-off-by: thiswillbeyourgithub <26625900+thiswillbeyourgithubusers.noreply.github.com>
tests/test_wdoc.py
- [13ad2aab] by thiswillbeyourgithub, 5 hours ago:
test: ollama should be an api mark not basic
Signed-off-by: thiswillbeyourgithub <26625900+thiswillbeyourgithubusers.noreply.github.com>
tests/test_wdoc.py
- [d3679b3a] by thiswillbeyourgithub, 5 hours ago:
minor: sort pytest by mark
Signed-off-by: thiswillbeyourgithub <26625900+thiswillbeyourgithubusers.noreply.github.com>
tests/test_wdoc.py
- [bfcba71a] by thiswillbeyourgithub, 6 hours ago:
fix test
Signed-off-by: thiswillbeyourgithub <26625900+thiswillbeyourgithubusers.noreply.github.com>
tests/test_wdoc.py
- [292ce90b] by thiswillbeyourgithub, 6 hours ago:
fix: test of query
Signed-off-by: thiswillbeyourgithub <26625900+thiswillbeyourgithubusers.noreply.github.com>
tests/test_wdoc.py
- [740dc25e] by thiswillbeyourgithub, 6 hours ago:
fix: test of out_file
Signed-off-by: thiswillbeyourgithub <26625900+thiswillbeyourgithubusers.noreply.github.com>
tests/test_wdoc.py
- [ff737110] by thiswillbeyourgithub, 6 hours ago:
fix: summary test
Signed-off-by: thiswillbeyourgithub <26625900+thiswillbeyourgithubusers.noreply.github.com>
tests/test_wdoc.py
- [3b292ea0] by thiswillbeyourgithub, 6 hours ago:
fix: remove unused arg in tests
Signed-off-by: thiswillbeyourgithub <26625900+thiswillbeyourgithubusers.noreply.github.com>
tests/test_wdoc.py
- [bf00ced0] by thiswillbeyourgithub, 6 hours ago:
fix: edge case was breaking summary
Signed-off-by: thiswillbeyourgithub <26625900+thiswillbeyourgithubusers.noreply.github.com>
wdoc/wdoc.py
- [89c01de3] by thiswillbeyourgithub, 6 hours ago:
fix: backend error in one edge case
Signed-off-by: thiswillbeyourgithub <26625900+thiswillbeyourgithubusers.noreply.github.com>
wdoc/wdoc.py
- [ebcf92fb] by thiswillbeyourgithub (aider), 6 hours ago:
feat: Change default query relevancy threshold to -0.5
wdoc/docs/help.md
wdoc/wdoc.py
- [443aab4c] by thiswillbeyourgithub, 7 hours ago:
fix: query_task arg is actually optional
Signed-off-by: thiswillbeyourgithub <26625900+thiswillbeyourgithubusers.noreply.github.com>
wdoc/wdoc.py
- [5847cd36] by thiswillbeyourgithub, 7 hours ago:
fix: missing var if only one document present
Signed-off-by: thiswillbeyourgithub <26625900+thiswillbeyourgithubusers.noreply.github.com>
wdoc/wdoc.py
- [31e6c5d5] by thiswillbeyourgithub (aider), 12 hours ago:
docs: Add context about medical student's motivation for creating wdoc
README.md
- [2ec7fad9] by thiswillbeyourgithub (aider), 14 hours ago:
style: Update walkthrough formatting to use triple backticks for code blocks
wdoc/docs/examples.md
- [c06a8491] by thiswillbeyourgithub, 15 hours ago:
update roadmap
Signed-off-by: thiswillbeyourgithub <26625900+thiswillbeyourgithubusers.noreply.github.com>
README.md
- [7d1ba8bd] by thiswillbeyourgithub, 15 hours ago:
fix: link to examples
Signed-off-by: thiswillbeyourgithub <26625900+thiswillbeyourgithubusers.noreply.github.com>
README.md
- [d003d8ed] by thiswillbeyourgithub, 25 hours ago:
fix: ongoing fix for the summary test
Signed-off-by: thiswillbeyourgithub <26625900+thiswillbeyourgithubusers.noreply.github.com>
tests/test_wdoc.py
- [3a5f805c] by thiswillbeyourgithub, 25 hours ago:
fix: tests for api were wrong
Signed-off-by: thiswillbeyourgithub <26625900+thiswillbeyourgithubusers.noreply.github.com>
tests/test_wdoc.py
- [17e6a0fa] by thiswillbeyourgithub, 26 hours ago:
fix: test using out_file
Signed-off-by: thiswillbeyourgithub <26625900+thiswillbeyourgithubusers.noreply.github.com>
tests/test_wdoc.py
- [50325e99] by thiswillbeyourgithub, 26 hours ago:
fix: dont read from stdin if pytest is imported
Signed-off-by: thiswillbeyourgithub <26625900+thiswillbeyourgithubusers.noreply.github.com>
wdoc/utils/misc.py
- [53370050] by thiswillbeyourgithub, 26 hours ago:
minor
Signed-off-by: thiswillbeyourgithub <26625900+thiswillbeyourgithubusers.noreply.github.com>
tests/test_wdoc.py
- [14735c8b] by thiswillbeyourgithub, 26 hours ago:
minor
Signed-off-by: thiswillbeyourgithub <26625900+thiswillbeyourgithubusers.noreply.github.com>
wdoc/utils/batch_file_loader.py
- [4ed0a350] by thiswillbeyourgithub, 26 hours ago:
new: the post install script now tries to install python-magic from git
Signed-off-by: thiswillbeyourgithub <26625900+thiswillbeyourgithubusers.noreply.github.com>
setup.py
- [7a551432] by thiswillbeyourgithub, 26 hours ago:
new: try to use uv for the PostInstall script
Signed-off-by: thiswillbeyourgithub <26625900+thiswillbeyourgithubusers.noreply.github.com>
setup.py
- [7c705562] by thiswillbeyourgithub, 26 hours ago:
docs: minor
Signed-off-by: thiswillbeyourgithub <26625900+thiswillbeyourgithubusers.noreply.github.com>
README.md
- [6d0c42cb] by thiswillbeyourgithub, 26 hours ago:
feat: allow using shell pipes
Signed-off-by: thiswillbeyourgithub <26625900+thiswillbeyourgithubusers.noreply.github.com>
README.md
setup.py
wdoc/__main__.py
wdoc/docs/examples.md
wdoc/utils/misc.py
- [d620f879] by thiswillbeyourgithub, 27 hours ago:
docs: insist on txt vs text filetype
Signed-off-by: thiswillbeyourgithub <26625900+thiswillbeyourgithubusers.noreply.github.com>
wdoc/docs/help.md
- [a0c54951] by thiswillbeyourgithub, 29 hours ago:
minor
Signed-off-by: thiswillbeyourgithub <26625900+thiswillbeyourgithubusers.noreply.github.com>
wdoc/__main__.py
- [83df9c5d] by thiswillbeyourgithub, 30 hours ago:
new: use Literal type hint for audio backend
Signed-off-by: thiswillbeyourgithub <26625900+thiswillbeyourgithubusers.noreply.github.com>
wdoc/utils/loaders.py
- [07128045] by thiswillbeyourgithub (aider), 2 days ago:
test: Add API test for summary out_file argument
tests/test_wdoc.py
- [a289759c] by thiswillbeyourgithub (aider), 2 days ago:
docs: Update documentation for out_file argument with query use case
wdoc/docs/help.md
- [f1a6294b] by thiswillbeyourgithub, 2 days ago:
feat: allow out_file to be specified for query too
Signed-off-by: thiswillbeyourgithub <26625900+thiswillbeyourgithubusers.noreply.github.com>
wdoc/wdoc.py
- [0b524a79] by thiswillbeyourgithub, 2 days ago:
harmonize the way out_file is handled
Signed-off-by: thiswillbeyourgithub <26625900+thiswillbeyourgithubusers.noreply.github.com>
wdoc/utils/misc.py
wdoc/wdoc.py
- [d0274309] by thiswillbeyourgithub, 2 days ago:
put unfinished svg into some folder
Signed-off-by: thiswillbeyourgithub <26625900+thiswillbeyourgithubusers.noreply.github.com>
wdoc/docs/svg/query_animation.html
wdoc/docs/svg/query_rag.md
wdoc/docs/svg/summary.svg
- [44f36f08] by thiswillbeyourgithub, 2 days ago:
add
Signed-off-by: thiswillbeyourgithub <26625900+thiswillbeyourgithubusers.noreply.github.com>
query_rag.md
- [4a673b09] by thiswillbeyourgithub (aider), 2 days ago:
refactor: Reposition step 4 and update process flow arrows in summary diagram
wdoc/docs/summary.svg
- [4ca2bdd2] by thiswillbeyourgithub (aider), 2 days ago:
fix: Swap step 3 and 4 labels in summary SVG diagram
wdoc/docs/summary.svg
- [87035327] by thiswillbeyourgithub, 2 days ago:
better svg
Signed-off-by: thiswillbeyourgithub <26625900+thiswillbeyourgithubusers.noreply.github.com>
wdoc/docs/summary.svg
- [b5f49a96] by thiswillbeyourgithub (aider), 2 days ago:
feat: Improve wdoc summary algorithm SVG diagram and documentation
summary_rag.md
wdoc/docs/summary.svg
- [204ca0f7] by thiswillbeyourgithub (aider), 2 days ago:
refactor: Update SVG flow and remove distracting icons
wdoc/docs/summary.svg
- [703dde08] by thiswillbeyourgithub (aider), 2 days ago:
feat: Simplify SVG cycle with more intuitive circular design
wdoc/docs/summary.svg
- [fc83dd31] by thiswillbeyourgithub (aider), 2 days ago:
feat: Enhance SVG visualization with improved design, color scheme, and intuitive flow representation
wdoc/docs/summary.svg
- [9489a8af] by thiswillbeyourgithub (aider), 2 days ago:
feat: Add SVG visualization of wdoc summary algorithm
wdoc/docs/summary.svg
- [3e975752] by thiswillbeyourgithub (aider), 2 days ago:
fix: Encode '>' as HTML entity in query_animation.html
wdoc/docs/query_animation.html
- [37c8244d] by thiswillbeyourgithub (aider), 2 days ago:
refactor: Convert dynamic SVG animation to static HTML flow diagram
wdoc/docs/query_animation.html
- [37469788] by thiswillbeyourgithub, 2 days ago:
remove unused arrows
Signed-off-by: thiswillbeyourgithub <26625900+thiswillbeyourgithubusers.noreply.github.com>
wdoc/docs/query_animation.html
- [c9dd00c2] by thiswillbeyourgithub, 2 days ago:
add query animation
Signed-off-by: thiswillbeyourgithub <26625900+thiswillbeyourgithubusers.noreply.github.com>
wdoc/docs/query_animation.html