Langroid

Latest version: v0.50.0

Safety actively analyzes 723217 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 71

0.50.0

feat: new splitter: structure-aware markdown

Features:
- `ParsingConfig.splitter` has a new option `Splitter.Markdown` which is now the default, and works for both plain text (which by definition is
markdown) as well as markdown text. It implements "structure-aware" chunking, which means:
- tries to keep entire sections as chunks if they are not too big (relative to chunking configs)
- recursively splits large sections by avoiding breaking paras, and if that's not feasible, then avoids breaking sentences,
and only avoids breaking sentences as a last resort.
- enriches chunks with the headers from enclosing sections to improve match surface during retrieval
- `DocChatAgent` by default now uses this splitter
- Crawlers in `URLLoader`:
- `TrafilaturaCrawlerConfig.format` can be set to 3 possible values:
- `"markdown"` (default) - extracts content from page in markdown format
- `"txt"` - extracts content as plain text
- `"xml"` - extracts text with html tags, and the output is converted to markdown using `markdownify` lib
- `ExaCrawler` now extracts content in html content, which is then converted to makdown using `markdownify`

0.49.1

feat(minor):

- `OpenAIGPT` handle, raise exception early when invalid stream response
- `DocChatAgent` source citations - include date, title if available.

0.49.0

feat: easily switch to using LiteLLM proxy using `litellm-proxy/` prefix

See [docs](https://langroid.github.io/langroid/notes/litellm-proxy/)

0.48.3

fix: remove non-empty string fields enforcement in `DocMetaData`

This was done mainly to accommodate LanceDB's quirkiness where it converts empty string values to None.

So users of LanceDB should take care not to set these fields to empty values.

0.48.2

fix: URLLoader should set non-empty `title` and `published_date` in `DocMetaData`

0.48.1

fix: URLLoader set parser when needed;

plus: new fields in `DocMetaData`: `title`, `published_date` - useful for citations

Page 1 of 71

Links

Releases

Has known vulnerabilities

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.