Langcheck

Latest version: v0.8.0

Safety actively analyzes 682244 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 2

0.8.0

Breaking changes
- Updated the minimal supported Python version to 3.9.
- All prompts for our built-in LLM based metrics are updated. Now all of them have `Output your thought process first, and then provide your final answer` as the last sentence to make sure LLM evaluators actually do the chain-of-thought reasoning. This may affect the output scores as well.
- Fixed a typo in a module name. Now `langcheck.utils.progess_bar` is renamed to `langcheck.utils.progress_bar`.
- Default prompts for `langcheck.en.toxicity` and `langcheck.ja.toxicity` are updated. Refer to 136 for comparison with the original prompt. You can fallback to the old prompts by specifying `eval_prompt_version="v1"` as an argument.
- Updated the arguments for `langcheck.augment.rephrase`. Now they take `EvalClient`s instead of directly taking OpenAI parameters.

New Features
- Added [langcheck.metrics.custom_text_quality](https://langcheck.readthedocs.io/en/latest/langcheck.metrics.custom_text_quality.html#module-langcheck.metrics.custom_text_quality). With functions in this module, you can build your own LLM-based metrics with custom prompts. See the documentation for details.
- Added support of some local LLMs as evaluators
- [LlamaEvalClient](https://langcheck.readthedocs.io/en/latest/langcheck.metrics.eval_clients.html#langcheck.metrics.eval_clients.LlamaEvalClient)
- [PrometheusEvalClient](https://langcheck.readthedocs.io/en/latest/langcheck.metrics.eval_clients.html#langcheck.metrics.eval_clients.PrometheusEvalClient)
- Added new text augmentations
- `jailbreak_templates` augmentation with the following templates
- `basic`, `chatgpt_dan`, `chatgpt_good_vs_evil`, `john` and `universal_adversarial_suffix` (EN)
- `basic`, `chatgpt_good_vs_evil` and `john` (JA)
- `payload_splitting` (EN, JA)
- `to_full_width` (EN)
- `conv_kana` (JA)
- Added new LLM-based built-in metrics for both EN & JA languages
- `answer_correctness`
- `answer_safety`
- `personal_data_leakage`
- `hate_speech`
- `adult_content`
- `harmful_activity`
- Added "Simulated Annotators", a confidence score estimating method proposed in paper [Trust or Escalate: LLM Judges with Provable Guarantees for Human Agreement](https://arxiv.org/abs/2407.18370). You can use that by adding `calculated_confidence=True` for `langcheck.metrics.en.pairwise_comparison`.
- Supported embedding-based metrics (e.g. `semantic_similarity`) with async OpenAI-based eval clients.


Bug Fixes
- Added error handling code in `OpenAIEvalClient` and `GeminiAIEvalClient` so that they just return `None` even if they fail in the function calling step.
- Updated `langcheck.metrics.pairwise_comparison` to accept lists with `None` as source texts.
- Fixed an error in `langcheck.augment.synonym` caused by a missing `nltk` package.
- Fixed the issue on decoding UTF-8 texts in some environments.
- Fixed typos in documentation.


**Full Changelog**: https://github.com/citadel-ai/langcheck/compare/v0.7.1...v0.8.0

0.7.1

Breaking Changes

The interface for OpenAI metrics has been changed in LangCheck v0.7.0. See [Computing Metrics with Remote LLMs](https://langcheck.readthedocs.io/en/latest/metrics.html#computing-metrics-with-remote-llms) for examples.

New Features
* Added support for computing metrics with Claude & Gemini in addition to OpenAI

Bug Fixes
* Fixed an issue in VSCode dev containers for Linux ARM64 devices

-----

0.7.0

Breaking Changes

The interface for OpenAI metrics has been changed in LangCheck v0.7.0. See [Computing Metrics with Remote LLMs](https://langcheck.readthedocs.io/en/latest/metrics.html#computing-metrics-with-remote-llms) for examples.

New Features
* Added support for computing metrics with Claude & Gemini in addition to OpenAI

Bug Fixes
* Fixed an issue in VSCode dev containers for Linux ARM64 devices

**Full Changelog**: https://github.com/citadel-ai/langcheck/compare/v0.6.0...v0.7.0

0.6.0

Breaking Changes

* To simplify installation, LangCheck v0.6.0 now splits the package into multiple languages.
* `pip install langcheck` will only install English metrics
* `pip install langcheck[all]` will install metrics for all languages
* `pip install langcheck[ja]` will only install Japanese and English metrics (use `[de]` for German and `[zh]` for Chinese)
* See [the "Installation" guide](https://langcheck.readthedocs.io/en/latest/installation.html) for more details.

New Features
* [Added the `pairwise_comparison()` metric](https://langcheck.readthedocs.io/en/latest/metrics.html#pairwise-text-quality-metrics) to automatically rank the output of two models!
* Enabled running OpenAI metrics asynchronously. To do this, set the `use_async` parameter for OpenAI metric functions, such as [`factual_consistency()`](https://langcheck.readthedocs.io/en/latest/langcheck.metrics.en.source_based_text_quality.html#langcheck.metrics.en.source_based_text_quality.factual_consistency)

Bug Fixes
* Fixed the local Japanese, German, Chinese `factual_consistency()` metric to not throw an exception for long inputs (inputs longer than the model accepts are automatically truncated now)
* Updated the "Pytest" status badge in README.md
* Refactored implementation of local metrics

**Full Changelog**: https://github.com/citadel-ai/langcheck/compare/v0.5.0...v0.6.0

0.5.0

Breaking Changes

None

New Features
* Launched LangCheck metrics for Chinese (`langcheck.metrics.zh`) 🇨🇳 - thanks Vela-zz!
* New Model Manager architecture to unify the management of local evaluation models - thanks Vela-zz!
* Update the default OpenAI embedding model to `text-embedding-3-small` - thanks bioerrorlog!
* [New tutorial for LangCheckChat](https://langcheck.readthedocs.io/en/latest/tutorial_langcheckchat.html), a self-evaluating RAG application on LangCheck documentation

Bug Fixes

* Improvements to support for German (README, factual consistency benchmarking, OpenAI prompts in German) 🇩🇪 - thanks ischender!
* Improvements to the OpenAI-based metrics for Japanese
* We now have pip install tests for different Python versions
* `nltk download` will now only run when needed (as opposed to always running on `import langcheck`)
* Add the tqdm progress bar to more metrics

**Full Changelog**: https://github.com/citadel-ai/langcheck/compare/v0.4.0...v0.5.0

0.4.0

Breaking Changes
None

New Features
* Launched LangCheck metrics for German (`langcheck.metrics.de`) 🇩🇪 - thanks ischender!
* Added `langcheck.metrics.context_relevance`, a metric to compute the relevance of a retrieved source text to the user's prompt
* Added `langcheck.metrics.answer_relevance`, a metric to compute the relevance of an output to the user's prompt
* Added `langcheck.augment.ja.synonym`, a Japanese text augmentation where some words are replaced with synonyms

Bug Fixes
None

**Full Changelog**: https://github.com/citadel-ai/langcheck/compare/v0.3.0...v0.4.0

Page 1 of 2

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.