LangFair is a Python library for conducting bias and fairness assessments of LLM use cases. This repository includes a framework for [choosing bias and fairness metrics](https://github.com/cvs-health/langfair/tree/develop#choosing-bias-and-fairness-metrics-for-an-llm-use-case), [demo notebooks](https://github.com/cvs-health/langfair/tree/develop/examples), and a LLM bias and fairness [technical playbook](https://arxiv.org/pdf/2407.10853) containing a thorough discussion of LLM bias and fairness risks, evaluation metrics, and best practices. Please refer to our [documentation site](https://cvs-health.github.io/langfair/) for more details on how to use LangFair.
Highlights
Bias and fairness metrics offered by LangFair fall into one of several categories. The full suite of metrics is displayed below.
Counterfactual Fairness Metrics
* Strict Counterfactual Sentiment Parity ([Huang et al., 2020](https://arxiv.org/pdf/1911.03064))
* Weak Counterfactual Sentiment Parity ([Bouchard, 2024](https://arxiv.org/pdf/2407.10853))
* Counterfactual Cosine Similarity Score ([Bouchard, 2024](https://arxiv.org/pdf/2407.10853))
* Counterfactual BLEU ([Bouchard, 2024](https://arxiv.org/pdf/2407.10853))
* Counterfactual ROUGE-L ([Bouchard, 2024](https://arxiv.org/pdf/2407.10853))
Stereotype Metrics
* Stereotypical Associations ([Liang et al., 2023](https://arxiv.org/pdf/2211.09110))
* Co-occurrence Bias Score ([Bordia & Bowman, 2019](https://aclanthology.org/N19-3002.pdf))
* Stereotype classifier metrics ([Zekun et al., 2023](https://arxiv.org/ftp/arxiv/papers/2311/2311.14126.pdf), [Bouchard, 2024](https://arxiv.org/pdf/2407.10853))
Toxicity Metrics
* Expected Maximum Toxicity ([Gehman et al., 2020](https://arxiv.org/pdf/2009.11462))
* Toxicity Probability ([Gehman et al., 2020](https://arxiv.org/pdf/2009.11462))
* Toxic Fraction ([Liang et al., 2023](https://arxiv.org/pdf/2211.09110))
Recommendation Fairness Metrics
* Jaccard Similarity ([Zhang et al., 2023](https://arxiv.org/pdf/2305.07609.pdf))
* Search Result Page Misinformation Score ([Zhang et al., 2023](https://arxiv.org/pdf/2305.07609.pdf))
* Pairwise Ranking Accuracy Gap ([Zhang et al., 2023](https://arxiv.org/pdf/2305.07609.pdf))
Classification Fairness Metrics
* Predicted Prevalence Rate Disparity ([Feldman et al., 2015](https://arxiv.org/abs/1412.3756); [Bellamy et al., 2018](https://arxiv.org/abs/1810.01943); [Saleiro et al., 2019](https://arxiv.org/abs/1811.05577))
* False Negative Rate Disparity ([Bellamy et al., 2018](https://arxiv.org/abs/1810.01943); [Saleiro et al., 2019](https://arxiv.org/abs/1811.05577))
* False Omission Rate Disparity ([Bellamy et al., 2018](https://arxiv.org/abs/1810.01943); [Saleiro et al., 2019](https://arxiv.org/abs/1811.05577))
* False Positive Rate Disparity ([Bellamy et al., 2018](https://arxiv.org/abs/1810.01943); [Saleiro et al., 2019](https://arxiv.org/abs/1811.05577))
* False Discovery Rate Disparity ([Bellamy et al., 2018](https://arxiv.org/abs/1810.01943); [Saleiro et al., 2019](https://arxiv.org/abs/1811.05577))