LangFair: Use-Case Level LLM Bias and Fairness Assessments

LangFair 是一个全面的 Python 库，旨在对大型语言模型（LLM）用例进行偏差和公平性评估。LangFair 存储库包括一个用于选择偏差和公平性指标的综合框架，以及演示笔记本和讨论 LLM 偏差和公平性风险、评估指标和最佳实践的技术手册。

浏览我们的文档网站，了解有关使用 LangFair 的详细说明。

⚡ 快速入门指南

（可选）创建使用 LangFair 的虚拟环境

我们建议在安装 LangFair 之前使用 venv 创建一个新的虚拟环境。为此，请按照此处的说明进行作。

安装 LangFair

可以从 PyPI 安装最新版本：

pip install langfair

使用示例

以下是代码示例，说明了如何使用 LangFair 评估文本生成和摘要用例中的偏差和公平性风险。以下示例假定用户已经从其用例prompts.

生成 LLM 响应

为了生成响应，我们可以使用 LangFair 的ResponseGenerator类。首先，我们必须创建一个langchainLLM 对象。下面我们使用ChatVertexAI，但可以使用 LangChain 的任何 LLM 类。请注意，InMemoryRateLimiter用于避免速率限制错误。

from langchain_google_vertexai import ChatVertexAI
from langchain_core.rate_limiters import InMemoryRateLimiter
rate_limiter = InMemoryRateLimiter(
    requests_per_second=4.5, check_every_n_seconds=0.5, max_bucket_size=280,  
)
llm = ChatVertexAI(
    model_name="gemini-pro", temperature=0.3, rate_limiter=rate_limiter
)

API 参考：ChatVertexAI | InMemoryRateLimiter （内存速率限制器）

我们可以使用ResponseGenerator.generate_responses为每个提示生成 25 个响应，这是毒性评估的惯例。

from langfair.generator import ResponseGenerator
rg = ResponseGenerator(langchain_llm=llm)
generations = await rg.generate_responses(prompts=prompts, count=25)
responses = generations["data"]["response"]
duplicated_prompts = generations["data"]["prompt"] # so prompts correspond to responses

计算毒性指标

毒性指标可以通过ToxicityMetrics.请注意，使用torch.device是可选的，如果 GPU 可用以加快毒性计算，则应使用此选项。

# import torch # uncomment if GPU is available
# device = torch.device("cuda") # uncomment if GPU is available
from langfair.metrics.toxicity import ToxicityMetrics
tm = ToxicityMetrics(
    # device=device, # uncomment if GPU is available,
)
tox_result = tm.evaluate(
    prompts=duplicated_prompts, 
    responses=responses, 
    return_data=True
)
tox_result['metrics']
# # Output is below
# {'Toxic Fraction': 0.0004,
# 'Expected Maximum Toxicity': 0.013845130120171235,
# 'Toxicity Probability': 0.01}

计算构造型指标

构造型指标可以用StereotypeMetrics.

from langfair.metrics.stereotype import StereotypeMetrics
sm = StereotypeMetrics()
stereo_result = sm.evaluate(responses=responses, categories=["gender"])
stereo_result['metrics']
# # Output is below
# {'Stereotype Association': 0.3172750176745329,
# 'Cooccurrence Bias': 0.44766333654278373,
# 'Stereotype Fraction - gender': 0.08}

生成反事实响应并计算指标

我们可以生成反事实响应CounterfactualGenerator.

from langfair.generator.counterfactual import CounterfactualGenerator
cg = CounterfactualGenerator(langchain_llm=llm)
cf_generations = await cg.generate_responses(
    prompts=prompts, attribute='gender', count=25
)
male_responses = cf_generations['data']['male_response']
female_responses = cf_generations['data']['female_response']

反事实指标可以很容易地计算CounterfactualMetrics.

from langfair.metrics.counterfactual import CounterfactualMetrics
cm = CounterfactualMetrics()
cf_result = cm.evaluate(
    texts1=male_responses, 
    texts2=female_responses,
    attribute='gender'
)
cf_result['metrics']
# # Output is below
# {'Cosine Similarity': 0.8318708,
# 'RougeL Similarity': 0.5195852482361165,
# 'Bleu Similarity': 0.3278433712872481,
# 'Sentiment Bias': 0.0009947145187601957}

替代方法：半自动评估`AutoEval`

为了简化文本生成和摘要使用案例的评估，AutoEvalclass 执行一个多步骤过程，通过两行代码完成上述所有步骤。

from langfair.auto import AutoEval
auto_object = AutoEval(
    prompts=prompts, 
    langchain_llm=llm,
    # toxicity_device=device # uncomment if GPU is available
)
results = await auto_object.evaluate()
results['metrics']
# # Output is below
# {'Toxicity': {'Toxic Fraction': 0.0004,
#   'Expected Maximum Toxicity': 0.013845130120171235,
#   'Toxicity Probability': 0.01},
#  'Stereotype': {'Stereotype Association': 0.3172750176745329,
#   'Cooccurrence Bias': 0.44766333654278373,
#   'Stereotype Fraction - gender': 0.08,
#   'Expected Maximum Stereotype - gender': 0.60355167388916,
#   'Stereotype Probability - gender': 0.27036},
#  'Counterfactual': {'male-female': {'Cosine Similarity': 0.8318708,
#    'RougeL Similarity': 0.5195852482361165,
#    'Bleu Similarity': 0.3278433712872481,
#    'Sentiment Bias': 0.0009947145187601957}}}