Co-Occurrence Metrics

This module provides classes for computing bias amplification metrics based on co-occurrence analysis between protected attributes and task labels.

BaseCoOccurMetric

class bias_amplification.metrics.CoOccurMetrics.BaseCoOccurMetric[source]

Bases: ABC

Abstract base class for co-occurrence-based bias amplification metrics.

This class provides common functionality for computing probabilities and bias amplification computations.

Methods

`computeAgivenT`(A, T)	Computes conditional probability for all Attributes A given T observations.
`computeBiasAmp`(A, T, T_pred)	Abstract method to compute bias amplification.
`computePairProbs`(A, T)	Computes joint probability for given A and T observations.
`computeProbs`(vals)	Computes observed probability for each category.
`computeTgivenA`(A, T)	Computes conditional probability for all Task T given A observations.

Examples

from bias_amplification.metrics.CoOccurMetrics import BaseCoOccurMetric
import torch

# BaseCoOccurMetric is abstract - use concrete implementations
# See BA_MALS, DBA, or MDBA below

__init__()[source]

computeAgivenT(A: tensor, T: tensor) → tensor[source]

Computes conditional probability for all Attributes A given T observations. i.e P(A|T)

Parameters:

Atorch.tensor: Binary tensor of the shape (N x a). a is the number of possible attribute categories. (i.e. 2 for gender {male, female})
Ttorch.tensor: Binary tensor of the shape (N x t). t is the number of possible task categories.

Returns:

probstorch.tensor: of the shape (a x t). Represents the conditional probability P(A|T) for each A-T pair.

abstract computeBiasAmp(A: tensor, T: tensor, T_pred: tensor) → tensor[source]

Abstract method to compute bias amplification. Subclasses must implement this method to compute the bias amplification for each A-T pair.

Parameters:

Atorch.tensor: Binary tensor of shape (N x a)
Ttorch.tensor: Binary tensor of shape (N x t)
T_predtorch.tensor: Binary tensor of shape (N x t)

Returns:

bias_amp_combinedtorch.tensor: Scalar representing mean bias amplification across all pairs
bias_amptorch.tensor: Tensor of shape (a x t) representing bias amplification for each A-T pair

computePairProbs(A: tensor, T: tensor) → tensor[source]

Computes joint probability for given A and T observations.

Parameters:

Atorch.tensor: Binary tensor of the shape (N x a). a is the number of possible attribute categories. (i.e. 2 for gender {male, female})
Ttorch.tensor: Binary tensor of the shape (N x t). t is the number of possible task categories.

Returns:

probstorch.tensor: of the shape (a x t). Represents the joint probability for each A-T pair.

computeProbs(vals: tensor) → tensor[source]

Computes observed probability for each category.

Parameters:

valstorch.tensor: Binary tensor of the shape (N x v). v is the number of possible categories. (i.e. 2 for gender {male, female})

Returns:

probstorch.tensor: Float tensor representing probabilities for each category.

computeTgivenA(A: tensor, T: tensor) → tensor[source]

Computes conditional probability for all Task T given A observations. i.e P(T|A)

Parameters:

Atorch.tensor: Binary tensor of the shape (N x a). a is the number of possible attribute categories. (i.e. 2 for gender {male, female})
Ttorch.tensor: Binary tensor of the shape (N x t). t is the number of possible task categories.

Returns:

probstorch.tensor: of the shape (a x t). Represents the conditional probability P(T|A) for each A-T pair.

BA_MALS

Bias Amplification Metric from Zhao et al. (2021). This metric computes bias amplification by comparing conditional probabilities, but only focuses on positive correlations.

class bias_amplification.metrics.CoOccurMetrics.BA_MALS[source]

Bases: BaseCoOccurMetric

Methods

`check_bias`(A, T)	Checks if each A-T pair exhibits statistical dependence (positive correlation).
`computeBiasAmp`(A, T, T_pred)	Computes bias amplification by comparing the conditional probabilities of A given T and A given T_pred.

Examples

from bias_amplification.metrics.CoOccurMetrics import BA_MALS
import torch

# Initialize BA_MALS metric
ba_mals = BA_MALS()

# Prepare data: A (attributes) and T (tasks) as binary tensors
# A: shape (N, a) where N is number of observations, a is number of attribute categories
# T: shape (N, t) where t is number of task categories
A = torch.tensor([[1, 0], [1, 0], [0, 1], [0, 1]], dtype=torch.float)
T = torch.tensor([[1, 0], [0, 1], [1, 0], [0, 1]], dtype=torch.float)
T_pred = torch.tensor([[1, 0], [1, 0], [0, 1], [0, 1]], dtype=torch.float)

# Check which pairs exhibit bias
bias_mask = ba_mals.check_bias(A, T)

# Compute bias amplification
bias_amp_combined, bias_amp = ba_mals.computeBiasAmp(A, T, T_pred)

__init__()[source]

check_bias(A: tensor, T: tensor) → tensor[source]

Checks if each A-T pair exhibits statistical dependence (positive correlation). Uses independence test: P(A,T) > P(A)P(T)

Parameters:

Atorch.tensor: Binary tensor of shape (N x a)
Ttorch.tensor: Binary tensor of shape (N x t) - represents ONE attribute combination

Returns:

is_biasedtorch.tensor: Binary mask of shape (a x t) indicating positively correlated pairs

computeBiasAmp(A: tensor, T: tensor, T_pred: tensor) → Tuple[tensor, tensor][source]

Computes bias amplification by comparing the conditional probabilities of A given T and A given T_pred.

Parameters:

Atorch.tensor: Binary tensor of shape (N x a)
Ttorch.tensor: Binary tensor of shape (N x t)
T_predtorch.tensor: Binary tensor of shape (N x t)

Returns:

bias_amp_combinedtorch.tensor: Scalar representing mean bias amplification across all pairs
bias_amptorch.tensor: Tensor of shape (a x t) representing bias amplification for each A-T pair

DBA (Directional Bias Amplification)

Bias Amplification Metric that addresses shortcomings of BA_MALS by focusing on both positive and negative correlations, and the direction of amplification.

class bias_amplification.metrics.CoOccurMetrics.DBA[source]

Bases: BaseCoOccurMetric

Bias Amplification Metric from Directional Bias Amplification. This metric computes bias amplification that addresses on the shortcomings of Zhao’s metric by focusing on both positive and negative correlations, and the direction of amplification through comparing the conditional probabilities of A given T and A given T_pred.

Methods

`check_bias`(A, T)	Checks if each A-T pair exhibits statistical dependence (positive correlation).
`computeBiasAmp`(A, T, T_pred)	Computes bias amplification by comparing the conditional probabilities of A given T and A given T_pred.
`computeBiasAmpBidirectional`(A, A_pred, T, T_pred)	Computes bidirectional bias amplification for AtoT and TtoA directions.

Examples

from bias_amplification.metrics.CoOccurMetrics import DBA
import torch

# Initialize DBA metric
dba = DBA()

# Prepare data
A = torch.tensor([[1, 0], [1, 0], [0, 1], [0, 1]], dtype=torch.float)
T = torch.tensor([[1, 0], [0, 1], [1, 0], [0, 1]], dtype=torch.float)
T_pred = torch.tensor([[1, 0], [1, 0], [0, 1], [0, 1]], dtype=torch.float)

# Check statistical dependence (positive correlation)
dependence_mask = dba.check_bias(A, T)

# Compute bias amplification (handles both positive and negative correlations)
bias_amp_combined, bias_amp = dba.computeBiasAmp(A, T, T_pred)

# Compute bidirectional bias amplification
A_pred = A.clone()  # Predicted attributes
bias_amp_bidirectional = dba.computeBiasAmpBidirectional(
    A, A_pred, T, T_pred
)
# Returns dict with keys 'AtoT' and 'TtoA'

__init__()[source]

check_bias(A: tensor, T: tensor) → tensor[source]

Checks if each A-T pair exhibits statistical dependence (positive correlation). Uses independence test: P(A,T) > P(A)P(T)

Parameters:

Atorch.tensor: Binary tensor of shape (N x a)
Ttorch.tensor: Binary tensor of shape (N x t) - represents ONE attribute combination

Returns:

y_attorch.tensor: Binary mask of shape (a x t) indicating positively correlated pairs

computeBiasAmp(A: tensor, T: tensor, T_pred: tensor) → Tuple[tensor, tensor][source]

Computes bias amplification by comparing the conditional probabilities of A given T and A given T_pred.

Parameters:

Atorch.tensor: Binary tensor of shape (N x a)
Ttorch.tensor: Binary tensor of shape (N x t)
T_predtorch.tensor: Binary tensor of shape (N x t)

Returns:

bias_amp_combinedtorch.tensor: Scalar representing mean bias amplification across all pairs
bias_amptorch.tensor: Tensor of shape (a x t) representing bias amplification for each A-T pair

computeBiasAmpBidirectional(A: tensor, A_pred: tensor, T: tensor, T_pred: tensor) → Dict[str, Tuple[tensor, tensor]][source]

Computes bidirectional bias amplification for AtoT and TtoA directions. :param A: Binary tensor of shape (N x a) :type A: torch.tensor :param A_pred: Binary tensor of shape (N x a) :type A_pred: torch.tensor :param T: Binary tensor of shape (N x t) :type T: torch.tensor :param T_pred: Binary tensor of shape (N x t) :type T_pred: torch.tensor

Returns:

bias_ampdict: Dictionary with keys ‘AtoT’ and ‘TtoA’, each containing (mean, variance) tuples

MDBA (Multi-Attribute Directional Bias Amplification)

Multi-Attribute Directional Bias Amplification Metric that extends DBA to handle multi-dimensional attribute combinations, computing bias amplification across all possible attribute combinations.

class bias_amplification.metrics.CoOccurMetrics.MDBA(min_attr_size: int = 1, max_attr_size: int | None = None)[source]

Bases: BaseCoOccurMetric

Multi-Attribute Directional Bias Amplification Metric. This metric computes bias amplification that addresses on the shortcomings of DBA by focusing on multi-attribute combinations through comparing the conditional probabilities of A given T and A given T_pred.

Methods

`check_bias`(A, T)	Checks if each A-T pair exhibits statistical dependence (positive correlation).
`computeBiasAmp`(A, T, T_pred)	Computes Multi-Dimensional Bias Amplification from A to T.
`computeBiasAmpBidirectional`(A, A_pred, T, T_pred)	Computes bidirectional bias amplification.
`getAttributeCombinationStats`(T)	Get statistics about attribute combinations in the dataset.

Examples

from bias_amplification.metrics.CoOccurMetrics import MDBA
import torch

# Initialize MDBA metric with attribute size constraints
mdba = MDBA(min_attr_size=1, max_attr_size=3)

# Prepare data with multiple attributes
A = torch.tensor([[1, 0], [1, 0], [0, 1], [0, 1]], dtype=torch.float)
T = torch.tensor([[1, 0, 0], [0, 1, 0], [1, 0, 1], [0, 1, 1]], dtype=torch.float)
T_pred = torch.tensor([[1, 0, 0], [1, 0, 0], [0, 1, 1], [0, 1, 1]], dtype=torch.float)

# Compute multi-dimensional bias amplification
# Returns (mean, variance) tuple
bias_amp_mean, bias_amp_variance = mdba.computeBiasAmp(A, T, T_pred)

# Get statistics about attribute combinations
stats = mdba.getAttributeCombinationStats(T)
# Returns dict with 'total_combinations', 'by_size', 'examples'

# Compute bidirectional bias amplification
A_pred = A.clone()
bias_amp_bidirectional = mdba.computeBiasAmpBidirectional(
    A, A_pred, T, T_pred
)
# Returns dict with keys 'AtoT' and 'TtoA', each containing (mean, variance)

__init__(min_attr_size: int = 1, max_attr_size: int | None = None)[source]

check_bias(A: tensor, T: tensor) → tensor[source]

Checks if each A-T pair exhibits statistical dependence (positive correlation). Uses independence test: P(A,T) > P(A)P(T)

Parameters:

Atorch.tensor: Binary tensor of shape (N x a)
Ttorch.tensor: Binary tensor of shape (N x t) - represents ONE attribute combination

Returns:

y_attorch.tensor: Binary mask of shape (a x t) indicating positively correlated pairs

computeBiasAmp(A: tensor, T: tensor, T_pred: tensor) → Tuple[tensor, tensor][source]

Computes Multi-Dimensional Bias Amplification from A to T.

This implements the Multi-> directional metric from the paper (Equation 3). It iterates over ALL combinations of attributes M and computes bias amplification for each combination, then aggregates.

The formula from the paper: Multi-> = (mean, variance) where mean = (1 / |G||M|) * Σ_g Σ_m |y_gm * Δ_gm + (1 - y_gm) * |-Δ_gm||

Parameters:

Atorch.tensor: Ground truth group membership, shape (N x a)
Ttorch.tensor: Ground truth tasks/attributes, shape (N x t)
T_predtorch.tensor: Predicted tasks/attributes, shape (N x t)

Returns:

bias_amp_meantorch.tensor: Scalar representing mean bias amplification across all combinations
bias_amp_variancetorch.tensor: Variance of bias amplification (shows if uniform or concentrated)

computeBiasAmpBidirectional(A: tensor, A_pred: tensor, T: tensor, T_pred: tensor) → Dict[str, Tuple[tensor, tensor]][source]

Computes bidirectional bias amplification.

This captures bias amplification in both directions: - Multi_A->T (or Multi_G->M): How group membership (A) influences task predictions (T) - Multi_T->A (or Multi_M->G): How tasks (T) influence group membership predictions (A)

Parameters:

Atorch.tensor: Ground truth group membership
A_predtorch.tensor: Predicted group membership
Ttorch.tensor: Ground truth tasks/attributes
T_predtorch.tensor: Predicted tasks/attributes

Returns:

bias_ampdict: Dictionary with keys ‘AtoT’ and ‘TtoA’, each containing (mean, variance) tuples

getAttributeCombinationStats(T: tensor) → Dict[source]

Get statistics about attribute combinations in the dataset. Useful for understanding the dataset structure.

Returns:

statsdict: Dictionary containing: - ‘total_combinations’: Total number of attribute combinations - ‘by_size’: Number of combinations for each size - ‘examples’: Example combinations for each size