Adversarial Explanations to Question Objective XAI Evaluation Metrics
This repository implements SHAPE (SHifted Adversaries using Pixel Elimination), a novel adversarial explanation method that exposes fundamental flaws in objective XAI evaluation metrics like insertion and deletion games.
"Are Objective Explanatory Evaluation Metrics Trustworthy? An Adversarial Analysis"
Prithwijit Chowdhury, Mohit Prabhushankar, Ghassan AlRegib, Mohamed Deriche
Published in: IEEE International Conference on Image Processing (ICIP) 2024
@inproceedings{chowdhury2024shape,
title={Are Objective Explanatory Evaluation Metrics Trustworthy? An Adversarial Analysis},
author={Chowdhury, Prithwijit and Prabhushankar, Mohit and AlRegib, Ghassan and Deriche, Mohamed},
booktitle={2024 IEEE International Conference on Image Processing (ICIP)},
pages={3938--3944},
year={2024},
organization={IEEE}
}Key Institution: OLIVES Lab, Georgia Institute of Technology
SHAPE is an adversarial explanation method that:
- β Mathematically sound - Based on causal definitions of necessity
- β Model-faithful - Accurately captures model behavior
- β Outperforms existing methods - Achieves better insertion/deletion scores than GradCAM, GradCAM++, and comparable to RISE
- β But NOT human-interpretable - Generates incomprehensible saliency maps
Unlike RISE (which measures sufficiency by masking and observing predictions on visible pixels), SHAPE measures necessity by removing pixels and observing prediction drops.
Mathematical Formulation:
N_I,f(Ξ») = E_M[f(I) - f(I β M) | M(Ξ») = 0]
Where:
I: Original imageM: Random binary maskf(I): Model prediction on full imagef(I β M): Model prediction on masked imageΞ»: Pixel locationM(Ξ») = 0: Pixel is masked (removed)
Interpretation: Importance = Expected prediction drop when pixel is absent
| Aspect | RISE (Sufficiency) | SHAPE (Necessity) |
|---|---|---|
| Masks | Shows visible pixels | Shows masked pixels |
| Prediction | On masked image | Baseline - masked |
| Aggregation | Weighted by masks | Weighted by inverted masks |
| Normalization | By p1 | By (1 - p1) |
| Interpretation | "Can these pixels drive prediction?" | "Are these pixels necessary?" |
# Clone repository
git clone https://github.com/yourusername/SHAPE-adversarial-explanations.git
cd SHAPE-adversarial-explanations
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txtimport torch
from torchvision import models
from src.shape import SHAPE
from PIL import Image
# Load pre-trained model
model = models.resnet50(pretrained=True).cuda().eval()
# Initialize SHAPE
explainer = SHAPE(model, input_size=(224, 224), gpu_batch=100)
# Generate or load masks
explainer.generate_masks(N=4000, s=8, p1=0.5, savepath='masks/shape_masks.npy')
# Load and preprocess image
image = Image.open('examples/dog.jpg')
input_tensor = preprocess_image(image).cuda()
# Generate SHAPE explanation
saliency_maps = explainer(input_tensor)
# Get saliency for predicted class
predicted_class = model(input_tensor).argmax().item()
saliency_map = saliency_maps[predicted_class].cpu().numpy()
# Visualize
visualize_saliency(image, saliency_map, save_path='results/dog_shape.png')# Process ImageNet validation set with multiple models
python shape_batch_processor.py \
--input-dir /path/to/imagenet/val \
--output-dir shape_outputs \
--models resnet50 vgg16 densenet161 \
--p1-values 0.1 0.3 0.5 0.8 \
--N 4000 \
--gpu-batch 400| Method | ResNet50 Insertion β | ResNet50 Deletion β | ResNet101 Insertion β | ResNet101 Deletion β |
|---|---|---|---|---|
| GradCAM | 0.684 | 0.223 | 0.687 | 0.174 |
| GradCAM++ | 0.712 | 0.209 | 0.701 | 0.166 |
| RISE | 0.769 | 0.091 | 0.788 | 0.150 |
| SHAPE (Ours) | 0.771 | 0.104 | 0.844 | 0.088 |
β Higher is better | β Lower is better
SHAPE-adversarial-explanations/
β
βββ src/
β βββ shape.py # Core SHAPE implementation
β βββ rise.py # RISE baseline for comparison
β βββ evaluation.py # Insertion & deletion metrics
β βββ visualization.py # Visualization utilities
β
βββ examples/
β βββ basic_usage.py # Simple example
β βββ compare_methods.py # Compare SHAPE vs RISE vs GradCAM
β βββ evaluate_metrics.py # Reproduce paper results
β
βββ shape_batch_processor.py # Batch processing script
βββ requirements.txt # Python dependencies
βββ README.md # This file
β
βββ docs/
β βββ methodology.md # Detailed explanation
β βββ api_reference.md # API documentation
β βββ faq.md # Frequently asked questions
β
βββ images/
β βββ (visualization outputs)
β
βββ results/
βββ (experimental outputs)
class SHAPE(nn.Module):
"""
SHifted Adversaries using Pixel Elimination
Generates adversarial explanations based on necessity.
Only 5 lines different from RISE implementation!
"""
def __init__(self, model, input_size, gpu_batch=100):
super(SHAPE, self).__init__()
self.model = model
self.input_size = input_size
self.gpu_batch = gpu_batch
def forward(self, x):
# Get baseline prediction
baseline_pred = self.model(x)
# Apply masks and get predictions
masked_preds = ... # predictions on masked images
# Compute prediction DROP
pred_drop = baseline_pred - masked_preds
# Use INVERTED masks to credit MASKED pixels
inverted_masks = 1.0 - self.masks
# Aggregate and normalize
saliency = matmul(pred_drop, inverted_masks) / N / (1 - p1)
return saliencydef insertion_game(model, image, saliency_map, steps=100):
"""
Insertion Game: Add pixels in order of importance
Higher AUC = Better explanation
"""
# Start with blank image
# Progressively add most important pixels
# Record prediction probability curve
# Return AUCdef deletion_game(model, image, saliency_map, steps=100):
"""
Deletion Game: Remove pixels in order of importance
Lower AUC = Better explanation
"""
# Start with full image
# Progressively remove most important pixels
# Record prediction probability curve
# Return AUC"Our adversarial technique outperforms existing visual explanations in these evaluation metrics. Hence, our work motivates the need for devising objective explanatory evaluation techniques that match subjective human opinions."
- Mathematically grounded in causal necessity
- Model-faithful to prediction behavior
- Optimizes directly for what metrics measure
- Highlights non-semantic regions
- Produces scattered importance maps
- Not interpretable by humans
- Does NOT help understand model decisions
Causal features β Pixel patterns
The paper argues:
"While predictive features may suffice for accurate predictions in normal circumstances, causal features are indispensable for maintaining accuracy even in the face of corruptions. Consequently even minor alterations can trigger substantial shifts in the representation."
Both methods use the same random masking framework, but differ in:
| Line | RISE | SHAPE |
|---|---|---|
| 1 | No baseline | Get baseline prediction |
| 2 | Use predictions directly | Compute prediction DROP |
| 3 | Use original masks | Invert masks (1 - mask) |
| 4 | Weight by masks | Weight by inverted masks |
| 5 | Normalize by p1 | Normalize by (1 - p1) |
RISE asks: "What pixels are sufficient to produce this prediction?"
- Shows visible pixels β measures sufficiency
- High importance = prediction maintained when shown
SHAPE asks: "What pixels are necessary for this prediction?"
- Shows masked pixels β measures necessity
- High importance = prediction drops when removed
python examples/evaluate_metrics.py \
--dataset imagenet \
--models resnet50 resnet101 vgg16 \
--methods gradcam gradcam++ rise shape \
--n-samples 1000 \
--output results/table1.csvpython examples/reproduce_figure1.py \
--image examples/rooster.jpg \
--model resnet50 \
--output results/figure1/python examples/reproduce_figure3.py \
--image examples/bull_mastiff.jpg \
--model resnet101 \
--classes "bull_mastiff" "tiger_cat" "tabby" \
--output results/figure3/python examples/reproduce_figure4.py \
--image examples/shark.jpg \
--model resnet50 \
--methods gradcam gradcam++ rise shape \
--output results/figure4/Prithwijit Chowdhury - pchowdhury6@gatech.edu
This project is licensed under the MIT License - see the LICENSE file for details.
If you use SHAPE in your research, please cite:
@inproceedings{chowdhury2024shape,
title={Are Objective Explanatory Evaluation Metrics Trustworthy? An Adversarial Analysis},
author={Chowdhury, Prithwijit and Prabhushankar, Mohit and AlRegib, Ghassan and Deriche, Mohamed},
booktitle={2024 IEEE International Conference on Image Processing (ICIP)},
pages={3938--3944},
year={2024},
organization={IEEE},
doi={10.1109/ICIP51287.2024.10647779}
}