Methods#

Integrated Gradients

Compute attribution using integrated gradients.

notebooks/methods/integrated-gradients.ipynb
Steering Vectors

Modify model behavior by adding vectors to intermediate activations.

notebooks/methods/steering-vectors.ipynb