Methods#
Integrated Gradients
Compute attribution using integrated gradients.
Steering Vectors
Modify model behavior by adding vectors to intermediate activations.
Compute attribution using integrated gradients.
Modify model behavior by adding vectors to intermediate activations.