pyvene.models.interventions#

Classes

`AdditionIntervention`(**kwargs)	Intervention the original representations with activation addition.
`AutoencoderIntervention`(**kwargs)	Intervene in the latent space of an autoencoder.
`BasisAgnosticIntervention`(**kwargs)	Intervention that will modify its basis in a uncontrolled manner.
`BoundlessRotatedSpaceIntervention`(**kwargs)	Intervention in the rotated space with boundary mask.
`CollectIntervention`(**kwargs)	Collect activations.
`ConstantSourceIntervention`(**kwargs)	Constant source.
`DistributedRepresentationIntervention`(**kwargs)	Distributed representation.
`Intervention`(**kwargs)	Intervention the original representations.
`InterventionOutput`([output, latent])	Output of the IntervenableModel, including original outputs, intervened outputs, and collected activations.
`JumpReLUAutoencoderIntervention`(**kwargs)	Interchange intervention on JumpReLU SAE's latent subspaces
`LocalistRepresentationIntervention`(**kwargs)	Localist representation.
`LowRankRotatedSpaceIntervention`(**kwargs)	Intervention in the rotated space.
`NoiseIntervention`(**kwargs)	Noise intervention
`PCARotatedSpaceIntervention`(**kwargs)	Intervention in the pca space.
`RotatedSpaceIntervention`(**kwargs)	Intervention in the rotated space.
`SharedWeightsTrainableIntervention`(**kwargs)	Intervention the original representations.
`SigmoidMaskIntervention`(**kwargs)	Intervention in the original basis with binary mask.
`SigmoidMaskRotatedSpaceIntervention`(**kwargs)	Intervention in the rotated space with boundary mask.
`SkipIntervention`(**kwargs)	Skip the current intervening layer's computation in the hook function.
`SourcelessIntervention`(**kwargs)	No source.
`SubtractionIntervention`(**kwargs)	Intervention the original representations with activation subtraction.
`TrainableIntervention`(**kwargs)	Intervention the original representations.
`VanillaIntervention`(**kwargs)	Intervention the original representations.
`ZeroIntervention`(**kwargs)	Zero-out activations.