Single SAEs trained on the residual stream activation vectors from every transformer layer simultaneously, including the transformers.