Layer-wise removal. Removing from layer i (the rows) and testing probing performance on layer j (the columns). Top row (3a) is non-masked version, bottom row (3b) is masked.
This site uses cookies. By continuing to use our website, you are agreeing to our privacy policy. No content on this site may be used to train artificial intelligence systems without permission in writing from the MIT Press.