Does Localization Inform Editing? Surprising Differences in Causality-Based Localization vs. Knowledge Editing in Language Models Paper • 2301.04213 • Published Jan 10, 2023
Post Hoc Explanations of Language Models Can Improve Language Models Paper • 2305.11426 • Published May 19, 2023 • 1
Patchscope: A Unifying Framework for Inspecting Hidden Representations of Language Models Paper • 2401.06102 • Published Jan 11, 2024 • 22