Bartosz Cywiński
bcywinski
AI & ML interests
Mechanistic Interpretability
Recent Activity
updated
a collection
1 day ago
Llama-3.1-8B-Instruct-taboo
updated
a collection
1 day ago
Eliciting Secret Knowledge from Language Models
updated
a model
10 days ago
bcywinski/gemma-2-9b-it-occupation-doctor
Organizations
None yet
Eliciting Secret Knowledge from Language Models
https://arxiv.org/abs/2510.01070
gemma-2-9b-it-user-gender