I am a Ph.D. student in Theoretical Data Science and Scientific Computing at SISSA and at Laboratory of Data Engineering (LADE) at AREA Science Park under the supervision of Alberto Cazzaniga and Alessandro Laio. Broadly, I am interested in the interpretability of transformer models and their applications in AI safety. More specifically, my research focuses on the intersection of causally inspired methods (such as mechanistic interpretability) and geometrical tools applied to unsupervised learning. I collect notes and random toughts in a digital garden I’m currently living in Trieste, a beautiful city on the sea in the north-east of Italy. You can reach me at: aserra[at]sissa.it.
NeurIPS
NeurIPS
Powered by Jekyll and Minimal Light theme.