University of Southern Denmark (SDU)

university

https://www.sdu.dk/en

AI & ML interests

None defined yet.

Recent Activity

filo362 submitted a paper 3 days ago

The Arbiter Agent: Continually Monitoring Multi-Agent Conversations to Detect Emergent Misalignment

giannor submitted a paper 8 days ago

PsychoSafe: Eliciting Psychologically-Informed Refusals in Large Language Models

giannor submitted a paper 8 days ago

BrainSurgery: Reproducible and Reliable Declarative Weight Manipulations for Model Editing and Upcycling

View all activity

Papers

The Arbiter Agent: Continually Monitoring Multi-Agent Conversations to Detect Emergent Misalignment

BrainSurgery: Reproducible and Reliable Declarative Weight Manipulations for Model Editing and Upcycling

View all Papers

SDU-Denmark 's papers 4

Submitted by

Filippo Tonini

The Arbiter Agent: Continually Monitoring Multi-Agent Conversations to Detect Emergent Misalignment

SDU-Denmark

University of Southern Denmark (SDU)

Submitted by

Gianluca Barmina

BrainSurgery: Reproducible and Reliable Declarative Weight Manipulations for Model Editing and Upcycling

SDU-Denmark

University of Southern Denmark (SDU)

Submitted by

Gianluca Barmina

PsychoSafe: Eliciting Psychologically-Informed Refusals in Large Language Models

SDU-Denmark

University of Southern Denmark (SDU)

Submitted by

Gianluca Barmina

LLMs Can Leak Training Data But Do They Want To? A Propensity-Aware Evaluation of Memorization in LLMs

SDU-Denmark

University of Southern Denmark (SDU)