arxiv:2310.08164
Abdullah
amirali1985
AI & ML interests
Mechanistic interpretability, high dimensional geometry, persona role playing.
Recent Activity
updated a dataset about 4 hours ago
amirali1985/high-temp-refusal-probe-artifacts published a dataset about 14 hours ago
amirali1985/high-temp-refusal-probe-artifacts