Nicholas Crispino's picture

3 7 2

Nicholas Crispino

ncrispino

·

AI & ML interests

None yet

Recent Activity

updated a dataset 4 days ago

ncrispino/tool-call-steering-data

published a dataset 4 days ago

ncrispino/tool-call-steering-data

updated a dataset 13 days ago

WangResearchLab/SteeringSafety

View all activity

Organizations

upvoted 2 papers about 2 months ago

Budget-aware Test-time Scaling via Discriminative Verification

Paper • 2510.14913 • Published Oct 16 • 4

Predicting Task Performance with Context-aware Scaling Laws

Paper • 2510.14919 • Published Oct 16 • 3

upvoted 2 papers 3 months ago

COSMIC: Generalized Refusal Direction Identification in LLM Activations

Paper • 2506.00085 • Published May 30 • 2

RepIt: Representing Isolated Targets to Steer Language Models

Paper • 2509.13281 • Published Sep 16 • 4

upvoted a collection 3 months ago

SteeringSafety

A benchmark for evaluating effectiveness and entanglement in representation steering across seven safety-relevant perspectives • 2 items • Updated Oct 20 • 1

upvoted a paper 3 months ago

SteeringControl: Holistic Evaluation of Alignment Steering in LLMs

Paper • 2509.13450 • Published Sep 16 • 7

upvoted a paper about 1 year ago

JudgeBench: A Benchmark for Evaluating LLM-based Judges

Paper • 2410.12784 • Published Oct 16, 2024 • 48