Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Lu Zhang's picture

1

Lu Zhang

kaitou951

·

https://zl9501.github.io/

AI & ML interests

None yet

Organizations

None yet

Collections 3

A Survey on LLM-as-a-Judge

Judging the Judges: Evaluating Alignment and Vulnerabilities in LLMs-as-Judges

Paper • 2406.12624 • Published Jun 18, 2024 • 37
A Survey on LLM-as-a-Judge

Paper • 2411.15594 • Published Nov 23, 2024
LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods

Paper • 2412.05579 • Published Dec 7, 2024 • 2
From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge

Paper • 2411.16594 • Published Nov 25, 2024 • 41

Robust Multimodal Large Language Models Against Modality Conflict

Paper • 2507.07151 • Published Jul 9 • 5
One Token to Fool LLM-as-a-Judge

Paper • 2507.08794 • Published Jul 11 • 31
Test-Time Scaling with Reflective Generative Model

Paper • 2507.01951 • Published Jul 2 • 106
KV Cache Steering for Inducing Reasoning in Small Language Models

Paper • 2507.08799 • Published Jul 11 • 40

A Survey on LLM-as-a-Judge

Judging the Judges: Evaluating Alignment and Vulnerabilities in LLMs-as-Judges

Paper • 2406.12624 • Published Jun 18, 2024 • 37
A Survey on LLM-as-a-Judge

Paper • 2411.15594 • Published Nov 23, 2024
LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods

Paper • 2412.05579 • Published Dec 7, 2024 • 2
From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge

Paper • 2411.16594 • Published Nov 25, 2024 • 41

Robust Multimodal Large Language Models Against Modality Conflict

Paper • 2507.07151 • Published Jul 9 • 5
One Token to Fool LLM-as-a-Judge

Paper • 2507.08794 • Published Jul 11 • 31
Test-Time Scaling with Reflective Generative Model

Paper • 2507.01951 • Published Jul 2 • 106
KV Cache Steering for Inducing Reasoning in Small Language Models

Paper • 2507.08799 • Published Jul 11 • 40

View 3 collections

models 0

None public yet

datasets 0

None public yet

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs