-
Will we run out of data? An analysis of the limits of scaling datasets in Machine Learning
Paper • 2211.04325 • Published • 1 -
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Paper • 1810.04805 • Published • 23 -
On the Opportunities and Risks of Foundation Models
Paper • 2108.07258 • Published • 1 -
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks
Paper • 2204.07705 • Published • 2
Collections
Discover the best community collections!
Collections including paper arxiv:2303.08896
-
Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models
Paper • 2411.14257 • Published • 14 -
Distinguishing Ignorance from Error in LLM Hallucinations
Paper • 2410.22071 • Published -
DeCoRe: Decoding by Contrasting Retrieval Heads to Mitigate Hallucinations
Paper • 2410.18860 • Published • 11 -
MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation
Paper • 2410.11779 • Published • 26
-
G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment
Paper • 2303.16634 • Published • 3 -
miracl/miracl-corpus
Viewer • Updated • 77.2M • 3.53k • 47 -
Judging LLM-as-a-judge with MT-Bench and Chatbot Arena
Paper • 2306.05685 • Published • 37 -
How is ChatGPT's behavior changing over time?
Paper • 2307.09009 • Published • 24
-
RPG: A Repository Planning Graph for Unified and Scalable Codebase Generation
Paper • 2509.16198 • Published • 126 -
SWE-Bench Pro: Can AI Agents Solve Long-Horizon Software Engineering Tasks?
Paper • 2509.16941 • Published • 20 -
SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models
Paper • 2303.08896 • Published • 4 -
From Local to Global: A Graph RAG Approach to Query-Focused Summarization
Paper • 2404.16130 • Published • 6
-
Looking for a Needle in a Haystack: A Comprehensive Study of Hallucinations in Neural Machine Translation
Paper • 2208.05309 • Published • 1 -
LLM-Eval: Unified Multi-Dimensional Automatic Evaluation for Open-Domain Conversations with Large Language Models
Paper • 2305.13711 • Published • 2 -
Semantic Uncertainty: Linguistic Invariances for Uncertainty Estimation in Natural Language Generation
Paper • 2302.09664 • Published • 4 -
BARTScore: Evaluating Generated Text as Text Generation
Paper • 2106.11520 • Published • 2
-
Will we run out of data? An analysis of the limits of scaling datasets in Machine Learning
Paper • 2211.04325 • Published • 1 -
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Paper • 1810.04805 • Published • 23 -
On the Opportunities and Risks of Foundation Models
Paper • 2108.07258 • Published • 1 -
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks
Paper • 2204.07705 • Published • 2
-
RPG: A Repository Planning Graph for Unified and Scalable Codebase Generation
Paper • 2509.16198 • Published • 126 -
SWE-Bench Pro: Can AI Agents Solve Long-Horizon Software Engineering Tasks?
Paper • 2509.16941 • Published • 20 -
SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models
Paper • 2303.08896 • Published • 4 -
From Local to Global: A Graph RAG Approach to Query-Focused Summarization
Paper • 2404.16130 • Published • 6
-
Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models
Paper • 2411.14257 • Published • 14 -
Distinguishing Ignorance from Error in LLM Hallucinations
Paper • 2410.22071 • Published -
DeCoRe: Decoding by Contrasting Retrieval Heads to Mitigate Hallucinations
Paper • 2410.18860 • Published • 11 -
MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation
Paper • 2410.11779 • Published • 26
-
Looking for a Needle in a Haystack: A Comprehensive Study of Hallucinations in Neural Machine Translation
Paper • 2208.05309 • Published • 1 -
LLM-Eval: Unified Multi-Dimensional Automatic Evaluation for Open-Domain Conversations with Large Language Models
Paper • 2305.13711 • Published • 2 -
Semantic Uncertainty: Linguistic Invariances for Uncertainty Estimation in Natural Language Generation
Paper • 2302.09664 • Published • 4 -
BARTScore: Evaluating Generated Text as Text Generation
Paper • 2106.11520 • Published • 2
-
G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment
Paper • 2303.16634 • Published • 3 -
miracl/miracl-corpus
Viewer • Updated • 77.2M • 3.53k • 47 -
Judging LLM-as-a-judge with MT-Bench and Chatbot Arena
Paper • 2306.05685 • Published • 37 -
How is ChatGPT's behavior changing over time?
Paper • 2307.09009 • Published • 24