Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
CelesteChen 's Collections
agent
creative-writing
multimodal
RL infra
application
acceleration
confidence
deepsearch
models
code
diffusion
multilingual
reasoning
RAG
others
long-context
math
Align
LLM-general

multimodal

updated 7 days ago
Upvote
-

  • DLER: Doing Length pEnalty Right - Incentivizing More Intelligence per Token via Reinforcement Learning

    Paper • 2510.15110 • Published 10 days ago • 15

  • PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model

    Paper • 2510.14528 • Published 11 days ago • 67

  • Bee: A High-Quality Corpus and Full-Stack Suite to Unlock Advanced Fully Open MLLMs

    Paper • 2510.13795 • Published 11 days ago • 49

  • UniME-V2: MLLM-as-a-Judge for Universal Multimodal Embedding Learning

    Paper • 2510.13515 • Published 12 days ago • 11

  • SAIL-Embedding Technical Report: Omni-modal Embedding Foundation Model

    Paper • 2510.12709 • Published 13 days ago • 10

  • HoneyBee: Data Recipes for Vision-Language Reasoners

    Paper • 2510.12225 • Published 13 days ago • 9
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs