1 17 14

Tran Nam

namtran

AI & ML interests

None yet

Recent Activity

upvoted an article 13 days ago

LightOnOCR-1B: The Case for End-to-End and Efficient Domain-Specific Vision-Language Models for OCR

upvoted a paper 14 days ago

Euclid's Gift: Enhancing Spatial Perception and Reasoning in Vision-Language Models via Geometric Surrogate Tasks

upvoted a paper 17 days ago

Scaling Spatial Intelligence with Multimodal Foundation Models

View all activity

Organizations

None yet

upvoted an article 13 days ago

Article

LightOnOCR-1B: The Case for End-to-End and Efficient Domain-Specific Vision-Language Models for OCR

Oct 23

•

upvoted a paper 14 days ago

Euclid's Gift: Enhancing Spatial Perception and Reasoning in Vision-Language Models via Geometric Surrogate Tasks

Paper • 2509.24473 • Published Sep 29 • 17

upvoted 2 papers 17 days ago

Scaling Spatial Intelligence with Multimodal Foundation Models

Paper • 2511.13719 • Published 21 days ago • 44

Nemotron Elastic: Towards Efficient Many-in-One Reasoning LLMs

Paper • 2511.16664 • Published 18 days ago • 24

upvoted 3 papers 26 days ago

upvoted a paper 27 days ago

Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

Paper • 2511.06221 • Published 30 days ago • 128

upvoted a paper 4 months ago

Qwen-Image Technical Report

Paper • 2508.02324 • Published Aug 4 • 263

upvoted an article 7 months ago

Article

Vision Language Models (Better, faster, stronger)

May 12

•

568

upvoted an article 8 months ago

Article

SmolVLM2: Bringing Video Understanding to Every Device

Feb 20

•

315

upvoted 2 papers about 1 year ago

Movie Gen: A Cast of Media Foundation Models

Paper • 2410.13720 • Published Oct 17, 2024 • 98

NVLM: Open Frontier-Class Multimodal LLMs

Paper • 2409.11402 • Published Sep 17, 2024 • 74

upvoted a paper over 1 year ago

SwiftBrush v2: Make Your One-step Diffusion Model Better Than Its Teacher

Paper • 2408.14176 • Published Aug 26, 2024 • 62

upvoted 3 papers almost 2 years ago

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 626

LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens

Paper • 2402.13753 • Published Feb 21, 2024 • 116

LLM in a flash: Efficient Large Language Model Inference with Limited Memory

Paper • 2312.11514 • Published Dec 12, 2023 • 260

Tran Nam

AI & ML interests

Recent Activity

Organizations

namtran's activity

LightOnOCR-1B: The Case for End-to-End and Efficient Domain-Specific Vision-Language Models for OCR

Vision Language Models (Better, faster, stronger)

SmolVLM2: Bringing Video Understanding to Every Device