2 350 82

oh sehun

sehun

AI & ML interests

None yet

Recent Activity

upvoted a paper about 13 hours ago

Learning to Reason in 4D: Dynamic Spatial Understanding for Vision Language Models

upvoted a paper about 14 hours ago

LongVideoAgent: Multi-Agent Reasoning with Long Videos

upvoted a paper about 20 hours ago

Reasoning Palette: Modulating Reasoning via Latent Contextualization for Controllable Exploration for (V)LMs

View all activity

Organizations

upvoted a paper about 13 hours ago

Learning to Reason in 4D: Dynamic Spatial Understanding for Vision Language Models

Paper • 2512.20557 • Published 2 days ago • 40

upvoted a paper about 14 hours ago

LongVideoAgent: Multi-Agent Reasoning with Long Videos

Paper • 2512.20618 • Published 2 days ago • 44

upvoted a paper about 20 hours ago

Reasoning Palette: Modulating Reasoning via Latent Contextualization for Controllable Exploration for (V)LMs

Paper • 2512.17206 • Published 7 days ago • 15

upvoted 3 papers 1 day ago

Reinforcement Learning for Self-Improving Agent with Skill Library

Paper • 2512.17102 • Published 7 days ago • 17

CASA: Cross-Attention via Self-Attention for Efficient Vision-Language Fusion

Paper • 2512.19535 • Published 3 days ago • 7

Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies

Paper • 2512.19673 • Published 3 days ago • 57

upvoted 2 papers 2 days ago

QuCo-RAG: Quantifying Uncertainty from the Pre-training Corpus for Dynamic Retrieval-Augmented Generation

Paper • 2512.19134 • Published 4 days ago • 29

WorldWarp: Propagating 3D Geometry with Asynchronous Video Diffusion

Paper • 2512.19678 • Published 3 days ago • 26

upvoted 5 papers 3 days ago

Physics of Language Models: Part 4.1, Architecture Design and the Magic of Canon Layers

Paper • 2512.17351 • Published 7 days ago • 20

Next-Embedding Prediction Makes Strong Vision Learners

Paper • 2512.16922 • Published 7 days ago • 77

4D-RGPT: Toward Region-level 4D Understanding via Perceptual Distillation

Paper • 2512.17012 • Published 7 days ago • 40

Both Semantics and Reconstruction Matter: Making Representation Encoders Ready for Text-to-Image Generation and Editing

Paper • 2512.17909 • Published 6 days ago • 35

Turn-PPO: Turn-Level Advantage Estimation with PPO for Improved Multi-Turn RL in Agentic LLMs

Paper • 2512.17008 • Published 7 days ago • 10

upvoted a paper 4 days ago

Differences That Matter: Auditing Models for Capability Gap Discovery and Rectification

Paper • 2512.16921 • Published 7 days ago • 7

upvoted an article 6 days ago

Article

LLM based Audio models

8 days ago

•

upvoted 2 papers 6 days ago

Kling-Omni Technical Report

Paper • 2512.16776 • Published 7 days ago • 155

HyperVL: An Efficient and Dynamic Multimodal Large Language Model for Edge Devices

Paper • 2512.14052 • Published 10 days ago • 39

upvoted 3 papers 7 days ago

Can LLMs Guide Their Own Exploration? Gradient-Guided Reinforcement Learning for LLM Reasoning

Paper • 2512.15687 • Published 8 days ago • 17

End-to-End Training for Autoregressive Video Diffusion via Self-Resampling

Paper • 2512.15702 • Published 8 days ago • 14

Fast and Accurate Causal Parallel Decoding using Jacobi Forcing

Paper • 2512.14681 • Published 9 days ago • 39

oh sehun

AI & ML interests

Recent Activity

Organizations

sehun's activity

LLM based Audio models