Han-Bit Kang's picture

140 31

Han-Bit Kang

hbkang

·

AI & ML interests

ML

Recent Activity

updated a collection about 13 hours ago

upvoted a paper about 13 hours ago

Adversarial Flow Models

liked a dataset 7 days ago

tabtoyou/KoLLaVA-v1.5-Instruct-581k

View all activity

Organizations

None yet

upvoted a paper about 13 hours ago

Adversarial Flow Models

Paper • 2511.22475 • Published 4 days ago • 14

upvoted 2 papers 14 days ago

MMaDA-Parallel: Multimodal Large Diffusion Language Models for Thinking-Aware Editing and Generation

Paper • 2511.09611 • Published 19 days ago • 67

PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model

Paper • 2510.14528 • Published Oct 16 • 98

upvoted 2 papers 21 days ago

Optimized Table Tokenization for Table Structure Recognition

Paper • 2305.03393 • Published May 5, 2023 • 1

PubTables-1M: Towards comprehensive table extraction from unstructured documents

Paper • 2110.00061 • Published Sep 30, 2021 • 3

upvoted 3 papers about 1 month ago

Prismatic Synthesis: Gradient-based Data Diversification Boosts Generalization in LLM Reasoning

Paper • 2505.20161 • Published May 26 • 1

FineVision: Open Data Is All You Need

Paper • 2510.17269 • Published Oct 20 • 67

Image-GS: Content-Adaptive Image Representation via 2D Gaussians

Paper • 2407.01866 • Published Jul 2, 2024 • 1

upvoted a paper about 2 months ago

BitNet Distillation

Paper • 2510.13998 • Published Oct 15 • 53

upvoted 2 papers 2 months ago

Sequential Diffusion Language Models

Paper • 2509.24007 • Published Sep 28 • 44

2D Gaussian Splatting with Semantic Alignment for Image Inpainting

Paper • 2509.01964 • Published Sep 2 • 6

upvoted 2 papers 3 months ago

Mixture of Global and Local Experts with Diffusion Transformer for Controllable Face Generation

Paper • 2509.00428 • Published Aug 30 • 17

Representing Speech Through Autoregressive Prediction of Cochlear Tokens

Paper • 2508.11598 • Published Aug 15 • 17

upvoted 2 papers 4 months ago

StyleMM: Stylized 3D Morphable Face Model via Text-Driven Aligned Image Translation

Paper • 2508.11203 • Published Aug 15 • 10

HPSv3: Towards Wide-Spectrum Human Preference Score

Paper • 2508.03789 • Published Aug 5 • 19

upvoted 5 papers 5 months ago

FantasyPortrait: Enhancing Multi-Character Portrait Animation with Expression-Augmented Diffusion Transformers

Paper • 2507.12956 • Published Jul 17 • 24

FLEXITOKENS: Flexible Tokenization for Evolving Language Models

Paper • 2507.12720 • Published Jul 17 • 9

SpeakerVid-5M: A Large-Scale High-Quality Dataset for Audio-Visual Dyadic Interactive Human Generation

Paper • 2507.09862 • Published Jul 14 • 49

Radial Attention: O(nlog n) Sparse Attention with Energy Decay for Long Video Generation

Paper • 2506.19852 • Published Jun 24 • 41

Peccavi: Visual Paraphrase Attack Safe and Distortion Free Image Watermarking Technique for AI-Generated Images

Paper • 2506.22960 • Published Jun 28 • 6