Kyoung-Rok Jang's picture

Kyoung-Rok Jang

joseph2324

·

AI & ML interests

None yet

Recent Activity

liked a model about 1 month ago

BAAI/bge-m3

liked a model about 1 month ago

answerdotai/ModernBERT-base

upvoted a collection about 2 months ago

View all activity

Organizations

None yet

liked 2 models about 1 month ago

BAAI/bge-m3

Sentence Similarity • Updated Jul 3, 2024 • 6.83M • • 2.45k

answerdotai/ModernBERT-base

Fill-Mask • 0.1B • Updated Jan 15 • 890k • 949

upvoted a collection about 2 months ago

EmbeddingGemma

3 items • Updated Sep 11 • 93

liked a model about 2 months ago

google/embeddinggemma-300m

Sentence Similarity • 0.3B • Updated Sep 25 • 686k • • 1.09k

upvoted a collection 3 months ago

Multilingual CLIP

Models that can be used for multilingual Smart Search in Immich. See https://immich.app/docs/features/searching/#clip-models for more info. • 6 items • Updated Mar 31 • 15

liked a model 3 months ago

moonshotai/Kimi-K2-Instruct

Text Generation • 1T • Updated 7 days ago • 88.7k • • 2.19k

upvoted a collection 3 months ago

Encoders vs Decoders: the Ettin Suite

A collection of SOTA, open-data, paired encoder-only and decoder only models ranging from 17M params to 1B. See the paper at https://arxiv.org/abs/250 • 32 items • Updated Jul 16 • 24

upvoted an article 3 months ago

Article

Seq vs Seq: the Ettin Suite of Paired Encoders and Decoders

Jul 16

• 74

upvoted a paper 4 months ago

NeuralOS: Towards Simulating Operating Systems via Neural Generative Models

Paper • 2507.08800 • Published Jul 11 • 79

liked a model 4 months ago

mradermacher/SmolDocling-256M-preview-GGUF

0.2B • Updated Aug 25 • 295 • 1

upvoted a paper 4 months ago

MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention

Paper • 2506.13585 • Published Jun 16 • 268

liked a dataset 5 months ago

reasonir/reasonir-data

Viewer • Updated May 7 • 345k • 274 • 31

liked a model 5 months ago

lightonai/Reason-ModernColBERT

Sentence Similarity • 0.1B • Updated Sep 9 • 2.43k • • 206

upvoted a collection 5 months ago

Qwen3-Reranker

3 items • Updated Jul 21 • 64

upvoted 2 articles 5 months ago

Article

KV Cache from scratch in nanoVLM

Jun 4

• 98

Article

Vision Language Models (Better, Faster, Stronger)

May 12

• 557

liked a Space 7 months ago

The Ultra-Scale Playbook

The ultimate guide to training LLM on large GPU Clusters

liked a model 7 months ago

deepseek-ai/DeepSeek-V3-0324

Text Generation • 685B • Updated Mar 27 • 295k • • 3.07k

upvoted an article 7 months ago

Article

Topic 33: Slim Attention, KArAt, XAttention and Multi-Token Attention Explained – What’s Really Changing in Transformers?

By

and 1 other •

Apr 4

• 15

upvoted a paper 7 months ago

I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders

Paper • 2503.18878 • Published Mar 24 • 119