Kyoung-Rok Jang's picture

Kyoung-Rok Jang

joseph2324

·

AI & ML interests

None yet

Recent Activity

liked a model about 1 month ago

BAAI/bge-m3

liked a model about 1 month ago

answerdotai/ModernBERT-base

upvoted a collection about 2 months ago

View all activity

Organizations

None yet

upvoted a collection about 2 months ago

EmbeddingGemma

3 items • Updated Sep 11 • 93

upvoted 2 collections 3 months ago

Multilingual CLIP

Models that can be used for multilingual Smart Search in Immich. See https://immich.app/docs/features/searching/#clip-models for more info. • 6 items • Updated Mar 31 • 15

Encoders vs Decoders: the Ettin Suite

A collection of SOTA, open-data, paired encoder-only and decoder only models ranging from 17M params to 1B. See the paper at https://arxiv.org/abs/250 • 32 items • Updated Jul 16 • 24

upvoted an article 3 months ago

Article

Seq vs Seq: the Ettin Suite of Paired Encoders and Decoders

Jul 16

• 74

upvoted 2 papers 4 months ago

NeuralOS: Towards Simulating Operating Systems via Neural Generative Models

Paper • 2507.08800 • Published Jul 11 • 79

MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention

Paper • 2506.13585 • Published Jun 16 • 268

upvoted a collection 5 months ago

Qwen3-Reranker

3 items • Updated Jul 21 • 64

upvoted 2 articles 5 months ago

Article

KV Cache from scratch in nanoVLM

Jun 4

• 98

Article

Vision Language Models (Better, Faster, Stronger)

May 12

• 557

upvoted an article 7 months ago

Article

Topic 33: Slim Attention, KArAt, XAttention and Multi-Token Attention Explained – What’s Really Changing in Transformers?

By

and 1 other •

Apr 4

• 15

upvoted 2 papers 7 months ago

I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders

Paper • 2503.18878 • Published Mar 24 • 119

Modifying Large Language Model Post-Training for Diverse Creative Writing

Paper • 2503.17126 • Published Mar 21 • 36

upvoted a paper 9 months ago

Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling

Paper • 2502.06703 • Published Feb 10 • 153

upvoted 2 articles 10 months ago

Article

Introducing smolagents: simple agents that write actions in code.

Dec 31, 2024

• 1.14k

Article

🌁#81: Key AI Concepts to Follow in 2025

By

•

Dec 23, 2024

• 24

upvoted a paper 10 months ago

Qwen2.5 Technical Report

Paper • 2412.15115 • Published Dec 19, 2024 • 376

upvoted an article 11 months ago

Article

Releasing the largest multilingual open pretraining dataset

By

and 2 others •

Nov 13, 2024

• 104

upvoted an article about 1 year ago

Article

Scaling AI-based Data Processing with Hugging Face + Dask

Oct 9, 2024

• 32

upvoted an article over 1 year ago

Article

Binary and Scalar Embedding Quantization for Significantly Faster & Cheaper Retrieval

Mar 22, 2024

• 104

upvoted a collection over 1 year ago

Meta Llama 3

This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Dec 6, 2024 • 862