flowpoint's picture

123

flowpoint

flowpoint

·

AI & ML interests

None yet

Recent Activity

liked a model 4 months ago

Skywork/Skywork-Reward-V2-Qwen3-8B

commented on an article 4 months ago

There is no such thing as a tokenizer-free lunch

liked a model 4 months ago

Qwen/Qwen3-Omni-30B-A3B-Thinking

View all activity

Organizations

None yet

liked a model 4 months ago

Skywork/Skywork-Reward-V2-Qwen3-8B

Text Classification • 8B • Updated Jul 6, 2025 • 9.03k • 20

commented on There is no such thing as a tokenizer-free lunch 4 months ago

This post is a good and valid reminder for the need of good science around tokenization.

However, my dislike of tokenizers stems more from their practical implications.
Tokenizers:

Are another software component that can and does go wrong.
Are uncommon or more problematic to finetune.
Mostly don't run on GPU/TPU.
...

Many are solvable implementation problems, but the bitter lesson would imply that we should rather just train/search/learn tokenization inside the networks.
The increased costs can be mitigated within the network architecture and with performance optimizations.

The performance and interpretability are strong points for it but they trade off against implementation problems and possibly lower model quality.

Additionally, its pretty appropriate to say tokenizer-free if you have no specific part in the whole software that is actually a tokenizer, and a str.encode hardly deserves the mention.

liked a model 4 months ago

Qwen/Qwen3-Omni-30B-A3B-Thinking

Any-to-Any • 32B • Updated Sep 22, 2025 • 11.4k • 246

liked 2 models 6 months ago

Qwen/Qwen3-30B-A3B-Instruct-2507

Text Generation • 31B • Updated Sep 17, 2025 • 1.07M • • 733

Qwen/Qwen3-Coder-480B-A35B-Instruct

Text Generation • 480B • Updated Aug 21, 2025 • 25.7k • • 1.27k

liked 2 datasets 8 months ago

RLHFlow/Deepseek-PRM-Data

Viewer • Updated Nov 9, 2024 • 253k • 42 • 17

mlfoundations/dclm-pool-400m-1x

Preview • Updated Jun 20, 2024 • 1.02k • 3

liked 8 models 9 months ago

nvidia/MambaVision-L3-512-21K

Image Classification • 0.7B • Updated Mar 29, 2025 • 87 • 54

Zyphra/Zamba2-7B

7B • Updated Feb 14, 2025 • 71 • 114

lightonai/modernbert-embed-large

Sentence Similarity • 0.4B • Updated May 14, 2025 • 38.7k • • 28

Linq-AI-Research/Linq-Embed-Mistral

Feature Extraction • 7B • Updated Jun 5, 2024 • 16.6k • 138

nvidia/quality-classifier-deberta

Updated Sep 22, 2025 • 1.72k • 73

Zyphra/Zamba2-1.2B

1B • Updated Feb 7, 2025 • 582 • 75

featherless-ai/QRWKV-72B

Text Generation • 79B • Updated Oct 20, 2025 • 135 • • 64

Zyphra/ZR1-1.5B

Text Generation • 2B • Updated Jul 14, 2025 • 704 • 69

liked a dataset 9 months ago

Zyphra/Zyda-2

Preview • Updated Aug 6, 2025 • 160k • 85

liked a model 10 months ago

featherless-ai/QRWKV-QwQ-32B

Text Generation • 35B • Updated Sep 17, 2025 • 5 • 30

liked 2 models 11 months ago

intfloat/multilingual-e5-large-instruct

Feature Extraction • 0.6B • Updated Jul 10, 2025 • 1.21M • • 600

Zyphra/Zonos-v0.1-transformer

Text-to-Speech • Updated Jun 3, 2025 • 9.57k • 425

liked a dataset 11 months ago

m-a-p/FineFineWeb

Viewer • Updated Dec 19, 2024 • 4.89B • 811k • 90