Michael Fromm's picture

Michael Fromm

mfromm

·

https://fromm-m.github.io/fromm/

AI & ML interests

NLP, LLM, ConvAI

Recent Activity

updated a dataset about 16 hours ago

Eurolingua/DCLM-200-100k-unfiltered

published a dataset about 16 hours ago

Eurolingua/DCLM-200-100k-unfiltered

updated a dataset 17 days ago

Eurolingua/HPLT3-198-500k

View all activity

Organizations

upvoted a paper about 1 month ago

Tokenizer Choice For LLM Training: Negligible or Crucial?

Paper • 2310.08754 • Published Oct 12, 2023 • 3

upvoted an article 5 months ago

Article

SmolLM3: smol, multilingual, long-context reasoner

Jul 8

•

732

upvoted a paper 6 months ago

Judging Quality Across Languages: A Multilingual Approach to Pretraining Data Filtering with Language Models

Paper • 2505.22232 • Published May 28 • 18

upvoted a collection about 1 year ago

EU20-Benchmarks

Evaluation Benchmarks for 20 European languages. • 5 items • Updated Oct 11, 2024 • 9