Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
1
81
197
pascalmusabyimana
pascal-maker
Follow
yaostephanekouassi1's profile picture
frascuchon's profile picture
Titus-von-Koeller's profile picture
18 followers
ยท
88 following
https://pascal-maker.github.io/developedbypascalmusabyimana/
PascalMusabyim1
pascal-maker
pascal-musabyimana-573b66178
AI & ML interests
computer vision, nlp , machine learning and deeplearning
Recent Activity
reacted
to
nouamanetazi
's
post
with ๐ค
about 1 hour ago
After training ๐๐ฆ๐จ๐ฅ๐๐๐ on ๐๐๐ ๐๐๐๐๐ฌ for nearly a month, I've come to realize something most people overlook: ๐ข๐ง๐๐ซ๐๐ฌ๐ญ๐ซ๐ฎ๐๐ญ๐ฎ๐ซ๐ ๐ข๐ฌ ๐ญ๐ก๐ ๐ฆ๐๐ค๐-๐จ๐ซ-๐๐ซ๐๐๐ค ๐๐๐๐ญ๐จ๐ซ ๐ข๐ง ๐๐๐ ๐ญ๐ซ๐๐ข๐ง๐ข๐ง๐ . ๐ฅ Everyone talks about model architecture and data quality. And yes, those matter immensely. But here's what nobody tells you: when your training run fails at 2 AM because of mysterious ๐๐๐๐ ๐๐ซ๐ซ๐จ๐ซ๐ฌ, or when your expensive GPU cluster is running at ๐๐% ๐๐๐๐ข๐๐ข๐๐ง๐๐ฒ, the problem isn't your model. It's most probably a ๐ฆ๐ข๐ฌ๐ฎ๐ฌ๐ ๐จ๐ ๐ญ๐ก๐ ๐ก๐๐ซ๐๐ฐ๐๐ซ๐. ๐ ๏ธ Questions that seemed simple but had no clear answers: Why is ๐๐จ๐ ๐ญ๐ซ๐๐ข๐ง๐ข๐ง๐ ๐ฌ๐ฅ๐จ๐ฐ๐๐ซ ๐ญ๐ก๐๐ง ๐๐๐ง๐ฌ๐ ๐ฆ๐จ๐๐๐ฅ๐ฌ? Which ๐๐๐๐ ๐๐ฅ๐๐ ๐ฌ should we actually set? How often should we checkpoint without killing throughput? That's why we built ๐๐ก๐ ๐๐ฆ๐จ๐ฅ ๐๐ซ๐๐ข๐ง๐ข๐ง๐ ๐๐ฅ๐๐ฒ๐๐จ๐จ๐ค ๐: a complete guide covering everything from model architecture and data curation to the SmolLM3 training marathon, post-training techniques, and crucially, the ๐ข๐ง๐๐ซ๐๐ฌ๐ญ๐ซ๐ฎ๐๐ญ๐ฎ๐ซ๐ ๐ฅ๐๐ฒ๐๐ซ that most teams get wrong. We validated real vs theoretical bandwidth across the entire stack: ๐๐๐๐ ๐ก๐ข๐ญ๐ญ๐ข๐ง๐ ๐ ๐๐/๐ฌ, ๐๐๐๐ข๐ง๐ค ๐.๐ ๐ซ๐๐๐๐ก๐ข๐ง๐ ๐๐๐ ๐๐/๐ฌ, ๐๐๐๐ ๐๐๐ง๐ ๐๐ญ ๐๐.๐ ๐๐/๐ฌ. Then we ran collective operations across ๐๐๐ ๐๐๐๐ฌ (16 nodes, 8xH100s each) and measured how performance degrades at scale: all-reduce drops from ๐๐๐ ๐๐/๐ฌ on a single node to ๐๐๐-๐๐๐ ๐๐/๐ฌ across 16 nodes. If you've ever wondered why your training runs are slower than they should be, or you're planning to scale up and want to avoid expensive mistakes, this guide might save you weeks of debugging. ๐๐ก๐ ๐๐ฆ๐จ๐ฅ ๐๐ซ๐๐ข๐ง๐ข๐ง๐ ๐๐ฅ๐๐ฒ๐๐จ๐จ๐ค: https://lnkd.in/e5MKXUHS Shared with โค๏ธ by the HuggingFace team
reacted
to
Kseniase
's
post
with ๐ฅ
1 day ago
11 Fascinating new Policy Optimization techniques Policy optimization (PO) algorithms are central to training AI models with preference-based feedback. In recent weeks, numerous new PO methods have emerged that build on or replace the popular PPO and GRPO, solving their issues. Here are 11 of them: 1. BAlanced Policy Optimization (BAPO) โ https://huggingface.co/papers/2510.18927 Dynamically adjusting the clipping bounds in PPO-style updates to balance positive and negative gradients and prevent entropy collapse 2. Training-Free GRPO โ https://huggingface.co/papers/2510.08191 Instead of using numeric rewards, it compares rollouts semantically to distill useful knowledge as a token prior, which is then applied during inference to guide the modelโs behavior 3. Asymmetric Importance Sampling Policy Optimization (ASPO) โ https://huggingface.co/papers/2510.06062 Fixes imbalanced token weighting in LLM training. It flips the importance sampling ratios for positive tokens to correct over- and under-updates, and adds a soft dual-clipping step to keep gradients stable 4. In-Context Steered Policy Optimization (ICPO) โ https://arxiv.org/abs/2510.26519 Uses a modelโs own in-context learning ability to guide training with existing data. It combines Mixed-Policy GRPO with Implicit Expert Forcing to expand exploration and adds Expert Region Reject Sampling and Annealed Expert-Bonus Reward Shaping to ensure stability and balanced expert influence 5. Graph-Enhanced Policy Optimization (GEPO) โ https://arxiv.org/abs/2510.26270 Builds a graph of an agentโs experiences to understand how different states connect, guide exploration and assign rewards more effectively 6. Information Gain-based Policy Optimization (IGPO) โ https://huggingface.co/papers/2510.14967 Uses the modelโs own belief updates to create dense, informative feedback for smoother multi-turn learning Read further below โฌ๏ธ If you like this, also subscribe to the Turing post: https://www.turingpost.com/subscribe
liked
a model
3 days ago
meituan-longcat/LongCat-Flash-Omni
View all activity
Organizations
spaces
7
Sort:ย Recently updated
pinned
Paused
My Argilla
โ
Sleeping
Agentscomparison Dashboard
๐
Display project metrics with real-time updates
Paused
Medical VLM with SAM-2 and CheXagent
๐
A comprehensive medical imaging analysis tool
Paused
Medical Imaging Analysis
๐
Paused
medicalaiapp
๐
Paused
luminus
๐
View 7 Spaces
models
7
Sort:ย Recently updated
pascal-maker/unsloth_finetune
Image-to-Text
โข
9B
โข
Updated
10 days ago
โข
29
pascal-maker/myemoji-gemma-3-270m-it
Text Generation
โข
0.4B
โข
Updated
17 days ago
โข
29
pascal-maker/vit_base_patch16_224.augreg2_in21k_ft_in1k.lora_ft_food101
Updated
Feb 5
pascal-maker/vit_base_patch16_224.augreg2_in21k_ft_in1k.ft_food101
Updated
Feb 5
pascal-maker/qwen2-7b-instruct-trl-sft-ChartQA
Updated
Dec 17, 2024
pascal-maker/paligemma_vqav2
Image-to-Text
โข
3B
โข
Updated
Nov 11, 2024
โข
1
pascal-maker/qwen2-7b-instruct-amazon-description
Updated
Oct 1, 2024
datasets
2
Sort:ย Recently updated
pascal-maker/my-single-image-dataset
Viewer
โข
Updated
May 27
โข
1
โข
9
pascal-maker/classification-ie-optimization
Viewer
โข
Updated
Feb 18
โข
12