Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
1
Gandharv Patil
gp02-mcgill
Follow
sasha's profile picture
1 follower
·
2 following
gp1702
AI & ML interests
Reinforcement Learning, Stochastic Optimisation, Probabilistic Inference
Organizations
Papers
1
arxiv:
2506.16507
models
1
gp02-mcgill/zephyr-7b-dpo-qlora
Updated
Jan 8
datasets
3
Sort: Recently updated
gp02-mcgill/ultrafeedback_binarised_all_max
Viewer
•
Updated
Jan 31
•
176k
•
6
gp02-mcgill/ultrafeedback_binarised_rnd_max
Viewer
•
Updated
Jan 31
•
60.9k
•
4
gp02-mcgill/ultrafeedback_binarised_min_max
Viewer
•
Updated
Jan 31
•
60.9k
•
17