Gandharv Patil's picture

1

Gandharv Patil

gp02-mcgill

·

gp1702

AI & ML interests

Reinforcement Learning, Stochastic Optimisation, Probabilistic Inference

Organizations

Papers 1

arxiv:2506.16507

models 1

gp02-mcgill/zephyr-7b-dpo-qlora

Updated Jan 8, 2025

datasets 3

gp02-mcgill/ultrafeedback_binarised_all_max

Viewer • Updated Jan 31, 2025 • 176k • 13

gp02-mcgill/ultrafeedback_binarised_rnd_max

Viewer • Updated Jan 31, 2025 • 60.9k • 7

gp02-mcgill/ultrafeedback_binarised_min_max

Viewer • Updated Jan 31, 2025 • 60.9k • 6