Kernels
optimizer / torch-ext

Commit History

Support param group with various placements (#13)
e2b41e5
unverified

wyldecat github-actions[bot] commited on

fix bug in fsdp
811726c

ca1207 commited on

Update torch-ext/optimizer/muon.py
b0230e7
unverified

TaehyunKim commited on

Update torch-ext/optimizer/muon.py
ff2fcfb
unverified

TaehyunKim commited on

Update muon.py
c16b438
unverified

TaehyunKim commited on

fix assert in a2a gather scatter
3dafb3e

ca1207 commited on

delete state in split_func
15336dc

ca1207 commited on

change owner_params to owned_params
6943c45

ca1207 commited on

modify pre step (overlap step) can get from arsgs
589b763

ca1207 commited on

add doc strings + init self rank on init_assign_params
267e8a0

ca1207 commited on

license added for flash_muon
d7cd571

ca1207 commited on

misc
35894d1

ca1207 commited on

use inpalce op in update_g
6e9baad

ca1207 commited on

use COMM_DTYPE instead of hardcoded dtype
2a8631f

ca1207 commited on

apply all2all scatter gather
ff6d675

ca1207 commited on

feat(muon_clip) : add muon clip (#6)
d65066c
unverified

dongseokmotif dongseokmotif github-actions[bot] commited on

feat(muon) : add tuned-abc-values & blfoat16 communication
f7faa93

wyldecat commited on

feat: update muon to receive paramgroups, not model (#4)
b0f46c7
unverified

junhyeok-motech leejunhyeok wyldecat commited on

fix(muon): add update_p stage and dealloc tensors properly
99e7c0c

wyldecat commited on

chore: add .gitignore
79fc8ba

wyldecat commited on

fix(optimizer): resolve bug where weight decay was multiplied by wrong lr value (#5)
671b033
verified

dongseokmotif commited on

fix(muon): handle un-distributed env
1f13dae

iamwyldecat commited on

refactor(muon): change argument adam_wd to weight_decay and handle params' type
02ac540

iamwyldecat commited on

fix(muon): free tensors that are no longer needed
64757cb

iamwyldecat commited on

chore(muon): update comment
036642a

iamwyldecat commited on

chore(muon): clean build and update doc
febdf5b

iamwyldecat commited on

fix(muon): delete intermediate tensors immediately to lower peak mem usage
bdd2678

iamwyldecat commited on

chore: initial commit
8535e80

iamwyldecat commited on