Spaces:
Running
Running
Commit History
Jason/submit only to submissions repo (#65)
8939028
unverified
Jason
commited on
Update table legend to use new names + styling (#66)
cdccabc
unverified
Amber Tanaka
commited on
benchmark descriptions and styling (#59)
ac15cf4
unverified
Smita R
Smita
commited on
Paper cuts (#64)
7b52df4
unverified
Amber Tanaka
commited on
Jason/dataset cfg (#54)
0dd7833
unverified
Jason
commited on
turn on results table filter (#60)
cbcb51a
unverified
Jason
commited on
Add Repro links to Agent column (#63)
5064c71
unverified
Amber Tanaka
commited on
Add Dynamic Column Sizes (#61)
c22c48e
unverified
Amber Tanaka
commited on
fix the nav bar! (#58)
07dc6d3
unverified
Amber Tanaka
commited on
Support old and new openness and tool usage values (#56)
b3aef2c
unverified
Chloe Anastasiades
commited on
Table Legends Refactor (#57)
1e64d2b
unverified
Amber Tanaka
commited on
Map from new task names (#55)
b36c3c5
unverified
Chloe Anastasiades
commited on
update About page layout (#49)
abb9f0a
unverified
Amber Tanaka
commited on
new svgs (#53)
1c65d8c
unverified
Amber Tanaka
commited on
more DRYer config (#52)
539a055
unverified
Jason
commited on
DRY out config global vars (#51)
53e8dbb
unverified
Jason
commited on
Make HF config to use configurable via a variable (#48)
fbcbf30
unverified
Chloe Anastasiades
commited on
copy changes around graph (#47)
02a4349
unverified
Amber Tanaka
commited on
Bump agent-eval version to pick up nulling out model usage info in some cases (#45)
fbcb5bb
unverified
Chloe Anastasiades
commited on
Add new heading content (#46)
9b2d66c
unverified
more eval ordering changes (#43)
2c742e8
unverified
Smita R
Smita
commited on
Update README.md (#44)
d2de222
unverified
Amber Tanaka
commited on
re-ordering evals (#41)
ad46ea8
unverified
Smita R
Smita
commited on
New svgs (#42)
f48fa14
unverified
Amber Tanaka
commited on
Small tweaks (#40)
c934393
unverified
Amber Tanaka
commited on
Update copy (#39)
95d1ab7
unverified
Amber Tanaka
commited on
Add feedback button to about page (#38)
c917078
unverified
make button work better (#36)
49b6cdd
unverified
Amber Tanaka
commited on
One more try at sticky feedback (#35)
2481702
unverified
Could this fix things? (#34)
4797876
unverified
Amber Tanaka
commited on
Revert "Move floating feedback button to document body" (#33)
c2f4381
unverified
bump agenteval to get pretty model names (#32)
3916685
unverified
Regan Huff
commited on
Move floating feedback button to document body (#30)
5f6ec58
unverified
change id to class (#31)
f909e70
unverified
Amber Tanaka
commited on
Add floating feedback button (#28)
afb9b4c
unverified
Reorder benchmarks (#29)
64716c3
unverified
Amber Tanaka
commited on
bump agenteval (#26)
ad338fc
unverified
Fix overflowing tables and cell tooltips (#27)
0d80b6f
unverified
bump agenteval version (#25)
c99da74
unverified
Regan Huff
commited on
Scatterplot hover (#21)
fed35d4
unverified
Amber Tanaka
commited on
Little arrow button to take you back up (#24)
9c18850
unverified
Amber Tanaka
commited on
Update table cells with tooltips and LLM count (#23)
9832707
unverified
Submission page agent tooltips (#22)
50fb0f5
unverified
Bump agenteval version in leaderboard code (#20)
18f8616
unverified
Regan Huff
commited on
Plot Adjustments (#19)
aca1950
unverified
Amber Tanaka
commited on
Nav bar updates (#18)
11de2f8
unverified
Amber Tanaka
commited on
Nav styling (#17)
cd5338b
unverified
Amber Tanaka
commited on
bump datasets version (#16)
dd9aeac
unverified
Regan Huff
commited on
Add About page (#15)
dd5281a
unverified
Amber Tanaka
commited on