In a Training Loop 🔄

72 124 264

Asankhaya Sharma

codelion

http://asankhaya.github.io/

AI & ML interests

Creator of OptiLLM, OpenEvolve, Adaptive Classifier, and Ellora. Pioneering a new category in AI infrastructure: inference-time compute for LLMs.

Recent Activity

reacted to their post with ➕ about 10 hours ago

Introducing Dhara-70M: A diffusion language model that achieves 3.8x higher throughput than autoregressive models! Key findings from our research on optimal architectures for small language models: → Depth beats width: 32 layers outperforms 12 layers at the same parameter count → Best-in-class factuality: 47.5% on TruthfulQA → 10x training efficiency using WSD (Warmup-Stable-Decay) conversion → Canon layers add only 0.13% parameters but improve reasoning We trained on 1B tokens using the optimal 50-30-20 dataset mix (PDFs + filtered web + educational content), then converted to diffusion with just 100M additional tokens. Blog: https://huggingface.co/blog/codelion/optimal-model-architecture Model: https://huggingface.co/codelion/dhara-70m

reacted to their post with 🤗 about 10 hours ago

reacted to their post with 🚀 about 10 hours ago

View all activity

Organizations

upvoted an article 1 day ago

Article

The Optimal Architecture for Small Language Models

about 10 hours ago

•

upvoted a paper 7 days ago

Universal Reasoning Model

Paper • 2512.14693 • Published 10 days ago • 37

upvoted an article 24 days ago

Article

Ellora: Enhancing LLMs with LoRA - Standardized Recipes for Capability Enhancement

24 days ago

•

upvoted a paper 26 days ago

Budget-Aware Tool-Use Enables Effective Agent Scaling

Paper • 2511.17006 • Published Nov 21 • 29

upvoted an article about 2 months ago

Article

The 1 Billion Token Challenge: Finding the Perfect Pre-training Mix

Nov 3

•

upvoted an article 2 months ago

Article

Python Is All You Need? Introducing Dria-Agent-α

Jan 10

•

upvoted a collection 3 months ago

Dhara Foundational Models

Collection

Diffusion Language Models combining deep narrow networks, Canon layers (depthwise causal convolutions), and WSD (Warmup-Stable-Decay) training. • 1 item • Updated about 6 hours ago • 1

upvoted a paper 3 months ago

Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6 • 500

upvoted an article 3 months ago

Article

mem-agent: Equipping LLM Agents with Memory Using RL

Oct 9

•

upvoted a collection 3 months ago

Mem-Agent

Collection

Small sized agents from Dria trained on interacting with an obsidian-like memory system using python tools. Trained on Qwen3-4B-Thinking-2507. • 4 items • Updated Sep 5 • 3

upvoted a paper 3 months ago

BeyondWeb: Lessons from Scaling Synthetic Data for Trillion-scale Pretraining

Paper • 2508.10975 • Published Aug 14 • 60

upvoted an article 4 months ago

Article

mem-agent: Persistent, Human Readable Memory Agent Trained with Online RL

Sep 11

•

upvoted a collection 4 months ago

Nemotron-Pre-Training-Datasets

Collection

Large scale pre-training datasets used in the Nemotron family of models. • 11 items • Updated 3 days ago • 81

upvoted an article 5 months ago

Article

Building Enterprise-Ready Text Classifiers in Minutes with Adaptive Learning

Aug 9

•

upvoted 2 papers 5 months ago

On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification

Paper • 2508.05629 • Published Aug 7 • 180

R-Zero: Self-Evolving Reasoning LLM from Zero Data

Paper • 2508.05004 • Published Aug 7 • 130

upvoted 2 articles 5 months ago

Article

Towards Open Evolutionary Agents

Aug 4

•

Article

Unsupervised Model Improvement via Internal Coherence Maximization: Outperforming Human-Supervised Methods Through Self-Elicitation

Aug 3

•

upvoted a collection 5 months ago

GLM-4.5

Collection

GLM-4.5: An open-source large language model designed for intelligent agents by Z.ai • 11 items • Updated Aug 11 • 250

upvoted a paper 5 months ago

Qwen3 Technical Report

Paper • 2505.09388 • Published May 14 • 319

Asankhaya Sharma

AI & ML interests

Recent Activity

Organizations

codelion's activity

The Optimal Architecture for Small Language Models

Ellora: Enhancing LLMs with LoRA - Standardized Recipes for Capability Enhancement

The 1 Billion Token Challenge: Finding the Perfect Pre-training Mix

Python Is All You Need? Introducing Dria-Agent-α

mem-agent: Equipping LLM Agents with Memory Using RL

mem-agent: Persistent, Human Readable Memory Agent Trained with Online RL

Building Enterprise-Ready Text Classifiers in Minutes with Adaptive Learning

Towards Open Evolutionary Agents

Unsupervised Model Improvement via Internal Coherence Maximization: Outperforming Human-Supervised Methods Through Self-Elicitation