Independent benchmark verification: your Gemma 4 variant is the top pick

#2
by DreamFast - opened

I compared 13 abliterated variants of Gemma 4 E2B across weight analysis, KL divergence (Heretic methodology, full 262K vocab), HarmBench safety evaluation (400 prompts, full LLM review of all 5,600 responses), and 8 benchmark tasks on native BF16. 44 GPU hours on a single RTX 5090. All 14 models tested with identical settings. Full comparison at DreamFast/Gemma4-e2b-abliterlitics.

Your divergence score of 0.1651 is verified. We measured 0.1673, within 1.3%. Your variant came out on top. 96.0% ASR with capability fully preserved. It actually beats the base model on GSM8K by 1.4 points and LAMBADA perplexity by 5.5%. The calibration against your reported KL value was the tightest match of all four Heretic-built variants. Solid work.

coder3101 pinned discussion

Sign up or log in to comment