How to use from
Docker Model Runner
docker model run hf.co/llmfan46/Gemma-4-Harmonia-31B-uncensored-heretic-GGUF:
Quick Links

🚨⚠️ I HAVE REACHED HUGGING FACE'S FREE STORAGE LIMIT ⚠️🚨

I can no longer upload new models unless I can cover the cost of additional storage.
I host 70+ free models as an independent contributor and this work is unpaid.
Without your support, no more new models can be uploaded.

🎉 Patreon (Monthly)  |  ☕ Ko-fi (One-time)

Every contribution goes directly toward Hugging Face storage fees to keep models free for everyone.


91% fewer refusals (9/100 Uncensored vs 97/100 Original) while preserving model quality (0.0047 KL divergence).

❤️ Support My Work

Creating these models takes significant time, work and compute. If you find them useful consider supporting me:

image/png

Platform Link What you get
🎉 Patreon Monthly support Priority model requests
☕ Ko-fi One-time tip My eternal gratitude

Your help will motivate me and would go into further improving my workflow and coverings fees for storage, compute and may even help uncensoring bigger model with rental Cloud GPUs.


GGUF quantizations of llmfan46/Gemma-4-Harmonia-31B-it-uncensored-heretic.

This is a decensored version of virtuous7373/Gemma-4-Harmonia-31B, made using Heretic v1.2.0 with the Arbitrary-Rank Ablation (ARA) method

Abliteration parameters

Parameter Value
start_layer_index 14
end_layer_index 55
preserve_good_behavior_weight 0.7754
steer_bad_behavior_weight 0.0001
overcorrect_relative_weight 0.9765
neighbor_count 14

Targeted components

  • attn.o_proj

Performance

Metric This model Original model (Gemma-4-Harmonia-31B)
KL divergence 0.0047 0 (by definition)
Refusals 9/100 97/100

Lower refusals indicate fewer content restrictions, while lower KL divergence indicates more closeness to the original model's baseline. Higher refusals cause more rejections, objections, pushbacks, lecturing, censorship, softening and deflections.

MMLU test results:

Original:

============================================================

  • Total questions: 7021

  • Correct: 6014

  • Accuracy: 0.8566 (85.66%)

  • Parse failures: 22

============================================================

Tested subject scores:

  • professional_law: 0.7592 (596/785)
  • moral_scenarios: 0.8394 (371/442)
  • miscellaneous: 0.9243 (354/383)
  • professional_psychology: 0.8797 (278/316)
  • high_school_psychology: 0.9593 (259/270)
  • high_school_macroeconomics: 0.9137 (180/197)
  • elementary_mathematics: 0.9239 (170/184)
  • moral_disputes: 0.8678 (151/174)
  • prehistory: 0.9128 (157/172)
  • philosophy: 0.8553 (136/159)
  • high_school_biology: 0.9605 (146/152)
  • professional_accounting: 0.7902 (113/143)
  • clinical_knowledge: 0.8929 (125/140)
  • high_school_microeconomics: 0.9632 (131/136)
  • nutrition: 0.8815 (119/135)
  • professional_medicine: 0.9104 (122/134)
  • conceptual_physics: 0.9062 (116/128)
  • high_school_mathematics: 0.5669 (72/127)
  • human_aging: 0.8448 (98/116)
  • security_studies: 0.8571 (96/112)
  • high_school_statistics: 0.8649 (96/111)
  • marketing: 0.9725 (106/109)
  • high_school_world_history: 0.9528 (101/106)
  • sociology: 0.9223 (95/103)
  • high_school_government_and_politics: 0.9406 (95/101)
  • high_school_geography: 0.9596 (95/99)
  • high_school_chemistry: 0.7835 (76/97)
  • high_school_us_history: 0.9053 (86/95)
  • virology: 0.5056 (45/89)
  • college_medicine: 0.8636 (76/88)
  • world_religions: 0.9205 (81/88)
  • high_school_physics: 0.7619 (64/84)
  • electrical_engineering: 0.8395 (68/81)
  • astronomy: 0.9241 (73/79)
  • logical_fallacies: 0.8816 (67/76)
  • high_school_european_history: 0.8904 (65/73)
  • anatomy: 0.8732 (62/71)
  • college_biology: 0.9844 (63/64)
  • human_sexuality: 0.8750 (56/64)
  • formal_logic: 0.7031 (45/64)
  • public_relations: 0.7213 (44/61)
  • international_law: 0.8667 (52/60)
  • college_physics: 0.7193 (41/57)
  • college_mathematics: 0.7818 (43/55)
  • econometrics: 0.7407 (40/54)
  • jurisprudence: 0.8302 (44/53)
  • high_school_computer_science: 0.9808 (51/52)
  • machine_learning: 0.8462 (44/52)
  • medical_genetics: 0.9020 (46/51)
  • global_facts: 0.5686 (29/51)
  • management: 0.8800 (44/50)
  • us_foreign_policy: 0.9800 (49/50)
  • college_chemistry: 0.6170 (29/47)
  • abstract_algebra: 0.7447 (35/47)
  • business_ethics: 0.8478 (39/46)
  • college_computer_science: 0.9333 (42/45)
  • computer_security: 0.8605 (37/43)

Heretic:

============================================================

  • Total questions: 7021

  • Correct: 5936

  • Accuracy: 0.8455 (84.55%)

  • Parse failures: 17

============================================================

Tested subject scores:

  • professional_law: 0.7121 (559/785)
  • moral_scenarios: 0.8281 (366/442)
  • miscellaneous: 0.9191 (352/383)
  • professional_psychology: 0.8703 (275/316)
  • high_school_psychology: 0.9593 (259/270)
  • high_school_macroeconomics: 0.9188 (181/197)
  • elementary_mathematics: 0.9348 (172/184)
  • moral_disputes: 0.8448 (147/174)
  • prehistory: 0.9128 (157/172)
  • philosophy: 0.8113 (129/159)
  • high_school_biology: 0.9605 (146/152)
  • professional_accounting: 0.7902 (113/143)
  • clinical_knowledge: 0.8786 (123/140)
  • high_school_microeconomics: 0.9559 (130/136)
  • nutrition: 0.8815 (119/135)
  • professional_medicine: 0.9030 (121/134)
  • conceptual_physics: 0.8828 (113/128)
  • high_school_mathematics: 0.5433 (69/127)
  • human_aging: 0.8448 (98/116)
  • security_studies: 0.8571 (96/112)
  • high_school_statistics: 0.8559 (95/111)
  • marketing: 0.9817 (107/109)
  • high_school_world_history: 0.9528 (101/106)
  • sociology: 0.9223 (95/103)
  • high_school_government_and_politics: 0.9406 (95/101)
  • high_school_geography: 0.9596 (95/99)
  • high_school_chemistry: 0.7835 (76/97)
  • high_school_us_history: 0.8947 (85/95)
  • virology: 0.5056 (45/89)
  • college_medicine: 0.8295 (73/88)
  • world_religions: 0.9205 (81/88)
  • high_school_physics: 0.7619 (64/84)
  • electrical_engineering: 0.8148 (66/81)
  • astronomy: 0.9367 (74/79)
  • logical_fallacies: 0.8947 (68/76)
  • high_school_european_history: 0.8630 (63/73)
  • anatomy: 0.8873 (63/71)
  • college_biology: 0.9844 (63/64)
  • human_sexuality: 0.8750 (56/64)
  • formal_logic: 0.7031 (45/64)
  • public_relations: 0.6885 (42/61)
  • international_law: 0.8667 (52/60)
  • college_physics: 0.7193 (41/57)
  • college_mathematics: 0.7455 (41/55)
  • econometrics: 0.7407 (40/54)
  • jurisprudence: 0.8113 (43/53)
  • high_school_computer_science: 0.9808 (51/52)
  • machine_learning: 0.8077 (42/52)
  • medical_genetics: 0.9020 (46/51)
  • global_facts: 0.5686 (29/51)
  • management: 0.8800 (44/50)
  • us_foreign_policy: 0.9600 (48/50)
  • college_chemistry: 0.6383 (30/47)
  • abstract_algebra: 0.7447 (35/47)
  • business_ethics: 0.8478 (39/46)
  • college_computer_science: 0.9333 (42/45)
  • computer_security: 0.8372 (36/43)

MMLU - Massive Multitask Language Understanding, multiple-choice questions across 57 subjects (math, history, law, medicine, etc.).


Quantizations

For the K-quants below, selected Gemma 4 attention and FFN tensors are kept at higher precision where useful.

These GGUFs preserve key Gemma 4 attention projection tensors at higher precision.

  • Q6_K, Q5_K_M, Q5_K_S, Q4_K_M, Q4_K_S Q3_K_LandQ3_K_Mkeep the main attention projection tensors asQ8_0`:
    • attn_q
    • attn_k
    • attn_v
    • attn_output

This helps preserve Gemma 4’s attention path at higher precision, especially for lower-bit quants, while avoiding large file-size increases from unnecessarily up-quantizing the largest MoE expert tensors.

Filename Quant Description
Gemma-4-Harmonia-31B-uncensored-heretic-BF16.gguf BF16 Full precision
Gemma-4-Harmonia-31B-uncensored-heretic-Q8_0.gguf Q8_0 Near-lossless, recommended
Gemma-4-Harmonia-31B-uncensored-heretic-Q6_K.gguf Q6_K Excellent quality
Gemma-4-Harmonia-31B-uncensored-heretic-Q5_K_M.gguf Q5_K_M Good balance
Gemma-4-Harmonia-31B-uncensored-heretic-Q5_K_S.gguf Q5_K_S Smaller Q5
Gemma-4-Harmonia-31B-uncensored-heretic-Q4_K_M.gguf Q4_K_M Good for limited VRAM
Gemma-4-Harmonia-31B-uncensored-heretic-Q4_K_S.gguf Q4_K_S Smaller Q4
Gemma-4-Harmonia-31B-uncensored-heretic-Q3_K_L.gguf Q3_K_L Low VRAM, decent quality
Gemma-4-Harmonia-31B-uncensored-heretic-Q3_K_M.gguf Q3_K_M Low VRAM, smaller

Vision Projector

Filename Quant Description
Gemma-4-Harmonia-31B-uncensored-heretic-mmproj-BF16.gguf BF16 Native precision

A Vision Projector File is Required for vision/multimodal capabilities. Use alongside any quantization above.

Usage

Works with llama.cpp, LM Studio, Ollama, and other GGUF-compatible tools.


HARMONIA

The Greek goddess of harmony and concord.

Gemini Word Salad Initialization

Harmonious Synthesis

Harmonia is a high-dimensional 31-billion parameter merge of Gemma 4. By executing a meticulous three-phase fusion of seven elite foundation and specialized models, Harmonia demonstrates a targeted approach to deep neural consolidation, minimizing regression while amplifying unique capability boundaries.

Instead of simple linear blending, which often degrades logical coherence and dilutes nuanced behavior, Harmonia was sculpted using a combination of mathematical projections, covariance activation matching, and surgical synaptic pruning. The model appears pretty solid so far.

Multi-Stage Fusion Protocol

The lineage of Harmonia is constructed systematically, passing through three isolated mathematical states to layer capabilities cleanly.

Phase I

Nullspace Coherence Mapping

To anchor base capabilities, the primary Gemma-4-31B-Base is combined with the analytically rigorous GarnetV2-31B. Utilizing low-rank Singular Value Decomposition (SVD), the specialized donor features are projected entirely onto the mathematical null-space of the base weights. This prevents the creative delta vectors from distorting essential core intelligence, producing the stable platform clever-basename.

> Method: Null-Space Filtering
> Core Integrity Protection (Base Protect): Active (True)
> Targeted Active Rank Limit: 256
Phase II

Surgical Synaptic Gating

Next, our newly anchored base is layered with the highly independent cognitive engines MeroMero-31B and Gembrain-31B. We apply Context-Aware Binary Selection (CABS) to execute structured, localized parameter gating. By enforcing precise structural pruning ratios (retaining optimal synapses in 16:32 and 11:33 ratios), we weave complex creative reasoning directly into the core matrix without causing neural interference. The result is the highly expressive clever-intname.

> Method: Context-Aware Binary Selection (CABS)
> Structural Masking Ratio (MeroMero): 16 : 32 (Weight: 0.6)
> Structural Masking Ratio (Gembrain): 11 : 33 (Weight: 0.4)
> Default Sparse Gating Step: 8 : 32
Phase III

Covariance Activation Matching

In the final harmonization phase, the expressive clever-intname is combined with the narrative mastery of Equinox-31B, the creative depth of Fabled-Gemma4, and our primary conversational core Ortenzya-The-Creative-Wordsmith. Using data-free covariance estimation via task vectors, ACTMat reconstructs layer-wise input activation properties, solving for optimal projection weights in activation space. This resolves semantic alignment anomalies and delivers the unified output model.

> Method: ACTMat Activation Matching
> Task Vector Blending Covariance Limit: 16,384
> Epsilon Solver Regularizer: 1e-06
> Output Precision Profile: bfloat16

Methodological Innovations

Nullspace Projection
Instead of destroying structural logic via linear interpolation, this method extracts the base model's essential singular values. It projects specialized donor features orthogonally, preventing core capability degradation.
Context-Aware Binary Selection
A dynamic, high-fidelity neural filter. Applying structured magnitude masking at customizable N:M fractions removes low-signal synaptic weights, seamlessly layering domain specialization into active logical paths.
Activation Covariance Matching
Using Gram matrices computed directly from task vectors, ACTMat aligns semantic representations in the activation space rather than the parameter space. It dynamically falls back to robust pseudo-inverse SVD solvers when numerical anomalies arise.

Merge Blueprint

The entire orchestration sequence is structured via a multi-stage MergeKit pipeline. Expand the block below to view the structural YAML recipes.

Show MergeKit Configuration
name: clever-basename

merge_method: nullspace
base_model: ./gemma-4-31B-base

models:
  - model: ./Gemma4-GarnetV2-31B
    parameters:
      weight: 1.0

parameters:
  protect_base: true
  nr: 256

tokenizer:
  source: base
chat_template: auto

dtype: float32
out_dtype: bfloat16
---
name: clever-intname
merge_method: cabs

base_model: ./clever-basename

models:
  - model: ./clever-basename

  - model: ./G4-MeroMero-31B-uncensored-heretic
    parameters:
      weight: 0.6
      n_val: 16
      m_val: 32
  - model: ./Gemma-4-Gembrain-31B-heretic
    parameters:
      weight: 0.4
      n_val: 11
      m_val: 33

default_n_val: 8
default_m_val: 32

pruning_order:
  - ./G4-MeroMero-31B-uncensored-heretic
  - ./Gemma-4-Gembrain-31B-heretic

dtype: float32
out_dtype: bfloat16

tokenizer:
  source: union

chat_template: auto
---
name: Harmonia

merge_method: actmat

base_model: ./gemma-4-Ortenzya-The-Creative-Wordsmith-31B-it-uncensored-heretic

models:
  - model: ./gemma-4-Ortenzya-The-Creative-Wordsmith-31B-it-uncensored-heretic
  - model: ./LatitudeGames-Equinox-31B
    parameters:
      weight: 1
  - model: ./clever-intname
    parameters:
      weight: 1
  - model: ./Fabled-Gemma4-31B
    parameters:
      weight: 1

parameters:
  epsilon: 1e-6

tokenizer:
  source: "union"

dtype: bfloat16
out_dtype: bfloat16

chat_template: auto

Symphony Contributors

I am grateful to the following individuals for their models, inspiration, and other contributions.:

And of course, every wonderful person on:

LocalLLaMA

A big thanks to Gemini-3.5-flash for creating this README alongside the word salads found within it. A special acknowledgment is extended to Google DeepMind for their contribution of the Gemma-4 foundation family to the open-weight ecosystem, representing the structural cornerstone of this merge and its constituents.

Downloads last month
11,629
GGUF
Model size
31B params
Architecture
gemma4
Hardware compatibility
Log In to add your hardware

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for llmfan46/Gemma-4-Harmonia-31B-uncensored-heretic-GGUF