HRM-Text1

HRM-Text1 is an experimental instruction-following text generation model based on the Hierarchical Recurrent Memory (HRM) architecture. It is trained on the databricks/databricks-dolly-15k dataset, which consists of instruction–response pairs across multiple task types.

The model utilizes the HRM structure, consisting of a "Specialist" module for low-level processing and a "Manager" module for high-level abstraction and planning. This architecture aims to handle long-range dependencies more effectively by summarizing information at different temporal scales.

Model Description

Architecture: Hierarchical Recurrent Memory (HRM)
Training Data: databricks/databricks-dolly-15k
Original Paper: Hierarchical Reasoning Model
Tokenizer: t5-small (slow T5 SentencePiece)
Vocab Size: 32100
Objective: Causal Language Modeling

Latest Performance (Epoch 20)

Validation Loss: 3.6668
Validation Perplexity: 39.13

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Viharikvs/HRM-Text1-UltraChat

Base model

google-t5/t5-small

Finetuned

(2164)

this model