Mex Ivanov
MexIvanov
AI & ML interests
NLP, Coding, Quantum Computing and more.
Recent Activity
reacted
to
RakshitAralimatti's
post
with 🔥
18 days ago
I built something crazy you never saw before.
Please check - https://huggingface.co/blog/RakshitAralimatti/streaming-data-rag
A real-time Streaming Data to RAG system that listens to live radio, transcribes it on-the-fly, and lets you query across TIME.
Not just "what was discussed" – but "what happened in the last 10 minutes on channel 0?" or "at 9 AM, what was the breaking news?" This is RAG that understands temporal context.
reacted
to
samerzaher80's
post
with 👍
30 days ago
AetherMind_SRL: How I beat 7B models on MMLU with 184M params and a $300 GPU
I’m Sameer, a solo researcher from Iraq working on a single RTX 3050 8GB laptop.Today I’m releasing AetherMind_SRL – a 184M-parameter NLI model that was trained only on tasks (SNLI, MNLI, ANLI, and a small clinical Alzheimer’s dataset).
It was never fine-tuned or even shown a single MMLU question during training.Yet here are the zero-shot MMLU (57 subjects) results:Model
MMLU Zero-Shot
Training Data
AetherMind_SRL (me)
184M
36.05 %
Only NLI (SNLI/MNLI/ANLI + ADNI)
DeBERTa-v3-base
278M
~30.8 %
General pre-training
BERT-large
340M
27–30 %
General pre-training
LLaMA-1 7B
7B
34–35 %
Massive text corpus
LLaMA-2 7B
7B
~45 %
Bigger + better data
Yes – my 184M model beats every classic 300–400M model and the original 7-billion-parameter LLaMA-1, all while running at 300+ samples/sec on a $300 laptop GPU.How did this happen?I built a standardized self-improvement loop called AetherMind Self-Reflective Learning (SRL) v1.0:Train normally on NLI
Let the model predict on hard adversarial data (ANLI)
Log every mistake + low-confidence case
Build a balanced “SMART” buffer (60% errors + 40% correct anchors)
Fine-tune with tiny LR and error-weighted loss
Repeat until stable
That’s it. No external knowledge, no MMLU data, no cluster.
Just pure reasoning transfer from entailment/contradiction patterns → real-world knowledge.Try it yourself python
from transformers import pipeline
import torch
nli_pipeline = pipeline(
"text-classification",
model="samerzaher80/AetherMind_SRL",
device=0 if torch.cuda.is_available() else -1
)
# DEFINE YOUR TEST HERE
premise = "Patient shows progressive memory decline."
hypothesis = "Patient shows progressive memory decline."
input_text = f"{premise} [SEP] {hypothesis}"
result = nli_pipeline(input_text)[0]
print(f"Prediction: {result['label']}")
print(f"Confidence: {result['score']:
Model: https://huggingface.co/samerzaher80/AetherMind_SRL
liked
a model
about 2 months ago
MiniMaxAI/MiniMax-M2
Organizations
None yet