Mex Ivanov's picture

1 1 72

Mex Ivanov

MexIvanov

·

MexIvanov

AI & ML interests

NLP, Coding, Quantum Computing and more.

Recent Activity

reacted to RakshitAralimatti's post with 🔥 18 days ago

I built something crazy you never saw before. Please check - https://huggingface.co/blog/RakshitAralimatti/streaming-data-rag A real-time Streaming Data to RAG system that listens to live radio, transcribes it on-the-fly, and lets you query across TIME. Not just "what was discussed" – but "what happened in the last 10 minutes on channel 0?" or "at 9 AM, what was the breaking news?" This is RAG that understands temporal context.

reacted to samerzaher80's post with 👍 30 days ago

AetherMind_SRL: How I beat 7B models on MMLU with 184M params and a $300 GPU I’m Sameer, a solo researcher from Iraq working on a single RTX 3050 8GB laptop.Today I’m releasing AetherMind_SRL – a 184M-parameter NLI model that was trained only on tasks (SNLI, MNLI, ANLI, and a small clinical Alzheimer’s dataset). It was never fine-tuned or even shown a single MMLU question during training.Yet here are the zero-shot MMLU (57 subjects) results:Model MMLU Zero-Shot Training Data AetherMind_SRL (me) 184M 36.05 % Only NLI (SNLI/MNLI/ANLI + ADNI) DeBERTa-v3-base 278M ~30.8 % General pre-training BERT-large 340M 27–30 % General pre-training LLaMA-1 7B 7B 34–35 % Massive text corpus LLaMA-2 7B 7B ~45 % Bigger + better data Yes – my 184M model beats every classic 300–400M model and the original 7-billion-parameter LLaMA-1, all while running at 300+ samples/sec on a $300 laptop GPU.How did this happen?I built a standardized self-improvement loop called AetherMind Self-Reflective Learning (SRL) v1.0:Train normally on NLI Let the model predict on hard adversarial data (ANLI) Log every mistake + low-confidence case Build a balanced “SMART” buffer (60% errors + 40% correct anchors) Fine-tune with tiny LR and error-weighted loss Repeat until stable That’s it. No external knowledge, no MMLU data, no cluster. Just pure reasoning transfer from entailment/contradiction patterns → real-world knowledge.Try it yourself python from transformers import pipeline import torch nli_pipeline = pipeline( "text-classification", model="samerzaher80/AetherMind_SRL", device=0 if torch.cuda.is_available() else -1 ) # DEFINE YOUR TEST HERE premise = "Patient shows progressive memory decline." hypothesis = "Patient shows progressive memory decline." input_text = f"{premise} [SEP] {hypothesis}" result = nli_pipeline(input_text)[0] print(f"Prediction: {result['label']}") print(f"Confidence: {result['score']: Model: https://huggingface.co/samerzaher80/AetherMind_SRL

liked a model about 2 months ago

MiniMaxAI/MiniMax-M2

View all activity

Organizations

None yet

New activity in MexIvanov/zephyr-python-ru about 2 years ago

Adding `safetensors` variant of this model

#1 opened about 2 years ago by