Proto_AGI's picture

Proto_AGI PRO

mayafree

·

AI & ML interests

None yet

Recent Activity

liked a Space about 22 hours ago

upvoted an article about 22 hours ago

MARL: Runtime Middleware That Reduces LLM Hallucination Without Fine-Tuning

reacted to SeaWolf-AI's post with 🔥 about 22 hours ago

🚀 Introducing MARL — Runtime Middleware That Reduces LLM Hallucination Without Fine-Tuning Now available on PyPI · GitHub · ClawHub · HuggingFace AI models sense they could be wrong, but they can't actually fix what's broken. 🤗 Live A/B test: https://huggingface.co/spaces/VIDraft/MARL We evaluated 9 SOTA models (GPT-5.2, Claude Opus 4.6, Gemini 3 Pro, etc.) across 1,800 assessments in FINAL Bench and found a 39.2%p gap between "recognizing potential errors (MA=0.694)" and "actually finding and fixing them (ER=0.302)." MARL (Model-Agnostic Runtime Middleware for LLMs) was built to close this metacognitive gap. It decomposes a single LLM call into a 5-stage expert pipeline (Hypothesis → Solver → Auditor → Adversarial Verifier → Synthesizer), transforming "answer in one shot" into "think, doubt, correct, and rewrite." No weight modification — works instantly with GPT-5.4, Claude, Gemini, Llama, or any OpenAI API-compatible LLM by changing one line: base_url. Ships with 9 domain-specific emergence engines (invention, pharma, genomics, chemistry, ecology, law, and more — 5,538 expert data items) activated by a simple tag like model="gpt-5.4::pharma". pip install marl-middleware MARL is also officially registered on ClawHub, the skill marketplace of OpenClaw — an AI agent platform with 260K+ developers and 3,200+ skills. It's the first middleware in the Reasoning Enhancement category. One command — clawhub install marl-middleware — gives your AI agent a metacognition upgrade. 📝 Technical deep dive: https://huggingface.co/blog/FINAL-Bench/marl-middleware 📦 PyPI: https://pypi.org/project/marl-middleware/ 🐙 GitHub: https://github.com/Vidraft/MARL 🦀 ClawHub: https://clawhub.ai/Cutechicken99/marl-middleware #MARL #LLM #Hallucination #Metacognition #MultiAgent #AIMiddleware #FINALBench #OpenClaw #ClawHub #PyPI #AGI #HuggingFace #ReasoningAI #SelfCorrection #GlassBoxAI

View all activity

Organizations

mayafree 's Spaces 9

Titan1

Score language models on 100 metacognitive benchmark tasks

Titan1

Evaluate LLMs on 100 metacognitive benchmark tasks

Titan1

Evaluate LLMs on 100 metacognitive benchmark tasks

Titan1

Evaluate LLMs on FINAL Bench metacognitive tasks

Titan1

Evaluate LLMs on the FINAL Bench Metacognitive benchmark

All Bench

a

Open NPC AI

openclaw moltbot

LightOnOCR 2 1B Demo

Extract and recognize text from images and PDFs

Humangen

humangen.ai