Proto_AGI's picture

Proto_AGI PRO

mayafree

·

AI & ML interests

None yet

Recent Activity

liked a Space about 18 hours ago

upvoted an article about 18 hours ago

MARL: Runtime Middleware That Reduces LLM Hallucination Without Fine-Tuning

reacted to SeaWolf-AI's post with 🔥 about 18 hours ago

🚀 Introducing MARL — Runtime Middleware That Reduces LLM Hallucination Without Fine-Tuning Now available on PyPI · GitHub · ClawHub · HuggingFace AI models sense they could be wrong, but they can't actually fix what's broken. We evaluated 9 SOTA models (GPT-5.2, Claude Opus 4.6, Gemini 3 Pro, etc.) across 1,800 assessments in FINAL Bench and found a 39.2%p gap between "recognizing potential errors (MA=0.694)" and "actually finding and fixing them (ER=0.302)." MARL (Model-Agnostic Runtime Middleware for LLMs) was built to close this metacognitive gap. It decomposes a single LLM call into a 5-stage expert pipeline (Hypothesis → Solver → Auditor → Adversarial Verifier → Synthesizer), transforming "answer in one shot" into "think, doubt, correct, and rewrite." No weight modification — works instantly with GPT-5.4, Claude, Gemini, Llama, or any OpenAI API-compatible LLM by changing one line: base_url. Ships with 9 domain-specific emergence engines (invention, pharma, genomics, chemistry, ecology, law, and more — 5,538 expert data items) activated by a simple tag like model="gpt-5.4::pharma". pip install marl-middleware MARL is also officially registered on ClawHub, the skill marketplace of OpenClaw — an AI agent platform with 260K+ developers and 3,200+ skills. It's the first middleware in the Reasoning Enhancement category. One command — clawhub install marl-middleware — gives your AI agent a metacognition upgrade. 📝 Technical deep dive: https://huggingface.co/blog/FINAL-Bench/marl-middleware 🤗 Live A/B test: https://huggingface.co/spaces/VIDraft/MARL 📦 PyPI: https://pypi.org/project/marl-middleware/ 🐙 GitHub: https://github.com/Vidraft/MARL 🦀 ClawHub: https://clawhub.ai/Cutechicken99/marl-middleware #MARL #LLM #Hallucination #Metacognition #MultiAgent #AIMiddleware #FINALBench #OpenClaw #ClawHub #PyPI #AGI #HuggingFace #ReasoningAI #SelfCorrection #GlassBoxAI

View all activity

Organizations

upvoted an article about 18 hours ago

Article

MARL: Runtime Middleware That Reduces LLM Hallucination Without Fine-Tuning

about 19 hours ago

•

10

upvoted an article 2 days ago

Article

Structural Problems in AI Benchmarking and the Case for a Unified Evaluation Framework

2 days ago

•

10

upvoted an article 14 days ago

Article

Do Bubbles Form When Tens of Thousands of AIs Simulate Capitalism?

14 days ago

•

17

upvoted an article 16 days ago

Article

FINAL Bench: The Real Bottleneck to AGI Is Self-Correction

17 days ago

•

20

upvoted a collection 17 days ago

FINAL Bench

World's First Functional Metacognition Benchmark. "Not how much AI knows — but whether it knows what it doesn't know, and can fix it." • 2 items • Updated 17 days ago • 4