Exgentic

community

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

Elron authored a paper 25 days ago

AlephBERT:A Hebrew Large Pre-Trained Language Model to Start-off your Hebrew NLP Application With

Elron authored a paper 25 days ago

Efficient Benchmarking (of Language Models)

Elron authored a paper 25 days ago

Quality Controlled Paraphrase Generation

View all activity

Elron

authored 7 papers 25 days ago

AlephBERT:A Hebrew Large Pre-Trained Language Model to Start-off your Hebrew NLP Application With

Paper • 2104.04052 • Published Apr 8, 2021

DOVE: A Large-Scale Multi-Dimensional Predictions Dataset Towards Meaningful LLM Evaluation

Paper • 2503.01622 • Published Mar 3, 2025

General Agent Evaluation

Paper • 2602.22953 • Published 29 days ago • 11

borgr

submitted a paper to Daily Papers 27 days ago

General Agent Evaluation

Paper • 2602.22953 • Published 29 days ago • 11

evijit

posted an update 6 months ago

Post

2728

AI for Scientific Discovery Won't Work Without Fixing How We Collaborate.

My co-author @cgeorgiaw and I just published a paper challenging a core assumption: that the main barriers to AI in science are technical. They're not. They're social.

Key findings:

🚨 The "AI Scientist" myth delays progress: Waiting for AGI devalues human expertise and obscures science's real purpose: cultivating understanding, not just outputs.
📊 Wrong incentives: Datasets have 100x longer impact than models, yet data curation is undervalued.
⚠️ Broken collaboration: Domain scientists want understanding. ML researchers optimize performance. Without shared language, projects fail.
🔍 Fragmentation costs years: Harmonizing just 9 cancer files took 329 hours.

Why this matters: Upstream bottlenecks like efficient PDE solvers could accelerate discovery across multiple sciences. CASP mobilized a community around protein structure, enabling AlphaFold. We need this for dozens of challenges.

Thus, we're launching Hugging Science! A global community addressing these barriers through collaborative challenges, open toolkits, education, and community-owned infrastructure. Please find all the links below!

Paper: AI for Scientific Discovery is a Social Problem (2509.06580)
Join:

hugging-science
Discord: https://discord.com/invite/VYkdEVjJ5J

burtenshaw

authored a paper 6 months ago

A Cartography of Open Collaboration in Open Source AI: Mapping Practices, Motivations, and Governance in 14 Open Large Language Model Projects

Paper • 2509.25397 • Published Sep 29, 2025 • 14

burtenshaw

posted an update 7 months ago

Post

7442

Smol course has a distinctive approach to teaching post-training, so I'm posting about how it’s different to other post-training courses, including the llm course that’s already available.

In short, the smol course is just more direct that any of the other course, and intended for semi-pro post trainers.

- It’s a minimal set of instructions on the core parts.
- It’s intended to bootstrap real projects you're working on.
- The material handsover to existing documentation for details
- Likewise, it handsover to the LLM course for basics.
- Assessment is based on a leaderboard, without reading all the material.

To start the smol course, follow here:

smol-course

burtenshaw

posted an update 7 months ago

Post

5486

new smol course

If you’re building with or learning about post training AI models right now, we have a new FREE and CERTIFIED course.

🔗 Follow the org to join in

smol-course

The course builds on smol course v1 which was the fastest way to learn to train your custom AI models. It now has:

- A leaderboard for students to submit models to
- Certification based on exams and leaderboards
- Prizes based on Leaderboards
- Up to date content on TRL and SmolLM3
- Deep integration with the Hub’s compute for model training and evaluation

We will release chapters every few weeks, so you can follow the org to stay updated.

2 replies

burtenshaw

posted an update 7 months ago

Post

3151

The open source AI community is just made of people who are passionate and care about their work. So we thought it would be cool to share our favourite icons of the community with a fun award.

Winners get free Hugging Face Pro Subscriptions, Merchandise, or compute credits for the hub.

🔗 Follow and nominate here:

community-spotlight

This is a new initiative to recognise and celebrate the incredible work being done by community members. It's all about inspiring more collaboration and innovation in the world of machine learning and AI.

They're highlighting contributors in four key areas:
- model creators: building and sharing innovative and state-of-the-art models.
- educators: sharing knowledge through posts, articles, demos, and events.
- tool builders: creating the libraries, frameworks, and applications that we all use.
- community champions: supporting and mentoring others in forums.

Know someone who deserves recognition? Nominate them by opening a post in the Hugging Face community forum.

1 reply

evijit

posted an update 8 months ago

Post

407

New blog post alert! "What is the Hugging Face Community Building?", with @yjernite and @irenesolaiman

What 1.8 Million Models Reveal About Open Source Innovation: Our latest deep dive into the Hugging Face Hub reveals patterns that challenge conventional AI narratives:

🔗 Models become platforms for innovation Qwen, Llama, and Gemma models have spawned entire ecosystems of specialized variants. Looking at derivative works shows community adoption better than any single metric.

📊 Datasets reveal the foundation layer → Most downloaded datasets are evaluation benchmarks (MMLU, Squad, GLUE) → Universities and research institutions dominate foundational data → Domain-specific datasets thrive across finance, healthcare, robotics, and science → Open actors provide the datasets that power most AI development

🏛️ Research institutions lead the charge: AI2 (Allen Institute) emerges as one of the most active contributors, alongside significant activity from IBM, NVIDIA, and international organizations. The open source ecosystem spans far beyond Big Tech.

🔍 Interactive exploration tools: We've built several tools to help you discover patterns!

ModelVerse Explorer - organizational contributions
DataVerse Explorer - dataset patterns
Organization HeatMap - activity over time
Base Model Explorer - model family trees
Semantic Search - find models by capability

📚 Academic research is thriving: Researchers are already producing valuable insights, including recent work at FAccT 2025: "The Brief and Wondrous Life of Open Models." We've also made hub datasets, weekly snapshots, and other data available for your own analysis.

The bottom line: AI development is far more distributed, diverse, and collaborative than popular narratives suggest. Real innovation happens through community collaboration across specialized domains.

Read: https://huggingface.co/blog/evijit/hf-hub-ecosystem-overview

burtenshaw

posted an update 8 months ago

Post

1616

Kimi-K2 is ready for general use! In these notebooks I walk you through use cases like function calling and structured outputs.

🔗 burtenshaw/Kimi-K2-notebooks

You can swap it into any OpenAI compatible application via Inference Providers and get to work with an open source model.

1 reply

burtenshaw

posted an update 9 months ago

Post

3155

Inference for generative ai models looks like a mine field, but there’s a simple protocol for picking the best inference:

🌍 95% of users >> If you’re using open (large) models and need fast online inference, then use Inference providers on auto mode, and let it choose the best provider for the model. https://huggingface.co/docs/inference-providers/index

👷 fine-tuners/ bespoke >> If you’ve got custom setups, use Inference Endpoints to define a configuration from AWS, Azure, GCP. https://endpoints.huggingface.co/

🦫 Locals >> If you’re trying to stretch everything you can out of a server or local machine, use Llama.cpp, Jan, LMStudio or vLLM. https://huggingface.co/settings/local-apps#local-apps

🪟 Browsers >> If you need open models running right here in the browser, use transformers.js. https://github.com/huggingface/transformers.js

Let me know what you’re using, and if you think it’s more complex than this.

burtenshaw

posted an update 9 months ago

Post

1172

You don't need remote APIs for a coding copliot, or the MCP Course! Set up a fully local IDE with MCP integration using Continue. In this tutorial Continue guides you through setting it up.

This is what you need to do to take control of your copilot:

1. Get the Continue extension from the [VS Code marketplace](https://marketplace.visualstudio.com/items?itemName=Continue.continue) to serve as the AI coding assistant.

2. Serve the model with an OpenAI compatible server in Llama.cpp / LmStudio/ etc.

llama-server -hf unsloth/Devstral-Small-2505-GGUF:Q4_K_M

3. Create a .continue/models/llama-max.yaml file in your project to tell Continue how to use the local Ollama model.

name: Llama.cpp model
    version: 0.0.1
    schema: v1
    models:
      - provider: llama.cpp
        model: unsloth/Devstral-Small-2505-GGUF
        apiBase: http://localhost:8080
        defaultCompletionOptions:
          contextLength: 8192 
    # Adjust based on the model
        name: Llama.cpp Devstral-Small
        roles:
          - chat
          - edit

4. Create a .continue/mcpServers/playwright-mcp.yaml file to integrate a tool, like the Playwright browser automation tool, with your assistant.

name: Playwright mcpServer
    version: 0.0.1
    schema: v1
    mcpServers:
      - name: Browser search
        command: npx
        args:
          - "@playwright/mcp@latest"

Check out the full tutorial in the [the MCP course](https://huggingface.co/learn/mcp-course/unit2/continue-client)

1 reply

burtenshaw

posted an update 10 months ago

Post

1806

Brand new MCP Course has units are out, and now it's getting REAL! We've collaborated with Anthropic to dive deep into production ready and autonomous agents using MCP

🔗

mcp-course

This is what the new material covers and includes:

- Use Claude Code to build an autonomous PR agent
- Integrate your agent with Slack and Github to integrate it with you Team
- Get certified on your use case and share with the community
- Build an autonomous PR cleanup agent on the Hugging Face hub and deploy it with spaces

The material goes deep into these problems and helps you to build applications that work. We’re super excited to see what you build with it.

burtenshaw

posted an update 10 months ago

Post

1670

Super excited to release Autotrain MCP. This is an MCP server for training AI models, so you can use your AI tools to train your AI models 🤯.

🔗 burtenshaw/autotrain-mcp

Use this MCP server with tools like Claude Desktop, Cursor, VSCode, or Continue to do this:

- Define an ML problem like Image Classification, LLM fine-tuning, Text Classification, etc.
- The AI can retrieve models and datasets from the hub using the hub MCP.
- Training happens on a Hugging Face space, so no worries about hardware restraints.
- Models are pushed to the hub to be used inference tools like Llama.cpp, vLLM, MLX, etc.
- Built on top of the AutoTrain library, so it has full integration with transformers and other libraries.

Everything is still under active development, but I’m super excited to hear what people build, and I’m open to contributions!

1 reply

evijit

posted an update 10 months ago

Post

1751

The HF Policy Team submitted our response to the 2025 National Artificial Intelligence (AI) Research and Development (R&D) Strategic Plan.

Blog (with link to full pdf response):

https://huggingface.co/blog/evijit/us-ai-research-strategy-rfi

AI & ML interests

Recent Activity

Team members 9

Exgentic's activity