AI & ML interests

None defined yet.

Recent Activity

bgamazayย  updated a Space about 8 hours ago
AIEnergyScore/Label
bgamazayย  updated a Space about 8 hours ago
AIEnergyScore/submission_portal
sashaย  updated a dataset about 11 hours ago
AIEnergyScore/requests_debug
View all activity

megย 
posted an update 3 days ago
view post
Post
3337
๐Ÿค– Did you know your voice might be cloned without your consent from just *one sentence* of audio?
That's not great. So with @frimelle , we brainstormed a new idea for developers who want to curb malicious use: โœจThe Voice Consent Gate.โœจ
Details, code, here: https://huggingface.co/blog/voice-consent-gate
  • 3 replies
ยท
megย 
posted an update about 2 months ago
view post
Post
2862
๐Ÿค– As AI-generated content is shared in movies/TV/across the web, there's one simple low-hanging fruit ๐Ÿ‡ to help know what's real: Visible watermarks. With the Gradio team, I've made sure it's trivially easy to add this disclosure to images, video, chatbot text. See how: https://huggingface.co/blog/watermarking-with-gradio
Thanks to the code collab in particular from @abidlabs and Yuvraj Sharma.
yjerniteย 
posted an update about 2 months ago
view post
Post
2425
Tremendous quality of life upgrade on the Hugging Face Hub - we now have auto-complete emojis ๐Ÿค— ๐Ÿฅณ ๐Ÿ‘ ๐Ÿ™Œ ๐ŸŽ‰

Get ready for lots more very serious analysis on a whole range of topics from yours truly now that we have unlocked this full range of expression ๐Ÿ˜„ ๐Ÿค” ๐Ÿ—ฃ ๐Ÿ™Š
megย 
posted an update 3 months ago
megย 
posted an update 3 months ago
view post
Post
437
๐Ÿค– ICYMI: Yesterday, Hugging Face and OpenAI partnered to bring open source GPT to the public. This is a Big Deal in "AI world".

0. Common ground setting: OpenAI is the ChatGPT people. An โ€œopen sourceโ€ model is one whose weights are available โ€” that means the model can be โ€œyoursโ€.
1. You donโ€™t have to interact with the company directly, nor give them your interactions, to use the system. The company can't "surveil" you.
2. You can evaluate the unique contributions of their SOTA model much more rigorously than you can when there are collections of models+code behind a closed API. You can find out specifically what the model can and can't do.
3. And you can directly customize it for whatever you'd like. Fine-tuning, wherein you give the model data that's tailored to your use cases and train it some more on that data, is trivial* when you have the model weights.
*Provided you have the compute.
4. You can directly benchmark whatever you'd like. Biases? Energy usage? Strengths/weaknesses? Go for it. You wants it you gots it--this transparency helps people understand SOTA *in general*, not just for this model, but points to, e.g., what's going on with closed Google models as well.
5. One of the most powerful things about "openness" that I've learned is that it cultivates ecosystems of collaborators building on top of one another's brilliance to make systems that are significantly better than they would be if created in isolation.
But, caveat wrt my own philosophy...
6. I do not take it as a given that advancing LLMs is good, and have a lot more to say wrt where I think innovation should focus more. For example, a focus on *data* -- curation, measurement, consent, credit, compensation, safety -- would deeply improve technology for everyone.
7. The transparency this release provides is massive for people who want to *learn* about LLMs. For the next generation of technologists to advance over the current, they MUST be able to learn about what's happening now. (cont...)
  • 1 reply
ยท
megย 
posted an update 3 months ago
view post
Post
493
๐Ÿค– ๐Ÿ‘พ Thanks so much to BBC News and the stellar Suranjana Tewari for having me on to talk about US <โ€”> China relationship in AI, and what it means for AI ethics.
IlyasMoutawwakilย 
posted an update 3 months ago
view post
Post
3440
๐Ÿš€ Optimum: The Last v1 Release ๐Ÿš€
Optimum v1.27 marks the final major release in the v1 series. As we close this chapter, we're laying the groundwork for a more modular and community-driven future:
- Optimum v2: A lightweight core package for porting Transformers, Diffusers, or Sentence-Transformers to specialized AI hardware/software/accelerators..
- Optimumโ€‘ONNX: A dedicated package where the ONNX/ONNX Runtime ecosystem lives and evolves, faster-moving and decoupled from the Optimum core.

๐ŸŽฏ Why this matters:
- A clearer governance path for ONNX, fostering stronger community collaboration and improved developer experience..
- Enable innovation at a faster pace in a more modular, open-source environment.

๐Ÿ’ก What this means:
- More transparency, broader participation, and faster development driven by the community and key actors in the ONNX ecosystem (PyTorch, Microsoft, Joshua Lochner ๐Ÿ‘€, ...)
- A cleaner, more maintainable core Optimum, focused on extending HF libraries to special AI hardware/software/accelerators tooling and used by our partners (Intel Corporation, Amazon Web Services (AWS), AMD, NVIDIA, FuriosaAI, ...)

๐Ÿ› ๏ธ Major updates I worked on in this release:
โœ… Added support for Transformers v4.53 and SmolLM3 in ONNX/ONNXRuntime.
โœ… Solved batched inference/generation for all supported decoder model architectures (LLMs).

โœจ Big shoutout to @echarlaix for leading the refactoring work that cleanly separated ONNX exporter logic and enabled the creation of Optimumโ€‘ONNX.

๐Ÿ“ Release Notes: https://lnkd.in/gXtE_qji
๐Ÿ“ฆ Optimum : https://lnkd.in/ecAezNT6
๐ŸŽ Optimum-ONNX: https://lnkd.in/gzjyAjSi
#Optimum #ONNX #OpenSource #HuggingFace #Transformers #Diffusers
yjerniteย 
posted an update 3 months ago
view post
Post
4184
๐—™๐—ถ๐—ฟ๐˜€๐˜ ๐—š๐—ฃ๐—”๐—œ ๐— ๐—ผ๐—ฑ๐—ฒ๐—น ๐˜„๐—ถ๐˜๐—ต ๐—˜๐—จ ๐——๐—ฎ๐˜๐—ฎ ๐—ง๐—ฟ๐—ฎ๐—ป๐˜€๐—ฝ๐—ฎ๐—ฟ๐—ฒ๐—ป๐—ฐ๐˜† ๐—ง๐—ฒ๐—บ๐—ฝ๐—น๐—ฎ๐˜๐—ฒ? ๐Ÿ‡ช๐Ÿ‡บ

With the release of the EU data transparency template this week, we finally got to see one of the most meaningful artifacts to come out of the AI Act implementation so far (haven't you heard? AI's all about the data! ๐Ÿ“Š๐Ÿ“š)

The impact of the template will depend on how effectively it establishes a minimum meaningful transparency standard for companies that don't otherwise offer any transparency into their handling of e.g. personal data or (anti?-)competitive practices in commercial licensing - we'll see how those play out as new models are released after August 2nd ๐Ÿ‘€


In the meantime, I wanted to see how the template works for a fully open-source + commercially viable model, so I filled it out for the SmolLM3 - which my colleagues at Hugging Face earlier this month ๐Ÿค— ICYMI, it's fully open-source with 3B parameters and performance matching the best similar-size models (I've switched all my local apps from Qwen3 to it, you should too ๐Ÿ’ก)

Verdict: congrats to the European Commission AI Office for making it so straightforward! Fully open and transparent models remain a cornerstone of informed regulation and governance, but the different organizational needs of their developers aren't always properly accounted for in new regulation. In this case, it took me all of two hours to fill out and publish the template (including reading the guidelines) - so kudos for making it feasible for smaller and distributed organizations ๐Ÿ™Œ Definitely a step forward for transparency ๐Ÿ”

To learn more have a look at:

- The SmolLM3 model: HuggingFaceTB/SmolLM3-3B
- Its filled out Public Summary of Training Content: hfmlsoc/smollm3-eu-data-transparency
- And if you're interested, some previous remarks on regulatory minimum meaningful standards for data disclosure: https://huggingface.co/blog/yjernite/naiac-data-transparency
yjerniteย 
posted an update 5 months ago
regisssย 
posted an update 6 months ago
megย 
posted an update 6 months ago