Cannot run on Apple Silicon (M4) due to Triton

#11
by Metaphorz - opened

Working with an AI coder and have YOLO and Google's segmenter working but after downloading sam3.pt, I am reading that triton is tied to CUDA so I cannot have this work with MPS. Does anyone have a workaround? Here is Claude Code's SAM3 report:

SAM 3 Findings:
After downloading the checkpoint and attempting
setup, I discovered that SAM 3 cannot run on Apple
Silicon because:

  • Hard dependency on triton library (CUDA-only, no
    MPS support)
  • The dependency chain requires triton for
    Euclidean Distance Transform calculations
  • PyTorch 2.9+ also expects triton to be available
  • Requires NVIDIA GPU with CUDA to function
AI at Meta org

Have you tried the transformers implementation on main branch? (pip install git+https://github.com/huggingface/transformers)
I haven't tested the implementation on MPS, but it shouldn't have the triton dependency issue

I just tested it works if we install from github pip install git+https://github.com/huggingface/transformers torchvision on apple silicon

@yonigozlan , @AbacusGauge : thanks much. Indeed transformers from HF was the solution!. We can close this ticket. I think I need to educate myself more on SAM 3 especially, threshold and mask settings.

@abacus . Your demo is excellent. We (claude code and me) borrowed, with attribution to Abacus,
this code. We have an emerging tool that covers 3 segmentation models: Yolo, Gemini, and Meta
SAM3. Adding Grounding Dino right now as a 4th. If there is interest, I can push to github but I need
to make sure to leave the Meta pt out of it so that users will need to go through the
Meta agreement.

Thanks @Metaphorz

you are attributing to the wrong person. I am https://huggingface.co/AbacusGauge (https://github.com/amritsingh183)

Hi guys in comments @yonigozlan , @AbacusGauge . Your solutions works only on image?

To run SAM3 on video, CUDA is must?

@pokijunior not really, I am able to use it on videos as well but with little code changes. If you're still facing issues, share a screenshot of the error you're facing.

Hi @sjoshi , I did pip install git+https://github.com/huggingface/transformers torchvision and the Pre-loaded Video Inference tutorial code but faced this error:

Traceback (most recent call last):
  File "/Users/user/sam3-huggingface-tutorial/./pcs-video.py", line 35, in <module>
    processed_outputs = processor.postprocess_outputs(inference_session, model_outputs)
  File "/Users/user/sam3-huggingface-tutorial/venv/lib/python3.13/site-packages/transformers/models/sam3_video/processing_sam3_video.py", line 343, in postprocess_outputs
    keep_idx_gpu = keep_idx.pin_memory().to(device=out_binary_masks.device, non_blocking=True)
                   ~~~~~~~~~~~~~~~~~~~^^
RuntimeError: Attempted to set the storage of a tensor on device "cpu" to a storage on different device "mps:0".  This is no longer allowed; the devices must match.

What changes did you make in your code?

Go to the line 343 at this path (/Users/user/sam3-huggingface-tutorial/venv/lib/python3.13/site-packages/transformers/models/sam3_video/processing_sam3_video.py) from your error message and change it to

keep_idx_gpu = keep_idx.to(device=out_binary_masks.device, non_blocking=True)

This should fix it.

make sure to change the source code in the venv path.

It works! Thank you so much @sjoshi !

Sign up or log in to comment