SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features
Paper
β’
2502.14786
β’
Published
β’
157
Watermark-Detection-SigLIP2 is a vision-language encoder model fine-tuned from google/siglip2-base-patch16-224 for binary image classification.
It detects whether an image contains a watermark or not, using the SiglipForImageClassification architecture.
β οΈ Note: Watermark detection works best with high-quality, crisp images. Avoid noisy inputs.
π Paper: SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features
https://arxiv.org/pdf/2502.14786
precision recall f1-score support
No Watermark 0.9290 0.9722 0.9501 12779
Watermark 0.9622 0.9048 0.9326 9983
accuracy 0.9427 22762
macro avg 0.9456 0.9385 0.9414 22762
weighted avg 0.9435 0.9427 0.9424 22762
Base model
google/siglip2-base-patch16-224