prithivMLmods commited on
Commit
ed2ebdb
·
verified ·
1 Parent(s): 23ceb7e

Update README.md (#1)

Browse files

- Update README.md (2d12cf2c673224331a1eb24b5af26400730d6a36)

Files changed (1) hide show
  1. README.md +100 -1
README.md CHANGED
@@ -2,8 +2,24 @@
2
  license: apache-2.0
3
  datasets:
4
  - flwrlabs/pacs
 
 
 
 
 
 
 
 
 
 
5
  ---
6
 
 
 
 
 
 
 
7
  ```py
8
  Classification Report:
9
  precision recall f1-score support
@@ -38,4 +54,87 @@ id2label = {str(i): label for i, label in enumerate(labels)}
38
 
39
  # Print the mapping
40
  print(id2label)
41
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  license: apache-2.0
3
  datasets:
4
  - flwrlabs/pacs
5
+ language:
6
+ - en
7
+ base_model:
8
+ - google/siglip2-base-patch16-224
9
+ pipeline_tag: image-classification
10
+ library_name: transformers
11
+ tags:
12
+ - PACS-DG
13
+ - domain generalization
14
+ - SigLIP2
15
  ---
16
 
17
+ ![4.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/2M1HRenGKvzLJiAdaexKs.png)
18
+
19
+ # **PACS-DG-SigLIP2**
20
+
21
+ > **PACS-DG-SigLIP2** is a vision-language encoder model fine-tuned from **google/siglip2-base-patch16-224** for **multi-class domain generalization** classification. It is trained to distinguish visual domains such as **art paintings**, **cartoons**, **photos**, and **sketches** using the **SiglipForImageClassification** architecture.
22
+
23
  ```py
24
  Classification Report:
25
  precision recall f1-score support
 
54
 
55
  # Print the mapping
56
  print(id2label)
57
+ ```
58
+
59
+ ---
60
+
61
+ ## **Label Space: 4 Domain Categories**
62
+
63
+ The model predicts the most probable visual domain from the following:
64
+
65
+ ```
66
+ Class 0: "art_painting"
67
+ Class 1: "cartoon"
68
+ Class 2: "photo"
69
+ Class 3: "sketch"
70
+ ```
71
+
72
+ ---
73
+
74
+ ## **Install dependencies**
75
+
76
+ ```bash
77
+ pip install -q transformers torch pillow gradio
78
+ ```
79
+
80
+ ---
81
+
82
+ ## **Inference Code**
83
+
84
+ ```python
85
+ import gradio as gr
86
+ from transformers import AutoImageProcessor, SiglipForImageClassification
87
+ from PIL import Image
88
+ import torch
89
+
90
+ # Load model and processor
91
+ model_name = "prithivMLmods/PACS-DG-SigLIP2" # Update to your actual model path on Hugging Face
92
+ model = SiglipForImageClassification.from_pretrained(model_name)
93
+ processor = AutoImageProcessor.from_pretrained(model_name)
94
+
95
+ # Label map
96
+ id2label = {
97
+ "0": "art_painting",
98
+ "1": "cartoon",
99
+ "2": "photo",
100
+ "3": "sketch"
101
+ }
102
+
103
+ def classify_pacs_image(image):
104
+ image = Image.fromarray(image).convert("RGB")
105
+ inputs = processor(images=image, return_tensors="pt")
106
+
107
+ with torch.no_grad():
108
+ outputs = model(**inputs)
109
+ logits = outputs.logits
110
+ probs = torch.nn.functional.softmax(logits, dim=1).squeeze().tolist()
111
+
112
+ prediction = {
113
+ id2label[str(i)]: round(probs[i], 3) for i in range(len(probs))
114
+ }
115
+
116
+ return prediction
117
+
118
+ # Gradio Interface
119
+ iface = gr.Interface(
120
+ fn=classify_pacs_image,
121
+ inputs=gr.Image(type="numpy"),
122
+ outputs=gr.Label(num_top_classes=4, label="Predicted Domain Probabilities"),
123
+ title="PACS-DG-SigLIP2",
124
+ description="Upload an image to classify its visual domain: Art Painting, Cartoon, Photo, or Sketch."
125
+ )
126
+
127
+ if __name__ == "__main__":
128
+ iface.launch()
129
+ ```
130
+
131
+ ---
132
+
133
+ ## **Intended Use**
134
+
135
+ The **PACS-DG-SigLIP2** model is designed to support tasks in **domain generalization**, particularly:
136
+
137
+ - **Cross-domain Visual Recognition** – Identify the domain style of an image.
138
+ - **Robust Representation Learning** – Aid in training or evaluating models on domain-shifted inputs.
139
+ - **Dataset Characterization** – Use as a tool to explore domain imbalance or drift.
140
+ - **Educational Tools** – Help understand how models distinguish between stylistic image variations.