Spaces:
Running
on
Zero
Running
on
Zero
Commit
ยท
06ecdbb
1
Parent(s):
6282316
Add CLAUDE.md and localize UI to Korean
Browse files- Add comprehensive CLAUDE.md documentation for Claude Code
- Localize all UI elements and error messages to Korean
๐ค Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <[email protected]>
CLAUDE.md
ADDED
|
@@ -0,0 +1,112 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# CLAUDE.md
|
| 2 |
+
|
| 3 |
+
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
| 4 |
+
|
| 5 |
+
## Project Overview
|
| 6 |
+
|
| 7 |
+
This is a Hugging Face Gradio Space that applies textures to images using the Qwen-Image-Edit-2509 model enhanced with custom LoRA adapters. The application takes a content image and a texture image as inputs, then applies the texture to the content based on a text description.
|
| 8 |
+
|
| 9 |
+
## Key Architecture
|
| 10 |
+
|
| 11 |
+
### Pipeline Structure
|
| 12 |
+
|
| 13 |
+
The application uses a custom diffusion pipeline built on top of Diffusers:
|
| 14 |
+
|
| 15 |
+
- **Base Model**: `Qwen/Qwen-Image-Edit-2509` with FlowMatchEulerDiscreteScheduler
|
| 16 |
+
- **LoRA Adapters**: Two LoRAs are fused at startup:
|
| 17 |
+
1. `tarn59/apply_texture_qwen_image_edit_2509` - texture application capability
|
| 18 |
+
2. `lightx2v/Qwen-Image-Lightning` - 4-step inference acceleration
|
| 19 |
+
- **Custom Components** (in `qwenimage/` module):
|
| 20 |
+
- `QwenImageEditPlusPipeline` - modified pipeline for dual image input
|
| 21 |
+
- `QwenImageTransformer2DModel` - custom transformer implementation
|
| 22 |
+
- `QwenDoubleStreamAttnProcessorFA3` - FlashAttention 3 processor for performance
|
| 23 |
+
|
| 24 |
+
### Pipeline Initialization Flow
|
| 25 |
+
|
| 26 |
+
1. Scheduler configured with exponential time shift and dynamic shifting
|
| 27 |
+
2. Base pipeline loaded with bfloat16 dtype
|
| 28 |
+
3. Both LoRAs loaded and fused (then unloaded to save memory)
|
| 29 |
+
4. Transformer class swapped to custom implementation
|
| 30 |
+
5. FlashAttention 3 processor applied
|
| 31 |
+
6. Pipeline moved to GPU and optimized with AOT compilation
|
| 32 |
+
|
| 33 |
+
### Optimization System
|
| 34 |
+
|
| 35 |
+
`optimization.py` implements ahead-of-time (AOT) compilation using Spaces GPU infrastructure:
|
| 36 |
+
|
| 37 |
+
- Exports transformer with torch.export using dynamic shapes for variable sequence lengths
|
| 38 |
+
- Compiles with TorchInductor using aggressive optimizations (max_autotune, cudagraphs, etc.)
|
| 39 |
+
- Applies compiled model back to pipeline transformer
|
| 40 |
+
- Warmup run performed during initialization with 1024x1024 dummy images
|
| 41 |
+
|
| 42 |
+
## Running the Application
|
| 43 |
+
|
| 44 |
+
### Local Development
|
| 45 |
+
|
| 46 |
+
```bash
|
| 47 |
+
# Install dependencies
|
| 48 |
+
pip install -r requirements.txt
|
| 49 |
+
|
| 50 |
+
# Run the Gradio app
|
| 51 |
+
python app.py
|
| 52 |
+
```
|
| 53 |
+
|
| 54 |
+
### Testing Inference
|
| 55 |
+
|
| 56 |
+
The main function is `apply_texture()` in `app.py:82`. Key parameters:
|
| 57 |
+
- `content_image`: PIL Image or file path - the base image
|
| 58 |
+
- `texture_image`: PIL Image or file path - the texture to apply
|
| 59 |
+
- `prompt`: Text description (e.g., "Apply wood texture to mug")
|
| 60 |
+
- `num_inference_steps`: Default 4 (optimized for Lightning LoRA)
|
| 61 |
+
- `true_guidance_scale`: Default 1.0
|
| 62 |
+
|
| 63 |
+
### Image Dimension Handling
|
| 64 |
+
|
| 65 |
+
Output dimensions are calculated from the content image (`calculate_dimensions()` at `app.py:59`):
|
| 66 |
+
- Largest side is scaled to 1024px
|
| 67 |
+
- Aspect ratio preserved
|
| 68 |
+
- Both dimensions rounded to multiples of 8 (required by model)
|
| 69 |
+
|
| 70 |
+
## Important Technical Details
|
| 71 |
+
|
| 72 |
+
### Model Device and Dtype
|
| 73 |
+
|
| 74 |
+
- Uses `torch.bfloat16` for memory efficiency and H100 compatibility
|
| 75 |
+
- Automatically selects CUDA if available, falls back to CPU
|
| 76 |
+
- Pipeline optimization assumes GPU availability (uses `@spaces.GPU` decorator)
|
| 77 |
+
|
| 78 |
+
### Spaces Integration
|
| 79 |
+
|
| 80 |
+
This app is designed for Hugging Face Spaces with ZeroGPU:
|
| 81 |
+
- `@spaces.GPU` decorator on inference function allocates GPU on-demand
|
| 82 |
+
- Optimization uses `spaces.aoti_capture()`, `spaces.aoti_compile()`, and `spaces.aoti_apply()`
|
| 83 |
+
- Compilation happens once at startup with 1500s duration allowance
|
| 84 |
+
|
| 85 |
+
### Custom Module Dependencies
|
| 86 |
+
|
| 87 |
+
The `qwenimage/` module contains modified Diffusers components:
|
| 88 |
+
- Not installed via pip, part of the repository
|
| 89 |
+
- Must be kept in sync if updating base Diffusers version
|
| 90 |
+
- Implements dual-image input handling for texture application
|
| 91 |
+
|
| 92 |
+
## Common Development Commands
|
| 93 |
+
|
| 94 |
+
```bash
|
| 95 |
+
# Test the app locally (will download ~10GB of models on first run)
|
| 96 |
+
python app.py
|
| 97 |
+
|
| 98 |
+
# Check dependencies
|
| 99 |
+
pip list | grep -E "diffusers|transformers|torch"
|
| 100 |
+
|
| 101 |
+
# View GPU memory usage during inference (if running on GPU)
|
| 102 |
+
nvidia-smi
|
| 103 |
+
```
|
| 104 |
+
|
| 105 |
+
## Key Files
|
| 106 |
+
|
| 107 |
+
- `app.py` - Main Gradio interface and inference logic
|
| 108 |
+
- `optimization.py` - AOT compilation and quantization utilities
|
| 109 |
+
- `qwenimage/pipeline_qwenimage_edit_plus.py` - Custom dual-image pipeline
|
| 110 |
+
- `qwenimage/transformer_qwenimage.py` - Modified transformer model
|
| 111 |
+
- `qwenimage/qwen_fa3_processor.py` - FlashAttention 3 attention processor
|
| 112 |
+
- `requirements.txt` - Includes diffusers from GitHub main branch
|
app.py
CHANGED
|
@@ -90,11 +90,11 @@ def apply_texture(
|
|
| 90 |
progress=gr.Progress(track_tqdm=True)
|
| 91 |
):
|
| 92 |
if content_image is None:
|
| 93 |
-
raise gr.Error("
|
| 94 |
if texture_image is None:
|
| 95 |
-
raise gr.Error("
|
| 96 |
if not prompt or not prompt.strip():
|
| 97 |
-
raise gr.Error("
|
| 98 |
|
| 99 |
if randomize_seed:
|
| 100 |
seed = random.randint(0, MAX_SEED)
|
|
@@ -130,46 +130,46 @@ css = '''
|
|
| 130 |
|
| 131 |
with gr.Blocks(theme=gr.themes.Citrus(), css=css) as demo:
|
| 132 |
with gr.Column(elem_id="col-container"):
|
| 133 |
-
gr.Markdown("#
|
| 134 |
gr.Markdown("""
|
| 135 |
-
|
| 136 |
-
|
| 137 |
""")
|
| 138 |
|
| 139 |
with gr.Row():
|
| 140 |
with gr.Column():
|
| 141 |
with gr.Row():
|
| 142 |
-
content_image = gr.Image(label="
|
| 143 |
-
texture_image = gr.Image(label="
|
| 144 |
-
|
| 145 |
prompt = gr.Textbox(
|
| 146 |
-
label="
|
| 147 |
-
info="
|
| 148 |
-
placeholder="
|
| 149 |
)
|
|
|
|
|
|
|
| 150 |
|
| 151 |
-
|
| 152 |
-
|
| 153 |
-
|
| 154 |
-
seed = gr.Slider(label="Seed", minimum=0, maximum=MAX_SEED, step=1, value=0)
|
| 155 |
-
randomize_seed = gr.Checkbox(label="Randomize Seed", value=True)
|
| 156 |
true_guidance_scale = gr.Slider(
|
| 157 |
-
label="
|
| 158 |
-
minimum=1.0,
|
| 159 |
-
maximum=10.0,
|
| 160 |
-
step=0.1,
|
| 161 |
value=1.0
|
| 162 |
)
|
| 163 |
num_inference_steps = gr.Slider(
|
| 164 |
-
label="
|
| 165 |
-
minimum=1,
|
| 166 |
-
maximum=40,
|
| 167 |
-
step=1,
|
| 168 |
value=4
|
| 169 |
)
|
| 170 |
|
| 171 |
with gr.Column():
|
| 172 |
-
output = gr.Image(label="
|
| 173 |
|
| 174 |
# Event handlers
|
| 175 |
button.click(
|
|
@@ -189,8 +189,8 @@ with gr.Blocks(theme=gr.themes.Citrus(), css=css) as demo:
|
|
| 189 |
# Examples
|
| 190 |
gr.Examples(
|
| 191 |
examples=[
|
| 192 |
-
["coffee_mug.png", "wood_boxes.png", "
|
| 193 |
-
["720park.jpg", "black-and-white.jpg", "
|
| 194 |
],
|
| 195 |
inputs=[
|
| 196 |
content_image,
|
|
|
|
| 90 |
progress=gr.Progress(track_tqdm=True)
|
| 91 |
):
|
| 92 |
if content_image is None:
|
| 93 |
+
raise gr.Error("์ฝํ
์ธ ์ด๋ฏธ์ง๋ฅผ ์
๋ก๋ํด์ฃผ์ธ์.")
|
| 94 |
if texture_image is None:
|
| 95 |
+
raise gr.Error("ํ
์ค์ฒ ์ด๋ฏธ์ง๋ฅผ ์
๋ก๋ํด์ฃผ์ธ์.")
|
| 96 |
if not prompt or not prompt.strip():
|
| 97 |
+
raise gr.Error("์ค๋ช
์ ์
๋ ฅํด์ฃผ์ธ์.")
|
| 98 |
|
| 99 |
if randomize_seed:
|
| 100 |
seed = random.randint(0, MAX_SEED)
|
|
|
|
| 130 |
|
| 131 |
with gr.Blocks(theme=gr.themes.Citrus(), css=css) as demo:
|
| 132 |
with gr.Column(elem_id="col-container"):
|
| 133 |
+
gr.Markdown("# ํ
์ค์ฒ ์ ์ฉ โ Qwen Image Edit")
|
| 134 |
gr.Markdown("""
|
| 135 |
+
[tarn59์ Apply-Texture-Qwen-Image-Edit-2509 LoRA](https://huggingface.co/tarn59/apply_texture_qwen_image_edit_2509)์
|
| 136 |
+
[lightx2v/Qwen-Image-Lightning](https://huggingface.co/lightx2v/Qwen-Image-Lightning)์ ์ฌ์ฉํ 4๋จ๊ณ ์ถ๋ก ๐จ
|
| 137 |
""")
|
| 138 |
|
| 139 |
with gr.Row():
|
| 140 |
with gr.Column():
|
| 141 |
with gr.Row():
|
| 142 |
+
content_image = gr.Image(label="์ฝํ
์ธ ", type="pil")
|
| 143 |
+
texture_image = gr.Image(label="ํ
์ค์ฒ", type="pil")
|
| 144 |
+
|
| 145 |
prompt = gr.Textbox(
|
| 146 |
+
label="์ค๋ช
",
|
| 147 |
+
info="...์ ... ํ
์ค์ฒ ์ ์ฉ",
|
| 148 |
+
placeholder="๊ฑด๋ฌผ ๋ฒฝ์ ๋๋ฌด ์ฌ์ด๋ฉ ํ
์ค์ฒ ์ ์ฉ"
|
| 149 |
)
|
| 150 |
+
|
| 151 |
+
button = gr.Button("โจ ์์ฑ", variant="primary")
|
| 152 |
|
| 153 |
+
with gr.Accordion("โ๏ธ ๊ณ ๊ธ ์ค์ ", open=False):
|
| 154 |
+
seed = gr.Slider(label="์๋", minimum=0, maximum=MAX_SEED, step=1, value=0)
|
| 155 |
+
randomize_seed = gr.Checkbox(label="์๋ ๋ฌด์์ํ", value=True)
|
|
|
|
|
|
|
| 156 |
true_guidance_scale = gr.Slider(
|
| 157 |
+
label="์ค์ ๊ฐ์ด๋์ค ์ค์ผ์ผ",
|
| 158 |
+
minimum=1.0,
|
| 159 |
+
maximum=10.0,
|
| 160 |
+
step=0.1,
|
| 161 |
value=1.0
|
| 162 |
)
|
| 163 |
num_inference_steps = gr.Slider(
|
| 164 |
+
label="์ถ๋ก ๋จ๊ณ",
|
| 165 |
+
minimum=1,
|
| 166 |
+
maximum=40,
|
| 167 |
+
step=1,
|
| 168 |
value=4
|
| 169 |
)
|
| 170 |
|
| 171 |
with gr.Column():
|
| 172 |
+
output = gr.Image(label="์ถ๋ ฅ", interactive=False)
|
| 173 |
|
| 174 |
# Event handlers
|
| 175 |
button.click(
|
|
|
|
| 189 |
# Examples
|
| 190 |
gr.Examples(
|
| 191 |
examples=[
|
| 192 |
+
["coffee_mug.png", "wood_boxes.png", "๋จธ๊ทธ์ ๋๋ฌด ํ
์ค์ฒ ์ ์ฉ"],
|
| 193 |
+
["720park.jpg", "black-and-white.jpg", "๊ฑด๋ฌผ์ ํ๋ฐฑ ๋ฌผ๊ฒฐ ํ
์ค์ฒ ์ ์ฉ"],
|
| 194 |
],
|
| 195 |
inputs=[
|
| 196 |
content_image,
|