Apply-Texture-Qwen-Image-Edit

Running on Zero

tchung1970 Claude commited on 19 days ago

Commit

06ecdbb

1 Parent(s): 6282316

Add CLAUDE.md and localize UI to Korean

- Add comprehensive CLAUDE.md documentation for Claude Code
- Localize all UI elements and error messages to Korean

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>

Files changed (2) hide show

CLAUDE.md +112 -0
app.py +28 -28

CLAUDE.md ADDED Viewed

	@@ -0,0 +1,112 @@

+# CLAUDE.md
+This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
+## Project Overview
+This is a Hugging Face Gradio Space that applies textures to images using the Qwen-Image-Edit-2509 model enhanced with custom LoRA adapters. The application takes a content image and a texture image as inputs, then applies the texture to the content based on a text description.
+## Key Architecture
+### Pipeline Structure
+The application uses a custom diffusion pipeline built on top of Diffusers:
+- **Base Model**: `Qwen/Qwen-Image-Edit-2509` with FlowMatchEulerDiscreteScheduler
+- **LoRA Adapters**: Two LoRAs are fused at startup:
+  1. `tarn59/apply_texture_qwen_image_edit_2509` - texture application capability
+  2. `lightx2v/Qwen-Image-Lightning` - 4-step inference acceleration
+- **Custom Components** (in `qwenimage/` module):
+  - `QwenImageEditPlusPipeline` - modified pipeline for dual image input
+  - `QwenImageTransformer2DModel` - custom transformer implementation
+  - `QwenDoubleStreamAttnProcessorFA3` - FlashAttention 3 processor for performance
+### Pipeline Initialization Flow
+1. Scheduler configured with exponential time shift and dynamic shifting
+2. Base pipeline loaded with bfloat16 dtype
+3. Both LoRAs loaded and fused (then unloaded to save memory)
+4. Transformer class swapped to custom implementation
+5. FlashAttention 3 processor applied
+6. Pipeline moved to GPU and optimized with AOT compilation
+### Optimization System
+`optimization.py` implements ahead-of-time (AOT) compilation using Spaces GPU infrastructure:
+- Exports transformer with torch.export using dynamic shapes for variable sequence lengths
+- Compiles with TorchInductor using aggressive optimizations (max_autotune, cudagraphs, etc.)
+- Applies compiled model back to pipeline transformer
+- Warmup run performed during initialization with 1024x1024 dummy images
+## Running the Application
+### Local Development
+```bash
+# Install dependencies
+pip install -r requirements.txt
+# Run the Gradio app
+python app.py
+```
+### Testing Inference
+The main function is `apply_texture()` in `app.py:82`. Key parameters:
+- `content_image`: PIL Image or file path - the base image
+- `texture_image`: PIL Image or file path - the texture to apply
+- `prompt`: Text description (e.g., "Apply wood texture to mug")
+- `num_inference_steps`: Default 4 (optimized for Lightning LoRA)
+- `true_guidance_scale`: Default 1.0
+### Image Dimension Handling
+Output dimensions are calculated from the content image (`calculate_dimensions()` at `app.py:59`):
+- Largest side is scaled to 1024px
+- Aspect ratio preserved
+- Both dimensions rounded to multiples of 8 (required by model)
+## Important Technical Details
+### Model Device and Dtype
+- Uses `torch.bfloat16` for memory efficiency and H100 compatibility
+- Automatically selects CUDA if available, falls back to CPU
+- Pipeline optimization assumes GPU availability (uses `@spaces.GPU` decorator)
+### Spaces Integration
+This app is designed for Hugging Face Spaces with ZeroGPU:
+- `@spaces.GPU` decorator on inference function allocates GPU on-demand
+- Optimization uses `spaces.aoti_capture()`, `spaces.aoti_compile()`, and `spaces.aoti_apply()`
+- Compilation happens once at startup with 1500s duration allowance
+### Custom Module Dependencies
+The `qwenimage/` module contains modified Diffusers components:
+- Not installed via pip, part of the repository
+- Must be kept in sync if updating base Diffusers version
+- Implements dual-image input handling for texture application
+## Common Development Commands
+```bash
+# Test the app locally (will download ~10GB of models on first run)
+python app.py
+# Check dependencies
+pip list | grep -E "diffusers|transformers|torch"
+# View GPU memory usage during inference (if running on GPU)
+nvidia-smi
+```
+## Key Files
+- `app.py` - Main Gradio interface and inference logic
+- `optimization.py` - AOT compilation and quantization utilities
+- `qwenimage/pipeline_qwenimage_edit_plus.py` - Custom dual-image pipeline
+- `qwenimage/transformer_qwenimage.py` - Modified transformer model
+- `qwenimage/qwen_fa3_processor.py` - FlashAttention 3 attention processor
+- `requirements.txt` - Includes diffusers from GitHub main branch

app.py CHANGED Viewed

@@ -90,11 +90,11 @@ def apply_texture(
     progress=gr.Progress(track_tqdm=True)
 ):
     if content_image is None:
-        raise gr.Error("Please upload a content image.")
     if texture_image is None:
-        raise gr.Error("Please upload a texture image.")
     if not prompt or not prompt.strip():
-        raise gr.Error("Please provide a description.")
     if randomize_seed:
         seed = random.randint(0, MAX_SEED)
@@ -130,46 +130,46 @@ css = '''
 with gr.Blocks(theme=gr.themes.Citrus(), css=css) as demo:
     with gr.Column(elem_id="col-container"):
-        gr.Markdown("# Apply Texture — Qwen Image Edit")
         gr.Markdown("""
-            Using [tarn59's Apply-Texture-Qwen-Image-Edit-2509 LoRA](https://huggingface.co/tarn59/apply_texture_qwen_image_edit_2509)
-            and [lightx2v/Qwen-Image-Lightning](https://huggingface.co/lightx2v/Qwen-Image-Lightning) for 4-step inference 💨
         """)
         with gr.Row():
             with gr.Column():
                 with gr.Row():
-                    content_image = gr.Image(label="Content", type="pil")
-                    texture_image = gr.Image(label="Texture", type="pil")
                 prompt = gr.Textbox(
-                    label="Describe",
-                    info="Apply ... texture to ...",
-                    placeholder="Apply wood siding texture to building walls."
                 )
-                button = gr.Button("✨ Generate", variant="primary")
-                with gr.Accordion("⚙️ Advanced Settings", open=False):
-                    seed = gr.Slider(label="Seed", minimum=0, maximum=MAX_SEED, step=1, value=0)
-                    randomize_seed = gr.Checkbox(label="Randomize Seed", value=True)
                     true_guidance_scale = gr.Slider(
-                        label="True Guidance Scale",
-                        minimum=1.0,
-                        maximum=10.0,
-                        step=0.1,
                         value=1.0
                     )
                     num_inference_steps = gr.Slider(
-                        label="Inference Steps",
-                        minimum=1,
-                        maximum=40,
-                        step=1,
                         value=4
                     )
             with gr.Column():
-                output = gr.Image(label="Output", interactive=False)
         # Event handlers
         button.click(
@@ -189,8 +189,8 @@ with gr.Blocks(theme=gr.themes.Citrus(), css=css) as demo:
         # Examples
         gr.Examples(
             examples=[
-                ["coffee_mug.png", "wood_boxes.png", "Apply wood texture to mug"],
-                ["720park.jpg", "black-and-white.jpg", "Apply black-and-white wobbly texture to building"],
             ],
             inputs=[
                 content_image,

     progress=gr.Progress(track_tqdm=True)
 ):
     if content_image is None:
+        raise gr.Error("콘텐츠 이미지를 업로드해주세요.")
     if texture_image is None:
+        raise gr.Error("텍스처 이미지를 업로드해주세요.")
     if not prompt or not prompt.strip():
+        raise gr.Error("설명을 입력해주세요.")
     if randomize_seed:
         seed = random.randint(0, MAX_SEED)
 with gr.Blocks(theme=gr.themes.Citrus(), css=css) as demo:
     with gr.Column(elem_id="col-container"):
+        gr.Markdown("# 텍스처 적용 — Qwen Image Edit")
         gr.Markdown("""
+            [tarn59의 Apply-Texture-Qwen-Image-Edit-2509 LoRA](https://huggingface.co/tarn59/apply_texture_qwen_image_edit_2509)와
+            [lightx2v/Qwen-Image-Lightning](https://huggingface.co/lightx2v/Qwen-Image-Lightning)을 사용한 4단계 추론 💨
         """)
         with gr.Row():
             with gr.Column():
                 with gr.Row():
+                    content_image = gr.Image(label="콘텐츠", type="pil")
+                    texture_image = gr.Image(label="텍스처", type="pil")
                 prompt = gr.Textbox(
+                    label="설명",
+                    info="...에 ... 텍스처 적용",
+                    placeholder="건물 벽에 나무 사이딩 텍스처 적용"
                 )
+                button = gr.Button("✨ 생성", variant="primary")
+                with gr.Accordion("⚙️ 고급 설정", open=False):
+                    seed = gr.Slider(label="시드", minimum=0, maximum=MAX_SEED, step=1, value=0)
+                    randomize_seed = gr.Checkbox(label="시드 무작위화", value=True)
                     true_guidance_scale = gr.Slider(
+                        label="실제 가이던스 스케일",
+                        minimum=1.0,
+                        maximum=10.0,
+                        step=0.1,
                         value=1.0
                     )
                     num_inference_steps = gr.Slider(
+                        label="추론 단계",
+                        minimum=1,
+                        maximum=40,
+                        step=1,
                         value=4
                     )
             with gr.Column():
+                output = gr.Image(label="출력", interactive=False)
         # Event handlers
         button.click(
         # Examples
         gr.Examples(
             examples=[
+                ["coffee_mug.png", "wood_boxes.png", "머그에 나무 텍스처 적용"],
+                ["720park.jpg", "black-and-white.jpg", "건물에 흑백 물결 텍스처 적용"],
             ],
             inputs=[
                 content_image,