Cene655 commited on
Commit
64edbe4
Β·
verified Β·
1 Parent(s): 0383e7a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +193 -3
README.md CHANGED
@@ -1,3 +1,193 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ base_model:
6
+ - Qwen/Qwen-Image-Edit
7
+ pipeline_tag: image-to-image
8
+ tags:
9
+ - lora
10
+ - qwen
11
+ - qwen-image
12
+ - qwen-image-edit
13
+ - image-editing
14
+ - inscene
15
+ - spatial-understanding
16
+ - scene-coherence
17
+ - computer-vision
18
+ - InScene
19
+ library_name: diffusers
20
+ ---
21
+
22
+ # Qwen Image Edit Inscene LoRA
23
+
24
+ An open-source LoRA (Low-Rank Adaptation) model for Qwen-Image-Edit that specializes in in-scene image editing by [FlyMy.AI](https://flymy.ai).
25
+
26
+ ## 🌟 About FlyMy.AI
27
+
28
+ Agentic Infra for GenAI. FlyMy.AI is a B2B infrastructure for building and running GenAI Media agents.
29
+
30
+ **πŸ”— Useful Links:**
31
+ - 🌐 [Official Website](https://flymy.ai)
32
+ - πŸ“š [Documentation](https://docs.flymy.ai/intro)
33
+ - πŸ’¬ [Discord Community](https://discord.com/invite/t6hPBpSebw)
34
+ - πŸ€— [LoRA Training Repository](https://github.com/FlyMyAI/flymyai-lora-trainer)
35
+ - 🐦 [X (Twitter)](https://x.com/flymyai)
36
+ - πŸ’Ό [LinkedIn](https://linkedin.com/company/flymyai)
37
+ - πŸ“Ί [YouTube](https://youtube.com/@flymyai)
38
+ - πŸ“Έ [Instagram](https://www.instagram.com/flymy_ai)
39
+
40
+ ---
41
+
42
+ ## πŸš€ Features
43
+
44
+ - LoRA-based fine-tuning for efficient in-scene image editing
45
+ - Specialized for Qwen-Image-Edit model
46
+ - Enhanced control over scene composition and object positioning
47
+ - Optimized for maintaining scene coherence during edits
48
+ - Compatible with Hugging Face `diffusers`
49
+ - Control-based image editing with improved spatial understanding
50
+
51
+ ---
52
+
53
+ ## πŸ“¦ Installation
54
+
55
+ 1. Install required packages:
56
+ ```bash
57
+ pip install torch torchvision diffusers transformers accelerate
58
+ ```
59
+
60
+ 2. Install the latest `diffusers` from GitHub:
61
+ ```bash
62
+ pip install git+https://github.com/huggingface/diffusers
63
+ ```
64
+
65
+ ---
66
+
67
+ ## πŸ§ͺ Usage
68
+
69
+ ### πŸ”§ Qwen-Image-Edit Initialization
70
+
71
+ ```python
72
+ from diffusers import QwenImageEditPipeline
73
+ import torch
74
+ from PIL import Image
75
+
76
+ # Load the pipeline
77
+ pipeline = QwenImageEditPipeline.from_pretrained("Qwen/Qwen-Image-Edit")
78
+ pipeline.to(torch.bfloat16)
79
+ pipeline.to("cuda")
80
+ ```
81
+
82
+ ### πŸ”Œ Load LoRA Weights
83
+
84
+ ```python
85
+ # Load trained LoRA weights for in-scene editing
86
+ pipeline.load_lora_weights("flymy-ai/qwen-image-edit-inscene-lora",weight_name="flymy_qwen_image_edit_inscene_lora.safetensors")
87
+ ```
88
+
89
+ ### 🎨 Edit Image with Qwen-Image-Edit Inscene LoRA
90
+
91
+ ```python
92
+ # Load input image
93
+ image = Image.open("./assets/qie2_input.jpg").convert("RGB")
94
+
95
+ # Define in-scene editing prompt
96
+ prompt = "Make a shot in the same scene of the left hand securing the edge of the cutting board while the right hand tilts it, causing the chopped tomatoes to slide off into the pan, camera angle shifts slightly to the left to center more on the pan."
97
+
98
+ # Generate edited image with enhanced scene understanding
99
+ inputs = {
100
+ "image": image,
101
+ "prompt": prompt,
102
+ "generator": torch.manual_seed(0),
103
+ "true_cfg_scale": 4.0,
104
+ "negative_prompt": " ",
105
+ "num_inference_steps": 50,
106
+ }
107
+
108
+ with torch.inference_mode():
109
+ output = pipeline(**inputs)
110
+ output_image = output.images[0]
111
+ output_image.save("edited_image.png")
112
+ ```
113
+
114
+ ### πŸ–ΌοΈ Sample Output - Qwen-Image-Edit Inscene
115
+
116
+ **Input Image:**
117
+
118
+ ![Input Image](./assets/qie2_input.jpg)
119
+
120
+ **Prompt:**
121
+ "Make a shot in the same scene of the left hand securing the edge of the cutting board while the right hand tilts it, causing the chopped tomatoes to slide off into the pan, camera angle shifts slightly to the left to center more on the pan."
122
+
123
+ **Output without LoRA:**
124
+
125
+ ![Output without LoRA](./assets/qie2_orig.jpg)
126
+
127
+ **Output with Inscene LoRA:**
128
+
129
+ ![Output with LoRA](./assets/qie2_lora.jpg)
130
+
131
+ ---
132
+
133
+ ### Workflow Features
134
+
135
+ - βœ… Pre-configured for Qwen-Image-Edit + Inscene LoRA inference
136
+ - βœ… Optimized settings for in-scene editing quality
137
+ - βœ… Enhanced spatial understanding and scene coherence
138
+ - βœ… Easy prompt and parameter adjustment
139
+ - βœ… Compatible with various input image types
140
+
141
+ ---
142
+
143
+ ## 🎯 What is Inscene LoRA?
144
+
145
+ This LoRA model is specifically trained to enhance Qwen-Image-Edit's ability to perform **in-scene image editing**. It focuses on:
146
+
147
+ - **Scene Coherence**: Maintaining logical spatial relationships within the scene
148
+ - **Object Positioning**: Better understanding of object placement and movement
149
+ - **Camera Perspective**: Improved handling of viewpoint changes and camera movements
150
+ - **Action Sequences**: Enhanced ability to depict sequential actions within the same scene
151
+ - **Contextual Editing**: Preserving scene context while making targeted modifications
152
+
153
+ ---
154
+
155
+ ## πŸ”§ Training Information
156
+
157
+ This LoRA model was trained using the [FlyMy.AI LoRA Trainer](https://github.com/FlyMyAI/flymyai-lora-trainer) with:
158
+
159
+ - **Base Model**: Qwen/Qwen-Image-Edit
160
+ - **Training Focus**: In-scene image editing and spatial understanding
161
+ - **Dataset**: Curated collection of scene-based editing examples (InScene dataset)
162
+ - **Optimization**: Low-rank adaptation for efficient fine-tuning
163
+
164
+ ---
165
+
166
+ ## πŸ“Š Model Specifications
167
+
168
+ - **Model Type**: LoRA (Low-Rank Adaptation)
169
+ - **Base Model**: Qwen/Qwen-Image-Edit
170
+ - **File Format**: SafeTensors (.safetensors)
171
+ - **Specialization**: In-scene image editing
172
+ - **Training Framework**: Diffusers + Accelerate
173
+ - **Memory Efficient**: Optimized for consumer GPUs
174
+
175
+ ---
176
+
177
+ ## 🀝 Support
178
+
179
+ If you have questions or suggestions, join our community:
180
+
181
+ - 🌐 [FlyMy.AI](https://flymy.ai)
182
+ - πŸ’¬ [Discord Community](https://discord.com/invite/t6hPBpSebw)
183
+ - 🐦 [Follow us on X](https://x.com/flymyai)
184
+ - πŸ’Ό [Connect on LinkedIn](https://linkedin.com/company/flymyai)
185
+ - πŸ“§ [Support](mailto:[email protected])
186
+
187
+ **⭐ Don't forget to star the repository if you like it!**
188
+
189
+ ---
190
+
191
+ ## πŸ“„ License
192
+
193
+ This project is licensed under the Apache 2.0 License - see the LICENSE file for details.