Xsong123 commited on
Commit
9b8f636
Β·
verified Β·
1 Parent(s): 816d522

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +110 -39
README.md CHANGED
@@ -1,3 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  <div align="center">
2
  <h1>🎨 LucidFlux:<br/>Caption-Free Universal Image Restoration with a Large-Scale Diffusion Transformer</h1>
3
 
@@ -27,8 +43,71 @@ Let us know if this works!
27
  ---
28
 
29
  ## 🌟 What is LucidFlux?
 
 
 
 
 
 
30
  LucidFlux is a framework designed to perform high-fidelity image restoration across a wide range of degradations without requiring textual captions. By combining a Flux-based DiT backbone with Light-weight Condition Module and SigLIP semantic alignment, LucidFlux enables caption-free guidance while preserving structural and semantic consistency, achieving superior restoration quality.
31
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
32
  ## πŸ“Š Performance Benchmarks
33
 
34
  <div align="center">
@@ -195,6 +274,10 @@ LucidFlux is a framework designed to perform high-fidelity image restoration acr
195
  </tbody>
196
  </table>
197
 
 
 
 
 
198
  </div>
199
 
200
  ---
@@ -247,24 +330,25 @@ LucidFlux is a framework designed to perform high-fidelity image restoration acr
247
  <td width="200"><b>LQ</b></td>
248
  <td width="200"><b>HYPIR</b></td>
249
  <td width="200"><b>Topaz</b></td>
 
250
  <td width="200"><b>Gemini-NanoBanana</b></td>
251
  <td width="200"><b>GPT-4o</b></td>
252
  <td width="200"><b>Ours</b></td>
253
  </tr>
254
- <tr align="center"><td colspan="6"><img src="https://raw.githubusercontent.com/W2GenAI-Lab/LucidFlux/main/images/commercial_comparison/commercial_061.jpg" width="1200"></td></tr>
255
- <tr align="center"><td colspan="6"><img src="https://raw.githubusercontent.com/W2GenAI-Lab/LucidFlux/main/images/commercial_comparison/commercial_094.jpg" width="1200"></td></tr>
256
- <tr align="center"><td colspan="6"><img src="https://raw.githubusercontent.com/W2GenAI-Lab/LucidFlux/main/images/commercial_comparison/commercial_205.jpg" width="1200"></td></tr>
257
- <tr align="center"><td colspan="6"><img src="https://raw.githubusercontent.com/W2GenAI-Lab/LucidFlux/main/images/commercial_comparison/commercial_209.jpg" width="1200"></td></tr>
258
  </table>
259
 
260
  <details>
261
  <summary>Show more examples</summary>
262
 
263
  <table>
264
- <tr align="center"><td colspan="6"><img src="https://raw.githubusercontent.com/W2GenAI-Lab/LucidFlux/main/images/commercial_comparison/commercial_062.jpg" width="1200"></td></tr>
265
- <tr align="center"><td colspan="6"><img src="https://raw.githubusercontent.com/W2GenAI-Lab/LucidFlux/main/images/commercial_comparison/commercial_160.jpg" width="1200"></td></tr>
266
- <tr align="center"><td colspan="6"><img src="https://raw.githubusercontent.com/W2GenAI-Lab/LucidFlux/main/images/commercial_comparison/commercial_111.jpg" width="1200"></td></tr>
267
- <tr align="center"><td colspan="6"><img src="https://raw.githubusercontent.com/W2GenAI-Lab/LucidFlux/main/images/commercial_comparison/commercial_123.jpg" width="1200"></td></tr>
268
  </table>
269
 
270
  </details>
@@ -310,46 +394,33 @@ pip install -r requirements.txt
310
  ```
311
 
312
  ### Inference
313
- - **Flux.1 dev** β†’ [πŸ€— FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev)
314
- Then update the model path in the `configs` for `flux-dev` in `src/flux/util.py` to your local FLUX.1-dev model path.
315
 
316
- - **T5** β†’ [πŸ€— T5](https://huggingface.co/XLabs-AI/xflux_text_encoders)
317
- Then update the T5 path in the function `load_t5` in `src/flux/util.py` to your local T5 path.
 
318
 
319
- - **CLIP** β†’ [πŸ€— CLIP](https://huggingface.co/openai/clip-vit-large-patch14)
320
- Then update the CLIP path in the function `load_clip` in `src/flux/util.py` to your local CLIP path.
 
321
 
322
- - **SigLIP** β†’ [πŸ€— siglip2-so400m-patch16-512](https://huggingface.co/google/siglip2-so400m-patch16-512)
323
- Then set `siglip_ckpt` to the corresponding local path.
324
 
325
- - **SwinIR** β†’ [πŸ€— SwinIR](https://huggingface.co/lxq007/DiffBIR/blob/main/general_swinir_v1.ckpt)
326
- Then set `swin_ir_ckpt` to the corresponding local path.
 
327
 
328
- - **LucidFlux** β†’ [πŸ€— LucidFlux](https://huggingface.co/W2GenAI/LucidFlux)
329
- Then set `checkpoint` to the corresponding local path.
 
330
 
331
- ```bash
332
- inference.sh
333
 
334
- result_dir=ouput_images_folder
335
- input_folder=input_images_folder
336
- checkpoint_path=path/to/lucidflux.pth
337
- swin_ir_ckpt=path/to/swinir.ckpt
338
- siglip_ckpt=path/to/siglip.ckpt
339
 
340
- mkdir -p ${result_dir}
341
- echo "Processing checkpoint..."
342
- python inference.py \
343
- --checkpoint ${checkpoint_path} \
344
- --swinir_pretrained ${swin_ir_ckpt} \
345
- --control_image ${input_folder} \
346
- --siglip_ckpt ${siglip_ckpt} \
347
- --prompt "restore this image into high-quality, clean, high-resolution result" \
348
- --output_dir ${result_dir}/ \
349
- --width 1024 --height 1024 --num_steps 50 \
350
  ```
351
 
352
- Finially ```bash inference.sh```. You can also obtain the results of LucidFlux on RealSR and RealLQ250 from Hugging Face: [**LucidFlux**](https://huggingface.co/W2GenAI/LucidFlux).
353
 
354
  ## πŸͺͺ License
355
 
@@ -383,4 +454,4 @@ For any questions or inquiries, please reach out to us:
383
  </details>
384
 
385
 
386
- </div>
 
1
+ ---
2
+ language:
3
+ - en
4
+ - zh
5
+ library_name: transformers
6
+ tags:
7
+ - image-restoration
8
+ - diffusion
9
+ - computer-vision
10
+ - flux
11
+ - pytorch
12
+ license: other
13
+ license_name: flux-1-dev
14
+ license_link: https://huggingface.co/black-forest-labs/FLUX.1-dev/blob/main/LICENSE.md
15
+ ---
16
+
17
  <div align="center">
18
  <h1>🎨 LucidFlux:<br/>Caption-Free Universal Image Restoration with a Large-Scale Diffusion Transformer</h1>
19
 
 
43
  ---
44
 
45
  ## 🌟 What is LucidFlux?
46
+
47
+ <!-- <div align="center">
48
+ <img src="https://raw.githubusercontent.com/W2GenAI-Lab/LucidFlux/main/images/demo/demo2.png" alt="What is LucidFlux - Quick Prompt Demo" width="1200"/>
49
+ <br>
50
+ </div> -->
51
+
52
  LucidFlux is a framework designed to perform high-fidelity image restoration across a wide range of degradations without requiring textual captions. By combining a Flux-based DiT backbone with Light-weight Condition Module and SigLIP semantic alignment, LucidFlux enables caption-free guidance while preserving structural and semantic consistency, achieving superior restoration quality.
53
 
54
+ <!-- ## πŸš€ Quick Start
55
+
56
+ ### πŸ”§ Installation
57
+
58
+ ```bash
59
+ # Clone the repository
60
+ git clone https://github.com/ephemeral182/LucidFlux.git
61
+ cd LucidFlux
62
+
63
+ # Create conda environment
64
+ conda create -n postercraft python=3.11
65
+ conda activate postercraft
66
+
67
+ # Install dependencies
68
+ pip install -r requirements.txt
69
+
70
+ ``` -->
71
+
72
+ <!-- ### πŸš€ Quick Generation
73
+
74
+ Generate high-quality aesthetic posters from your prompt with `BF16` precision:
75
+
76
+ ```bash
77
+ python inference.py \
78
+ --prompt "Urban Canvas Street Art Expo poster with bold graffiti-style lettering and dynamic colorful splashes" \
79
+ --enable_recap \
80
+ --num_inference_steps 28 \
81
+ --guidance_scale 3.5 \
82
+ --seed 42 \
83
+ --pipeline_path "black-forest-labs/FLUX.1-dev" \
84
+ --custom_transformer_path "LucidFlux/LucidFlux-v1_RL" \
85
+ --qwen_model_path "Qwen/Qwen3-8B"
86
+ ```
87
+
88
+ If you are running on a GPU with limited memory, you can use `inference_offload.py` to offload some components to the CPU:
89
+
90
+ ```bash
91
+ python inference_offload.py \
92
+ --prompt "Urban Canvas Street Art Expo poster with bold graffiti-style lettering and dynamic colorful splashes" \
93
+ --enable_recap \
94
+ --num_inference_steps 28 \
95
+ --guidance_scale 3.5 \
96
+ --seed 42 \
97
+ --pipeline_path "black-forest-labs/FLUX.1-dev" \
98
+ --custom_transformer_path "LucidFlux/LucidFlux-v1_RL" \
99
+ --qwen_model_path "Qwen/Qwen3-8B"
100
+ ``` -->
101
+ <!--
102
+ ### πŸ’» Gradio Web UI
103
+
104
+ We provide a Gradio web UI for LucidFlux.
105
+
106
+ ```bash
107
+ python demo_gradio.py
108
+ ``` -->
109
+
110
+
111
  ## πŸ“Š Performance Benchmarks
112
 
113
  <div align="center">
 
274
  </tbody>
275
  </table>
276
 
277
+
278
+
279
+ <!-- <img src="https://raw.githubusercontent.com/W2GenAI-Lab/LucidFlux/main/images/user_study/hpc.png" alt="User Study Results" width="1200"/> -->
280
+
281
  </div>
282
 
283
  ---
 
330
  <td width="200"><b>LQ</b></td>
331
  <td width="200"><b>HYPIR</b></td>
332
  <td width="200"><b>Topaz</b></td>
333
+ <td width="200"><b>SeeDream 4.0</b></td>
334
  <td width="200"><b>Gemini-NanoBanana</b></td>
335
  <td width="200"><b>GPT-4o</b></td>
336
  <td width="200"><b>Ours</b></td>
337
  </tr>
338
+ <tr align="center"><td colspan="7"><img src="https://raw.githubusercontent.com/W2GenAI-Lab/LucidFlux/main/images/commercial_comparison/commercial_061.jpg" width="1400"></td></tr>
339
+ <tr align="center"><td colspan="7"><img src="https://raw.githubusercontent.com/W2GenAI-Lab/LucidFlux/main/images/commercial_comparison/commercial_094.jpg" width="1400"></td></tr>
340
+ <tr align="center"><td colspan="7"><img src="https://raw.githubusercontent.com/W2GenAI-Lab/LucidFlux/main/images/commercial_comparison/commercial_205.jpg" width="1400"></td></tr>
341
+ <tr align="center"><td colspan="7"><img src="https://raw.githubusercontent.com/W2GenAI-Lab/LucidFlux/main/images/commercial_comparison/commercial_209.jpg" width="1400"></td></tr>
342
  </table>
343
 
344
  <details>
345
  <summary>Show more examples</summary>
346
 
347
  <table>
348
+ <tr align="center"><td colspan="7"><img src="https://raw.githubusercontent.com/W2GenAI-Lab/LucidFlux/main/images/commercial_comparison/commercial_062.jpg" width="1400"></td></tr>
349
+ <tr align="center"><td colspan="7"><img src="https://raw.githubusercontent.com/W2GenAI-Lab/LucidFlux/main/images/commercial_comparison/commercial_160.jpg" width="1400"></td></tr>
350
+ <tr align="center"><td colspan="7"><img src="https://raw.githubusercontent.com/W2GenAI-Lab/LucidFlux/main/images/commercial_comparison/commercial_111.jpg" width="1400"></td></tr>
351
+ <tr align="center"><td colspan="7"><img src="https://raw.githubusercontent.com/W2GenAI-Lab/LucidFlux/main/images/commercial_comparison/commercial_123.jpg" width="1400"></td></tr>
352
  </table>
353
 
354
  </details>
 
394
  ```
395
 
396
  ### Inference
 
 
397
 
398
+ Prepare models in 2 steps, then run a single command.
399
+
400
+ 1) Login to Hugging Face (required for gated FLUX.1-dev). Skip if already logged-in.
401
 
402
+ ```bash
403
+ python -m tools.hf_login --token "$HF_TOKEN"
404
+ ```
405
 
406
+ 2) Download required weights to fixed paths and export env vars
 
407
 
408
+ ```bash
409
+ # FLUX.1-dev (flow+ae), SwinIR prior, T5, CLIP, SigLIP and LucidFlux checkpoint to ./weights
410
+ python -m tools.download_weights --dest weights
411
 
412
+ # Exports FLUX_DEV_FLOW/FLUX_DEV_AE to your shell
413
+ source weights/env.sh
414
+ ```
415
 
 
 
416
 
417
+ Run inference (uses fixed relative paths):
 
 
 
 
418
 
419
+ ```bash
420
+ bash inference.sh
 
 
 
 
 
 
 
 
421
  ```
422
 
423
+ You can also obtain results of LucidFlux on RealSR and RealLQ250 from Hugging Face: [**LucidFlux**](https://huggingface.co/W2GenAI/LucidFlux).
424
 
425
  ## πŸͺͺ License
426
 
 
454
  </details>
455
 
456
 
457
+ </div>