--- tags: - text-to-image - lora - diffusers - template:diffusion-lora base_model: black-forest-labs/FLUX.1-Kontext-dev instance_prompt: >- [photo content], recreate the scene from a top-down perspective. Maintain all visual proportions, lighting consistency, and realistic spatial relationships. Ensure the background, textures, and environmental shadows remain naturally aligned from this elevated angle. license: other license_name: flux-1-dev-non-commercial-license license_link: LICENSE.md language: - en pipeline_tag: image-to-image library_name: diffusers --- ![1](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/gkn6DvNaQn14GgbhHgq5v.png) # **Kontext-Top-Down-View** The Kontext-Top-Down-View is an experimental adapter for black-forest-lab's FLUX.1-Kontext-dev, designed to transform scenes into a top-down perspective while maintaining accurate visual proportions, consistent lighting, and realistic spatial relationships. The model ensures that backgrounds, textures, and environmental details remain natural and contextually coherent, producing high-quality, perspective-accurate visual outputs. It was trained on 800 image pairs (400 start images and 400 end images) to achieve precise, geometry-consistent top-down scene generation. > [!note] [photo content], recreate the scene from a top-down perspective. Maintain all visual proportions, lighting consistency, and realistic spatial relationships. Ensure the background, textures, and environmental shadows remain naturally aligned from this elevated angle. > You modified the prompt, altering its properties and subjective elements. Note: this is an experimental adapter and may contain artifacts. --- ## **Sample Inferences : Demo**

--- ## Parameter Settings | Setting | Value | | ------------------------ | ------------------------ | | Module Type | Adapter | | Base Model | FLUX.1 Kontext Dev - fp8 | | Trigger Words | [photo content], recreate the scene from a top-down perspective. Maintain all visual proportions, lighting consistency, and realistic spatial relationships. Ensure the background, textures, and environmental shadows remain naturally aligned from this elevated angle. | | Image Processing Repeats | 50 | | Epochs | 25 | | Save Every N Epochs | 1 | Labeling: DeepCaption-VLA-7B(natural language & English) Total Images Used for Training : 800 Image Pairs (400 Start, 400 End) ## Training Parameters | Setting | Value | | --------------------------- | --------- | | Seed | - | | Clip Skip | - | | Text Encoder LR | 0.00001 | | UNet LR | 0.00005 | | LR Scheduler | constant | | Optimizer | AdamW8bit | | Network Dimension | 64 | | Network Alpha | 32 | | Gradient Accumulation Steps | - | ## Label Parameters | Setting | Value | | --------------- | ----- | | Shuffle Caption | - | | Keep N Tokens | - | ## Advanced Parameters | Setting | Value | | ------------------------- | ----- | | Noise Offset | 0.03 | | Multires Noise Discount | 0.1 | | Multires Noise Iterations | 10 | | Conv Dimension | - | | Conv Alpha | - | | Batch Size | - | | Steps | 3800 & 400(warm up) | | Sampler | euler | --- ## Trigger words You should use `[photo content]` to trigger the image generation. You should use `recreate the scene from a top-down perspective. Maintain all visual proportions` to trigger the image generation. You should use `lighting consistency` to trigger the image generation. You should use `and realistic spatial relationships. Ensure the background` to trigger the image generation. You should use `textures` to trigger the image generation. You should use `and environmental shadows remain naturally aligned from this elevated angle.` to trigger the image generation. ## Download model [Download](/prithivMLmods/Kontext-Top-Down-View/tree/main) them in the Files & versions tab.