lironui
/

AeroReformer

Model card Files Files and versions

lironui commited on Feb 22

Commit

d979df1

·

verified ·

1 Parent(s): 52b7130

Update README.md

Files changed (1) hide show

README.md +19 -13

README.md CHANGED Viewed

@@ -1,23 +1,29 @@
 ---
 license: mit
 ---
-AeroReformer: Aerial Referring Transformer for UAV-based Referring Image Segmentation
-🚀 AeroReformer is a novel framework for UAV-based referring image segmentation (RIS), designed to address the unique challenges of aerial imagery, such as complex spatial scales, occlusions, and varying object orientations.
-Our method integrates multi-head vision-language fusion (MHVLFM) and multi-scale rotation-aware fusion (MSRAFM) to achieve superior segmentation performance compared to existing RIS approaches.
-The datasets and code will be made publicly available at our GitHub repository.
-📝 Paper Status
-Our research paper detailing AeroReformer is currently in preparation and will be released soon. Stay tuned for updates!
-📌 Model Overview
-AeroReformer is a transformer-based vision-language model designed for referring segmentation in UAV imagery. It automatically localizes and segments objects based on natural language descriptions, overcoming the limitations of existing RIS models in aerial datasets.
-🔹 Key Features
-Automatic Annotation Pipeline: Utilizes open-source UAV segmentation datasets and large language models (LLMs) to generate textual descriptions.
-Multi-Head Vision-Language Fusion (MHVLFM): Enhances cross-modal understanding for precise segmentation.
-Multi-Scale Rotation-Aware Fusion (MSRAFM): Improves robustness to aerial scene variations.
-State-of-the-Art Performance: Sets a new benchmark in UAV-based referring segmentation on multiple datasets.

 ---
 license: mit
 ---
+# **AeroReformer: Aerial Referring Transformer for UAV-based Referring Image Segmentation**
+🚀 **AeroReformer** is a novel framework for **UAV-based referring image segmentation (RIS)**, designed to address the unique challenges of aerial imagery, such as complex spatial scales, occlusions, and varying object orientations.
+Our method integrates **multi-head vision-language fusion (MHVLFM)** and **multi-scale rotation-aware fusion (MSRAFM)** to achieve superior segmentation performance compared to existing RIS approaches.
+The datasets and code will be made publicly available at our **[GitHub repository](https://github.com/lironui/AeroReformer)**.
+---
+## **📝 Paper Status**
+Our research paper detailing **AeroReformer** is currently in preparation and will be released soon. Stay tuned for updates!
+---
+## **📌 Model Overview**
+**AeroReformer** is a **transformer-based vision-language model** designed for **referring segmentation in UAV imagery**. It automatically **localizes and segments objects** based on **natural language descriptions**, overcoming the limitations of existing RIS models in aerial datasets.
+### **🔹 Key Features**
+✅ **Automatic Annotation Pipeline**: Utilizes open-source UAV segmentation datasets and large language models (LLMs) to generate textual descriptions.
+✅ **Multi-Head Vision-Language Fusion (MHVLFM)**: Enhances cross-modal understanding for precise segmentation.
+✅ **Multi-Scale Rotation-Aware Fusion (MSRAFM)**: Improves robustness to aerial scene variations.
+✅ **State-of-the-Art Performance**: Sets a new benchmark in UAV-based referring segmentation on multiple datasets.
+---