Update README.md
Browse files
README.md
CHANGED
|
@@ -1,23 +1,29 @@
|
|
| 1 |
---
|
| 2 |
license: mit
|
| 3 |
---
|
| 4 |
-
AeroReformer: Aerial Referring Transformer for UAV-based Referring Image Segmentation
|
| 5 |
|
|
|
|
| 6 |
|
| 7 |
-
π AeroReformer is a novel framework for UAV-based referring image segmentation (RIS)
|
| 8 |
|
| 9 |
-
Our method integrates multi-head vision-language fusion (MHVLFM) and multi-scale rotation-aware fusion (MSRAFM) to achieve superior segmentation performance compared to existing RIS approaches.
|
| 10 |
|
| 11 |
-
The datasets and code will be made publicly available at our GitHub repository.
|
| 12 |
|
| 13 |
-
|
| 14 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 15 |
|
| 16 |
-
|
| 17 |
-
|
|
|
|
|
|
|
|
|
|
| 18 |
|
| 19 |
-
|
| 20 |
-
Automatic Annotation Pipeline: Utilizes open-source UAV segmentation datasets and large language models (LLMs) to generate textual descriptions.
|
| 21 |
-
Multi-Head Vision-Language Fusion (MHVLFM): Enhances cross-modal understanding for precise segmentation.
|
| 22 |
-
Multi-Scale Rotation-Aware Fusion (MSRAFM): Improves robustness to aerial scene variations.
|
| 23 |
-
State-of-the-Art Performance: Sets a new benchmark in UAV-based referring segmentation on multiple datasets.
|
|
|
|
| 1 |
---
|
| 2 |
license: mit
|
| 3 |
---
|
|
|
|
| 4 |
|
| 5 |
+
# **AeroReformer: Aerial Referring Transformer for UAV-based Referring Image Segmentation**
|
| 6 |
|
| 7 |
+
π **AeroReformer** is a novel framework for **UAV-based referring image segmentation (RIS)**, designed to address the unique challenges of aerial imagery, such as complex spatial scales, occlusions, and varying object orientations.
|
| 8 |
|
| 9 |
+
Our method integrates **multi-head vision-language fusion (MHVLFM)** and **multi-scale rotation-aware fusion (MSRAFM)** to achieve superior segmentation performance compared to existing RIS approaches.
|
| 10 |
|
| 11 |
+
The datasets and code will be made publicly available at our **[GitHub repository](https://github.com/lironui/AeroReformer)**.
|
| 12 |
|
| 13 |
+
---
|
| 14 |
+
|
| 15 |
+
## **π Paper Status**
|
| 16 |
+
Our research paper detailing **AeroReformer** is currently in preparation and will be released soon. Stay tuned for updates!
|
| 17 |
+
|
| 18 |
+
---
|
| 19 |
+
|
| 20 |
+
## **π Model Overview**
|
| 21 |
+
**AeroReformer** is a **transformer-based vision-language model** designed for **referring segmentation in UAV imagery**. It automatically **localizes and segments objects** based on **natural language descriptions**, overcoming the limitations of existing RIS models in aerial datasets.
|
| 22 |
|
| 23 |
+
### **πΉ Key Features**
|
| 24 |
+
β
**Automatic Annotation Pipeline**: Utilizes open-source UAV segmentation datasets and large language models (LLMs) to generate textual descriptions.
|
| 25 |
+
β
**Multi-Head Vision-Language Fusion (MHVLFM)**: Enhances cross-modal understanding for precise segmentation.
|
| 26 |
+
β
**Multi-Scale Rotation-Aware Fusion (MSRAFM)**: Improves robustness to aerial scene variations.
|
| 27 |
+
β
**State-of-the-Art Performance**: Sets a new benchmark in UAV-based referring segmentation on multiple datasets.
|
| 28 |
|
| 29 |
+
---
|
|
|
|
|
|
|
|
|
|
|
|