G0-VLA / README.md
lllliuxiao23's picture
Update README.md
500d8a2 verified
metadata
license: cc-by-nc-sa-4.0
language:
  - en
  - zh
size_categories:
  - n>1T
tags:
  - robotics
  - real-world
  - dual-arm
  - whole body control
  - manipulation
datasets:
  - OpenGalaxea/Galaxea-Open-World-Dataset

๐Ÿš€ Galaxea Open-World Dataset and G0 Dual-System VLA Model

Project Page Paper Videos Visualizer Modelscope Twitter Linkedin

G0-VLA architecture and training pipeline: Stage 1 pre-trains a vision-language model on cross-embodiment data in an autoregressive manner. Stage 2 and post-train share the same model structure, trained on Galaxea open-world data with embodiment-specific views and high-level and subtask instructions, by supervising the Action Transformerโ€™s action reconstruction with a flow- matching loss.

image/png

In this repo, you can find:

  • G0_3B_base.pt: Model Weights after Stage2 Pretraining
  • G0_3B_base_dataset_statistics: Statistics for Dataset Used in Pretraining

๐Ÿ“œ Citation

All the data and code within this repo are under CC BY-NC-SA 4.0. If you use our dataset or models, please cite:

@article{galaxea2025,
  title={Galaxea G0: Open-World Dataset and Dual-System VLA Model},
  author={Galaxea Team},
  journal={arXiv preprint arXiv:2509.00576},
  year={2025}
}