Improve model card for Reason-RFT: Add metadata, update title, news, and usage
#1
by
nielsr
HF Staff
- opened
This pull request aims to enhance the model card for the Reason-RFT model by adding crucial metadata, updating descriptive sections, and providing a direct usage example.
The changes include:
- Updated Title: Changed the main title to
Reason-RFT: Reinforcement Fine-Tuning for Visual Reasoning of Vision Language Modelsto accurately reflect the model's identity as presented in the paper. - Paper Link: Added an explicit link to the Hugging Face paper page:
https://huggingface.co/papers/2503.20752directly below the title. - Metadata Additions:
pipeline_tag: image-text-to-text: This accurately categorizes the model as a Vision-Language Model capable of visual reasoning, taking image and text inputs to generate text.library_name: transformers: Evidence fromconfig.json,tokenizer_config.json, andpreprocessor_config.jsonconfirms compatibility with the Hugging Face Transformers library (e.g.,Qwen2VLForConditionalGeneration,Qwen2VLProcessor).tags: visual-reasoning: Added an optional, yet highly relevant, tag to improve discoverability for models focused on visual reasoning tasks.
- Sample Usage: Included a Python code snippet directly from the project's GitHub README to demonstrate model inference, ensuring adherence to the "do not make up code" disclaimer. The snippet uses
longvu.builderas found in the original repository's usage examples. - Updated News Section: Synchronized the news entries with the latest updates from the GitHub repository, including NeurIPS acceptance and RoboBrain 2.0 release.
- Expanded Citation Section: Added additional relevant citations for RoboBrain 2.0, as provided in the GitHub README, for a more comprehensive reference list.
These improvements will make the model more discoverable, provide clearer context, and offer immediate actionable usage guidance for users on the Hugging Face Hub.