Improve model card for Reason-RFT: Add metadata, update title, news, and usage

#1
by nielsr HF Staff - opened

This pull request aims to enhance the model card for the Reason-RFT model by adding crucial metadata, updating descriptive sections, and providing a direct usage example.

The changes include:

  • Updated Title: Changed the main title to Reason-RFT: Reinforcement Fine-Tuning for Visual Reasoning of Vision Language Models to accurately reflect the model's identity as presented in the paper.
  • Paper Link: Added an explicit link to the Hugging Face paper page: https://huggingface.co/papers/2503.20752 directly below the title.
  • Metadata Additions:
    • pipeline_tag: image-text-to-text: This accurately categorizes the model as a Vision-Language Model capable of visual reasoning, taking image and text inputs to generate text.
    • library_name: transformers: Evidence from config.json, tokenizer_config.json, and preprocessor_config.json confirms compatibility with the Hugging Face Transformers library (e.g., Qwen2VLForConditionalGeneration, Qwen2VLProcessor).
    • tags: visual-reasoning: Added an optional, yet highly relevant, tag to improve discoverability for models focused on visual reasoning tasks.
  • Sample Usage: Included a Python code snippet directly from the project's GitHub README to demonstrate model inference, ensuring adherence to the "do not make up code" disclaimer. The snippet uses longvu.builder as found in the original repository's usage examples.
  • Updated News Section: Synchronized the news entries with the latest updates from the GitHub repository, including NeurIPS acceptance and RoboBrain 2.0 release.
  • Expanded Citation Section: Added additional relevant citations for RoboBrain 2.0, as provided in the GitHub README, for a more comprehensive reference list.

These improvements will make the model more discoverable, provide clearer context, and offer immediate actionable usage guidance for users on the Hugging Face Hub.

Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment