Improve model card: Add metadata, paper & code links, description, and performance image

#1
by nielsr HF Staff - opened

This PR significantly improves the model card for AdaTooler-V-7B by:

  • Adding the pipeline_tag: image-text-to-text metadata, which correctly categorizes the model as a Multimodal Large Language Model processing visual input to generate text responses.
  • Specifying library_name: transformers in the metadata. This is supported by evidence in the model's config.json on the Hugging Face Hub, which lists Qwen2VLForConditionalGeneration as an architecture, indicating compatibility with the πŸ“š Transformers library. This enables the "Use in Transformers" widget on the model page.
  • Including a direct link to the official paper: AdaTooler-V: Adaptive Tool-Use for Images and Videos.
  • Providing a link to the official GitHub repository: https://github.com/CYWang735/AdaTooler-V.
  • Adding a comprehensive description of the model, summarizing its key features, methodology, and performance highlights, based on the paper abstract and the GitHub README.
  • Including an illustrative performance image (bar.png) from the GitHub repository to visually summarize the model's capabilities.
  • Adding a BibTeX citation for easy referencing.

Please note that a sample usage code snippet has been omitted, as the provided GitHub README does not contain a direct Python inference example for the AdaTooler-V-7B model, in adherence to the task's strict guidelines.

Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment