Improve model card: Add metadata, paper & code links, description, and performance image
#1
by nielsr HF Staff - opened
This PR significantly improves the model card for AdaTooler-V-7B by:
- Adding the
pipeline_tag: image-text-to-textmetadata, which correctly categorizes the model as a Multimodal Large Language Model processing visual input to generate text responses. - Specifying
library_name: transformersin the metadata. This is supported by evidence in the model'sconfig.jsonon the Hugging Face Hub, which listsQwen2VLForConditionalGenerationas an architecture, indicating compatibility with the π Transformers library. This enables the "Use in Transformers" widget on the model page. - Including a direct link to the official paper: AdaTooler-V: Adaptive Tool-Use for Images and Videos.
- Providing a link to the official GitHub repository: https://github.com/CYWang735/AdaTooler-V.
- Adding a comprehensive description of the model, summarizing its key features, methodology, and performance highlights, based on the paper abstract and the GitHub README.
- Including an illustrative performance image (
bar.png) from the GitHub repository to visually summarize the model's capabilities. - Adding a BibTeX citation for easy referencing.
Please note that a sample usage code snippet has been omitted, as the provided GitHub README does not contain a direct Python inference example for the AdaTooler-V-7B model, in adherence to the task's strict guidelines.