Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Cuiunbo 's Collections
VLM dataset
MiniCPM-V
VLM For OCR
Dataset For OCR
audio

VLM For OCR

updated Jun 29, 2024
Upvote
4

  • Qwen/Qwen-VL

    Text Generation • Updated Jan 25, 2024 • 25.8k • 273

  • google/pix2struct-large

    Image-to-Text • 1B • Updated Sep 6, 2023 • 416 • 34

  • zai-org/cogagent-chat-hf

    Text Generation • 18B • Updated Dec 24, 2024 • 133 • 69

  • openbmb/MiniCPM-Llama3-V-2_5

    Image-Text-to-Text • 9B • Updated Jan 15, 2025 • 45k • 1.41k

  • google/paligemma-3b-pt-896

    Image-Text-to-Text • 3B • Updated Jun 22, 2025 • 404 • 123

  • UCSC-VLAA/Recap-DataComp-1B

    Viewer • Updated Jan 9, 2025 • 1.88B • 8.65k • 193

  • WildVision: Evaluating Vision-Language Models in the Wild with Human Preferences

    Paper • 2406.11069 • Published Jun 16, 2024 • 14

  • pbevan11/synthetic-ocr-correction-gpt4o

    Viewer • Updated Jul 25, 2024 • 10k • 118 • 5

  • yifeihu/ACL-23-Paper-OCR-Markdown

    Viewer • Updated Jun 8, 2024 • 2.15k • 43 • 19

  • LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs

    Paper • 2406.15319 • Published Jun 21, 2024 • 64
Upvote
4
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs