Tiny Random MiniCPM-o-2_6

A tiny (~42 MB) randomly-initialized version of MiniCPM-o-2.6 designed for testing purposes in the optimum-intel library.

Purpose

This model was created to replace the existing test model at optimum-intel-internal-testing/tiny-random-MiniCPM-o-2_6 (185 MB) with a smaller alternative for CI/CD testing. Smaller test models reduce:

  • Download times in CI pipelines
  • Storage requirements
  • Test execution time

Size Comparison

Model Total Size Model Weights
openbmb/MiniCPM-o-2_6 (Original) 17.4 GB ~17 GB
optimum-intel-internal-testing/tiny-random-MiniCPM-o-2_6 (Current Test Model) 185 MB 169 MB
hrithik-dev8/tiny-random-MiniCPM-o-2_6 (This Model) ~42 MB 41.55 MB

Result: 4× smaller than Intel's current test model

Model Configuration

Component This Model Original
Vocabulary 5,000 tokens 151,700 tokens
LLM Hidden Size 128 3,584
LLM Layers 1 40
LLM Attention Heads 8 28
Vision Hidden Size 128 1,152
Vision Layers 1 27
Image Size 980 (preserved) 980
Patch Size 14 (preserved) 14
Audio d_model 64 1,280
TTS Hidden Size 128 -

Parameter Breakdown

Component Parameters Size (MB)
TTS/DVAE 19,339,766 36.89
LLM 1,419,840 2.71
Vision 835,328 1.59
Resampler 91,392 0.17
Audio 56,192 0.11
Other 20,736 0.04
Total 21,763,254 ~41.5

Technical Details

Why Keep TTS/DVAE Components?

The TTS (Text-to-Speech) component, which includes the DVAE (Discrete Variational Auto-Encoder), accounts for approximately 37 MB (~85%) of the model size. While the optimum-intel tests do not exercise TTS functionality (they only test image+text → text generation), we retain this component because:

  1. Structural Consistency: Removing TTS via init_tts=False causes structural differences in the model that lead to numerical divergence between PyTorch and OpenVINO outputs
  2. Test Compatibility: The test_compare_to_transformers test compares PyTorch vs OpenVINO outputs and requires exact structural matching
  3. Architecture Integrity: The MiniCPM-o architecture expects TTS weights to be present during model loading

Tokenizer Shrinking

The vocabulary was reduced from 151,700 to 5,000 tokens:

  • Base tokens: IDs 0-4899 (first 4,900 most common tokens)
  • Special tokens: IDs 4900-4949 (remapped from original high IDs)
  • BPE merges: Filtered from 151,387 to 4,644 (only merges involving retained tokens)

Key token mappings:

Token ID
<unk> 4900
<|endoftext|> 4901
<|im_start|> 4902
<|im_end|> 4903

Reproducibility

Model weights are initialized with a fixed random seed (42) to ensure:

  • Reproducible outputs between runs
  • Consistent behavior between PyTorch and OpenVINO
  • Passing of test_compare_to_transformers which compares framework outputs

Test Results

Tested with pytest tests/openvino/test_seq2seq.py -k "minicpmo" -v:

Test Status Notes
test_compare_to_transformers ✅ PASSED PyTorch/OpenVINO outputs match
test_generate_utils ✅ PASSED Generation pipeline works
test_model_can_be_loaded_after_saving ⚠️ FAILED Windows file locking issue (not model-related)

The third test failure is a Windows-specific issue where OpenVINO keeps file handles open, preventing cleanup of temporary directories. This is a known platform limitation, not a model defect. The test passes on Linux/macOS.

Usage

For optimum-intel Testing

# In optimum-intel/tests/openvino/utils_tests.py, update MODEL_NAMES:
MODEL_NAMES = {
    # ... other models ...
    "minicpmo": "hrithik-dev8/tiny-random-MiniCPM-o-2_6",
}

Then run tests:

pytest tests/openvino/test_seq2seq.py -k "minicpmo" -v

Basic Model Loading

from transformers import AutoModel, AutoTokenizer

model = AutoModel.from_pretrained(
    "hrithik-dev8/tiny-random-MiniCPM-o-2_6",
    trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained(
    "hrithik-dev8/tiny-random-MiniCPM-o-2_6",
    trust_remote_code=True
)

Files Included

File Size Description
model.safetensors 41.55 MB Model weights (bfloat16)
config.json 5.33 KB Model configuration
tokenizer.json 338.27 KB Shrunk tokenizer (5,000 tokens)
tokenizer_config.json 12.78 KB Tokenizer settings
vocab.json 85.70 KB Vocabulary mapping
merges.txt 36.58 KB BPE merge rules
preprocessor_config.json 1.07 KB Image processor config
generation_config.json 121 B Generation settings
added_tokens.json 1.13 KB Special tokens
special_tokens_map.json 1.24 KB Special token mappings

Requirements

  • Python 3.8+
  • transformers >= 4.45.0, < 4.52.0
  • torch
  • For OpenVINO testing: optimum-intel with OpenVINO backend

Limitations

⚠️ This model is for testing only - it produces random/meaningless outputs and should not be used for inference.

Downloads last month
408
Safetensors
Model size
21.8M params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for hrithik-dev8/tiny-random-MiniCPM-o-2_6

Finetuned
(8)
this model