Instructions to use VLAI-AIVN/vigpt2-aio-mixed-one-step with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use VLAI-AIVN/vigpt2-aio-mixed-one-step with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="VLAI-AIVN/vigpt2-aio-mixed-one-step")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("VLAI-AIVN/vigpt2-aio-mixed-one-step") model = AutoModelForCausalLM.from_pretrained("VLAI-AIVN/vigpt2-aio-mixed-one-step") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use VLAI-AIVN/vigpt2-aio-mixed-one-step with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "VLAI-AIVN/vigpt2-aio-mixed-one-step" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "VLAI-AIVN/vigpt2-aio-mixed-one-step", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/VLAI-AIVN/vigpt2-aio-mixed-one-step
- SGLang
How to use VLAI-AIVN/vigpt2-aio-mixed-one-step with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "VLAI-AIVN/vigpt2-aio-mixed-one-step" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "VLAI-AIVN/vigpt2-aio-mixed-one-step", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "VLAI-AIVN/vigpt2-aio-mixed-one-step" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "VLAI-AIVN/vigpt2-aio-mixed-one-step", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use VLAI-AIVN/vigpt2-aio-mixed-one-step with Docker Model Runner:
docker model run hf.co/VLAI-AIVN/vigpt2-aio-mixed-one-step
ViGPT2 AIO Mixed One Step
VLAI-AIVN/vigpt2-aio-mixed-one-step is a Vietnamese GPT-2 style causal language model trained with a single mixed pretraining run over general Vietnamese text and a poem corpus.
This checkpoint is intended for Vietnamese text generation and research experiments around mixed-domain pretraining. It is not an instruction-tuned chat model.
Model Summary
- Architecture:
GPT2LMHeadModel - Layers: 12
- Hidden size: 768
- Attention heads: 12
- Context length: 1024 tokens
- Vocabulary size: 50,257
- Parameter count: 124,439,808
- Saved weights format:
safetensors - Framework: Hugging Face Transformers
Training Data
The model was trained on a mixed Vietnamese corpus built from:
- Deduplicated BKAI training data
- Deduplicated Vietnamese Wikipedia articles
- A Vietnamese poem stanza corpus
Text is normalized, tokenized, concatenated, and packed into fixed 1024-token blocks. An end-of-text token is appended between samples before packing.
Training Procedure
This model was trained with the mixed pretraining recipe implemented in src/train_mixed.py.
Important detail: the training script loads the tokenizer and config from the project checkpoint at artifacts/checkpoints/sft_poem/final, but creates the model with AutoModelForCausalLM.from_config(config). In other words, this run uses the saved architecture/tokenizer configuration but initializes model weights from config for the mixed one-step run instead of loading pretrained weights from that checkpoint.
Saved training arguments from this checkpoint:
| Setting | Value |
|---|---|
max_steps |
18920 |
per_device_train_batch_size |
2 |
per_device_eval_batch_size |
2 |
gradient_accumulation_steps |
64 |
learning_rate |
5e-4 |
weight_decay |
0.01 |
warmup_ratio |
0.1 |
lr_scheduler_type |
cosine |
bf16 |
true |
fp16 |
false |
eval_steps |
2000 |
save_steps |
2000 |
logging_steps |
100 |
seed |
42 |
The launch script in this project runs mixed training with torchrun --nproc_per_node=2.
Training Metrics
- Final saved training step:
18920 - Final eval loss:
2.5013 - Approximate perplexity:
12.20 - Final reported train loss:
2.8971
These numbers should be treated as run-level reference metrics, not as a full benchmark.
Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_id = "VLAI-AIVN/vigpt2-aio-mixed-one-step"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype="auto",
device_map="auto",
)
prompt = "Hà Nội là thủ đô của Việt Nam. "
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=120,
do_sample=True,
temperature=0.8,
top_p=0.95,
pad_token_id=tokenizer.eos_token_id,
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Intended Uses
- Vietnamese language modeling experiments
- Vietnamese text generation baselines
- Research on mixed-domain pretraining
- Further fine-tuning for downstream Vietnamese generation tasks
Out-of-Scope Uses
- Safety-critical decision making
- Factual question answering without verification
- Use as a chat assistant without additional instruction tuning
- Deployment in production without task-specific evaluation and filtering
Limitations
- The model can generate incorrect, biased, repetitive, or low-quality text.
- The training mixture includes general web-like and Wikipedia-style text as well as poem data, so style may drift depending on the prompt.
- This is a base generative model, not an aligned assistant model.
- The repository does not currently declare a license in the local project snapshot used to produce this checkpoint. Confirm licensing before broad redistribution or commercial use.
Repository Context
This checkpoint comes from the Vietnamese GPT-2 pretraining project in this repository, which includes:
- Tokenizer training
- Deduplication and corpus preparation
- Base pretraining
- Poem-domain continued pretraining
- Mixed one-step pretraining
Citation
If you use this model, cite the repository or link back to the Hugging Face model page.
- Downloads last month
- 44