Instructions to use VLAI-AIVN/vigpt2-aio-mixed-one-step with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use VLAI-AIVN/vigpt2-aio-mixed-one-step with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="VLAI-AIVN/vigpt2-aio-mixed-one-step")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("VLAI-AIVN/vigpt2-aio-mixed-one-step")
model = AutoModelForCausalLM.from_pretrained("VLAI-AIVN/vigpt2-aio-mixed-one-step")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use VLAI-AIVN/vigpt2-aio-mixed-one-step with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "VLAI-AIVN/vigpt2-aio-mixed-one-step"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "VLAI-AIVN/vigpt2-aio-mixed-one-step",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/VLAI-AIVN/vigpt2-aio-mixed-one-step

SGLang

How to use VLAI-AIVN/vigpt2-aio-mixed-one-step with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "VLAI-AIVN/vigpt2-aio-mixed-one-step" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "VLAI-AIVN/vigpt2-aio-mixed-one-step",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "VLAI-AIVN/vigpt2-aio-mixed-one-step" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "VLAI-AIVN/vigpt2-aio-mixed-one-step",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use VLAI-AIVN/vigpt2-aio-mixed-one-step with Docker Model Runner:
```
docker model run hf.co/VLAI-AIVN/vigpt2-aio-mixed-one-step
```

ViGPT2 AIO Mixed One Step

VLAI-AIVN/vigpt2-aio-mixed-one-step is a Vietnamese GPT-2 style causal language model trained with a single mixed pretraining run over general Vietnamese text and a poem corpus.

This checkpoint is intended for Vietnamese text generation and research experiments around mixed-domain pretraining. It is not an instruction-tuned chat model.

Model Summary

Architecture: GPT2LMHeadModel
Layers: 12
Hidden size: 768
Attention heads: 12
Context length: 1024 tokens
Vocabulary size: 50,257
Parameter count: 124,439,808
Saved weights format: safetensors
Framework: Hugging Face Transformers

Training Data

The model was trained on a mixed Vietnamese corpus built from:

Deduplicated BKAI training data
Deduplicated Vietnamese Wikipedia articles
A Vietnamese poem stanza corpus

Text is normalized, tokenized, concatenated, and packed into fixed 1024-token blocks. An end-of-text token is appended between samples before packing.

Training Procedure

This model was trained with the mixed pretraining recipe implemented in src/train_mixed.py.

Important detail: the training script loads the tokenizer and config from the project checkpoint at artifacts/checkpoints/sft_poem/final, but creates the model with AutoModelForCausalLM.from_config(config). In other words, this run uses the saved architecture/tokenizer configuration but initializes model weights from config for the mixed one-step run instead of loading pretrained weights from that checkpoint.

Saved training arguments from this checkpoint:

Setting	Value
`max_steps`	`18920`
`per_device_train_batch_size`	`2`
`per_device_eval_batch_size`	`2`
`gradient_accumulation_steps`	`64`
`learning_rate`	`5e-4`
`weight_decay`	`0.01`
`warmup_ratio`	`0.1`
`lr_scheduler_type`	`cosine`
`bf16`	`true`
`fp16`	`false`
`eval_steps`	`2000`
`save_steps`	`2000`
`logging_steps`	`100`
`seed`	`42`

The launch script in this project runs mixed training with torchrun --nproc_per_node=2.

Training Metrics

Final saved training step: 18920
Final eval loss: 2.5013
Approximate perplexity: 12.20
Final reported train loss: 2.8971

These numbers should be treated as run-level reference metrics, not as a full benchmark.

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "VLAI-AIVN/vigpt2-aio-mixed-one-step"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype="auto",
    device_map="auto",
)

prompt = "Hà Nội là thủ đô của Việt Nam. "
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=120,
        do_sample=True,
        temperature=0.8,
        top_p=0.95,
        pad_token_id=tokenizer.eos_token_id,
    )

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Intended Uses

Vietnamese language modeling experiments
Vietnamese text generation baselines
Research on mixed-domain pretraining
Further fine-tuning for downstream Vietnamese generation tasks

Out-of-Scope Uses

Safety-critical decision making
Factual question answering without verification
Use as a chat assistant without additional instruction tuning
Deployment in production without task-specific evaluation and filtering

Limitations

The model can generate incorrect, biased, repetitive, or low-quality text.
The training mixture includes general web-like and Wikipedia-style text as well as poem data, so style may drift depending on the prompt.
This is a base generative model, not an aligned assistant model.
The repository does not currently declare a license in the local project snapshot used to produce this checkpoint. Confirm licensing before broad redistribution or commercial use.

Repository Context

This checkpoint comes from the Vietnamese GPT-2 pretraining project in this repository, which includes:

Tokenizer training
Deduplication and corpus preparation
Base pretraining
Poem-domain continued pretraining
Mixed one-step pretraining

Citation

If you use this model, cite the repository or link back to the Hugging Face model page.

Downloads last month: 44

Safetensors

Model size

0.1B params

Tensor type

BF16