Hippocrene
/

MiniLLM-0.1B

Text Generation

Model card Files Files and versions

miniLLM-0.1B

A small (~109M parameters) causal language model pretrained from scratch on OpenWebText.

The script that support this model is uploaded on https://github.com/Cerynitius/llmTrain/raw/refs/heads/main/generate.py

loss 3.4

Model Details

Attribute	Value
Architecture	LlamaForCausalLM
Parameters	~109M
Hidden Size	768
Attention Heads	12
Layers	10
Intermediate Size	2048
Max Sequence Length	1024
Vocabulary Size	50257
Tokenizer	GPT-2 (BPE)
Positional Encoding	RoPE (θ=10000)
Activation	SiLU
Tie Word Embeddings	Yes
Precision (training)	bfloat16

Limitations

This is a small-scale pretrained model intended for research and educational purposes.

The training script is uploaded on https://github.com/Cerynitius/llmTrain

It is not suitable for production use.

Outputs may be incoherent, biased, or factually incorrect.

Downloads last month: 1,744

Safetensors

Model size

0.1B params

Tensor type

F32

·

Model tree for Hippocrene/MiniLLM-0.1B

Quantizations

Dataset used to train Hippocrene/MiniLLM-0.1B