FlashLM v5.2 "Nova-Ignition"

5.0M parameter language model designed for 2-CPU/5GB RAM environments. Trained for 2 hours on free-tier cloud CPU. No GPU — not for training, not for inference.

Model Details

Architecture: Standard Transformer with Rotary Positional Embeddings (RoPE)
Parameters: ~5.0M
Vocab Size: 4,096 (BPE)
Context Length: 128 tokens
d_model: 256
Layers: 6
Attention Heads: 4
FFN Hidden: 512
Activation: GELU
Weight Tying: Yes (embedding ↔ head)

Architecture

Embedding (4K × 256, float, weight-tied)
  → 6 × NovaBlock:
      LayerNorm → MultiHeadAttention (RoPE) + residual
      LayerNorm → FFN (GELU, 256→512→256) + residual
  → LayerNorm → Output Head (tied to embedding)

Training

Dataset: TinyStories V2 (validation split)
Training Time: 2 hours
Hardware: Free-tier cloud CPU (2 threads, 5GB RAM)
Speed: ~3,500 tokens/sec

Benchmark Results

Model	Params	BPC	PPL	Hardware
FlashLM v5.2	5.0M	0.78	10.56	2-thread CPU
FlashLM v4 "Bolt"	4.3M	0.88	15.05	2-thread CPU
TinyStories-1M	3.7M	0.62	6.72	V100 GPU

v5.2 beats v4 by 11% relative in BPC with the same training time (2 hours)!

Usage

import torch
from tokenizers import Tokenizer
import torch.nn as nn
import torch.nn.functional as F

# Load tokenizer
tokenizer = Tokenizer.from_file("tokenizer.json")

# Load model (requires architecture definition - see model.py)
model = NovaIgnitionLM(vocab=4096, d_model=256, n_layers=6, 
                       n_heads=4, d_head=64, d_ffn=512)
model.load_state_dict(torch.load("best.pt", weights_only=True))

# Generate
prompt = "Once upon a time"
ids = tokenizer.encode(prompt).ids
x = torch.tensor([ids])
out = model.generate(x, max_new_tokens=80, temperature=0.8, top_k=40)
text = tokenizer.decode(out[0].tolist())
print(text)

Files

best.pt - Best model checkpoint
latest.pt - Latest checkpoint
config.json - Training configuration

Limitations

Small context window (128 tokens)
Trained on limited data (~20M tokens)
Not suitable for complex reasoning tasks

License

MIT

Citation

@misc{flashlm-v52,
  author = {Chang Cheng},
  title = {FlashLM v5.2 Nova-Ignition},
  year = {2026},
  url = {https://github.com/changcheng967/FlashLM}
}

Downloads last month: 4

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

changcheng967
/

flashlm-v5.2-nova-ignition