Instructions to use haykgrigorian/TimeCapsuleLLM-v2-llama-1.2B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use haykgrigorian/TimeCapsuleLLM-v2-llama-1.2B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="haykgrigorian/TimeCapsuleLLM-v2-llama-1.2B")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("haykgrigorian/TimeCapsuleLLM-v2-llama-1.2B") model = AutoModelForCausalLM.from_pretrained("haykgrigorian/TimeCapsuleLLM-v2-llama-1.2B") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use haykgrigorian/TimeCapsuleLLM-v2-llama-1.2B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "haykgrigorian/TimeCapsuleLLM-v2-llama-1.2B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "haykgrigorian/TimeCapsuleLLM-v2-llama-1.2B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/haykgrigorian/TimeCapsuleLLM-v2-llama-1.2B
- SGLang
How to use haykgrigorian/TimeCapsuleLLM-v2-llama-1.2B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "haykgrigorian/TimeCapsuleLLM-v2-llama-1.2B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "haykgrigorian/TimeCapsuleLLM-v2-llama-1.2B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "haykgrigorian/TimeCapsuleLLM-v2-llama-1.2B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "haykgrigorian/TimeCapsuleLLM-v2-llama-1.2B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use haykgrigorian/TimeCapsuleLLM-v2-llama-1.2B with Docker Model Runner:
docker model run hf.co/haykgrigorian/TimeCapsuleLLM-v2-llama-1.2B
haykgrigorian/TimeCapsuleLLM-v2-London-1800-1875: Llama-Architecture 1.2B Model
Model Overview
v2 model, trained from scratch on 112GB of 1800-1875 london texts using a Llama-based Casual Language Model.
| Detail | Value |
|---|---|
| Model Architecture | LlamaForCausalLM (Decoder-Only Transformer) |
| Parameter Count | ~1.22B |
| Training Type | Trained from Scratch (Random Initialization) |
| Tokenizer | Custom BPE, Vocab Size 32,000 |
| Sequence Length | 2048 tokens |
| Attention Type | Grouped Query Attention (GQA) |
Configuration Details
This model is a custom size and configuration based on Llama:
| Parameter | Value |
|---|---|
| Number of Layers | 22 |
| Hidden Size (d) | 2048 |
| Intermediate Size ($\text{d}_{\text{ff}}$) | 5504 |
| Attention Heads | 16 (Query) / 8 (Key/Value) |
| Activation Function | SiLU (silu) |
| Normalization | RMS Norm (rms_norm_eps: 1e-06) |
| Position Embeddings | Rotary Positional Embeddings (RoPE) |
Training Info
This model was trained for 182,000 steps, about 0.5 epochs.
Training Metrics:
Final Training Loss: 3.3951
Start Training Loss: 10.7932
Training Steps: 182,000
Epochs: 0.4997
Gradient Norm Stability: Consistently stable between 0.50 and 0.60 in later stages.
Training time: 117 hours 51 minutes
Cost
This model was trained on an H100 SXM from RunPod
Total: $340.97
How to Load and Run the Model
Install all the files locally in a folder and run the test script. You will have to make some adjustments in the run script like updating the config/file path and test prompts
Test script
A run file for testing and evaluating this model is available on the main project repository:
- Test Script Link: run_v2.py on GitHub
- Downloads last month
- 93,439