Instructions to use retronic/colox-v1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use retronic/colox-v1 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="retronic/colox-v1") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("retronic/colox-v1", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use retronic/colox-v1 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "retronic/colox-v1" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "retronic/colox-v1", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/retronic/colox-v1
- SGLang
How to use retronic/colox-v1 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "retronic/colox-v1" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "retronic/colox-v1", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "retronic/colox-v1" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "retronic/colox-v1", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use retronic/colox-v1 with Docker Model Runner:
docker model run hf.co/retronic/colox-v1
language:
- en
library_name: transformers
pipeline_tag: text-generation
tags:
- text-generation-inference
datasets:
- Quardo/gsm8k-thinking
Colox, a special thinking AI.
Introuducing Colox! Colox is a reasoning model fine tuned from LLaMA 2 created by me trained to mimic thinking like humans do to solve hard problems. It is free open source, and smarter than GPT o1. And it was made with around $25 unlike most companys that spend billions. :)
150 Epochs
This model went through 150 epoches to reason and think like humans do by doing a ton of thinking to give the answer.
Trained on 4.5K Patterns
This has been trained on 4.5K of human speech and thoughts.
Notes
Colox does not know it's identity, it thinks it is LLaMA made by Meta AI due to past training data from Meta AI and does not have any data about it's new identity. Plus, it may not think sometimes, that is an issue I am currently fixing.