Anthropic/hh-rlhf
Viewer • Updated • 169k • 41.1k • 1.75k
How to use Leogrin/eleuther-pythia1.4b-hh-sft with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-generation", model="Leogrin/eleuther-pythia1.4b-hh-sft") # Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("Leogrin/eleuther-pythia1.4b-hh-sft")
model = AutoModelForCausalLM.from_pretrained("Leogrin/eleuther-pythia1.4b-hh-sft")How to use Leogrin/eleuther-pythia1.4b-hh-sft with vLLM:
# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Leogrin/eleuther-pythia1.4b-hh-sft"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "Leogrin/eleuther-pythia1.4b-hh-sft",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'docker model run hf.co/Leogrin/eleuther-pythia1.4b-hh-sft
How to use Leogrin/eleuther-pythia1.4b-hh-sft with SGLang:
# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
--model-path "Leogrin/eleuther-pythia1.4b-hh-sft" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "Leogrin/eleuther-pythia1.4b-hh-sft",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'docker run --gpus all \
--shm-size 32g \
-p 30000:30000 \
-v ~/.cache/huggingface:/root/.cache/huggingface \
--env "HF_TOKEN=<secret>" \
--ipc=host \
lmsysorg/sglang:latest \
python3 -m sglang.launch_server \
--model-path "Leogrin/eleuther-pythia1.4b-hh-sft" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "Leogrin/eleuther-pythia1.4b-hh-sft",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'How to use Leogrin/eleuther-pythia1.4b-hh-sft with Docker Model Runner:
docker model run hf.co/Leogrin/eleuther-pythia1.4b-hh-sft
Pythia-1.4b supervised finetuned with Anthropic-hh-rlhf dataset for 1 epoch.
See Pythia-1.4b for model details (paper).
Results for the base model are taken from the Pythia paper.
| Task | 1.4B_base | 1.4B_sft |
|---|---|---|
| Lambada (OpenAI) | 0.616 ± 0.007 | 0.5977 ± 0.0068 |
| PIQA | 0.711 ± 0.011 | 0.7133 ± 0.0106 |
| WinoGrande | 0.573 ± 0.014 | 0.5793 ± 0.0139 |
| WSC | 0.365 ± 0.047 | 0.3654 ± 0.0474 |
| ARC - Easy | 0.606 ± 0.010 | 0.6098 ± 0.0100 |
| ARC - Challenge | 0.260 ± 0.013 | 0.2696 ± 0.0130 |
| SciQ | 0.865 ± 0.011 | 0.8540 ± 0.0112 |
| LogiQA | 0.210 ± 0.016 | NA |
| Task | 1.4B_base | 1.4B_sft |
|---|---|---|
| Lambada (OpenAI) | 0.578 ± 0.007 | 0.5201 ± 0.007 |
| PIQA | 0.705 ± 0.011 | 0.7176 ± 0.0105 |
| WinoGrande | 0.580 ± 0.014 | 0.5793 ± 0.0139 |
| WSC | 0.365 ± 0.047 | 0.5288 ± 0.0492 |
| ARC - Easy | 0.643 ± 0.010 | 0.6376 ± 0.0099 |
| ARC - Challenge | 0.290 ± 0.013 | 0.2935 ± 0.0133 |
| SciQ | 0.92 ± 0.009 | 0.9180 ± 0.0087 |
| LogiQA | 0.240 ± 0.017 | N/A |