Instructions to use qqggez/deepseek-parlay-6.7b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use qqggez/deepseek-parlay-6.7b with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="qqggez/deepseek-parlay-6.7b")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("qqggez/deepseek-parlay-6.7b")
model = AutoModelForCausalLM.from_pretrained("qqggez/deepseek-parlay-6.7b")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use qqggez/deepseek-parlay-6.7b with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "qqggez/deepseek-parlay-6.7b"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "qqggez/deepseek-parlay-6.7b",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/qqggez/deepseek-parlay-6.7b

SGLang

How to use qqggez/deepseek-parlay-6.7b with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "qqggez/deepseek-parlay-6.7b" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "qqggez/deepseek-parlay-6.7b",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "qqggez/deepseek-parlay-6.7b" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "qqggez/deepseek-parlay-6.7b",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use qqggez/deepseek-parlay-6.7b with Docker Model Runner:
```
docker model run hf.co/qqggez/deepseek-parlay-6.7b
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

Model Card for deepseek-parlay-6.7b

This model is part of the ParEVO framework, introduced in the paper ParEVO: Synthesizing Code for Irregular Data: High-Performance Parallelism through Agentic Evolution.

Project Website: https://quanquancliu.com/ParEVO/index.html
GitHub Repository: https://github.com/WildAlg/ParEVO

Model Details

Base Model: deepseek-ai/deepseek-coder-6.7b-base
Model Type: C++ Parallel Code Generation Model
Language: C++
Parameters: 6.7B

Intended Use

The model is specifically fine-tuned for generating high-performance parallel algorithms for irregular data structures in C++. It understands and utilizes the composable primitives of the ParlayLib parallel data structures library (e.g., filter, pack, scan, sort, reduce) to output mathematically scalable and safe parallel code.

Training Data

The model was trained on the Parlay-Instruct Corpus, a dataset containing 13,820 verified tasks synthesized via an Evolutionary "Teacher-Student-Critic" pipeline. The training dataset includes:

Ground-truth samples covering ParlayLib's core primitives.
DMOJ "slow-fast" code comparison pairs, constructed to identify optimal performance transformations rather than just functional correctness.
Code validated with execution-based verification against a ground-truth C++ compiler oracle.

Training data can be found at this Github link: https://github.com/WildAlg/ParEVO

Training Procedure

Algorithm: Single-stage Supervised Fine-Tuning (SFT)
Method: LoRA ($r=8$, $\alpha=16$) targeting the query and value projections
Learning Rate: $2\text{e-}4$
Precision: FP16
Hardware: NVIDIA RTX 5000 Ada

License

The ParEVO framework and datasets use a modular licensing structure to maximize open-source adoption, while the fine-tuned model weights inherit the license of their base model.

1. Model Weights License

The fine-tuned deepseek-parlay-6.7b model weights are a derivative work of deepseek-ai/deepseek-coder-6.7b-base. As such, the model weights and inference outputs are governed by the DeepSeek License. Users must comply with the original use-case restrictions and terms set by DeepSeek when using this model.

2. Software License (MIT License)

3. Dataset License (CC BY 4.0)

The Parlay-Instruct Corpus, ParEval evaluation trajectories, and DMOJ problem-solution datasets are licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0).

Citation

If you use this model or the ParEVO framework in your research, please cite:

@inproceedings{yang2026parevo,
  title={ParEVO: Synthesizing Code for Irregular Data: High-Performance Parallelism through Agentic Evolution},
  author={Yang, Liu and Nie, Zeyu and Liu, Andrew and Zou, Felix and Altinb{\u{k}}en, Deniz and Yazdanbakhsh, Amir and Liu, Quanquan C.},
  booktitle={arXiv Preprint},
  year={2026}
}