Instructions to use Alibaba-NLP/Tongyi-DeepResearch-30B-A3B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Alibaba-NLP/Tongyi-DeepResearch-30B-A3B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Alibaba-NLP/Tongyi-DeepResearch-30B-A3B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Alibaba-NLP/Tongyi-DeepResearch-30B-A3B")
model = AutoModelForCausalLM.from_pretrained("Alibaba-NLP/Tongyi-DeepResearch-30B-A3B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use Alibaba-NLP/Tongyi-DeepResearch-30B-A3B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Alibaba-NLP/Tongyi-DeepResearch-30B-A3B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Alibaba-NLP/Tongyi-DeepResearch-30B-A3B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Alibaba-NLP/Tongyi-DeepResearch-30B-A3B

SGLang

How to use Alibaba-NLP/Tongyi-DeepResearch-30B-A3B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Alibaba-NLP/Tongyi-DeepResearch-30B-A3B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Alibaba-NLP/Tongyi-DeepResearch-30B-A3B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Alibaba-NLP/Tongyi-DeepResearch-30B-A3B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Alibaba-NLP/Tongyi-DeepResearch-30B-A3B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use Alibaba-NLP/Tongyi-DeepResearch-30B-A3B with Docker Model Runner:
```
docker model run hf.co/Alibaba-NLP/Tongyi-DeepResearch-30B-A3B
```

Update pipeline tag to image-text-to-text, add ReSum paper link and citation, and enhance content

by nielsr HF Staff - opened Sep 18, 2025

base: refs/heads/main

←

from: refs/pr/4

Discussion Files changed

+93

-7

initial commit1b24d8d2

Upload folder using huggingface_hubf991b90a

Update README.mda92f778c

Delete mergekit_config.ymlb958b177

Update README.md991ea16d

Upload vocab.jsone65d87ae

Update README.mdcbb31948

Update README.md70fd758d

Adding `transformers` as the library tag for better visibility. (#1)99d23cdd

nielsr

Sep 18, 2025

This PR significantly improves the model card for Alibaba-NLP/Tongyi-DeepResearch-30B-A3B by:

Updating the pipeline_tag from text-generation to image-text-to-text. This change accurately reflects the model's multimodal capabilities, as evidenced by its use as a web agent that processes visual environments and the presence of vision-related tokens in its tokenizer configuration (tokenizer_config.json). This will improve the model's discoverability under the correct pipeline on the Hugging Face Hub (e.g., at https://huggingface.co/models?pipeline_tag=image-text-to-text).
Adding a direct link to the associated paper, "ReSum: Unlocking Long-Horizon Search Intelligence via Context Summarization", at the top of the model card for better visibility and context.
Updating the BibTeX citation section to include the specific citation for the ReSum paper, in addition to the existing project citation.
Integrating additional valuable information from the GitHub repository's README, such as badges, "News", "Deep Research Benchmark Results", "Deep Research Agent Family", "Misc", "Talent Recruitment", and "Contact Information", to provide a more complete overview of the model and its ecosystem. Image links from the GitHub README have been converted to raw URLs for proper rendering.
Updating the "Download" section to "Download and Usage" to clearly direct users to the GitHub repository for detailed setup and inference instructions, as no standalone Python code snippet for direct inference via the transformers library was found in the GitHub README.

These enhancements aim to provide a more accurate, informative, and user-friendly model card, aligning it with Hugging Face Hub best practices.

Update pipeline tag to image-text-to-text, add ReSum paper link and citation, and enhance content3caa20f4

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Cannot merge

This branch has merge conflicts in the following files:

README.md

· Sign up or log in to comment