Yang Zhou's picture

1 5

Yang Zhou

nbzy1995

·

AI & ML interests

Artificial General Intelligence, AI for Science, AI for society

Recent Activity

updated a model 3 months ago

nbzy1995/Qwen2-0-5B-GRPO-vllm-trl

updated a Space 3 months ago

nbzy1995/Qwen2-0-5B-GRPO-vllm-trl

published a Space 3 months ago

nbzy1995/Qwen2-0-5B-GRPO-vllm-trl

View all activity

Organizations

updated a model 3 months ago

nbzy1995/Qwen2-0-5B-GRPO-vllm-trl

Updated Nov 17, 2025

updated a Space 3 months ago

Qwen2 0 5B GRPO Vllm Trl

Trackio Dashboard: Monitor and analyze project runs

published a Space 3 months ago

Qwen2 0 5B GRPO Vllm Trl

Trackio Dashboard: Monitor and analyze project runs

published a model 3 months ago

nbzy1995/Qwen2-0-5B-GRPO-vllm-trl

Updated Nov 17, 2025

updated a Space 3 months ago

Trl Trackio

Display tracking information

published a Space 3 months ago

Trl Trackio

Display tracking information

updated a Space 3 months ago

Trackio

Track and visualize project run metrics

published a Space 3 months ago

Trackio

Track and visualize project run metrics

published a model 3 months ago

nbzy1995/Qwen3-VL-4B-Instruct-trl-grpo

Updated Nov 13, 2025

upvoted a paper 4 months ago

Learn the Ropes, Then Trust the Wins: Self-imitation with Progressive Exploration for Agentic Reinforcement Learning

Paper • 2509.22601 • Published Sep 26, 2025 • 30

liked 2 models 6 months ago

openai/clip-vit-base-patch32

Zero-Shot Image Classification • Updated Feb 29, 2024 • 14.6M • 850

Qwen/Qwen2.5-1.5B

Text Generation • 2B • Updated Oct 8, 2024 • 420k • • 160

liked a dataset 8 months ago

gaia-benchmark/GAIA

Viewer • Updated Oct 28, 2025 • 932 • 14.5k • 603

updated 2 models 8 months ago

nbzy1995/Reinforce-Cartpole-v1

Reinforcement Learning • Updated Jun 7, 2025

nbzy1995/dqn_rl_zoo3_atari

Reinforcement Learning • Updated Jun 6, 2025 • 1

published a model 8 months ago

nbzy1995/dqn_rl_zoo3_atari

Reinforcement Learning • Updated Jun 6, 2025 • 1

liked a Space 8 months ago

Interactive DeepRL Demo

Run and customize deep reinforcement learning simulations

updated a model 8 months ago

nbzy1995/rl_course_vizdoom_health_gathering_supreme

Reinforcement Learning • Updated Jun 4, 2025

published a model 8 months ago

nbzy1995/rl_course_vizdoom_health_gathering_supreme

Reinforcement Learning • Updated Jun 4, 2025

updated a model 8 months ago

nbzy1995/ppo-LunarLander-v2

Reinforcement Learning • Updated Jun 1, 2025