dfuhoiysOHSVFh82934gfjklb

huba-buba

AI & ML interests

None yet

Recent Activity

upvoted a paper 15 days ago

Universal Reasoning Model

liked a dataset 18 days ago

d0rj/OpenOrca-ru

liked a dataset 18 days ago

d0rj/ru-instruct

View all activity

Organizations

None yet

upvoted a paper 15 days ago

Universal Reasoning Model

Paper • 2512.14693 • Published 18 days ago • 40

upvoted a collection 18 days ago

Awesome SFT datasets

Collection

A curated list of interesting datasets to fine-tune language models with. • 43 items • Updated Apr 12, 2024 • 147

upvoted a paper 24 days ago

From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence

Paper • 2511.18538 • Published Nov 23, 2025 • 282

upvoted 2 papers about 1 month ago

DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning

Paper • 2511.22570 • Published Nov 27, 2025 • 85

GigaEvo: An Open Source Optimization Framework Powered By LLMs And Evolution Algorithms

Paper • 2511.17592 • Published Nov 17, 2025 • 118

upvoted 2 papers 2 months ago

Human-Agent Collaborative Paper-to-Page Crafting for Under $0.1

Paper • 2510.19600 • Published Oct 22, 2025 • 68

Drawing2CAD: Sequence-to-Sequence Learning for CAD Generation from Vector Drawings

Paper • 2508.18733 • Published Aug 26, 2025 • 9

upvoted a collection 3 months ago

Qwen3-VL

Collection

37 items • Updated 4 days ago • 555

upvoted 7 papers 3 months ago

StockBench: Can LLM Agents Trade Stocks Profitably In Real-world Markets?

Paper • 2510.02209 • Published Oct 2, 2025 • 53

No Prompt Left Behind: Exploiting Zero-Variance Prompts in LLM Reinforcement Learning via Entropy-Guided Advantage Shaping

Paper • 2509.21880 • Published Sep 26, 2025 • 52

Knapsack RL: Unlocking Exploration of LLMs via Optimizing Budget Allocation

Paper • 2509.25849 • Published Sep 30, 2025 • 47

EPO: Entropy-regularized Policy Optimization for LLM Agents Reinforcement Learning

Paper • 2509.22576 • Published Sep 26, 2025 • 134

upvoted an article 3 months ago

Article

Gaia2 and ARE: Empowering the community to study agents

Sep 22, 2025

•

125

upvoted 4 papers 4 months ago

AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning

Paper • 2509.08755 • Published Sep 10, 2025 • 56

Sharing is Caring: Efficient LM Post-Training with Collective RL Experience Sharing

Paper • 2509.08721 • Published Sep 10, 2025 • 660

A Survey of Reinforcement Learning for Large Reasoning Models

Paper • 2509.08827 • Published Sep 10, 2025 • 190

Open Data Synthesis For Deep Research

Paper • 2509.00375 • Published Aug 30, 2025 • 70

dfuhoiysOHSVFh82934gfjklb

AI & ML interests

Recent Activity

Organizations

huba-buba's activity

Gaia2 and ARE: Empowering the community to study agents