MCP-Cosmos: World Model-Augmented Agents for Complex Task Execution in MCP Environments Paper • 2605.09131 • Published 8 days ago • 38
view article Article Inside VAKRA: Reasoning, Tool Use, and Failure Modes of Agents ibm-research • Apr 15 • 28
Enterprise Agents and Benchmarks Collection Enterprise agent ecosystem featuring AssetOpsBench (industrial) and ITBench (SRE, FinOps, CISO), CUGA to accelerate AI Automation • 18 items • Updated 3 days ago • 16
From Static Templates to Dynamic Runtime Graphs: A Survey of Workflow Optimization for LLM Agents Paper • 2603.22386 • Published Mar 23 • 57
Enterprise Agents and Benchmarks Collection Enterprise agent ecosystem featuring AssetOpsBench (industrial) and ITBench (SRE, FinOps, CISO), CUGA to accelerate AI Automation • 18 items • Updated 3 days ago • 16
Time Series Models Collection A collection of time series models trained by IBM • 4 items • Updated Feb 25 • 1
Granite Time Series Collection Time series models for forecasting, anomaly detection, classification, and more. • 9 items • Updated 17 days ago • 51
view article Article IBM and UC Berkeley Diagnose Why Enterprise Agents Fail Using IT-Bench and MAST ibm-research • Feb 18 • 19
view article Article IBM and UC Berkeley Diagnose Why Enterprise Agents Fail Using IT-Bench and MAST ibm-research • Feb 18 • 19
ITBench: Evaluating AI Agents across Diverse Real-World IT Automation Tasks Paper • 2502.05352 • Published Feb 7, 2025 • 2