4 7

Zhichen Zeng

CharyZeng

Zhichenzzz

AI & ML interests

None yet

Recent Activity

updated a model 3 days ago

CharyZeng/Kimi-K2.5-4layer

published a model 3 days ago

CharyZeng/Kimi-K2.5-4layer

authored a paper 3 days ago

Local Linear Attention: An Optimal Interpolation of Linear and Softmax Attention For Test-Time Regression

View all activity

Organizations

updated a model 3 days ago

CharyZeng/Kimi-K2.5-4layer

56B • Updated 3 days ago • 34

published a model 3 days ago

CharyZeng/Kimi-K2.5-4layer

56B • Updated 3 days ago • 34

authored 2 papers 3 days ago

Local Linear Attention: An Optimal Interpolation of Linear and Softmax Attention For Test-Time Regression

Paper • 2510.01450 • Published Oct 1, 2025 • 2

Parallax: Parameterized Local Linear Attention for Language Modeling

Paper • 2605.29157 • Published 9 days ago • 11

upvoted a paper 5 days ago

Parallax: Parameterized Local Linear Attention for Language Modeling

Paper • 2605.29157 • Published 9 days ago • 11

updated 2 models about 1 month ago

CharyZeng/DeepSeek-V4-Flash-4layer

Text Generation • 15B • Updated Apr 25 • 153

CharyZeng/Kimi-K2.5-2layer

21B • Updated Apr 25 • 5

published 2 models about 1 month ago

CharyZeng/DeepSeek-V4-Flash-4layer

Text Generation • 15B • Updated Apr 25 • 153

CharyZeng/Kimi-K2.5-2layer

21B • Updated Apr 25 • 5

authored a paper 3 months ago

ReMix: Reinforcement routing for mixtures of LoRAs in LLM finetuning

Paper • 2603.10160 • Published Mar 10 • 26

liked a model 3 months ago

zai-org/GLM-5

Text Generation • 754B • Updated Apr 5 • 109k • • 2.09k

liked a Space 4 months ago

HLE Leaderboard for Agents with Tools

🥇

Humanity's Last Exam Leaderboard for LLM Agents with Tools

liked a dataset 10 months ago

UW-FMRL2/MMMG

Viewer • Updated May 27, 2025 • 937 • 135 • 13

authored a paper about 1 year ago

Exploring the Performance Improvement of Tensor Processing Engines through Transformation in the Bit-weight Dimension of MACs

Paper • 2503.06342 • Published Mar 8, 2025 • 1

upvoted a paper about 1 year ago

Exploring the Performance Improvement of Tensor Processing Engines through Transformation in the Bit-weight Dimension of MACs

Paper • 2503.06342 • Published Mar 8, 2025 • 1

liked 2 models over 1 year ago

SeerAttention/SeerAttention-Llama-3.1-8B-AttnGates

Text Generation • Updated Mar 3, 2025 • 7.49k • 4

SeerAttention/SeerAttention-Llama-3.1-8B

Text Generation • 8B • Updated Feb 16, 2025 • 18 • 4

authored 3 papers over 1 year ago

EN-T: Optimizing Tensor Computing Engines Performance via Encoder-Based Methodology

Paper • 2404.11887 • Published Apr 18, 2024

LUT Tensor Core: Lookup Table Enables Efficient Low-Bit LLM Inference Acceleration

Paper • 2408.06003 • Published Aug 12, 2024

SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs

Paper • 2410.13276 • Published Oct 17, 2024 • 29

Zhichen Zeng

AI & ML interests

Recent Activity

Organizations

CharyZeng's activity

HLE Leaderboard for Agents with Tools