KDEformer: Accelerating Transformers via Kernel Density Estimation Paper • 2302.02451 • Published Feb 5, 2023
SubGen: Token Generation in Sublinear Time and Memory Paper • 2402.06082 • Published Feb 8, 2024 • 12
HyperAttention: Long-context Attention in Near-Linear Time Paper • 2310.05869 • Published Oct 9, 2023 • 2