GSA-FT-Qwen2-7B-Instruct-chunk8-chunk4

This model is fine-tuned from gist-sparse-attention/GSA-PT-Qwen2-7B-Instruct-chunk8-chunk4 using Gist Sparse Attention (GSA) with chunk size chunk8-chunk4.

Paper

GSA: Gist Sparse Attention via Learnable Compression and Selective Unfolding

Field	Value
Base model	gist-sparse-attention/GSA-PT-Qwen2-7B-Instruct-chunk8-chunk4
Training type	Supervised Fine-Tuning
Chunk size	chunk8-chunk4
Architecture	Qwen2-7B

Safetensors

Model size

333k params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Base model

Finetuned

Finetuned

Finetuned

(1)

this model