tomg-group-umd 's Collections Retrofitting Recurrence
updated
Teaching Pretrained Language Models to Think Deeper with Retrofitted
Recurrence
Paper
• 2511.07384
• Published
• 19
smcleish/Recurrent-Llama-3.2-train-recurrence-32
Text Generation
• 1B • Updated
• 384
• 1
smcleish/Recurrent-Llama-3.2-train-recurrence-16
Text Generation
• 1B • Updated
• 2
smcleish/Recurrent-Llama-3.2-train-recurrence-8
Text Generation
• 1B • Updated
• 1
smcleish/Recurrent-Llama-3.2-train-recurrence-4
Text Generation
• 1B • Updated
• 6
smcleish/Recurrent-TinyLlama-3T-train-recurrence-32
Text Generation
• 0.8B • Updated
• 228
• 1
smcleish/Recurrent-TinyLlama-3T-train-recurrence-16
Text Generation
• 0.8B • Updated
• 1
smcleish/Recurrent-TinyLlama-3T-train-recurrence-8
Text Generation
• 0.8B • Updated
• 7
smcleish/Recurrent-TinyLlama-3T-train-recurrence-4
Text Generation
• 0.8B • Updated
smcleish/Recurrent-OLMo-2-0425-train-recurrence-32
Text Generation
• 1B • Updated
• 307
• 2
smcleish/Recurrent-OLMo-2-0425-train-recurrence-16
Text Generation
• 1B • Updated
• 1
smcleish/Recurrent-OLMo-2-0425-train-recurrence-8
Text Generation
• 1B • Updated
• 4
smcleish/Recurrent-OLMo-2-0425-train-recurrence-4
Text Generation
• 1B • Updated
• 1
smcleish/Recurrent-TinyLlama-3T-train-recurrence-4-single-phase
Text Generation
• 0.8B • Updated
• 2
smcleish/Recurrent-TinyLlama-3T-train-recurrence-4-two-phase
Text Generation
• 0.8B • Updated
smcleish/Recurrent-Llama-3.2-untrained
Text Generation
• 1B • Updated
smcleish/Recurrent-TinyLlama-3T-untrained
Text Generation
• 0.8B • Updated
• 2
smcleish/Recurrent-OLMo-2-0425-untrained
Text Generation
• 1B • Updated
• 24
smcleish/Recurrent-Llama-3.2-2-4-2-untrained
Text Generation
• 1B • Updated
• 1
smcleish/retrofitting-llama-fineweb-edu-tokenized
Viewer
• Updated
• 332M • 279