Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
- Website
- Community
- Solutions
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2404.14047

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 630
BitNet: Scaling 1-bit Transformers for Large Language Models

Paper • 2310.11453 • Published Oct 17, 2023 • 107
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

Paper • 2404.02258 • Published Apr 2, 2024 • 107
TransformerFAM: Feedback attention is working memory

Paper • 2404.09173 • Published Apr 14, 2024 • 43

Papers - University - Beihang University

TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models

Paper • 2109.10282 • Published Sep 21, 2021 • 13
How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study

Paper • 2404.14047 • Published Apr 22, 2024 • 45
McEval: Massively Multilingual Code Evaluation

Paper • 2406.07436 • Published Jun 11, 2024 • 41

Papers - University - Hong Kong University of Science and Te

Event Camera Demosaicing via Swin Transformer and Pixel-focus Loss

Paper • 2404.02731 • Published Apr 3, 2024 • 1
MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models

Paper • 2309.12284 • Published Sep 21, 2023 • 19
RALL-E: Robust Codec Language Modeling with Chain-of-Thought Prompting for Text-to-Speech Synthesis

Paper • 2404.03204 • Published Apr 4, 2024 • 9
Adapting LLaMA Decoder to Vision Transformer

Paper • 2404.06773 • Published Apr 10, 2024 • 19

PIA: Your Personalized Image Animator via Plug-and-Play Modules in Text-to-Image Models

Paper • 2312.13964 • Published Dec 21, 2023 • 19
LLM in a flash: Efficient Large Language Model Inference with Limited Memory

Paper • 2312.11514 • Published Dec 12, 2023 • 264
StreamDiffusion: A Pipeline-level Solution for Real-time Interactive Generation

Paper • 2312.12491 • Published Dec 19, 2023 • 76
LLaVA-φ: Efficient Multi-Modal Assistant with Small Language Model

Paper • 2401.02330 • Published Jan 4, 2024 • 19

LLaMA3-Quantization

This is the official quantized models collection of “How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study”

Efficient-ML/LLaMA-3-8B-GPTQ-4bit-b128

Updated Apr 21, 2024 • 3
Efficient-ML/LLaMA-3-8B-SmoothQuant-4bit-4bit

Text Generation • 8B • Updated Apr 22, 2024 • 2
Efficient-ML/LLaMA-3-8B-AWQ-4bit-b128

Text Generation • Updated Apr 28, 2024 • 3
Efficient-ML/LLaMA-3-8B-SmoothQuant-8bit-8bit

Text Generation • 8B • Updated Apr 22, 2024 • 2

Papers - ETH Zurich

I-Design: Personalized LLM Interior Designer

Paper • 2404.02838 • Published Apr 3, 2024 • 2
Scaling MLPs: A Tale of Inductive Bias

Paper • 2306.13575 • Published Jun 23, 2023 • 17
Fast Feedforward Networks

Paper • 2308.14711 • Published Aug 28, 2023 • 3
How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study

Paper • 2404.14047 • Published Apr 22, 2024 • 45

PRDP: Proximal Reward Difference Prediction for Large-Scale Reward Finetuning of Diffusion Models

Paper • 2402.08714 • Published Feb 13, 2024 • 15
Data Engineering for Scaling Language Models to 128K Context

Paper • 2402.10171 • Published Feb 15, 2024 • 25
RLVF: Learning from Verbal Feedback without Overgeneralization

Paper • 2402.10893 • Published Feb 16, 2024 • 12
Coercing LLMs to do and reveal (almost) anything

Paper • 2402.14020 • Published Feb 21, 2024 • 13

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 630
BitNet: Scaling 1-bit Transformers for Large Language Models

Paper • 2310.11453 • Published Oct 17, 2023 • 107
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

Paper • 2404.02258 • Published Apr 2, 2024 • 107
TransformerFAM: Feedback attention is working memory

Paper • 2404.09173 • Published Apr 14, 2024 • 43

LLaMA3-Quantization

This is the official quantized models collection of “How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study”

Efficient-ML/LLaMA-3-8B-GPTQ-4bit-b128

Updated Apr 21, 2024 • 3
Efficient-ML/LLaMA-3-8B-SmoothQuant-4bit-4bit

Text Generation • 8B • Updated Apr 22, 2024 • 2
Efficient-ML/LLaMA-3-8B-AWQ-4bit-b128

Text Generation • Updated Apr 28, 2024 • 3
Efficient-ML/LLaMA-3-8B-SmoothQuant-8bit-8bit

Text Generation • 8B • Updated Apr 22, 2024 • 2

Papers - University - Beihang University

TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models

Paper • 2109.10282 • Published Sep 21, 2021 • 13
How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study

Paper • 2404.14047 • Published Apr 22, 2024 • 45
McEval: Massively Multilingual Code Evaluation

Paper • 2406.07436 • Published Jun 11, 2024 • 41

Papers - ETH Zurich

I-Design: Personalized LLM Interior Designer

Paper • 2404.02838 • Published Apr 3, 2024 • 2
Scaling MLPs: A Tale of Inductive Bias

Paper • 2306.13575 • Published Jun 23, 2023 • 17
Fast Feedforward Networks

Paper • 2308.14711 • Published Aug 28, 2023 • 3
How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study

Paper • 2404.14047 • Published Apr 22, 2024 • 45

Papers - University - Hong Kong University of Science and Te

Event Camera Demosaicing via Swin Transformer and Pixel-focus Loss

Paper • 2404.02731 • Published Apr 3, 2024 • 1
MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models

Paper • 2309.12284 • Published Sep 21, 2023 • 19
RALL-E: Robust Codec Language Modeling with Chain-of-Thought Prompting for Text-to-Speech Synthesis

Paper • 2404.03204 • Published Apr 4, 2024 • 9
Adapting LLaMA Decoder to Vision Transformer

Paper • 2404.06773 • Published Apr 10, 2024 • 19

PRDP: Proximal Reward Difference Prediction for Large-Scale Reward Finetuning of Diffusion Models

Paper • 2402.08714 • Published Feb 13, 2024 • 15
Data Engineering for Scaling Language Models to 128K Context

Paper • 2402.10171 • Published Feb 15, 2024 • 25
RLVF: Learning from Verbal Feedback without Overgeneralization

Paper • 2402.10893 • Published Feb 16, 2024 • 12
Coercing LLMs to do and reveal (almost) anything

Paper • 2402.14020 • Published Feb 21, 2024 • 13

PIA: Your Personalized Image Animator via Plug-and-Play Modules in Text-to-Image Models

Paper • 2312.13964 • Published Dec 21, 2023 • 19
LLM in a flash: Efficient Large Language Model Inference with Limited Memory

Paper • 2312.11514 • Published Dec 12, 2023 • 264
StreamDiffusion: A Pipeline-level Solution for Real-time Interactive Generation

Paper • 2312.12491 • Published Dec 19, 2023 • 76
LLaVA-φ: Efficient Multi-Modal Assistant with Small Language Model

Paper • 2401.02330 • Published Jan 4, 2024 • 19

Previous
1
2
Next

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs