-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 23 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 85 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 151 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper • 2401.17072 • Published • 25
Collections
Discover the best community collections!
Collections including paper arxiv:2402.09668
-
How to Train Data-Efficient LLMs
Paper • 2402.09668 • Published • 42 -
SliceGPT: Compress Large Language Models by Deleting Rows and Columns
Paper • 2401.15024 • Published • 74 -
SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning
Paper • 2407.07523 • Published • 6 -
Spectra: A Comprehensive Study of Ternary, Quantized, and FP16 Language Models
Paper • 2407.12327 • Published • 79
-
Pre-training Small Base LMs with Fewer Tokens
Paper • 2404.08634 • Published • 35 -
Ziya2: Data-centric Learning is All LLMs Need
Paper • 2311.03301 • Published • 20 -
How to Train Data-Efficient LLMs
Paper • 2402.09668 • Published • 42 -
MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies
Paper • 2404.06395 • Published • 24
-
OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework
Paper • 2404.14619 • Published • 126 -
Scaling Laws for Downstream Task Performance of Large Language Models
Paper • 2402.04177 • Published • 20 -
Orca 2: Teaching Small Language Models How to Reason
Paper • 2311.11045 • Published • 77 -
Orca-Math: Unlocking the potential of SLMs in Grade School Math
Paper • 2402.14830 • Published • 25
-
Watermarking Makes Language Models Radioactive
Paper • 2402.14904 • Published • 24 -
ChunkAttention: Efficient Self-Attention with Prefix-Aware KV Cache and Two-Phase Partition
Paper • 2402.15220 • Published • 22 -
GPTVQ: The Blessing of Dimensionality for LLM Quantization
Paper • 2402.15319 • Published • 22 -
DiLightNet: Fine-grained Lighting Control for Diffusion-based Image Generation
Paper • 2402.11929 • Published • 11
-
PDFTriage: Question Answering over Long, Structured Documents
Paper • 2309.08872 • Published • 53 -
Adapting Large Language Models via Reading Comprehension
Paper • 2309.09530 • Published • 81 -
Table-GPT: Table-tuned GPT for Diverse Table Tasks
Paper • 2310.09263 • Published • 41 -
Context-Aware Meta-Learning
Paper • 2310.10971 • Published • 17
-
How to Train Data-Efficient LLMs
Paper • 2402.09668 • Published • 42 -
A Survey on Data Selection for LLM Instruction Tuning
Paper • 2402.05123 • Published • 3 -
Fact, Fetch, and Reason: A Unified Evaluation of Retrieval-Augmented Generation
Paper • 2409.12941 • Published • 24 -
Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model
Paper • 2503.24290 • Published • 62
-
Adapting Large Language Models via Reading Comprehension
Paper • 2309.09530 • Published • 81 -
TinyGSM: achieving >80% on GSM8k with small language models
Paper • 2312.09241 • Published • 40 -
How to Train Data-Efficient LLMs
Paper • 2402.09668 • Published • 42 -
Let GPT be a Math Tutor: Teaching Math Word Problem Solvers with Customized Exercise Generation
Paper • 2305.14386 • Published
-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 23 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 85 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 151 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper • 2401.17072 • Published • 25
-
PDFTriage: Question Answering over Long, Structured Documents
Paper • 2309.08872 • Published • 53 -
Adapting Large Language Models via Reading Comprehension
Paper • 2309.09530 • Published • 81 -
Table-GPT: Table-tuned GPT for Diverse Table Tasks
Paper • 2310.09263 • Published • 41 -
Context-Aware Meta-Learning
Paper • 2310.10971 • Published • 17
-
How to Train Data-Efficient LLMs
Paper • 2402.09668 • Published • 42 -
SliceGPT: Compress Large Language Models by Deleting Rows and Columns
Paper • 2401.15024 • Published • 74 -
SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning
Paper • 2407.07523 • Published • 6 -
Spectra: A Comprehensive Study of Ternary, Quantized, and FP16 Language Models
Paper • 2407.12327 • Published • 79
-
How to Train Data-Efficient LLMs
Paper • 2402.09668 • Published • 42 -
A Survey on Data Selection for LLM Instruction Tuning
Paper • 2402.05123 • Published • 3 -
Fact, Fetch, and Reason: A Unified Evaluation of Retrieval-Augmented Generation
Paper • 2409.12941 • Published • 24 -
Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model
Paper • 2503.24290 • Published • 62
-
Pre-training Small Base LMs with Fewer Tokens
Paper • 2404.08634 • Published • 35 -
Ziya2: Data-centric Learning is All LLMs Need
Paper • 2311.03301 • Published • 20 -
How to Train Data-Efficient LLMs
Paper • 2402.09668 • Published • 42 -
MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies
Paper • 2404.06395 • Published • 24
-
Adapting Large Language Models via Reading Comprehension
Paper • 2309.09530 • Published • 81 -
TinyGSM: achieving >80% on GSM8k with small language models
Paper • 2312.09241 • Published • 40 -
How to Train Data-Efficient LLMs
Paper • 2402.09668 • Published • 42 -
Let GPT be a Math Tutor: Teaching Math Word Problem Solvers with Customized Exercise Generation
Paper • 2305.14386 • Published
-
OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework
Paper • 2404.14619 • Published • 126 -
Scaling Laws for Downstream Task Performance of Large Language Models
Paper • 2402.04177 • Published • 20 -
Orca 2: Teaching Small Language Models How to Reason
Paper • 2311.11045 • Published • 77 -
Orca-Math: Unlocking the potential of SLMs in Grade School Math
Paper • 2402.14830 • Published • 25
-
Watermarking Makes Language Models Radioactive
Paper • 2402.14904 • Published • 24 -
ChunkAttention: Efficient Self-Attention with Prefix-Aware KV Cache and Two-Phase Partition
Paper • 2402.15220 • Published • 22 -
GPTVQ: The Blessing of Dimensionality for LLM Quantization
Paper • 2402.15319 • Published • 22 -
DiLightNet: Fine-grained Lighting Control for Diffusion-based Image Generation
Paper • 2402.11929 • Published • 11