Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2203.02155

paper digestion

Language Models are Few-Shot Learners

Paper • 2005.14165 • Published May 28, 2020 • 18
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 25
Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 108
Lookahead Anchoring: Preserving Character Identity in Audio-Driven Human Animation

Paper • 2510.23581 • Published Oct 27, 2025 • 41

Toolkit - AI Papers

Neural Machine Translation by Jointly Learning to Align and Translate

Paper • 1409.0473 • Published Sep 1, 2014 • 7
Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 108
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 25
Hierarchical Reasoning Model

Paper • 2506.21734 • Published Jun 26, 2025 • 46

DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Paper • 2503.14476 • Published Mar 18, 2025 • 144
Training language models to follow instructions with human feedback

Paper • 2203.02155 • Published Mar 4, 2022 • 24
Llama 2: Open Foundation and Fine-Tuned Chat Models

Paper • 2307.09288 • Published Jul 18, 2023 • 248
The Llama 3 Herd of Models

Paper • 2407.21783 • Published Jul 31, 2024 • 117

Reinforcement Pre-Training

Paper • 2506.08007 • Published Jun 9, 2025 • 263
A Survey on Latent Reasoning

Paper • 2507.06203 • Published Jul 8, 2025 • 93
Language Models are Few-Shot Learners

Paper • 2005.14165 • Published May 28, 2020 • 18
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Paper • 1910.10683 • Published Oct 23, 2019 • 15

Training language models to follow instructions with human feedback

Paper • 2203.02155 • Published Mar 4, 2022 • 24

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 627
Hierarchical Reasoning Model

Paper • 2506.21734 • Published Jun 26, 2025 • 46
Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6, 2025 • 501
Training language models to follow instructions with human feedback

Paper • 2203.02155 • Published Mar 4, 2022 • 24

LLM Tech Reports

Qwen3 Technical Report

Paper • 2505.09388 • Published May 14, 2025 • 321
Kimi k1.5: Scaling Reinforcement Learning with LLMs

Paper • 2501.12599 • Published Jan 22, 2025 • 126
Training language models to follow instructions with human feedback

Paper • 2203.02155 • Published Mar 4, 2022 • 24

DeBERTa: Decoding-enhanced BERT with Disentangled Attention

Paper • 2006.03654 • Published Jun 5, 2020 • 3
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 25
RoBERTa: A Robustly Optimized BERT Pretraining Approach

Paper • 1907.11692 • Published Jul 26, 2019 • 9
Language Models are Few-Shot Learners

Paper • 2005.14165 • Published May 28, 2020 • 18

A collection of arXiv papers from Chip Huyen's AI Engineering organized by chapter and ordered by when each appears in the book.

Will we run out of data? An analysis of the limits of scaling datasets in Machine Learning

Paper • 2211.04325 • Published Oct 26, 2022 • 1
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 25
On the Opportunities and Risks of Foundation Models

Paper • 2108.07258 • Published Aug 16, 2021 • 2
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks

Paper • 2204.07705 • Published Apr 16, 2022 • 2

Papers (I want) To Read

A list of papers on my reading list.

Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models

Paper • 2304.09842 • Published Apr 19, 2023 • 2
ReAct: Synergizing Reasoning and Acting in Language Models

Paper • 2210.03629 • Published Oct 6, 2022 • 31
Gorilla: Large Language Model Connected with Massive APIs

Paper • 2305.15334 • Published May 24, 2023 • 5
Reflexion: Language Agents with Verbal Reinforcement Learning

Paper • 2303.11366 • Published Mar 20, 2023 • 5

paper digestion

Language Models are Few-Shot Learners

Paper • 2005.14165 • Published May 28, 2020 • 18
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 25
Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 108
Lookahead Anchoring: Preserving Character Identity in Audio-Driven Human Animation

Paper • 2510.23581 • Published Oct 27, 2025 • 41

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 627
Hierarchical Reasoning Model

Paper • 2506.21734 • Published Jun 26, 2025 • 46
Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6, 2025 • 501
Training language models to follow instructions with human feedback

Paper • 2203.02155 • Published Mar 4, 2022 • 24

Toolkit - AI Papers

Neural Machine Translation by Jointly Learning to Align and Translate

Paper • 1409.0473 • Published Sep 1, 2014 • 7
Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 108
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 25
Hierarchical Reasoning Model

Paper • 2506.21734 • Published Jun 26, 2025 • 46

LLM Tech Reports

Qwen3 Technical Report

Paper • 2505.09388 • Published May 14, 2025 • 321
Kimi k1.5: Scaling Reinforcement Learning with LLMs

Paper • 2501.12599 • Published Jan 22, 2025 • 126
Training language models to follow instructions with human feedback

Paper • 2203.02155 • Published Mar 4, 2022 • 24

DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Paper • 2503.14476 • Published Mar 18, 2025 • 144
Training language models to follow instructions with human feedback

Paper • 2203.02155 • Published Mar 4, 2022 • 24
Llama 2: Open Foundation and Fine-Tuned Chat Models

Paper • 2307.09288 • Published Jul 18, 2023 • 248
The Llama 3 Herd of Models

Paper • 2407.21783 • Published Jul 31, 2024 • 117

DeBERTa: Decoding-enhanced BERT with Disentangled Attention

Paper • 2006.03654 • Published Jun 5, 2020 • 3
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 25
RoBERTa: A Robustly Optimized BERT Pretraining Approach

Paper • 1907.11692 • Published Jul 26, 2019 • 9
Language Models are Few-Shot Learners

Paper • 2005.14165 • Published May 28, 2020 • 18

Reinforcement Pre-Training

Paper • 2506.08007 • Published Jun 9, 2025 • 263
A Survey on Latent Reasoning

Paper • 2507.06203 • Published Jul 8, 2025 • 93
Language Models are Few-Shot Learners

Paper • 2005.14165 • Published May 28, 2020 • 18
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Paper • 1910.10683 • Published Oct 23, 2019 • 15

A collection of arXiv papers from Chip Huyen's AI Engineering organized by chapter and ordered by when each appears in the book.

Will we run out of data? An analysis of the limits of scaling datasets in Machine Learning

Paper • 2211.04325 • Published Oct 26, 2022 • 1
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 25
On the Opportunities and Risks of Foundation Models

Paper • 2108.07258 • Published Aug 16, 2021 • 2
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks

Paper • 2204.07705 • Published Apr 16, 2022 • 2

Training language models to follow instructions with human feedback

Paper • 2203.02155 • Published Mar 4, 2022 • 24

Papers (I want) To Read

A list of papers on my reading list.

Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models

Paper • 2304.09842 • Published Apr 19, 2023 • 2
ReAct: Synergizing Reasoning and Acting in Language Models

Paper • 2210.03629 • Published Oct 6, 2022 • 31
Gorilla: Large Language Model Connected with Massive APIs

Paper • 2305.15334 • Published May 24, 2023 • 5
Reflexion: Language Agents with Verbal Reinforcement Learning

Paper • 2303.11366 • Published Mar 20, 2023 • 5

Previous
1
2
3
Next

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs