LLM Timeline

Timeline Color Key

Version: 0.42.0 (Work in Progress)

About this Timeline →
Foundation
Decoder-Only Models
Encoder-Only Models
Encoder-Decoder Models
Mixture-of-Experts
Open-Source Models
Alignment Techniques
Theoretical Advances
Multimodal Models
Hybrid Approaches
201720182019202020212022202320242025"Attention Is All You Nee...GPT-1: Generative Pre-Tra...BERT: Bidirectional Encod...GPT-2: Scaling UpT5: Text-to-Text Transfer...GPT-3: Few-Shot LearningRetrieval-Augmented Gener...Scaling Laws for Neural L...Switch Transformer (MoE)FLAN: Instruction TuningInstructGPT: RLHF Alignme...Chinchilla: Compute-Optim...ChatGPT: Conversational I...Constitutional AIMamba: State Space ModelsDALL-E 3Gemini Nano: On-Device Mo...Megatron-Turing NLGPaLM 2Claude 1Falcon LLMMidjourneyMeta's OPT: First Major O...Google's PaLMLLaMA 1: Meta's Open Rese...Stanford AlpacaGPT-4: Multimodal Capabil...Anthropic Claude 2LLaMA 2: Commercial Open-...Google's GeminiLLaMA 3: Continued Scalin...Claude 3 Family: Opus, So...GPT-4o: Omni ModelClaude 3.5: Sonnet and Ha...GPT-4 Turbo with VisionMistral AI ModelsMicrosoft Phi SeriesClaude 3.7 SonnetCLIP: Contrastive Languag...OpenAI Sora: World Simula...LLaMA 4: Mixture-of-Exper...