Attention Is All You Need NVDA $875.42 GPT-4 MFU: 54.2% BERT: Pre-training AVGO $1,234.56 Claude-3 FLOPS: 2.1e23 Language Models are Few-Shot Learners AMD $142.56 LLaMA-2 Perplexity: 2.87 Attention Is All You Need MRVL $78.34 GPT-4 MFU: 54.2% BERT: Pre-training GEV $45.67 Claude-3 FLOPS: 2.1e23 Language Models are Few-Shot Learners SMNEY $89.23 LLaMA-2 Perplexity: 2.87
PaLM-2 Tokens/sec: 1.2M Scaling Laws for Neural Language Models SMNEY $89.23 Gemini BF16 Precision Training language models AMZN $178.45 Mixtral-8x7B Latency: 45ms Constitutional AI GOOG $167.89 PaLM-2 Tokens/sec: 1.2M Scaling Laws ORCL $134.23 Gemini BF16 Precision Training language models META $312.94 Mixtral-8x7B Latency: 45ms Constitutional AI MSFT $425.67
Batch Size: 2048 Chain-of-Thought Prompting META $312.94 Llama-3-70B Context Length: 128K Retrieval-Augmented Generation MSFT $425.67 Vicuna-13B Temperature: 0.7 In-Context Learning VRT $98.34 Alpaca-7B Batch Size: 2048 Chain-of-Thought VST $67.89 Llama-3-70B Context Length: 128K Retrieval-Augmented Generation MU $89.45 Vicuna-13B Temperature: 0.7 In-Context Learning ONTO $23.67
GPT-3.5-Turbo Top-K: 50 Emergent Abilities of Large Language Models MU $89.45 Falcon-40B Gradient Norm: 1.0 Red Teaming Language Models ONTO $23.67 MPT-30B Learning Rate: 1e-4 Scaling Instruction-Finetuned TLN $156.78 GPT-3.5-Turbo Top-K: 50 Emergent Abilities CEG $78.23 Falcon-40B Gradient Norm: 1.0 Red Teaming Language Models CRWV $234.56 MPT-30B Learning Rate: 1e-4 Scaling Instruction-Finetuned SMCI $456.89
CRWV $234.56 Dropout: 0.1 Self-Instruct CodeLlama-34B Vocab Size: 128K Tree of Thoughts SMCI $456.89 Mistral-7B Warmup Steps: 2000 Constitutional AI NVDA $875.42 Zephyr-7B Dropout: 0.1 Self-Instruct AVGO $1,234.56 CodeLlama-34B Vocab Size: 128K Tree of Thoughts AMD $142.56 Mistral-7B Warmup Steps: 2000 Constitutional AI MRVL $78.34
ChatGPT Attention Heads: 96 Reward Modeling for RLHF AMD $142.56 Bard Hidden Size: 12288 Direct Preference Optimization MRVL $78.34 Phi-2 Layers: 80 Toolformer GEV $45.67 ChatGPT Attention Heads: 96 Reward Modeling SMNEY $89.23 Bard Hidden Size: 12288 Direct Preference Optimization AMZN $178.45 Phi-2 Layers: 80 Toolformer GOOG $167.89
Sequence Length: 4096 WebGPT AMZN $178.45 Orca-2 KV Cache Sparrow GOOG $167.89 WizardLM Flash Attention ReAct ORCL $134.23 Starling-7B RoPE WebGPT META $312.94 Orca-2 KV Cache Sparrow MSFT $425.67 WizardLM Flash Attention ReAct VRT $98.34
Qwen-72B Flamingo MSFT $425.67 SwiGLU CLIP Yi-34B VRT $98.34 LayerNorm DALL-E 2 DeepSeek-67B VST $67.89 Embedding Dim: 4096 Stable Diffusion Qwen-72B MU $89.45 SwiGLU CLIP Yi-34B ONTO $23.67 LayerNorm DALL-E 2 DeepSeek-67B TLN $156.78 Embedding Dim: 4096 Stable Diffusion Qwen-72B CEG $78.23 SwiGLU Flamingo Yi-34B CRWV $234.56 LayerNorm CLIP DeepSeek-67B SMCI $456.89 Embedding Dim: 4096 DALL-E 2 Qwen-72B NVDA $875.42
ONTO $23.67 Weight Decay: 0.01 Chinchilla Baichuan-13B Adam Beta: 0.9 Gopher TLN $156.78 InternLM-20B Gradient Clipping PaLM CEG $78.23 ChatGLM-6B FP32 Precision Chinchilla CRWV $234.56 Baichuan-13B Weight Decay: 0.01 Gopher SMCI $456.89 InternLM-20B Gradient Clipping PaLM NVDA $875.42
OpenChat Multimodal CoT SMCI $456.89 Cosine Decay Instruction Tuning Solar-10.7B NVDA $875.42 Mixed Precision FLAN-T5 Nous-Hermes AVGO $1,234.56 Throughput: 15K tok/s LaMDA OpenChat AMD $142.56 Cosine Decay Instruction Tuning Solar-10.7B MRVL $78.34 Mixed Precision FLAN-T5 Nous-Hermes GEV $45.67

Implied Future

AGI is coming.

There are no secrets left, only straight lines. Yet few have the clarity to measure them and the conviction to follow their predictions.


We are a measurement and prediction company tracking the arrival of AGI.