Attention Is All You NeedNVDA $875.42GPT-4MFU: 54.2%BERT: Pre-trainingAVGO $1,234.56Claude-3FLOPS: 2.1e23Language Models are Few-Shot LearnersAMD $142.56LLaMA-2Perplexity: 2.87Attention Is All You NeedMRVL $78.34GPT-4MFU: 54.2%BERT: Pre-trainingGEV $45.67Claude-3FLOPS: 2.1e23Language Models are Few-Shot LearnersSMNEY $89.23LLaMA-2Perplexity: 2.87
PaLM-2Tokens/sec: 1.2MScaling Laws for Neural Language ModelsSMNEY $89.23GeminiBF16 PrecisionTraining language modelsAMZN $178.45Mixtral-8x7BLatency: 45msConstitutional AIGOOG $167.89PaLM-2Tokens/sec: 1.2MScaling LawsORCL $134.23GeminiBF16 PrecisionTraining language modelsMETA $312.94Mixtral-8x7BLatency: 45msConstitutional AIMSFT $425.67