Meta Llama 3 AI Models With 8B and 70B Parameters Launched, Said to Outperform Google’s Gemini 1.5 Pro

Meta will launch its large Llama 3 AI models with more than 400 billion parameters later this year.

Written by Akash Dutta, Edited by Siddharth Suvarna | Updated: 19 April 2024 13:06 IST

Highlights

Meta Llama 3 AI models have been integrated with Meta AI
The AI models will be available in Pre-trained and Instruct variants
Meta Llama 3 uses a tokeniser with a vocabulary of 128K tokens

Meta Llama 3 AI Models With 8B and 70B Parameters Launched, Said to Outperform Google’s Gemini 1.5 Pro

Meta said Llama 3 is trained on publicly available data

Photo Credit: Meta

Meta introduced the next generation of its artificial intelligence (AI) models, Llama 3 8B and 70B, on Thursday. Shortened for Large Language Model Meta AI, Llama 3 comes with improved capabilities over its predecessor. The company also adopted new training methods to optimise the efficiency of the models. Interestingly, with Llama 2, the largest model was 70B, but this time the company said its large models will contain more than 400 billion parameters. Notably, a report last week revealed that Meta will unveil its smaller AI models in April and its larger models later in the summer.

Meta Llama 3 availability

Those interested in trying out the new AI models are in luck as Meta is taking a community-first approach with the Llama 3. The new foundation models will be open source just like previous models. Meta stated in its blog post, “Llama 3 models will soon be available on AWS, Databricks, Google Cloud, Hugging Face, Kaggle, IBM WatsonX, Microsoft Azure, NVIDIA NIM, and Snowflake, and with support from hardware platforms offered by AMD, AWS, Dell, Intel, NVIDIA, and Qualcomm.”

The list includes all major cloud, hosting, and hardware platforms, which should make it easier for enthusiasts to get their hands on the AI models. Further, Meta has also integrated Llama 3 with its own Meta AI that can be accessed via Facebook Messenger, Instagram, and WhatsApp in supported countries.

Apple's Rumoured 12.9-Inch iPad Air May Arrive With This Display Upgrade

Meta Llama 3 performance and architecture

Coming to the performance, the social media giant shared benchmark scores of Llama 3 for both its pre-trained and instruct models. For reference, pre-trained is the general conversational AI whereas the instruct models are aimed at completing specific tasks. The pre-trained model of Llama 3 70B outscored Google's Gemini 1.0 Pro in the MMLU (79.5 vs 71.8), BIG-Bench Hard (81.3 vs 75.0), and DROP (79.7 vs 74.1) benchmarks, wheres the 70B Instruct model outscored the Gemini 1.5 Pro model in MMLU, HumanEval, and GSM-8K benchmarks, based on data shared by the company.

Meta has opted for a decoder-only transformer architecture for the new AI models but has made several improvements over the predecessor. Llama 3 now uses a tokeniser with a vocabulary of 128K tokens, and the company has adopted grouped query attention (GQA) to improve inference efficiency. GQA helps in improving the attention of the AI so it does not move outside of its designated context when answering queries. The social media giant has pre-trained the models with more than 15T tokens, which it claims to have sourced from publicly available data.

Is the Samsung Galaxy Z Flip 5 the best foldable phone you can buy in India right now? We discuss the company's new clamshell-style foldable handset on the latest episode of Orbital, the Gadgets 360 podcast. Orbital is available on Spotify, Gaana, JioSaavn, Google Podcasts, Apple Podcasts, Amazon Music and wherever you get your podcasts.

Affiliate links may be automatically generated - see our ethics statement for details.