Apple Partners With Nvidia to Improve Performance Speed of Its AI Models

Apple used Nvidia’s inference acceleration framework for its open-source Recurrent Drafter technique for AI models.

Advertisement
Highlights
  • Apple published a paper on Recurrent Drafter earlier this year
  • Nvidia’s TensorRT-LLM acceleration framework was used for this
  • Apple claims the process resulted in 2.7x faster token generation
Apple Partners With Nvidia to Improve Performance Speed of Its AI Models

Apple had earlier stated the Recurrent Drafter can improve token generation by up to 3.5 tokens per step

Photo Credit: Reuters

Apple is partnering with Nvidia in an effort to improve the performance speed of artificial intelligence (AI) models. On Wednesday, the Cupertino-based tech giant announced that it has been researching inference acceleration on Nvidia's platform to see whether both the efficiency and latency of a large language model (LLM) can be improved simultaneously. The iPhone maker used a technique dubbed Recurrent Drafter (ReDrafter) that was published in a research paper earlier this year. This technique was combined with the Nvidia TensorRT-LLM inference acceleration framework.

Apple Uses Nvidia Platform to Improve AI Performance

In a blog post, Apple researchers detailed the new collaboration with Nvidia for LLM performance and the results achieved from it. The company highlighted that it has been researching the problem of improving inference efficiency while maintaining latency in AI models.

Inference in machine learning refers to the process of making predictions, decisions, or conclusions based on a given set of data or input while using a trained model. Put simply, it is the processing step of an AI model where it decodes the prompts and converts raw data into processed unseen information.

Earlier this year, Apple published and open-sourced the ReDrafter technique bringing a new approach to the speculative decoding of data. Using a Recurrent neural network (RNN) draft model, it combines beam search (a mechanism where AI explores multiple possibilities for a solution) and dynamic tree attention (tree-structure data is processed using an attention mechanism). The researchers stated that it can speed up LLM token generation by up to 3.5 tokens per generation step.

While the company was able to improve performance efficiency to a certain degree by combining two processes, Apple highlighted that there was no significant boost to speed. To solve this, researchers integrated ReDrafter into the Nvidia TensorRT-LLM inference acceleration framework.

As a part of the collaboration, Nvidia added new operators and exposed the existing ones to improve the speculative decoding process. The post claimed that when using the Nvidia platform with ReDrafter, they found a 2.7x speed-up in generated tokens per second for greedy decoding (a decoding strategy used in sequence generation tasks).

Apple highlighted that this technology can be used to reduce the latency of AI processing while also using fewer GPUs and consuming less power.

Catch the latest from the Consumer Electronics Show on Gadgets 360, at our CES 2025 hub.

Further reading: Apple, Nvidia, AI, Artificial Intelligence
Akash Dutta

Akash Dutta is a Senior Sub Editor at Gadgets 360. He is particularly interested in the social impact of techn... more

Advertisement

Related Stories

Popular Mobile Brands
  1. OnePlus 13, OnePlus 13R Buyers in India Get 180-Day Phone Replacement Plan
  2. Amazon Great Republic Day Sale 2025 Announced: Here's What to Know
  3. Pixel 4a Gets an Update; Eligible Users Can Get Free Battery Replacement
  4. JBL Horizon 3 Mini Speaker Launched Alongside Three New PartyBox Speakers
  5. Mufasa: The Lion King OTT Release Reportedly Revealed: What You Need to Know
  6. Google TV Will Soon Offer These Gemini-Powered AI Features
  7. Realme P1 5G Price in India Drops to Rs. 12,999 for a Limited Time
  8. iPhone SE 4, iPad 11 Launch Timeline Tipped
  9. Amazon Echo Spot 2024 Review: The Perfect Bedside Companion?
  1. Blue Origin New Glenn Set for Launch on January 10 from Cape Canaveral
  2. Spiders Detect Smells Through Leg Hairs, Claims New Study
  3. OnePlus 13 Magnetic Cases, OnePlus AIRVOOC 50W Wireless Charger Launched in India
  4. Nvidia Cosmos AI Platform to Train and Develop Autonomous Vehicles and Robots Launched at CES 2025
  5. Apple's iPhone SE 4, New iPad Models Could Launch 'By April if All Goes to Plan': Mark Gurman
  6. Explained: BWA’s ‘Cybersecurity and Fair-Trading Guidelines’ for VASPs in India
  7. PFAS Chemicals Harm Freshwater Turtles in Australia, New Research Finds
  8. Microsoft to Improve Windows Handheld Consoles With the 'Xbox Experience': Report
  9. JBL Horizon 3 Mini Speaker Launched Alongside PartyBox 520, Encore 2, Encore Essential 2 Speakers at CES 2025
  10. Google Releases New Pixel 4a Update, Affected Owners Can Claim Eligible for Free Battery Replacement
Gadgets 360 is available in
Download Our Apps
App Store App Store
Available in Hindi
App Store
© Copyright Red Pixels Ventures Limited 2025. All rights reserved.
Trending Products »
Latest Tech News »