Exclusive: AI startup Tenyx’s fine-tuned open-source Llama 3 model outperforms GPT-4

by | May 7, 2024 | Technology

Discover how companies are responsibly integrating AI in production. This invite-only event in SF will explore the intersection of technology and business. Find out how you can attend here.

In an exclusive interview with VentureBeat, Itamar Arel, founder and CEO of AI startup Tenyx, revealed a groundbreaking achievement in the field of natural language processing. Tenyx has successfully fine-tuned Meta’s open-source Llama-3 language model (now known as Tenyx-70B) to outperform OpenAI’s GPT-4 in certain domains, marking the first time an open-source model has surpassed the proprietary gold standard.

“We developed this fine-tuning technology that allows us to take a foundational model and sort of polish it or train it beyond what it was trained on,” Arel explained. “What we’ve been getting more and more excited about is that we could take that technology, which allows us essentially to exploit some redundancy in these large models, to allow for what’s probably better called continual learning or incremental learning.”

A radial chart shows the Tenyx-optimized Llama 3 model outperforming GPT-4 in math and coding while surpassing the base Llama 3 model across all capabilities, a first for an open-source AI model according to Tenyx founder Itamar Arel. (Image Credit: Tenyx)

Overcoming ‘catastrophic forgetting’

Tenyx’s novel approach to fine-tuning tackles the issue of “catastrophic forgetting,” where a model loses previously learned knowledge when exposed to new data. By selectively updating only a small portion of the model’s parameters, Tenyx can efficiently train the model on new information without compromising its existing capabilities.

“If you end up changing, say, just 5% of the model parameters, and everything else stays the same, you’re able to do so more aggressively without running the risk that you’re going to distort other things,” Arel said. This selective parameter updating method has also enabled Tenyx to achieve remarkably fast training times, fine-tuning the 70-billion-parameter Llama-3 model in just 15 hours using 100 GPUs.

VB Event
The AI Impact Tour – San Francisco

Join us as we navigate the complexities of responsibly integrating AI in business at the next stop of VB’s AI Impact Tour in San Francisco. Don’t miss out on the chance to gain insights from industry experts, network with like-minded innovators, and explore the future of GenAI with customer experiences and optimize business processes.

Request an invite

At the time of release, Llama3-TenyxChat-70B is the highest-ranked open source model on the MT-Bench evaluation available for downloa …

Article Attribution | Read More at Article Source

[mwai_chat context=”Let’s have a discussion about this article:nn
Discover how companies are responsibly integrating AI in production. This invite-only event in SF will explore the intersection of technology and business. Find out how you can attend here.

In an exclusive interview with VentureBeat, Itamar Arel, founder and CEO of AI startup Tenyx, revealed a groundbreaking achievement in the field of natural language processing. Tenyx has successfully fine-tuned Meta’s open-source Llama-3 language model (now known as Tenyx-70B) to outperform OpenAI’s GPT-4 in certain domains, marking the first time an open-source model has surpassed the proprietary gold standard.

“We developed this fine-tuning technology that allows us to take a foundational model and sort of polish it or train it beyond what it was trained on,” Arel explained. “What we’ve been getting more and more excited about is that we could take that technology, which allows us essentially to exploit some redundancy in these large models, to allow for what’s probably better called continual learning or incremental learning.”

A radial chart shows the Tenyx-optimized Llama 3 model outperforming GPT-4 in math and coding while surpassing the base Llama 3 model across all capabilities, a first for an open-source AI model according to Tenyx founder Itamar Arel. (Image Credit: Tenyx)

Overcoming ‘catastrophic forgetting’

Tenyx’s novel approach to fine-tuning tackles the issue of “catastrophic forgetting,” where a model loses previously learned knowledge when exposed to new data. By selectively updating only a small portion of the model’s parameters, Tenyx can efficiently train the model on new information without compromising its existing capabilities.

“If you end up changing, say, just 5% of the model parameters, and everything else stays the same, you’re able to do so more aggressively without running the risk that you’re going to distort other things,” Arel said. This selective parameter updating method has also enabled Tenyx to achieve remarkably fast training times, fine-tuning the 70-billion-parameter Llama-3 model in just 15 hours using 100 GPUs.

VB Event
The AI Impact Tour – San Francisco

Join us as we navigate the complexities of responsibly integrating AI in business at the next stop of VB’s AI Impact Tour in San Francisco. Don’t miss out on the chance to gain insights from industry experts, network with like-minded innovators, and explore the future of GenAI with customer experiences and optimize business processes.

Request an invite

At the time of release, Llama3-TenyxChat-70B is the highest-ranked open source model on the MT-Bench evaluation available for downloa …nnDiscussion:nn” ai_name=”RocketNews AI: ” start_sentence=”Can I tell you more about this article?” text_input_placeholder=”Type ‘Yes'”]

Share This