Sakana AI’s evolutionary algorithm discovers new architectures for generative models

by | Mar 26, 2024 | Technology

Join Gen AI enterprise leaders in Boston on March 27 for an exclusive night of networking, insights, and conversations surrounding data integrity. Request an invite here.

A new technique developed by much-hyped Tokyo, Japan startup Sakana AI automatically creates generative models. The technique, called Evolutionary Model Merge, is inspired by the process of natural selection and combines parts of existing models to create more capable ones.

Sakana AI first announced its existence in August 2023, co-founded by esteemed AI researchers including former Googlers David Ha and “Attention Is All You Need” co-author Llion Jones (the paper that launched the current generative AI era).

Sakana’s new Evolutionary Model Merge technique can enable developers and organizations to create and discover new models through cost-effective methods and without the need to spend huge amounts to train and fine-tune their own models.

Sakana has released a large language model (LLM) and a vision-language model (VLM) created through Evolutionary Model Merge.

VB Event
The AI Impact Tour – Atlanta

Continuing our tour, we’re headed to Atlanta for the AI Impact Tour stop on April 10th. This exclusive, invite-only event, in partnership with Microsoft, will feature discussions on how generative AI is transforming the security workforce. Space is limited, so request an invite today.

Request an invite

Introducing Evolutionary Model Merge: A new approach bringing us closer to automating foundation model development. We use evolution to find great ways of combining open-source models, building new powerful foundation models with user-specified abilities!https://t.co/G0EyM7pztr pic.twitter.com/msOokvqGbR— Sakana AI (@SakanaAILabs) March 21, 2024

Model merging

Training generative models is an expensive and complicated process that most organizations can’t afford. But with the release of open models such as Llama 2 and Mistral, developers have found innovative ways to improve them at low costs. 

One of these methods is “model merging,” where different components of two or more pre-trained models are combined to create a new one. If done correctly, the merged model can potentially inherit the strengths and capabilities of the source models.

Interestingly, merged models do not need additional training, making it very cost-effective. In fact, many of the top-performing models on Open LLM leaderboards are merged versions of popular base models. 

“What we are witnessing is a large community of researchers, hackers, enthusiasts and artists alike going about their own ways of developing new foundation models by fine-tuning existing models on specialized datasets, or merging existing models together,” Sakana AI’s researchers write on the company’s blog.

With more than 500,000 models available on Hugging Face, model merging offers vast possibilities for researchers, developers, and organizations to explore and create new models at a very low cost. However, model merging relies heavily on intuition and domain knowledge. 

Evolutionary Model Merge

Sakana AI’s ne …

Article Attribution | Read More at Article Source

[mwai_chat context=”Let’s have a discussion about this article:nn
Join Gen AI enterprise leaders in Boston on March 27 for an exclusive night of networking, insights, and conversations surrounding data integrity. Request an invite here.

A new technique developed by much-hyped Tokyo, Japan startup Sakana AI automatically creates generative models. The technique, called Evolutionary Model Merge, is inspired by the process of natural selection and combines parts of existing models to create more capable ones.

Sakana AI first announced its existence in August 2023, co-founded by esteemed AI researchers including former Googlers David Ha and “Attention Is All You Need” co-author Llion Jones (the paper that launched the current generative AI era).

Sakana’s new Evolutionary Model Merge technique can enable developers and organizations to create and discover new models through cost-effective methods and without the need to spend huge amounts to train and fine-tune their own models.

Sakana has released a large language model (LLM) and a vision-language model (VLM) created through Evolutionary Model Merge.

VB Event
The AI Impact Tour – Atlanta

Continuing our tour, we’re headed to Atlanta for the AI Impact Tour stop on April 10th. This exclusive, invite-only event, in partnership with Microsoft, will feature discussions on how generative AI is transforming the security workforce. Space is limited, so request an invite today.

Request an invite

Introducing Evolutionary Model Merge: A new approach bringing us closer to automating foundation model development. We use evolution to find great ways of combining open-source models, building new powerful foundation models with user-specified abilities!https://t.co/G0EyM7pztr pic.twitter.com/msOokvqGbR— Sakana AI (@SakanaAILabs) March 21, 2024

Model merging

Training generative models is an expensive and complicated process that most organizations can’t afford. But with the release of open models such as Llama 2 and Mistral, developers have found innovative ways to improve them at low costs. 

One of these methods is “model merging,” where different components of two or more pre-trained models are combined to create a new one. If done correctly, the merged model can potentially inherit the strengths and capabilities of the source models.

Interestingly, merged models do not need additional training, making it very cost-effective. In fact, many of the top-performing models on Open LLM leaderboards are merged versions of popular base models. 

“What we are witnessing is a large community of researchers, hackers, enthusiasts and artists alike going about their own ways of developing new foundation models by fine-tuning existing models on specialized datasets, or merging existing models together,” Sakana AI’s researchers write on the company’s blog.

With more than 500,000 models available on Hugging Face, model merging offers vast possibilities for researchers, developers, and organizations to explore and create new models at a very low cost. However, model merging relies heavily on intuition and domain knowledge. 

Evolutionary Model Merge

Sakana AI’s ne …nnDiscussion:nn” ai_name=”RocketNews AI: ” start_sentence=”Can I tell you more about this article?” text_input_placeholder=”Type ‘Yes'”]

Share This