How Gradient created an open LLM with a million-token context window

by | Jun 24, 2024 | Technology

Don’t miss OpenAI, Chevron, Nvidia, Kaiser Permanente, and Capital One leaders only at VentureBeat Transform 2024. Gain essential insights about GenAI and expand your network at this exclusive three day event. Learn More

In a recent collaboration, AI startup Gradient and cloud compute platform Crusoe extended the “context window” of Llama-3 models to 1 million tokens. The context window determines the number of input and output tokens a large language model (LLM) can process. 

Big tech companies and frontier AI labs are locked in a race to extend the context windows of their LLMs. In a few months, models have gone from supporting a few thousand tokens to more than a million in less than a year. However, LLMs with very long context windows are mostly limited to private models such as Anthropic Claude (200k tokens), OpenAI GPT-4 (128k tokens), and Google Gemini (1 million tokens).

The race to create open-source models with long context windows can reshuffle the LLM market and unlock applications that are not possible with private models.

The need for open-source long-context LLMs

Gradient works with enterprise customers who want to integrate LLMs into their workflows. Even before Llama-3 came out, the company was facing context pain points in projects they were working on for their customers.

Countdown to VB Transform 2024

Join enterprise leaders in San Francisco from July 9 to 11 for our flagship AI event. Connect with peers, explore the opportunities and challenges of Generative AI, and learn how to integrate AI applications into your industry. Register Now

For example, language models that help in programming tasks, often referred to as “coding copilots,” have become an important development tool in many companies. Standard coding copilots can generate small bits of code at a time, such as a function. Now, companies are looking to extend those capabilities to creating entire modules of code.

“In order to do that, the language model needs to be able to reference an entire code base or maybe multiple GitHub code repositories,” Leo Pekelis, Chief Scientist at Gradient AI, told VentureBeat. 

One way to do it would be to provide the codebase to the LLM piecemeal and make multiple …

Article Attribution | Read More at Article Source

[mwai_chat context=”Let’s have a discussion about this article:nn
Don’t miss OpenAI, Chevron, Nvidia, Kaiser Permanente, and Capital One leaders only at VentureBeat Transform 2024. Gain essential insights about GenAI and expand your network at this exclusive three day event. Learn More

In a recent collaboration, AI startup Gradient and cloud compute platform Crusoe extended the “context window” of Llama-3 models to 1 million tokens. The context window determines the number of input and output tokens a large language model (LLM) can process. 

Big tech companies and frontier AI labs are locked in a race to extend the context windows of their LLMs. In a few months, models have gone from supporting a few thousand tokens to more than a million in less than a year. However, LLMs with very long context windows are mostly limited to private models such as Anthropic Claude (200k tokens), OpenAI GPT-4 (128k tokens), and Google Gemini (1 million tokens).

The race to create open-source models with long context windows can reshuffle the LLM market and unlock applications that are not possible with private models.

The need for open-source long-context LLMs

Gradient works with enterprise customers who want to integrate LLMs into their workflows. Even before Llama-3 came out, the company was facing context pain points in projects they were working on for their customers.

Countdown to VB Transform 2024

Join enterprise leaders in San Francisco from July 9 to 11 for our flagship AI event. Connect with peers, explore the opportunities and challenges of Generative AI, and learn how to integrate AI applications into your industry. Register Now

For example, language models that help in programming tasks, often referred to as “coding copilots,” have become an important development tool in many companies. Standard coding copilots can generate small bits of code at a time, such as a function. Now, companies are looking to extend those capabilities to creating entire modules of code.

“In order to do that, the language model needs to be able to reference an entire code base or maybe multiple GitHub code repositories,” Leo Pekelis, Chief Scientist at Gradient AI, told VentureBeat. 

One way to do it would be to provide the codebase to the LLM piecemeal and make multiple …nnDiscussion:nn” ai_name=”RocketNews AI: ” start_sentence=”Can I tell you more about this article?” text_input_placeholder=”Type ‘Yes'”]

Share This