Join us on November 9 to learn how to successfully innovate and achieve efficiency by upskilling and scaling citizen developers at the Low-Code/No-Code Summit. Register here.
A little over a year ago, using large language models (LLMs) to generate software code was a cutting-edge scientific experiment that had yet to prove its worth. Today, code generation has become one of the most successful applications of LLMs.
Today, many developers are using LLM-powered tools such as GitHub Copilot to improve productivity, stay in the flow and make their work more enjoyable. However, as LLM-powered coding matures, we’re also beginning to discover the challenges it must overcome, including licensing, transparency, security and control.
The Stack, a dataset of source code recently released by the BigCode project, addresses some of these pain points. It also highlights some of the known barriers that remain to be resolved as artificial intelligence (AI)-powered code generation continues to move into the mainstream.
LLMs and code license
“The recent introduction of code LLMs has shown that they can make developers more productive and make software engineering accessible to people with less technical backgrounds,” Leandro von Werra, machine learning engineer at Hugging Face, told VentureBeat.
Learn how to build, scale, and govern low-code programs in a straightforward way that creates success for all this November 9. Register for your free pass today.
These language models can serve a variety of tasks. Programmers are using tools such as Copilot and Codex to write entire classes and functions from textual descriptions. This can be very useful for automating mundane parts of programming, such as setting up web servers, pulling information from databases or even writing Python code for a neural network and its training loop. According to von Werra, in the future, software engineers will be able to use LLMs to maintain legacy code written in an unfamiliar programming language.
However, the growing use of LLMs in coding has raised several concerns, including licensing issues. Models like Copilot generate code based on patterns they have learned from their training examples, some of which might be subject to restrictive licenses.
“Questions have been raised as to whether these AI models respect current open-source licenses—both for model training …