LakeFS brings branching to data lakes

by | Jun 27, 2022 | Technology

We are excited to bring Transform 2022 back in-person July 19 and virtually July 20 – 28. Join AI and data leaders for insightful talks and exciting networking opportunities. Register today!

Can enterprises find a better way to organize the relentless onslaught of data? LakeFS thinks the answer: versioning a la Git. LakeFS offers the opportunity to create and track different versions of data, essentially imitating the process that developers use to organize the code. 

On June 27, the company announced general availability of their service, LakeFS Cloud. Teams will be able to use it to follow the evolution of various versions of their data just as they do with different versions of their code. 

“LakeFS is actually an infrastructure. It sits on top of the data,” explains Einat Orr, a cofounder and CEO of LakeFS. “It is an interface between the data lake and the applications. So any application can enjoy the Git-like operations that LakeFS offers, and the data is managed through one consistent interface for the organization.”

For a long time, developers have treated software and data differently. The programmers created versioning systems like Git to help organize software development by tracking the various small and large changes. Teams rely on the tool to keep the work of different programmers separate until it’s time to merge and ship a final version. Software teams routinely work with dozens, hundreds or even thousands of different versions arranged in a metaphorical tree with branches. 

Data, though, has generally been stored in separate chunks. Developers often make complete copies of different snapshots or backups taken at different times. Tracking the differences was difficult and the proliferation of copies created confusion and large bills for storage.

“The cloud never warned us about the data getting clouded. As the blessing of infinite storage quickly became an unmanageable mess, there is a need for technologies like LakeFS to make data accessible again,” explained Sivan Bercovici, CTO at medical diagnostics company K …

Article Attribution | Read More at Article Source

Share This