•  

Data Lineage: Understanding Data Lineage at Scale with Julien Le Dem

0
0


Big Data has exploded the past decade as cloud computing and more efficient hardware made scaling essentially limitless. Products like Uber revolve entirely around analyzing data to provide rides. According to an EMC/IDC study, there was approximately 5.2TB of data for every person in 2020. That estimate was made before the transition to remote work, which likely makes it much higher. 


The term “data lineage” refers to the collection, origin, storage, transfer, and use of data over time. Given the size of the Big Data industry and related industries, maintaining a thorough data lineage, even within small companies, can be very difficult. It becomes especially challenging at scale. What innovative tools make understanding all this information possible? Can data really continue growing at this rate?


In this episode we talk with Julien Le Dem, CTO and Co-Founder at Datakin. We discuss the challenges, available tools, and future for big data and data lineage.


Sponsorship inquiries: sponsor@softwareengineeringdaily.com


The post Data Lineage: Understanding Data Lineage at Scale with Julien Le Dem appeared first on Software Engineering Daily.


No comments yet...
Log in to comment
0 0 0
Yesterday

React Remix with Ryan Florence

Remix is a full-stack, open-source web framework that was developed by the creators of the popular R…
0 0 0
2025-03-18

Turing Award Special: A Conversation with Jack Dongarra

Jack Dongarra is an American computer scientist who is celebrated for his pioneering contributions t…
0 0 0
2025-03-13

Quantum Computing at Rigetti with David Rivas

Rigetti Computing is an American company specializing in quantum computing, founded in 2013. The com…
0 0 0
2025-03-11

The State of the Ethereum Blockchain with Andrew Koller

Ethereum is a decentralized blockchain platform that was created by Vitalik Buterin and Gavin Wood i…
0 0 0
2025-03-06

StackHawk and Shift-Left API Security with Scott Gerlach

APIs are a fundamental part of modern software systems and enable communication between services, ap…
0 0 0
2025-03-04

NVIDIA RAPIDS and Open Source ML Acceleration with Chris Deotte and Jean-Francois Puget

NVIDIA RAPIDS is an open-source suite of GPU-accelerated data science and AI libraries. It leverages…

Software Engineering Daily

Technical interviews about software topics.

Log in to Follow

More episodes from Software Engineering Daily

Top Podcasts Top rated Podcasts