Sun.Mar 10, 2024

article thumbnail

Version Your Data Lakehouse Like Your Software With Nessie

Data Engineering Podcast

Summary Data lakehouse architectures are gaining popularity due to the flexibility and cost effectiveness that they offer. The link that bridges the gap between data lake and warehouse capabilities is the catalog. The primary purpose of the catalog is to inform the query engine of what data exists and where, but the Nessie project aims to go beyond that simple utility.

Data Lake 147
article thumbnail

Understanding Data Quality and Why Teams Struggle with It

Towards Data Science

Data quality: the catch-all term for business logic, reliability, validity, and consistency Continue reading on Towards Data Science »

IT 71
article thumbnail

Data Engineering Weekly #162

Data Engineering Weekly

Editor’s Note: Chennai Meetup Wrap-Up & Preparation work started for DEWCon I am so grateful for the enthusiastic participants who made our Chennai Data Heroes- Community for Data Folks meetup vibrant! Big thanks to our insightful speakers, Hareshkumar Selvakumar - Talks about his work on Data Products for PayPal. Pradheep Arjunan - Shared insights on AZ's journey from on-prem to the cloud data warehouses.