article thumbnail

Using Trino And Iceberg As The Foundation Of Your Data Lakehouse

Data Engineering Podcast

Summary A data lakehouse is intended to combine the benefits of data lakes (cost effective, scalable storage and compute) and data warehouses (user friendly SQL interface). Data lakes are notoriously complex. Data lakes are notoriously complex. Go to dataengineeringpodcast.com/dagster today to get started.

Data Lake 262
article thumbnail

Snowflake Ventures Invests in Anomalo for Advanced Data Quality

Snowflake

In todays data-driven world, organizations depend on high-quality data to drive accurate analytics and machine learning models. But poor data quality gaps, inconsistencies and errors can undermine even the most sophisticated data and AI initiatives.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

How Meta discovers data flows via lineage at scale

Engineering at Meta

In order to build high-quality data lineage, we developed different techniques to collect data flow signals across different technology stacks: static code analysis for different languages, runtime instrumentation, and input and output data matching, etc. Hack, C++, Python, etc.)

article thumbnail

Low Code And High Quality Data Engineering For The Whole Organization With Prophecy

Data Engineering Podcast

Are you bored with writing scripts to move data into SaaS tools like Salesforce, Marketo, or Facebook Ads? Hightouch is the easiest way to sync data into the platforms that your business teams rely on. The data you’re looking for is already in your data warehouse and BI tools. No more scripts, just SQL.

article thumbnail

Tackling Real Time Streaming Data With SQL Using RisingWave

Data Engineering Podcast

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Data lakes are notoriously complex. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Data lakes are notoriously complex. Starburst : ![Starburst

SQL 173
article thumbnail

Modern Customer Data Platform Principles

Data Engineering Podcast

In this episode Tasso Argyros, CEO of ActionIQ, gives a summary of the major epochs in database technologies and how he is applying the capabilities of cloud data warehouses to the challenge of building more comprehensive experiences for end-users through a modern customer data platform (CDP).

Data Lake 147
article thumbnail

X-Ray Vision For Your Flink Stream Processing With Datorios

Data Engineering Podcast

Data lakes are notoriously complex. For data engineers who battle to build and scale high quality data workflows on the data lake, Starburst is an end-to-end data lakehouse platform built on Trino, the query engine Apache Iceberg was designed for, with complete support for all table formats including Apache Iceberg, Hive, and Delta Lake.

Process 147