Data Workflow, High Quality Data and Project

Zenlytic Is Building You A Better Coworker With AI Agents

Data Engineering Podcast

MAY 18, 2024

Data lakes are notoriously complex. For data engineers who battle to build and scale high quality data workflows on the data lake, Starburst is an end-to-end data lakehouse platform built on Trino, the query engine Apache Iceberg was designed for, with complete support for all table formats including Apache Iceberg, Hive, and Delta Lake.

Building

Building Data Lake High Quality Data Business Intelligence

Using Trino And Iceberg As The Foundation Of Your Data Lakehouse

Data Engineering Podcast

FEBRUARY 18, 2024

Summary A data lakehouse is intended to combine the benefits of data lakes (cost effective, scalable storage and compute) and data warehouses (user friendly SQL interface). Multiple open source projects and vendors have been working together to make this vision a reality. Data lakes are notoriously complex.

Data Lake

Data Lake High Quality Data Data Warehouse Google Cloud

Adding Anomaly Detection And Observability To Your dbt Projects Is Elementary

Data Engineering Podcast

MARCH 31, 2024

While there are numerous products available to provide that visibility, they all have different technologies and workflows that they focus on. To bring observability to dbt projects the team at Elementary embedded themselves into the workflow. How have the scope and goals of the project changed since you started working on it?

Project

Project Data Lake High Quality Data Data Workflow

Webinars

Apache Airflow®: The Ultimate Guide to DAG Writing

MORE WEBINARS

Unlocking Your dbt Projects With Practical Advice For Practitioners

Data Engineering Podcast

NOVEMBER 19, 2023

Summary The dbt project has become overwhelmingly popular across analytics and data engineering teams. Dustin Dorsey and Cameron Cyr co-authored a practical guide to building your dbt project. In this episode they share their hard-won wisdom about how to build and scale your dbt projects.

Project

Project Data Lake High Quality Data SQL

Troubleshooting Kafka In Production

Data Engineering Podcast

DECEMBER 24, 2023

Data lakes are notoriously complex. For data engineers who battle to build and scale high quality data workflows on the data lake, Starburst powers petabyte-scale SQL analytics fast, at a fraction of the cost of traditional methods, so that you can meet all your data needs ranging from AI to data applications to complete analytics.

Kafka

Kafka Data Lake High Quality Data SQL

Ship Smarter Not Harder With Declarative And Collaborative Data Orchestration On Dagster+

Data Engineering Podcast

MARCH 24, 2024

Summary A core differentiator of Dagster in the ecosystem of data orchestration is their focus on software defined assets as a means of building declarative workflows. With their launch of Dagster+ as the redesigned commercial companion to the open source project they are investing in that capability with a suite of new features.

Data Lake

Data Lake High Quality Data Hadoop Data Pipeline

Making Email Better With AI At Shortwave

Data Engineering Podcast

APRIL 21, 2024

Data lakes are notoriously complex. For data engineers who battle to build and scale high quality data workflows on the data lake, Starburst powers petabyte-scale SQL analytics fast, at a fraction of the cost of traditional methods, so that you can meet all your data needs ranging from AI to data applications to complete analytics.

Data Lake

Data Lake High Quality Data Data Pipeline Machine Learning

Stitching Together Enterprise Analytics With Microsoft Fabric

Data Engineering Podcast

JUNE 23, 2024

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Data lakes are notoriously complex. If you've learned something or tried out a project from the show then tell us about it! If you've learned something or tried out a project from the show then tell us about it!

Data Lake

Data Lake High Quality Data Hadoop Machine Learning

Establish A Single Source Of Truth For Your Data Consumers With A Semantic Layer

Data Engineering Podcast

APRIL 7, 2024

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management This episode is brought to you by Datafold – a testing automation platform for data engineers that prevents data quality issues from entering every part of your data workflow, from migration to dbt deployment.

Data Lake

Data Lake High Quality Data BI Data Workflow

Tackling Real Time Streaming Data With SQL Using RisingWave

Data Engineering Podcast

FEBRUARY 4, 2024

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Data lakes are notoriously complex. What are the core design philosophies that you rely on to prioritize the ongoing development of the project? Email hosts@dataengineeringpodcast.com ) with your story. Starburst : ![Starburst

SQL

SQL Data Lake High Quality Data Data Pipeline

Being Data Driven At Stripe With Trino And Iceberg

Data Engineering Podcast

JUNE 16, 2024

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Data lakes are notoriously complex. If you've learned something or tried out a project from the show then tell us about it! If you've learned something or tried out a project from the show then tell us about it!

Data Lake

Data Lake High Quality Data Metadata Machine Learning

Designing A Non-Relational Database Engine

Data Engineering Podcast

APRIL 14, 2024

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management This episode is brought to you by Datafold – a testing automation platform for data engineers that prevents data quality issues from entering every part of your data workflow, from migration to dbt deployment.

Non-relational Database

Non-relational Database Relational Database Database Designing

Pushing The Limits Of Scalability And User Experience For Data Processing WIth Jignesh Patel

Data Engineering Podcast

JANUARY 7, 2024

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Data lakes are notoriously complex. As you strive to push the limits of technical capacity in data systems, how does that impact the usability of the resulting systems?

Data Process

Data Process Process Data Lake High Quality Data

Modern Customer Data Platform Principles

Data Engineering Podcast

JANUARY 21, 2024

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Data lakes are notoriously complex. Data projects are notoriously complex. Miro is your single pane of glass where everyone can discover, track, and collaborate on your organization's data.

Data Lake

Data Lake High Quality Data NoSQL Data Warehouse

Practical First Steps In Data Governance For Long Term Success

Data Engineering Podcast

JUNE 2, 2024

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Data lakes are notoriously complex. number/scale of systems, types of data, AI) Data governance can often become an exercise in boiling the ocean. Email hosts@dataengineeringpodcast.com ) with your story.

Data Governance

Data Governance Government Data Lake High Quality Data

Improve Data Quality Through Engineering Rigor And Business Engagement With Synq

Data Engineering Podcast

JUNE 30, 2024

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Data lakes are notoriously complex. If you've learned something or tried out a project from the show then tell us about it! If you've learned something or tried out a project from the show then tell us about it!

Pipeline-centric

Pipeline-centric Engineering Data Lake High Quality Data

Release Management For Data Platform Services And Logic

Data Engineering Podcast

MAY 12, 2024

Data lakes are notoriously complex. For data engineers who battle to build and scale high quality data workflows on the data lake, Starburst is an end-to-end data lakehouse platform built on Trino, the query engine Apache Iceberg was designed for, with complete support for all table formats including Apache Iceberg, Hive, and Delta Lake.

Management

Management Data Lake High Quality Data Machine Learning

When And How To Conduct An AI Program

Data Engineering Podcast

MARCH 3, 2024

Data lakes are notoriously complex. For data engineers who battle to build and scale high quality data workflows on the data lake, Starburst powers petabyte-scale SQL analytics fast, at a fraction of the cost of traditional methods, so that you can meet all your data needs ranging from AI to data applications to complete analytics.

Programming

Programming Data Lake High Quality Data Data Pipeline

Find Out About The Technology Behind The Latest PFAD In Analytical Database Development

Data Engineering Podcast

FEBRUARY 25, 2024

Data lakes are notoriously complex. For data engineers who battle to build and scale high quality data workflows on the data lake, Starburst powers petabyte-scale SQL analytics fast, at a fraction of the cost of traditional methods, so that you can meet all your data needs ranging from AI to data applications to complete analytics.

Database

Database Technology Data Lake High Quality Data

X-Ray Vision For Your Flink Stream Processing With Datorios

Data Engineering Podcast

JUNE 9, 2024

Data lakes are notoriously complex. For data engineers who battle to build and scale high quality data workflows on the data lake, Starburst is an end-to-end data lakehouse platform built on Trino, the query engine Apache Iceberg was designed for, with complete support for all table formats including Apache Iceberg, Hive, and Delta Lake.

Process

Process Data Lake High Quality Data Machine Learning

Build A Data Lake For Your Security Logs With Scanner

Data Engineering Podcast

JANUARY 28, 2024

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Data lakes are notoriously complex. If you've learned something or tried out a project from the show then tell us about it! If you've learned something or tried out a project from the show then tell us about it!

Data Lake

Data Lake Building High Quality Data AWS

Reconciling The Data In Your Databases With Datafold

Data Engineering Podcast

MARCH 17, 2024

Summary A significant portion of data workflows involve storing and processing information in database engines. In this episode Gleb Mezhanskiy, founder and CEO of Datafold, discusses the different error conditions and solutions that you need to know about to ensure the accuracy of your data. Data lakes are notoriously complex.

Database

Database Data Lake High Quality Data Data Workflow

Build Your Second Brain One Piece At A Time

Data Engineering Podcast

APRIL 28, 2024

Data lakes are notoriously complex. For data engineers who battle to build and scale high quality data workflows on the data lake, Starburst powers petabyte-scale SQL analytics fast, at a fraction of the cost of traditional methods, so that you can meet all your data needs ranging from AI to data applications to complete analytics.

Building

Building Data Lake High Quality Data Machine Learning

Data Sharing Across Business And Platform Boundaries

Data Engineering Podcast

FEBRUARY 11, 2024

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Data lakes are notoriously complex. If you've learned something or tried out a project from the show then tell us about it! If you've learned something or tried out a project from the show then tell us about it!

Data Lake

Data Lake High Quality Data Government Data Pipeline

Version Your Data Lakehouse Like Your Software With Nessie

Data Engineering Podcast

MARCH 10, 2024

The link that bridges the gap between data lake and warehouse capabilities is the catalog. The primary purpose of the catalog is to inform the query engine of what data exists and where, but the Nessie project aims to go beyond that simple utility. Data lakes are notoriously complex. Your first 30 days are free!

Data Lake

Data Lake High Quality Data Architecture Data Pipeline

Barking Up The Wrong GPTree: Building Better AI With A Cognitive Approach

Data Engineering Podcast

MAY 5, 2024

Data lakes are notoriously complex. For data engineers who battle to build and scale high quality data workflows on the data lake, Starburst powers petabyte-scale SQL analytics fast, at a fraction of the cost of traditional methods, so that you can meet all your data needs ranging from AI to data applications to complete analytics.

Building

Building Data Lake High Quality Data Machine Learning

Designing Data Transfer Systems That Scale

Data Engineering Podcast

DECEMBER 3, 2023

This episode is brought to you by Datafold – a testing automation platform for data engineers that finds data quality issues for every part of your data workflow, from migration to deployment. Datafold has recently launched a 3-in-1 product experience to support accelerated data migrations.

Systems

Systems Designing Data Lake SQL

Data Migration Strategies For Large Scale Systems

Data Engineering Podcast

MAY 26, 2024

When that system is responsible for the data layer the process becomes more challenging. Sriram Panyam has been involved in several projects that required migration of large volumes of data in high traffic environments. Can you start by sharing some of your experiences with data migration projects?

Systems

Systems Data Lake High Quality Data Google Cloud

Addressing The Challenges Of Component Integration In Data Platform Architectures

Data Engineering Podcast

NOVEMBER 26, 2023

Data lakes are notoriously complex. For data engineers who battle to build and scale high quality data workflows on the data lake, Starburst powers petabyte-scale SQL analytics fast, at a fraction of the cost of traditional methods, so that you can meet all your data needs ranging from AI to data applications to complete analytics.

Architecture

Architecture Data Lake High Quality Data SQL

Run Your Own Anomaly Detection For Your Critical Business Metrics With Anomstack

Data Engineering Podcast

DECEMBER 10, 2023

Unfortunately, it can often be complex or expensive to incorporate anomaly detection into your data platform. Andrew Maguire got tired of solving that problem for each of the different roles he has ended up in, so he created the open source Anomstack project. Find simplicity in your most complex projects with Miro.

Data Lake

Data Lake High Quality Data SQL Architecture

Shining Some Light In The Black Box Of PostgreSQL Performance

Data Engineering Podcast

NOVEMBER 5, 2023

Data lakes are notoriously complex. For data engineers who battle to build and scale high quality data workflows on the data lake, Starburst powers petabyte-scale SQL analytics fast, at a fraction of the cost of traditional methods, so that you can meet all your data needs ranging from AI to data applications to complete analytics.

PostgreSQL

PostgreSQL Data Lake High Quality Data SQL

Designing Data Platforms For Fintech Companies

Data Engineering Podcast

DECEMBER 31, 2023

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Data lakes are notoriously complex. If you've learned something or tried out a project from the show then tell us about it! If you've learned something or tried out a project from the show then tell us about it!

Designing

Designing Data Lake High Quality Data SQL

Enhancing The Abilities Of Software Engineers With Generative AI At Tabnine

Data Engineering Podcast

NOVEMBER 12, 2023

If you are migrating to a modern data stack, Datafold can also help you automate data and code validation to speed up the migration. Learn more about Datafold by visiting dataengineeringpodcast.com/datafold Data lakes are notoriously complex. Email hosts@dataengineeringpodcast.com ) with your story. Sponsored By: Starburst : ![Starburst

Software Engineer

Software Engineer Software Engineering Engineering Data Lake

Adding An Easy Mode For The Modern Data Stack With 5X

Data Engineering Podcast

DECEMBER 17, 2023

Data lakes are notoriously complex. For data engineers who battle to build and scale high quality data workflows on the data lake, Starburst powers petabyte-scale SQL analytics fast, at a fraction of the cost of traditional methods, so that you can meet all your data needs ranging from AI to data applications to complete analytics.

Data Lake

Data Lake High Quality Data SQL Architecture

DataOps For Business Analytics Teams

DataKitchen

JANUARY 3, 2022

When the project scope and schedule are fixed, managers have to add more staff to keep up with the workload. . These groups tend to be more insulated from their customer’s actual needs, so they take many months to deliver, and when a project does get done, it usually requires rework. Some IT teams are fantastic.

Business Analyst

Business Analyst Data Lake Consulting Data Analytics

How to Use DBT to Get Actionable Insights from Data?

Workfall

JULY 4, 2023

Reading Time: 8 minutes In the world of data engineering, a mighty tool called DBT (Data Build Tool) comes to the rescue of modern data workflows. Imagine a team of skilled data engineers on an exciting quest to transform raw data into a treasure trove of insights.

Data Warehouse

Data Warehouse SQL Database PostgreSQL

Snowflake Summit 2024 Reflections: An Exciting Road Ahead for Data Engineering

Ascend.io

JUNE 10, 2024

Snowflake’s partnership with Nvidia promises to revolutionize how we build and deploy AI-powered applications, equipping teams to deliver on AI projects faster than ever before. Snowflake’s investment in expanding data engineering capabilities is a game-changer.

Data Engineering

Data Engineering Data Engineer Engineering High Quality Data

What is Data Orchestration?

Monte Carlo

MAY 25, 2023

Benefits of Data Orchestration Data orchestration can yield significant benefits for organizations—from driving efficiency across the data team to improving relationships across business domains. Automating data workflows. Data quality issues introduced by faster data workflows.

Data Pipeline

Data Pipeline Data Workflow Data Data Governance

Data Quality Engineer: Skills, Salary, & Tools Required

Monte Carlo

JULY 27, 2023

These specialists are also commonly referred to as data reliability engineers. To be successful in their role, data quality engineers will need to gather data quality requirements (mentioned in 65% of job postings) from relevant stakeholders.

Engineering

Engineering Healthcare Data Warehouse Scala

Build vs Buy Data Pipeline Guide

Monte Carlo

APRIL 24, 2023

While this worked well for a time, the data infrastructure team quickly found itself needing to support hundreds of thousands of pipelines across the company. Airflow was an upcoming project at the time that supported most of the use-cases we needed.

Data Pipeline

Data Pipeline Building Data Ingestion BI

Zenlytic Is Building You A Better Coworker With AI Agents

Using Trino And Iceberg As The Foundation Of Your Data Lakehouse

Webinars

Trending Sources

Adding Anomaly Detection And Observability To Your dbt Projects Is Elementary

Webinars

Unlocking Your dbt Projects With Practical Advice For Practitioners

Troubleshooting Kafka In Production

Ship Smarter Not Harder With Declarative And Collaborative Data Orchestration On Dagster+

Making Email Better With AI At Shortwave

Stitching Together Enterprise Analytics With Microsoft Fabric

Establish A Single Source Of Truth For Your Data Consumers With A Semantic Layer

Tackling Real Time Streaming Data With SQL Using RisingWave

Being Data Driven At Stripe With Trino And Iceberg

Designing A Non-Relational Database Engine

Pushing The Limits Of Scalability And User Experience For Data Processing WIth Jignesh Patel

Modern Customer Data Platform Principles

Practical First Steps In Data Governance For Long Term Success

Improve Data Quality Through Engineering Rigor And Business Engagement With Synq

Release Management For Data Platform Services And Logic

When And How To Conduct An AI Program

Find Out About The Technology Behind The Latest PFAD In Analytical Database Development

X-Ray Vision For Your Flink Stream Processing With Datorios

Build A Data Lake For Your Security Logs With Scanner

Reconciling The Data In Your Databases With Datafold

Build Your Second Brain One Piece At A Time

Data Sharing Across Business And Platform Boundaries

Version Your Data Lakehouse Like Your Software With Nessie

Barking Up The Wrong GPTree: Building Better AI With A Cognitive Approach

Designing Data Transfer Systems That Scale

Data Migration Strategies For Large Scale Systems

Addressing The Challenges Of Component Integration In Data Platform Architectures

Run Your Own Anomaly Detection For Your Critical Business Metrics With Anomstack

Shining Some Light In The Black Box Of PostgreSQL Performance

Designing Data Platforms For Fintech Companies

Enhancing The Abilities Of Software Engineers With Generative AI At Tabnine

Adding An Easy Mode For The Modern Data Stack With 5X

DataOps For Business Analytics Teams

How to Use DBT to Get Actionable Insights from Data?

Snowflake Summit 2024 Reflections: An Exciting Road Ahead for Data Engineering

What is Data Orchestration?

Data Quality Engineer: Skills, Salary, & Tools Required

Build vs Buy Data Pipeline Guide

Stay Connected