Data Management, Data Workflow and Technology

Find Out About The Technology Behind The Latest PFAD In Analytical Database Development

Data Engineering Podcast

FEBRUARY 25, 2024

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Dagster offers a new approach to building and running data platforms and data pipelines. Data lakes are notoriously complex. Go to dataengineeringpodcast.com/dagster today to get started.

Database

Database Technology Data Lake High Quality Data

Using Trino And Iceberg As The Foundation Of Your Data Lakehouse

Data Engineering Podcast

FEBRUARY 18, 2024

In this episode Dain Sundstrom, CTO of Starburst, explains how the combination of the Trino query engine and the Iceberg table format offer the ease of use and execution speed of data warehouses with the infinite storage and scalability of data lakes. Data lakes are notoriously complex. Your first 30 days are free!

Data Lake

Data Lake High Quality Data Data Warehouse Google Cloud

Making The Total Cost Of Ownership For External Data Manageable With Crux

Data Engineering Podcast

JULY 17, 2022

In this episode Crux CTO Mark Etherington discusses the different costs involved in managing external data, how to think about the total return on investment for your data, and how the Crux platform is architected to reduce the toil involved in managing third party data. When is Crux the wrong choice?

Data Management

Data Management Management Metadata MongoDB

Webinars

What’s New in Apache Airflow® 3.0—And How Will It Reshape Your Data Workflows?

MORE WEBINARS

Troubleshooting Kafka In Production

Data Engineering Podcast

DECEMBER 24, 2023

Summary Kafka has become a ubiquitous technology, offering a simple method for coordinating events and data across different systems. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Introducing RudderStack Profiles. Data lakes are notoriously complex.

Kafka

Kafka Data Lake High Quality Data SQL

Stitching Together Enterprise Analytics With Microsoft Fabric

Data Engineering Podcast

JUNE 23, 2024

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Data lakes are notoriously complex. Data lakes in various forms have been gaining significant popularity as a unified interface to an organization's analytics. Closing Announcements Thank you for listening!

Data Lake

Data Lake High Quality Data Hadoop Government

Establish A Single Source Of Truth For Your Data Consumers With A Semantic Layer

Data Engineering Podcast

APRIL 7, 2024

Different roles and tasks in the business need their own ways to access and analyze the data in the organization. In order to enable this use case, while maintaining a single point of access, the semantic layer has evolved as a technological solution to the problem. Data lakes are notoriously complex.

Data Lake

Data Lake High Quality Data BI Data Workflow

Making Email Better With AI At Shortwave

Data Engineering Podcast

APRIL 21, 2024

Summary Generative AI has rapidly transformed everything in the technology sector. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Dagster offers a new approach to building and running data platforms and data pipelines. Your first 30 days are free!

Data Lake

Data Lake High Quality Data Data Pipeline Government

Being Data Driven At Stripe With Trino And Iceberg

Data Engineering Podcast

JUNE 16, 2024

In this episode Kevin Liu shares some of the interesting features that they have built by combining those technologies, as well as the challenges that they face in supporting the myriad workloads that are thrown at this layer of their data platform. Can you describe what role Trino and Iceberg play in Stripe's data architecture?

Data Lake

Data Lake High Quality Data Metadata Government

How To Prepare Your Data Team for 2025

Ascend.io

DECEMBER 4, 2024

As we approach 2025, data teams find themselves at a pivotal juncture. The rapid evolution of technology and the increasing demand for data-driven insights have placed immense pressure on these teams. In this blog post, we’ll explore key strategies that data teams should adopt to prepare for the year ahead.

Data Pipeline

Data Pipeline Metadata Data Workflow Data

X-Ray Vision For Your Flink Stream Processing With Datorios

Data Engineering Podcast

JUNE 9, 2024

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management This episode is supported by Code Comments, an original podcast from Red Hat. Data lakes are notoriously complex. Code Comments Podcast Logo]([link] Putting new technology to use is an exciting prospect.

Process

Process Data Lake High Quality Data Government

Building Linked Data Products With JSON-LD

Data Engineering Podcast

SEPTEMBER 17, 2023

Summary A significant amount of time in data engineering is dedicated to building connections and semantic meaning around pieces of information. Linked data technologies provide a means of tightly coupling metadata with raw information. What is the overlap between knowledge graphs and "linked data products"?

Building

Building BI SQL Python

Designing A Non-Relational Database Engine

Data Engineering Podcast

APRIL 14, 2024

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management This episode is brought to you by Datafold – a testing automation platform for data engineers that prevents data quality issues from entering every part of your data workflow, from migration to dbt deployment.

Non-relational Database

Non-relational Database Relational Database Database Designing

Improve Data Quality Through Engineering Rigor And Business Engagement With Synq

Data Engineering Podcast

JUNE 30, 2024

He highlights the role of data teams in modern organizations and how Synq is empowering them to achieve this. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Data lakes are notoriously complex. Can you describe what Synq is and the story behind it?

Pipeline-centric

Pipeline-centric Engineering Data Lake High Quality Data

Tackling Real Time Streaming Data With SQL Using RisingWave

Data Engineering Podcast

FEBRUARY 4, 2024

In this episode Yingjun Wu explains how it is architected to power analytical workflows on continuous data flows, and the challenges of making it responsive and scalable. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Data lakes are notoriously complex.

SQL

SQL Data Lake High Quality Data Kafka

Practical First Steps In Data Governance For Long Term Success

Data Engineering Podcast

JUNE 2, 2024

Nicola Askham found her way into data governance by accident, and stayed because of the benefit that she was able to provide by serving as a bridge between the technology and business. In this episode she shares the practical steps to implementing a data governance practice in your organization, and the pitfalls to avoid.

Data Governance

Data Governance Government Data Lake High Quality Data

Ship Smarter Not Harder With Declarative And Collaborative Data Orchestration On Dagster+

Data Engineering Podcast

MARCH 24, 2024

In this episode Pete Hunt, CEO of Dagster labs, outlines these new capabilities, how they reduce the burden on data teams, and the increased collaboration that they enable across teams and business units. Data lakes are notoriously complex. Go to dataengineeringpodcast.com/dagster today to get started. Your first 30 days are free!

Data Lake

Data Lake High Quality Data Hadoop Data Pipeline

Pushing The Limits Of Scalability And User Experience For Data Processing WIth Jignesh Patel

Data Engineering Podcast

JANUARY 7, 2024

Summary Data processing technologies have dramatically improved in their sophistication and raw throughput. Unfortunately, the volumes of data that are being generated continue to double, requiring further advancements in the platform capabilities to keep up. What do you have planned for the future of your academic research?

Data Process

Data Process Process Data Lake High Quality Data

Adding Anomaly Detection And Observability To Your dbt Projects Is Elementary

Data Engineering Podcast

MARCH 31, 2024

Identifying and accounting for those errors is a critical piece of building trust in the organization that your data is accurate and up to date. While there are numerous products available to provide that visibility, they all have different technologies and workflows that they focus on. Your first 30 days are free!

Project

Project Data Lake High Quality Data Data Workflow

Reconciling The Data In Your Databases With Datafold

Data Engineering Podcast

MARCH 17, 2024

Summary A significant portion of data workflows involve storing and processing information in database engines. In this episode Gleb Mezhanskiy, founder and CEO of Datafold, discusses the different error conditions and solutions that you need to know about to ensure the accuracy of your data. Data lakes are notoriously complex.

Database

Database Data Lake High Quality Data Data Workflow

6 Ways To Prepare Your Data Team for 2025

Ascend.io

DECEMBER 4, 2024

As we approach 2025, data teams find themselves at a pivotal juncture. The rapid evolution of technology and the increasing demand for data-driven insights have placed immense pressure on these teams. In this blog post, we’ll explore key strategies that data teams should adopt to prepare for the year ahead.

Data Pipeline

Data Pipeline Metadata Data Workflow Data

When And How To Conduct An AI Program

Data Engineering Podcast

MARCH 3, 2024

Summary Artificial intelligence technologies promise to revolutionize business and produce new sources of value. Colleen Tartow has worked across all stages of the data lifecycle, and in this episode she shares her hard-earned wisdom about how to conduct an AI program for your organization. Data lakes are notoriously complex.

Programming

Programming Data Lake High Quality Data Data Pipeline

Top 10 Data Engineering Trends in 2025

Edureka

APRIL 22, 2025

This blog will explore the significant advancements, challenges, and opportunities impacting data engineering in 2025, highlighting the increasing importance for companies to stay updated. Key Trends in Data Engineering for 2025 In the fast-paced world of technology, data engineering services keep companies that focus on data running.

Data Engineer

Data Engineer Data Engineering Engineering Consulting

Data Sharing Across Business And Platform Boundaries

Data Engineering Podcast

FEBRUARY 11, 2024

In this episode Andrew Jefferson explains the complexities of building a robust system for data sharing, the techno-social considerations, and how the Bobsled platform that he is building aims to simplify the process. What is the current state of the ecosystem for data sharing protocols/practices/platforms?

Data Lake

Data Lake High Quality Data Government Data Pipeline

Modern Customer Data Platform Principles

Data Engineering Podcast

JANUARY 21, 2024

In this episode Tasso Argyros, CEO of ActionIQ, gives a summary of the major epochs in database technologies and how he is applying the capabilities of cloud data warehouses to the challenge of building more comprehensive experiences for end-users through a modern customer data platform (CDP).

Data Lake

Data Lake High Quality Data NoSQL Data Warehouse

Release Management For Data Platform Services And Logic

Data Engineering Podcast

MAY 12, 2024

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management This episode is supported by Code Comments, an original podcast from Red Hat. Data lakes are notoriously complex. Code Comments Podcast Logo]([link] Putting new technology to use is an exciting prospect.

Management

Management Data Lake High Quality Data Government

Build A Data Lake For Your Security Logs With Scanner

Data Engineering Podcast

JANUARY 28, 2024

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Data lakes are notoriously complex. Contact Info LinkedIn Parting Question From your perspective, what is the biggest gap in the tooling or technology for data management today?

Data Lake

Data Lake Building High Quality Data AWS

Airflow vs Dagster: Comparing Two Data Orchestration Solutions

ProjectPro

JUNE 6, 2025

Dagster vs Airflow: Overview Dagster and Airflow are two popular open-source tools that have emerged as leaders in data orchestration. They are often compared because of their shared goal of automating data workflows and widespread adoption in the data engineering community. What is Airflow? What is Dagster?

Pipeline-centric

Pipeline-centric Database-centric Data Pipeline Data Workflow

Version Your Data Lakehouse Like Your Software With Nessie

Data Engineering Podcast

MARCH 10, 2024

In this episode Alex Merced explains how the branching and merging functionality in Nessie allows you to use the same versioning semantics for your data lakehouse that you are used to from Git. Data lakes are notoriously complex. Go to dataengineeringpodcast.com/dagster today to get started. Your first 30 days are free!

Data Lake

Data Lake High Quality Data Architecture Data Pipeline

Complete Guide to Data Transformation: Basics to Advanced

Ascend.io

OCTOBER 28, 2024

Advanced Data Transformation Techniques For data engineers ready to push the boundaries, advanced data transformation techniques offer the tools to tackle complex data challenges and drive innovation. Data engineers should embrace continuous learning and explore new tools and methodologies to remain competitive.

Raw Data

Raw Data Aggregated Data Data Pipeline Data Validation

Designing Data Transfer Systems That Scale

Data Engineering Podcast

DECEMBER 3, 2023

Andrei Tserakhau has dedicated his careeer to this problem, and in this episode he shares the lessons that he has learned and the work he is doing on his most recent data transfer system at DoubleCloud. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Introducing RudderStack Profiles.

Systems

Systems Designing Data Lake SQL

Addressing The Challenges Of Component Integration In Data Platform Architectures

Data Engineering Podcast

NOVEMBER 26, 2023

In this episode Tobias Macey shares his thoughts on the challenges that he is facing as he prepares to build the next set of architectural layers for his data platform to enable a larger audience to start accessing the data being managed by his team. Data lakes are notoriously complex. Data lakes are notoriously complex.

Architecture

Architecture Data Lake High Quality Data Java

Data Migration Strategies For Large Scale Systems

Data Engineering Podcast

MAY 26, 2024

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Data lakes are notoriously complex. Can you start by sharing some of your experiences with data migration projects? Code Comments Podcast Logo]([link] Putting new technology to use is an exciting prospect.

Systems

Systems Data Lake High Quality Data Google Cloud

Designing Data Platforms For Fintech Companies

Data Engineering Podcast

DECEMBER 31, 2023

In this episode Andrey Korchack, CTO of fintech startup Monite, discusses the complexities of designing and implementing a data platform in that sector. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Data lakes are notoriously complex.

Designing

Designing Data Lake High Quality Data SQL

Microsoft Azure Data Factory Training Free For Beginners

ProjectPro

JUNE 6, 2025

Data engineers gain insights into pipeline performance, data movement, and potential bottlenecks. This skill is crucial for maintaining smooth data workflows and ensuring data integrity. This phase also underscores the seamless integration of Azure Data Factory with a range of Azure services.

Data Lake

Data Lake Cloud Computing Data Workflow Data Pipeline

Build Your Second Brain One Piece At A Time

Data Engineering Podcast

APRIL 28, 2024

In this episode he explains the data collection and preparation process, the collection of model types and sizes that work together to power the experience, and how to incorporate it into your workflow to act as a second brain. Data lakes are notoriously complex. Data lakes are notoriously complex.

Building

Building Data Lake High Quality Data Machine Learning

Barking Up The Wrong GPTree: Building Better AI With A Cognitive Approach

Data Engineering Podcast

MAY 5, 2024

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Dagster offers a new approach to building and running data platforms and data pipelines. Data lakes are notoriously complex. Data lakes are notoriously complex. Your first 30 days are free!

Building

Building Data Lake High Quality Data Machine Learning

Unlocking Your dbt Projects With Practical Advice For Practitioners

Data Engineering Podcast

NOVEMBER 19, 2023

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Data projects are notoriously complex. With multiple stakeholders to manage across varying backgrounds and toolchains even simple reports can become unwieldy to maintain. Data lakes are notoriously complex.

Project

Project Data Lake High Quality Data SQL

Shining Some Light In The Black Box Of PostgreSQL Performance

Data Engineering Podcast

NOVEMBER 5, 2023

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Introducing RudderStack Profiles. RudderStack Profiles takes the SaaS guesswork and SQL grunt work out of building complete customer profiles so you can quickly ship actionable, enriched data to every downstream team.

PostgreSQL

PostgreSQL Data Lake High Quality Data SQL

7 GCP Data Engineering Tools Every Data Engineer Must Know

ProjectPro

JUNE 6, 2025

These businesses need data engineers who can use technologies for handling data quickly and effectively since they have to manage potentially profitable real-time data. These platforms facilitate effective data management and other crucial Data Engineering activities.

Data Engineer

Data Engineer Data Engineering Engineering Google Cloud

Making Sense Of The Technical And Organizational Considerations Of Data Contracts

Data Engineering Podcast

DECEMBER 18, 2022

In this episode Abe Gong brings his experiences with the Great Expectations project and community to discuss the technical and organizational considerations involved in implementing these constraints to your data workflows. Can you describe what your conception of a data contract is?

Metadata

Metadata Data Lake Business Intelligence MongoDB

Microsoft Fabric - All-in-one AI-Powered Analytics Solution

ProjectPro

JUNE 6, 2025

” Tom Peplow, Principal and Senior Director of Technology Strategy for Life Technology Solutions, Milliman You want to improve inventory management to reduce stockouts and overstock situations while maximizing sales. “Microsoft Fabric would have been our choice from the beginning—had it only existed then.”

Database-centric

Database-centric BI Pipeline-centric Data Lake

Adding An Easy Mode For The Modern Data Stack With 5X

Data Engineering Podcast

DECEMBER 17, 2023

In this episode founder Tarush Aggarwal explains how the realities of the modern data stack are impacting data teams and the work that they are doing to accelerate time to value. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Introducing RudderStack Profiles.

Data Lake

Data Lake High Quality Data SQL Architecture

Understanding The Immune System With Data At ImmunAI

Data Engineering Podcast

FEBRUARY 20, 2022

Summary The life sciences as an industry has seen incredible growth in scale and sophistication, along with the advances in data technology that make it possible to analyze massive amounts of genomic information. Interview Introduction (see Guy’s bio below) How did you get involved in the area of data management?

Systems

Systems Software Engineer Software Engineering Data Warehouse

How to Build AI Agents with Phidata?

ProjectPro

JUNE 6, 2025

However, creating and deploying these agents often involves challenges such as managing complex data workflows, integrating machine learning models, and ensuring scalability across operations. Use Phidata to scale and automate data workflows across large datasets.

Building

Building Data Workflow Python Data Pipeline

Find Out About The Technology Behind The Latest PFAD In Analytical Database Development

Using Trino And Iceberg As The Foundation Of Your Data Lakehouse

Webinars

Trending Sources

Making The Total Cost Of Ownership For External Data Manageable With Crux

Webinars

Troubleshooting Kafka In Production

Stitching Together Enterprise Analytics With Microsoft Fabric

Establish A Single Source Of Truth For Your Data Consumers With A Semantic Layer

Making Email Better With AI At Shortwave

Being Data Driven At Stripe With Trino And Iceberg

How To Prepare Your Data Team for 2025

X-Ray Vision For Your Flink Stream Processing With Datorios

Building Linked Data Products With JSON-LD

Designing A Non-Relational Database Engine

Improve Data Quality Through Engineering Rigor And Business Engagement With Synq

Tackling Real Time Streaming Data With SQL Using RisingWave

Practical First Steps In Data Governance For Long Term Success

Ship Smarter Not Harder With Declarative And Collaborative Data Orchestration On Dagster+

Pushing The Limits Of Scalability And User Experience For Data Processing WIth Jignesh Patel

Adding Anomaly Detection And Observability To Your dbt Projects Is Elementary

Reconciling The Data In Your Databases With Datafold

6 Ways To Prepare Your Data Team for 2025

When And How To Conduct An AI Program

Top 10 Data Engineering Trends in 2025

Data Sharing Across Business And Platform Boundaries

Modern Customer Data Platform Principles

Release Management For Data Platform Services And Logic

Build A Data Lake For Your Security Logs With Scanner

Airflow vs Dagster: Comparing Two Data Orchestration Solutions

Version Your Data Lakehouse Like Your Software With Nessie

Complete Guide to Data Transformation: Basics to Advanced

Designing Data Transfer Systems That Scale

Addressing The Challenges Of Component Integration In Data Platform Architectures

Data Migration Strategies For Large Scale Systems

Designing Data Platforms For Fintech Companies

Microsoft Azure Data Factory Training Free For Beginners

Build Your Second Brain One Piece At A Time

Barking Up The Wrong GPTree: Building Better AI With A Cognitive Approach

Unlocking Your dbt Projects With Practical Advice For Practitioners

Shining Some Light In The Black Box Of PostgreSQL Performance

7 GCP Data Engineering Tools Every Data Engineer Must Know

Making Sense Of The Technical And Organizational Considerations Of Data Contracts

Microsoft Fabric - All-in-one AI-Powered Analytics Solution

Adding An Easy Mode For The Modern Data Stack With 5X

Understanding The Immune System With Data At ImmunAI

How to Build AI Agents with Phidata?

Stay Connected