Data Engineer, Data Pipeline and SQL - Data Engineering Digest

Seamless SQL And Python Transformations For Data Engineers And Analysts With SQLMesh

Data Engineering Podcast

JUNE 25, 2023

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management RudderStack helps you build a customer data platform on your warehouse or data lake. Rudderstack]([link] RudderStack provides all your customer data pipelines in one platform.

Data Engineering

Data Engineering Data Engineer Python Engineering

Snowflake’s New Python API Empowers Data Engineers to Build Modern Data Pipelines with Ease

Snowflake

APRIL 17, 2024

In today’s data-driven world, developer productivity is essential for organizations to build effective and reliable products, accelerate time to value, and fuel ongoing innovation. While the Python API connector remains available for specific SQL use cases, the new API is designed to be your go-to solution.

Data Pipeline

Data Pipeline Python Data Engineering Data Engineer

How To Future-Proof Your Data Pipelines

Ascend.io

NOVEMBER 14, 2024

Why Future-Proofing Your Data Pipelines Matters Data has become the backbone of decision-making in businesses across the globe. The ability to harness and analyze data effectively can make or break a company’s competitive edge. Resilience and adaptability are the cornerstones of a future-proof data pipeline.

Data Pipeline

Data Pipeline Amazon Web Services Data Integration Data

Webinars

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

How to improve at SQL as a data engineer

Start Data Engineering

OCTOBER 22, 2021

SQL skills 2.1. Data modeling 2.1.1. Data storage 2.2. Data transformation 2.2.1. Data pipeline 2.4. Data analytics 3. Introduction SQL is the bread and butter of data engineering. Introduction 2. Gathering requirements 2.1.2. Exploration 2.1.3. Modeling 2.1.4. Practice 4.

SQL

SQL Data Engineering Data Engineer Engineering

Data Engineering Weekly #198

Data Engineering Weekly

NOVEMBER 24, 2024

Editor’s Note: Launching Data & Gen-AI courses in 2025 I can’t believe DEW will reach almost its 200th edition soon. What I started as a fun hobby has become one of the top-rated newsletters in the data engineering industry. The blog narrates a few examples of Pipe Syntax in comparison with the SQL queries.

Data Engineering

Data Engineering Data Engineer Engineering Insurance

Data Engineering Weekly #222

Data Engineering Weekly

JUNE 1, 2025

Join Dagster and Neurospace to learn: - How to build AI pipelines with orchestration baked in - How to track data lineage for audits and traceability - Tips for designing compliant workflows under the EU AI Act Register for the technical session DuckDB: DuckLake - SQL as a Lakehouse Format DuckDB announced a new open table format, DuckLake.

Data Engineering

Data Engineering Data Engineer Engineering Relational Database

Tackling Real Time Streaming Data With SQL Using RisingWave

Data Engineering Podcast

FEBRUARY 4, 2024

Summary Stream processing systems have long been built with a code-first design, adding SQL as a layer on top of the existing framework. RisingWave is a database engine that was created specifically for stream processing, with S3 as the storage layer. Can you describe what RisingWave is and the story behind it?

SQL

SQL Data Lake High Quality Data Machine Learning

What is an AI Data Engineer? 4 Important Skills, Responsibilities, & Tools

Monte Carlo

OCTOBER 31, 2024

One job that has become increasingly popular across enterprise data teams is the role of the AI data engineer. Demand for AI data engineers has grown rapidly in data-driven organizations. But what does an AI data engineer do? Table of Contents What Does an AI Data Engineer Do?

Data Engineering

Data Engineering Data Engineer Engineering Unstructured Data

Mirroring SQL Server Database to Microsoft Fabric

Striim

NOVEMBER 19, 2024

SQL2Fabric Mirroring is a new fully managed service offered by Striim to mirror on premise SQL Databases. It’s a collaborative service between Striim and Microsoft based on Fabric Open Mirroring that enables real-time data replication from on-premise SQL Server databases to Azure Fabric OneLake. Striim automates the rest.

SQL

SQL Database Data Warehouse Data Pipeline

Streaming Data Pipelines Made SQL With Decodable

Data Engineering Podcast

OCTOBER 28, 2021

He also explains why he started Decodable to address that limitation and the work that he and his team have done to let data engineers build streaming pipelines entirely in SQL. The data you’re looking for is already in your data warehouse and BI tools. No more scripts, just SQL.

Data Pipeline

Data Pipeline SQL Data Warehouse Data Lake

Ready-to-go sample data pipelines with Dataflow

Netflix Tech

DECEMBER 3, 2022

by Jasmine Omeke , Obi-Ike Nwoke , Olek Gorajek Intro This post is for all data practitioners, who are interested in learning about bootstrapping, standardization and automation of batch data pipelines at Netflix. You may remember Dataflow from the post we wrote last year titled Data pipeline asset management with Dataflow.

Data Pipeline

Data Pipeline Scala Metadata Food

The Emerging Role of AI Data Engineers - The New Strategic Role for AI-Driven Success

Data Engineering Weekly

JANUARY 15, 2025

The Critical Role of AI Data Engineers in a Data-Driven World How does a chatbot seamlessly interpret your questions? The answer lies in unstructured data processing—a field that powers modern artificial intelligence (AI) systems. Develop modular, reusable components for end-to-end AI pipelines.

Data Engineering

Data Engineering Data Engineer Unstructured Data Engineering

Unpacking The Seven Principles Of Modern Data Pipelines

Data Engineering Podcast

AUGUST 13, 2023

Summary Data pipelines are the core of every data product, ML model, and business intelligence dashboard. The folks at Rivery distilled the seven principles of modern data pipelines that will help you stay out of trouble and be productive with your data. Rudderstack : ![Rudderstack]([link]

Data Pipeline

Data Pipeline BI SQL Machine Learning

Using Trino And Iceberg As The Foundation Of Your Data Lakehouse

Data Engineering Podcast

FEBRUARY 18, 2024

Summary A data lakehouse is intended to combine the benefits of data lakes (cost effective, scalable storage and compute) and data warehouses (user friendly SQL interface). Data lakes are notoriously complex. Visit: dataengineeringpodcast.com/data-council today. Your first 30 days are free!

Data Lake

Data Lake High Quality Data Data Warehouse Google Cloud

Data Engineering Weekly #217

Data Engineering Weekly

APRIL 20, 2025

[link] Jing Ge: Context Matters — The Vision of Data Analytics and Data Science Leveraging MCP and A2A All aspects of software engineering are rapidly being automated with various coding AI tools, as seen in the AI technology radar. Data engineering is one aspect where I see a few startups starting to disrupt.

Data Engineering

Data Engineering Data Engineer Engineering Kafka

No Python, No SQL Templates, No YAML: Why Your Open Source Data Quality Tool Should Generate 80% Of Your Data Quality Tests Automatically

DataKitchen

FEBRUARY 17, 2025

No Python, No SQL Templates, No YAML: Why Your Open Source Data Quality Tool Should Generate 80% Of Your Data Quality Tests Automatically As a data engineer, ensuring data quality is both essential and overwhelming. Even if data engineers had the resources, they lacked the full context of data use.

SQL

SQL Python Government Data Engineering

Troubleshooting Kafka In Production

Data Engineering Podcast

DECEMBER 24, 2023

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Introducing RudderStack Profiles. RudderStack Profiles takes the SaaS guesswork and SQL grunt work out of building complete customer profiles so you can quickly ship actionable, enriched data to every downstream team.

Kafka

Kafka Data Lake High Quality Data SQL

10 Skills to Ace Your Data Engineering Interviews

Start Data Engineering

OCTOBER 11, 2021

Leetcode: data structures and algorithms 4. Data modeling 4.1 Data warehousing 4.2 Data pipelines 6. Introduction Skills 1. Distributed system fundamentals 7. Event streaming 8. System design 9. Business questions 10. Cloud computing 11.

Data Engineering

Data Engineering Data Engineer Engineering Cloud Computing

How to learn data engineering

Christophe Blefari

JANUARY 20, 2024

Learn data engineering, all the references ( credits ) This is a special edition of the Data News. But right now I'm in holidays finishing a hiking week in Corsica 🥾 So I wrote this special edition about: how to learn data engineering in 2024. Who are the data engineers?

Data Engineering

Data Engineering Data Engineer Engineering Hadoop

Simplified End-to-End Development for Production-Ready Data Pipelines, Applications, and ML Models

Snowflake

JUNE 4, 2024

Streamlined development across SQL and Python Snowflake now offers data teams a suite of intuitive tools designed to simplify development and accelerate workflows. This suite extends seamlessly across Snowflake’s offerings, including Snowpark, Native Apps, Streamlit and more, for building anything with your data.

Data Pipeline

Data Pipeline Python SQL Database

Our First Netflix Data Engineering Summit

Netflix Tech

DECEMBER 14, 2023

Engineers from across the company came together to share best practices on everything from Data Processing Patterns to Building Reliable Data Pipelines. The result was a series of talks which we are now sharing with the rest of the Data Engineering community! In this video, Sr.

Data Engineering

Data Engineering Data Engineer Engineering Metadata

How Data Engineering Teams Power Machine Learning With Feature Platforms

Data Engineering Podcast

JULY 2, 2023

In this episode Razi Raziuddin shares how data engineering teams can support the machine learning workflow through the development and support of systems that empower data scientists and ML engineers to build and maintain their own features. How is this distinct from other forms of data pipeline development and delivery?

Machine Learning

Machine Learning Data Engineering Data Engineer Engineering

Data Engineering Weekly #175

Data Engineering Weekly

JUNE 10, 2024

Experience Enterprise-Grade Apache Airflow Astro augments Airflow with enterprise-grade features to enhance productivity, meet scalability and availability demands across your data pipelines, and more. As I have shared , its impact on data engineering is exciting.

Data Engineering

Data Engineering Data Engineer Engineering Kafka

Databricks Workspace Health SQL Toolkit

Sync Computing

FEBRUARY 4, 2025

As data engineers, understanding the intricacies of your Databricks environment is important. But raw system data can be tricky to navigate, and sometimes you just need a quick answer to that burning question. Wow the team with insights in your Jobs, SQL warehouses, APC clusters, and DLT usage.

SQL

SQL Datasets Utilities Accessible

Modern Data Engineering

Towards Data Science

NOVEMBER 4, 2023

In this article, I want to talk about crucial things that affect data engineers. We will discuss how to use this knowledge to power advanced analytics pipelines and operational excellence. I’d like to discuss some popular Data engineering questions: Modern data engineering (DE). What is it?

Data Engineering

Data Engineering Data Engineer Engineering BI

Building Data Pipelines That Run From Source To Analysis And Activation With Hevo Data

Data Engineering Podcast

SEPTEMBER 11, 2022

Building reliable data pipelines is a complex and costly undertaking with many layered requirements. In order to reduce the amount of time and effort required to build pipelines that power critical insights Manish Jethani co-founded Hevo Data. Data stacks are becoming more and more complex.

Data Pipeline

Data Pipeline Building MongoDB MySQL

Cloudera Data Engineering 2021 Year End Review

Cloudera

DECEMBER 21, 2021

Since the release of Cloudera Data Engineering (CDE) more than a year ago , our number one goal was operationalizing Spark pipelines at scale with first class tooling designed to streamline automation and observability. Data pipelines are composed of multiple steps with dependencies and triggers. Happy New Year.

Data Engineering

Data Engineering Data Engineer Engineering Data Pipeline

Data Engineering Weekly #203

Data Engineering Weekly

JANUARY 12, 2025

With Astro, you can build, run, and observe your data pipelines in one place, ensuring your mission critical data is delivered on time. Generative AI demands the processing of vast amounts of diverse, unstructured data (e.g.,

Pipeline-centric

Pipeline-centric Data Engineering Data Engineer Engineering

Tired of your Data Engineering Role?

Towards Data Science

AUGUST 19, 2023

How I made the transition to an analytics engineer Photo by Campaign Creators on Unsplash A few years ago, I was at a point where I was feeling unfulfilled in my career. I had been working in data engineering for three years and the initial excitement of starting in the world of tech had faded.

Data Engineering

Data Engineering Data Engineer Engineering SQL

The Future of Data Engineering as a Data Engineer

Monte Carlo

JANUARY 23, 2024

In the world of data engineering, Maxime Beauchemin is someone who needs no introduction. Currently, Maxime is CEO and co-founder of Preset , a fast-growing startup that’s paving the way forward for AI-enabled data visualization for modern companies. Enter, the data engineer. What is a data engineer today?

Data Engineering

Data Engineering Data Engineer Engineering Data

Data Engineer vs Data Analyst: Key Differences and Similarities

Knowledge Hut

MAY 3, 2023

With companies increasingly relying on data-driven insights to make informed decisions, there has never been a greater need for skilled specialists who can manage and evaluate vast amounts of data. The roles of data analyst and data engineer have emerged as two of the most in-demand professions in today's job market.

Data Engineering

Data Engineering Data Engineer Engineering Data Cleanse

Scale Your Spatial Analysis By Building It In SQL With Syntax Extensions

Data Engineering Podcast

FEBRUARY 6, 2022

In order to make geospatial analytics more maintainable and scalable there has been an increase in the number of database engines that provide extensions to their SQL syntax that supports manipulation of spatial data. Once you’re up and running, your smart data pipelines are resilient to data drift.

SQL

SQL Building IT Google Cloud

Making Email Better With AI At Shortwave

Data Engineering Podcast

APRIL 21, 2024

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Dagster offers a new approach to building and running data platforms and data pipelines. Data lakes are notoriously complex. Go to dataengineeringpodcast.com/dagster today to get started.

Data Lake

Data Lake High Quality Data Machine Learning Data Pipeline

Cloud Native Data Orchestration For Machine Learning And Data Engineering With Flyte

Data Engineering Podcast

MAY 22, 2022

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode.

Machine Learning

Machine Learning Data Engineering Data Engineer Cloud

Exploring The Evolving Role Of Data Engineers

Data Engineering Podcast

DECEMBER 26, 2021

Summary Data Engineering is still a relatively new field that is going through a continued evolution as new technologies are introduced and new requirements are understood. In this episode Maxime Beauchemin returns to revisit what it means to be a data engineer and how the role has changed over the past 5 years.

Data Engineering

Data Engineering Data Engineer Engineering Data Warehouse

How to Become a Data Engineer in 2024?

Knowledge Hut

DECEMBER 26, 2023

Data Engineering is typically a software engineering role that focuses deeply on data – namely, data workflows, data pipelines, and the ETL (Extract, Transform, Load) process. What is Data Science? What are the roles and responsibilities of a Data Engineer? And many more.

Data Engineering

Data Engineering Data Engineer Engineering Hadoop

Building Databricks Data Pipelines 101

Confessions of a Data Guy

MARCH 29, 2024

Have you ever wondered at a high level what it’s like to build production-level data pipelines on Databricks? The post Building Databricks Data Pipelines 101 appeared first on Confessions of a Data Guy. What does it look like, what tools do you use?

Data Pipeline

Data Pipeline Building Data IT

Microsoft Fabric Tutorial for Beginners

Edureka

MAY 27, 2025

You won’t have to deal with siloed systems, jump between tools, or write endless lines of code to make data useful. With its ability to seamlessly integrate data engineering, analytics, and business intelligence, Microsoft Fabric stands out as the all-in-one superhero in a world where data is abundant but insights are scarce.

BI

BI Data Pipeline Business Intelligence Data Engineering

Introducing Self-Service, No-Code Airflow Authoring UI in Cloudera Data Engineering

Cloudera

OCTOBER 19, 2021

Airflow has been adopted by many Cloudera Data Platform (CDP) customers in the public cloud as the next generation orchestration service to setup and operationalize complex data pipelines. We started out by interviewing customers to understand where the most friction exists in their pipeline development workflows today.

Coding

Coding Data Engineering Data Engineer Engineering

Data Engineering Weekly #179

Data Engineering Weekly

JULY 7, 2024

Experience Enterprise-Grade Apache Airflow Astro augments Airflow with enterprise-grade features to enhance productivity, meet scalability and availability demands across your data pipelines, and more. The blog explains how we can programmatically measure the real cost of Azure databricks SQL warehouse instances.

Data Engineering

Data Engineering Data Engineer Engineering Data Lake

Microsoft Fabric Architecture Explained: Core Components & Benefit

Edureka

MAY 27, 2025

Microsoft Fabric is a next-generation data platform that combines business intelligence, data warehousing, real-time analytics, and data engineering into a single integrated SaaS framework. It preserves security and governance while facilitating smooth data access across all Fabric services.

Architecture

Architecture BI Business Intelligence Raw Data

A Reflection On Learning A Lot More Than 97 Things Every Data Engineer Should Know

Data Engineering Podcast

JANUARY 30, 2022

Summary The Data Engineering Podcast has been going for five years now and has included conversations and interviews with a huge number of guests, covering a broad range of topics. In this episode he shares some reflections on producing the podcast, compiling the book, and relevant trends in the ecosystem of data engineering.

Data Engineering

Data Engineering Data Engineer Engineering Data Pipeline

Ship Smarter Not Harder With Declarative And Collaborative Data Orchestration On Dagster+

Data Engineering Podcast

MARCH 24, 2024

In this episode Pete Hunt, CEO of Dagster labs, outlines these new capabilities, how they reduce the burden on data teams, and the increased collaboration that they enable across teams and business units. Data lakes are notoriously complex. Go to dataengineeringpodcast.com/dagster today to get started. Your first 30 days are free!

Data Lake

Data Lake High Quality Data Hadoop Machine Learning

10 DataOps Principles for Overcoming Data Engineer Burnout

DataKitchen

NOVEMBER 18, 2021

Add all these facts together, and it paints a picture that something is amiss in the data world. . Yet, among all this, one area that hasn’t been studied is the data engineering role. We thought it would be interesting to look at how data engineers are doing under these circumstances. Blaming and finger-pointing.

Data Engineering

Data Engineering Data Engineer Engineering Government

Seamless SQL And Python Transformations For Data Engineers And Analysts With SQLMesh

Snowflake’s New Python API Empowers Data Engineers to Build Modern Data Pipelines with Ease

Webinars

Trending Sources

How To Future-Proof Your Data Pipelines

Webinars

How to improve at SQL as a data engineer

Data Engineering Weekly #198

Data Engineering Weekly #222

Tackling Real Time Streaming Data With SQL Using RisingWave

What is an AI Data Engineer? 4 Important Skills, Responsibilities, & Tools

Mirroring SQL Server Database to Microsoft Fabric

Streaming Data Pipelines Made SQL With Decodable

Ready-to-go sample data pipelines with Dataflow

The Emerging Role of AI Data Engineers - The New Strategic Role for AI-Driven Success

Unpacking The Seven Principles Of Modern Data Pipelines

Using Trino And Iceberg As The Foundation Of Your Data Lakehouse

Data Engineering Weekly #217

No Python, No SQL Templates, No YAML: Why Your Open Source Data Quality Tool Should Generate 80% Of Your Data Quality Tests Automatically

Troubleshooting Kafka In Production

10 Skills to Ace Your Data Engineering Interviews

How to learn data engineering

Simplified End-to-End Development for Production-Ready Data Pipelines, Applications, and ML Models

Our First Netflix Data Engineering Summit

How Data Engineering Teams Power Machine Learning With Feature Platforms

Data Engineering Weekly #175

Databricks Workspace Health SQL Toolkit

Modern Data Engineering

Building Data Pipelines That Run From Source To Analysis And Activation With Hevo Data

Cloudera Data Engineering 2021 Year End Review

Data Engineering Weekly #203

Tired of your Data Engineering Role?

The Future of Data Engineering as a Data Engineer

Data Engineer vs Data Analyst: Key Differences and Similarities

Scale Your Spatial Analysis By Building It In SQL With Syntax Extensions

Making Email Better With AI At Shortwave

Cloud Native Data Orchestration For Machine Learning And Data Engineering With Flyte

Exploring The Evolving Role Of Data Engineers

How to Become a Data Engineer in 2024?

Building Databricks Data Pipelines 101

Microsoft Fabric Tutorial for Beginners

Introducing Self-Service, No-Code Airflow Authoring UI in Cloudera Data Engineering

Data Engineering Weekly #179

Microsoft Fabric Architecture Explained: Core Components & Benefit

A Reflection On Learning A Lot More Than 97 Things Every Data Engineer Should Know

Ship Smarter Not Harder With Declarative And Collaborative Data Orchestration On Dagster+

10 DataOps Principles for Overcoming Data Engineer Burnout

Stay Connected