Data Engineer and SQL - Data Engineering Digest

How To Migrate From SQL Server To Snowflake

Seattle Data Guy

APRIL 27, 2025

Many companies looking to migrate to the cloud go from SQL Server to Snowflake. One of the reasons and common benefits was that teams found it far easier to manage that SQL Server and in almost every… Read more The post How To Migrate From SQL Server To Snowflake appeared first on Seattle Data Guy.

SQL

SQL Cloud Management IT

Seamless SQL And Python Transformations For Data Engineers And Analysts With SQLMesh

Data Engineering Podcast

JUNE 25, 2023

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management RudderStack helps you build a customer data platform on your warehouse or data lake. Support Data Engineering Podcast

Data Engineering

Data Engineering Data Engineer Python Engineering

How to Normalize Relational Databases With SQL Code?

Analytics Vidhya

FEBRUARY 27, 2023

So, we are […] The post How to Normalize Relational Databases With SQL Code? If a corrupted, unorganized, or redundant database is used, the results of the analysis may become inconsistent and highly misleading. appeared first on Analytics Vidhya.

Relational Database

Relational Database Database SQL Coding

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

How to Modernize Manufacturing Without Losing Control

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Most Essential 2023 Interview Questions on Data Engineering

Analytics Vidhya

FEBRUARY 7, 2023

Introduction Data engineering is the field of study that deals with the design, construction, deployment, and maintenance of data processing systems. The goal of this domain is to collect, store, and process data efficiently and efficiently so that it can be used to support business decisions and power data-driven applications.

Data Engineering

Data Engineering Data Engineer Engineering Data

Step-by-Step Roadmap to Learn SQL in 2023

Analytics Vidhya

FEBRUARY 28, 2023

Introduction Structured Query Language is a powerful language to manage and manipulate data stored in databases. SQL is widely used in the field of data science and is considered an essential skill to have if you work with data.

SQL

SQL Relational Database Data Science Database

Data Engineering Weekly #198

Data Engineering Weekly

NOVEMBER 24, 2024

Editor’s Note: Launching Data & Gen-AI courses in 2025 I can’t believe DEW will reach almost its 200th edition soon. What I started as a fun hobby has become one of the top-rated newsletters in the data engineering industry. The blog narrates a few examples of Pipe Syntax in comparison with the SQL queries.

Data Engineering

Data Engineering Data Engineer Engineering Insurance

SQL Injection: The Cyber Attack Hiding in Your Database

Analytics Vidhya

FEBRUARY 2, 2023

Introduction SQL injection is an attack in which a malicious user can insert arbitrary SQL code into a web application’s query, allowing them to gain unauthorized access to a database. We can use this to steal sensitive information or make unauthorized changes to the data stored in the database.

Database

Database SQL Coding Accessibility

Data Engineering Weekly #216

Data Engineering Weekly

APRIL 13, 2025

link] Sponsored: The Ultimate Guide to Apache Airflow® DAGs Download this free 130+ page eBook for everything a data engineer needs to know to take their DAG writing skills to the next level (+ plenty of example code). link] All rights reserved, ProtoGrowth Inc.,

Data Engineering

Data Engineering Data Engineer Engineering Datasets

25 SQL tips to level up your data engineering skills

Start Data Engineering

OCTOBER 17, 2024

Introduction Setup SQL tips 1. Handy functions for common data processing scenarios 1.1. STRUCT data types are sorted based on their keys from left to right 1.4. Need to filter on WINDOW function without CTE/Subquery use QUALIFY 1.2. Need the first/last row in a partition, use DISTINCT ON 1.3.

SQL

SQL Data Engineering Data Engineer Engineering

Tackling Real Time Streaming Data With SQL Using RisingWave

Data Engineering Podcast

FEBRUARY 4, 2024

Summary Stream processing systems have long been built with a code-first design, adding SQL as a layer on top of the existing framework. RisingWave is a database engine that was created specifically for stream processing, with S3 as the storage layer. Can you describe what RisingWave is and the story behind it?

SQL

SQL Data Lake High Quality Data Machine Learning

Data Engineering Weekly #217

Data Engineering Weekly

APRIL 20, 2025

[link] Jing Ge: Context Matters — The Vision of Data Analytics and Data Science Leveraging MCP and A2A All aspects of software engineering are rapidly being automated with various coding AI tools, as seen in the AI technology radar. Data engineering is one aspect where I see a few startups starting to disrupt.

Data Engineering

Data Engineering Data Engineer Engineering Kafka

Data Engineering Weekly #218

Data Engineering Weekly

APRIL 27, 2025

Before Hoptimator, Pinot ingestion often required data producers to create and manage separate, Pinot-specific preprocessing jobs to optimize data, such as re-keying, filtering, and pre-aggregating. reducing user friction, operator toil, and resource consumption on Pinot servers, while automating pipeline management.

Data Engineering

Data Engineering Data Engineer Engineering Kafka

DuckDB vs Polars for Data Engineering.

Confessions of a Data Guy

APRIL 16, 2023

I was wondering the other day … since Polars now has a SQL context and is getting more popular by the day, do I need DuckDB anymore? You might think […] The post DuckDB vs Polars for Data Engineering. appeared first on Confessions of a Data Guy. These two tools are hot.

Data Engineering

Data Engineering Data Engineer Engineering SQL

Polars – Laziness and SQL Context.

Confessions of a Data Guy

MAY 7, 2023

I mean you have a problem if you could use Polars Python, and you find yourself using […] The post Polars – Laziness and SQL Context. appeared first on Confessions of a Data Guy. It’s gotten so bad, I’ve started to use it in my Rust code on the side, Polars that is.

SQL

SQL Python Coding IT

Mirroring SQL Server Database to Microsoft Fabric

Striim

NOVEMBER 19, 2024

SQL2Fabric Mirroring is a new fully managed service offered by Striim to mirror on premise SQL Databases. It’s a collaborative service between Striim and Microsoft based on Fabric Open Mirroring that enables real-time data replication from on-premise SQL Server databases to Azure Fabric OneLake.

SQL

SQL Database Data Warehouse Data Pipeline

The Emerging Role of AI Data Engineers - The New Strategic Role for AI-Driven Success

Data Engineering Weekly

JANUARY 15, 2025

The Critical Role of AI Data Engineers in a Data-Driven World How does a chatbot seamlessly interpret your questions? The answer lies in unstructured data processing—a field that powers modern artificial intelligence (AI) systems. How does a self-driving car understand a chaotic street scene?

Data Engineering

Data Engineering Data Engineer Unstructured Data Engineering

Data Engineering Weekly #181

Data Engineering Weekly

JULY 21, 2024

Editor’s Note: A New Series on Data Engineering Tools Evaluation There are plenty of data tools and vendors in the industry. Data Engineering Weekly is launching a new series on software evaluation focused on data engineering to better guide data engineering leaders in evaluating data tools.

Data Engineering

Data Engineering Data Engineer Engineering Generalist

Data Engineering Weekly #214

Data Engineering Weekly

MARCH 30, 2025

Save Your Spot → Editor’s Note: Data Council 2025, Apr 22-24, Oakland, CA Data Council has always been one of my favorite events to connect with and learn from the data engineering community. Data Council 2025 is set for April 22-24 in Oakland, CA. link] BVP: Roadmap: Data 3.0

Data Engineering

Data Engineering Data Engineer Engineering Pipeline-centric

No Python, No SQL Templates, No YAML: Why Your Open Source Data Quality Tool Should Generate 80% Of Your Data Quality Tests Automatically

DataKitchen

FEBRUARY 17, 2025

No Python, No SQL Templates, No YAML: Why Your Open Source Data Quality Tool Should Generate 80% Of Your Data Quality Tests Automatically As a data engineer, ensuring data quality is both essential and overwhelming. Even if data engineers had the resources, they lacked the full context of data use.

SQL

SQL Python Government Data Engineering

What is an AI Data Engineer? 4 Important Skills, Responsibilities, & Tools

Monte Carlo

OCTOBER 31, 2024

One job that has become increasingly popular across enterprise data teams is the role of the AI data engineer. Demand for AI data engineers has grown rapidly in data-driven organizations. But what does an AI data engineer do? Table of Contents What Does an AI Data Engineer Do?

Data Engineering

Data Engineering Data Engineer Engineering Unstructured Data

Reflecting On The Past 6 Years Of Data Engineering

Data Engineering Podcast

FEBRUARY 5, 2023

In that time there have been a number of generational shifts in how data engineering is done. Materialize’s PostgreSQL-compatible interface lets users leverage the tools they already use, with unsurpassed simplicity enabled by full ANSI SQL support.

Data Engineering

Data Engineering Data Engineer Engineering PostgreSQL

Data Engineering Weekly #182

Data Engineering Weekly

JULY 28, 2024

[link] Instacart: Data Science Spotlight - Cracking the SQL Interview at Instacart (LLM Edition) Instacart writes about integrating LLM in their interview process and how it helps them identify the right candidates. Swiggy recently wrote about its internal platform, Hermes, a text-to-SQL solution.

Data Engineering

Data Engineering Data Engineer Engineering Database-centric

How to learn data engineering

Christophe Blefari

JANUARY 20, 2024

Learn data engineering, all the references ( credits ) This is a special edition of the Data News. But right now I'm in holidays finishing a hiking week in Corsica 🥾 So I wrote this special edition about: how to learn data engineering in 2024. Who are the data engineers?

Data Engineering

Data Engineering Data Engineer Engineering Hadoop

Databricks Workspace Health SQL Toolkit

Sync Computing

FEBRUARY 4, 2025

As data engineers, understanding the intricacies of your Databricks environment is important. But raw system data can be tricky to navigate, and sometimes you just need a quick answer to that burning question. Wow the team with insights in your Jobs, SQL warehouses, APC clusters, and DLT usage.

SQL

SQL Datasets Utilities Accessibility

Snowflake’s New Python API Empowers Data Engineers to Build Modern Data Pipelines with Ease

Snowflake

APRIL 17, 2024

In today’s data-driven world, developer productivity is essential for organizations to build effective and reliable products, accelerate time to value, and fuel ongoing innovation. While the Python API connector remains available for specific SQL use cases, the new API is designed to be your go-to solution.

Data Pipeline

Data Pipeline Python Data Engineering Data Engineer

How Data Engineering Teams Power Machine Learning With Feature Platforms

Data Engineering Podcast

JULY 2, 2023

In this episode Razi Raziuddin shares how data engineering teams can support the machine learning workflow through the development and support of systems that empower data scientists and ML engineers to build and maintain their own features. What is the role of the data engineer in supporting those interfaces?

Machine Learning

Machine Learning Data Engineering Data Engineer Engineering

Troubleshooting Kafka In Production

Data Engineering Podcast

DECEMBER 24, 2023

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Introducing RudderStack Profiles. RudderStack Profiles takes the SaaS guesswork and SQL grunt work out of building complete customer profiles so you can quickly ship actionable, enriched data to every downstream team.

Kafka

Kafka Data Lake High Quality Data SQL

Data Engineering Weekly #207

Data Engineering Weekly

FEBRUARY 9, 2025

GetInData writes an excellent summary of adding data quality checks in a Flink streaming pipeline. link] Fernando Borretti: Composable SQL One of the biggest challenges in SQL is the unit testing. The author highlights three key challenges in SQL.

Data Engineering

Data Engineering Data Engineer Engineering Unstructured Data

New SQL Practice Problems

Confessions of a Data Guy

FEBRUARY 21, 2024

New SQL Practice Problems I’m trying something new. I get a lot of questions from folks about getting into the Data Engineering space, how to get better, grow, learn, etc. SQL Practice Problems. Some moons ago I wrote a Data Engineering Practice repo on GitHub for free, and some 1.2K

SQL

SQL Data Engineering Data Engineer Engineering

Using Trino And Iceberg As The Foundation Of Your Data Lakehouse

Data Engineering Podcast

FEBRUARY 18, 2024

Summary A data lakehouse is intended to combine the benefits of data lakes (cost effective, scalable storage and compute) and data warehouses (user friendly SQL interface). Data lakes are notoriously complex. Visit: dataengineeringpodcast.com/data-council today. Your first 30 days are free!

Data Lake

Data Lake High Quality Data Data Warehouse Google Cloud

Data Engineering Weekly #175

Data Engineering Weekly

JUNE 10, 2024

Experience Enterprise-Grade Apache Airflow Astro augments Airflow with enterprise-grade features to enhance productivity, meet scalability and availability demands across your data pipelines, and more. As I have shared , its impact on data engineering is exciting.

Data Engineering

Data Engineering Data Engineer Engineering Kafka

Our First Netflix Data Engineering Summit

Netflix Tech

DECEMBER 14, 2023

Engineers from across the company came together to share best practices on everything from Data Processing Patterns to Building Reliable Data Pipelines. The result was a series of talks which we are now sharing with the rest of the Data Engineering community! In this video, Sr.

Data Engineering

Data Engineering Data Engineer Engineering Metadata

KDnuggets Top Posts for January 2023: SQL and Python Interview Questions for Data Analysts

KDnuggets

MARCH 22, 2023

SQL and Python Interview Questions for Data Analysts • 5 SQL Visualization Tools for Data Engineers • 5 Free Tools For Detecting ChatGPT, GPT3, and GPT2 • Top Free Resources To Learn ChatGPT • Free TensorFlow 2.0

SQL

SQL Python Data Engineering Data Engineer

How to get started with dbt

Christophe Blefari

MARCH 1, 2023

dbt Core is an open-source framework that helps you organise data warehouse SQL transformation. In the ELT, the load is done before the transform part without any alteration of the data leaving the raw data ready to be transformed in the data warehouse. dbt Labs also popularised the analytics engineer role.

Data Warehouse

Data Warehouse SQL Metadata Raw Data

dbt on Databricks.

Confessions of a Data Guy

MARCH 4, 2025

Context and Motivation dbt (Data Build Tool): A popular open-source framework that organizes SQL transformations in a modular, version-controlled, and testable way. Databricks: A platform that unifies data engineering and data science pipelines, typically with Spark (PySpark, Scala) or SparkSQL.

Scala

Scala Data Science SQL Data Engineering

Ingest data from SQL Server, Salesforce, and Workday with LakeFlow Connect

databricks

JULY 31, 2024

We’re excited to announce the Public Preview of LakeFlow Connect for SQL Server, Salesforce, and Workday. These ingestion connectors enable simple and efficient.

SQL

SQL Data Data Engineering Data Engineer

Datafusion SQL CLI – Look Ma, I made a new ETL tool.

Confessions of a Data Guy

DECEMBER 21, 2023

Thinking about and contemplating life and data engineering … something flitted across my […] The post Datafusion SQL CLI – Look Ma, I made a new ETL tool. appeared first on Confessions of a Data Guy.

ETL Tools

ETL Tools SQL Data Engineering Data Engineer

dbt multi-project collaboration

Christophe Blefari

OCTOBER 19, 2023

With dbt, you can apply software engineering practices to SQL development. Managing your SQL patrimony has never been easier. So, yes, dbt is cool but there is a common pattern with it: you accumulate SQL queries. Fast forward to 2 years later, you find yourself with hundreds or thousands of SQL queries.

Project

Project Finance SQL Government

Visual Studio Code (VSCode) extensions for data engineers

Start Data Engineering

FEBRUARY 16, 2025

SQL Tools 5. Data Wrangler 7. Introduction 2. Python environment setup 3. VSCode Primer 4. Extensions overview 1. Python test & debug 3. autoDocstring 8. Rainbow csv 9. DBT power user 5. Privacy, Performance, and Cognitive Overload 6. Conclusion 7. Recommended reading 1.

Coding

Coding Data Engineering Data Engineer Engineering

The Future of Data Engineering as a Data Engineer

Monte Carlo

JANUARY 23, 2024

In the world of data engineering, Maxime Beauchemin is someone who needs no introduction. Currently, Maxime is CEO and co-founder of Preset , a fast-growing startup that’s paving the way forward for AI-enabled data visualization for modern companies. Enter, the data engineer. What is a data engineer today?

Data Engineering

Data Engineering Data Engineer Engineering Data

SQL and Data Integration: ETL and ELT

KDnuggets

JANUARY 19, 2023

In this article, we will discuss use cases and methods for using ETL (Extract, Transform, Load) and ELT (Extract, Load, Transform) processes along with SQL to integrate data from various sources.

SQL

SQL Data Integration Data Process

How To Migrate From SQL Server To Snowflake

Seamless SQL And Python Transformations For Data Engineers And Analysts With SQLMesh

Webinars

Trending Sources

How to Normalize Relational Databases With SQL Code?

Webinars

Most Essential 2023 Interview Questions on Data Engineering

Step-by-Step Roadmap to Learn SQL in 2023

Data Engineering Weekly #198

Top 5 SQL Interview Questions With Implementation

SQL Injection: The Cyber Attack Hiding in Your Database

Data Engineering Weekly #216

Top 5 SQL Interview Questions

Top Posts February 20-26: 5 SQL Visualization Tools for Data Engineers

25 SQL tips to level up your data engineering skills

Tackling Real Time Streaming Data With SQL Using RisingWave

Data Engineering Weekly #217

Data Engineering Weekly #218

DuckDB vs Polars for Data Engineering.

Polars – Laziness and SQL Context.

Mirroring SQL Server Database to Microsoft Fabric

The Emerging Role of AI Data Engineers - The New Strategic Role for AI-Driven Success

Data Engineering Weekly #181

Data Engineering Weekly #214

No Python, No SQL Templates, No YAML: Why Your Open Source Data Quality Tool Should Generate 80% Of Your Data Quality Tests Automatically

What is an AI Data Engineer? 4 Important Skills, Responsibilities, & Tools

Reflecting On The Past 6 Years Of Data Engineering

Data Engineering Weekly #182

How to learn data engineering

Databricks Workspace Health SQL Toolkit

Snowflake’s New Python API Empowers Data Engineers to Build Modern Data Pipelines with Ease

How Data Engineering Teams Power Machine Learning With Feature Platforms

Troubleshooting Kafka In Production

Data Engineering Weekly #207

New SQL Practice Problems

Using Trino And Iceberg As The Foundation Of Your Data Lakehouse

Data Engineering Weekly #175

Our First Netflix Data Engineering Summit

KDnuggets Top Posts for January 2023: SQL and Python Interview Questions for Data Analysts

How to get started with dbt

dbt on Databricks.

Ingest data from SQL Server, Salesforce, and Workday with LakeFlow Connect

Datafusion SQL CLI – Look Ma, I made a new ETL tool.

dbt multi-project collaboration

Visual Studio Code (VSCode) extensions for data engineers

The Future of Data Engineering as a Data Engineer

SQL and Data Integration: ETL and ELT

Stay Connected