Data Management and Database - Data Engineering Digest

Azure SQL Database: The Future of Cloud Data Management

ProjectPro

JUNE 6, 2025

What makes the Azure SQL database so popular for OLTP applications? What features of Microsoft Azure SQL database give it an edge over its competitors? To get answers to all these questions, read our ultimate guide on Azure SQL Database! Table of Contents What is Azure SQL Database? How To Connect To Azure SQL Database?

Database

Database SQL Cloud Data Management

The Future of Data Management Is Agentic AI

Snowflake

APRIL 13, 2025

The vast amounts of data generated daily require advanced tools for efficient management and analysis. Enter agentic AI, a type of artificial intelligence set to transform enterprise data management. Many enterprises face overwhelming data sources, from structured databases to unstructured social media feeds.

Data Management

Data Management Management Consulting Unstructured Data

Surveying The Market Of Database Products

Data Engineering Podcast

OCTOBER 29, 2023

Summary Databases are the core of most applications, whether transactional or analytical. In recent years the selection of database products has exploded, making the critical decision of which engine(s) to use even more difficult. What are the aspects of the database market that keep you interested as a VP of product?

Database

Database BI SQL Machine Learning

Webinars

What’s New in Apache Airflow® 3.0—And How Will It Reshape Your Data Workflows?

MORE WEBINARS

Back to Basics Week 2: Database, SQL, Data Management and Statistical Concepts

KDnuggets

NOVEMBER 13, 2023

This week, we delve into the vital world of Databases, SQL, Data Management, and Statistical Concepts in Data Science. Welcome back to Week 2 of KDnuggets’ "Back to Basics" series.

Database

Database SQL Data Management Management

Designing A Non-Relational Database Engine

Data Engineering Podcast

APRIL 14, 2024

Summary Databases come in a variety of formats for different use cases. The default association with the term "database" is relational engines, but non-relational engines are also used quite widely. Datafold has recently launched data replication testing, providing ongoing validation for source-to-target replication.

Non-relational Database

Non-relational Database Relational Database Database Designing

Reconciling The Data In Your Databases With Datafold

Data Engineering Podcast

MARCH 17, 2024

Summary A significant portion of data workflows involve storing and processing information in database engines. In this episode Gleb Mezhanskiy, founder and CEO of Datafold, discusses the different error conditions and solutions that you need to know about to ensure the accuracy of your data.

Database

Database Data Lake High Quality Data Data Workflow

A Beginner’s Guide to Graph Databases

ProjectPro

JUNE 6, 2025

Imagine solving a complex puzzle where each piece represents a unique data point, and their connections form a vast network. Traditional databases often need help to capture these intricate relationships, leaving you with a fragmented view of your data. Table of Contents What is a Graph Database? Why Graph Databases?

Database

Database Database-centric Relational Database MongoDB

Find Out About The Technology Behind The Latest PFAD In Analytical Database Development

Data Engineering Podcast

FEBRUARY 25, 2024

Summary Building a database engine requires a substantial amount of engineering effort and time investment. In this episode he explains how he used the combination of Apache Arrow, Flight, Datafusion, and Parquet to lay the foundation of the newest version of his time-series database. Closing Announcements Thank you for listening!

Database

Database Technology Data Lake High Quality Data

How To Choose Right AWS Databases for Your Needs

ProjectPro

JUNE 6, 2025

Explore the world of data analytics with the top AWS databases! Check out this blog to discover your ideal database and uncover the power of scalable and efficient solutions for all your data analytical requirements. Let’s understand more about AWS Databases in the following section.

AWS

AWS Database Amazon Web Services MySQL

Azure Cosmos DB: The Future of Database Management

ProjectPro

JUNE 6, 2025

Are you ready to join the database revolution? Data is the new oil" has become the mantra of the digital age, and in this era of rapidly increasing data volumes, the need for robust and scalable database management solutions has never been more critical. FAQs on Microsoft Azure Cosmos DB What is Azure Cosmos DB?

Database

Database Management MongoDB NoSQL

How to Use Pinecone Vector Database in your AI Projects?

ProjectPro

JUNE 6, 2025

” This blog will align with that vision by exploring what Pinecone Vector Database is, how to use Pinecone Vector Database, and explore a comprehensive Pinecone Vector Database tutorial with a simple example. Table of Contents What is a Pinecone Vector Database? Pinecone is helpful in this situation.

Database

Database Project Metadata Unstructured Data

Building An Internal Database As A Service Platform At Cloudflare

Data Engineering Podcast

AUGUST 27, 2023

Summary Data persistence is one of the most challenging aspects of computer systems. In the era of the cloud most developers rely on hosted services to manage their databases, but what if you are a cloud service? It’s the only true SQL streaming database built from the ground up to meet the needs of modern data products.

Database

Database Building PostgreSQL BI

Exploring Vector Databases: A Guide to Their Role in AI Tech

ProjectPro

JUNE 6, 2025

It's the magic of vector databases! To unlock the power of complex data formats such as audio files, images, etc., researchers have developed vector databases that allow users to utilize similarity search through vectors. Table of Contents Introduction to Vector Databases How Vector Databases Work?

Database

Database Algorithm Machine Learning Metadata

Amazon Aurora: The Future of Cloud Database Technology

ProjectPro

JUNE 6, 2025

Say goodbye to database downtime, and hello to Amazon Aurora! Explore the advanced features of this powerful cloud-based solution and take your data management to the next level with this comprehensive guide. It offers various cloud database services, with Amazon Aurora being one of the most popular services.

Database

Database Technology Cloud PostgreSQL

HBase vs Cassandra-The Battle of the Best NoSQL Databases

ProjectPro

JUNE 6, 2025

NoSQL databases are the new-age solutions to distributed unstructured data storage and processing. The speed, scalability, and fail-over safety offered by NoSQL databases are needed in the current times in the wake of Big Data Analytics and Data Science technologies.

NoSQL

NoSQL Database Hadoop Big Data

How Does AWS DocumentDB Simplify Database Management?

ProjectPro

JUNE 6, 2025

Ever wished for a database that's as easy to use as your favorite app? Say hello to AWS DocumentDB - your passport to unlocking the simplicity of data management. It's like a magic tool that makes handling data super simple. DocumentDB is everyone's favorite from startups to established enterprises alike.

AWS

AWS Database MongoDB Management

Mastering the Art of ETL on AWS for Data Management

ProjectPro

JUNE 6, 2025

With so much riding on the efficiency of ETL processes for data engineering teams, it is essential to take a deep dive into the complex world of ETL on AWS to take your data management to the next level. This is particularly useful for companies that need to process data in near-real-time. Q) What ETL does Amazon use?

AWS

AWS Data Management ETL Tools Management

Composable data management at Meta

Engineering at Meta

MAY 22, 2024

In recent years, Meta’s data management systems have evolved into a composable architecture that creates interoperability, promotes reusability, and improves engineering efficiency. Data is at the core of every product and service at Meta. Data is at the core of every product and service at Meta.

Data Management

Data Management Management Data SQL

Master Data Management: Common Misconceptions You Should Know

Precisely

OCTOBER 23, 2023

When most people think of master data management, they first think of customers and products. But master data encompasses so much more than data about customers and products. Challenges of Master Data Management A decade ago, master data management (MDM) was a much simpler proposition than it is today.

Data Management

Data Management Management Data Government

Hottest IT Certifications of 2025- NoSQL Databases (MongoDB Certification)

ProjectPro

JUNE 6, 2025

Table of Contents MongoDB NoSQL Database Certification- Hottest IT Certifications of 2025 MongoDB-NoSQL Database of the Developers and for the Developers MongoDB Certification Roles and Levels Why MongoDB Certification? One third of Fortune 100 companies are employing MongoDB NoSQL database for mission critical big data applications.

NoSQL

NoSQL MongoDB Certification Database

Troubleshooting Kafka In Production

Data Engineering Podcast

DECEMBER 24, 2023

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Introducing RudderStack Profiles. RudderStack Profiles takes the SaaS guesswork and SQL grunt work out of building complete customer profiles so you can quickly ship actionable, enriched data to every downstream team.

Kafka

Kafka Data Lake High Quality Data SQL

Delivering the Most Enterprise-Ready Postgres, Built for Snowflake

Snowflake

JUNE 1, 2025

This will allow companies to speed up AI development and simplify data management with a secure, compliant database solution ready for enterprises across industries, including Fortune 500 financial institutions, high-scale SaaS companies and federal agencies.

PostgreSQL

PostgreSQL Database Cloud Government

Data Integrity for AI: What’s Old is New Again

Precisely

JANUARY 9, 2025

The goal of this post is to understand how data integrity best practices have been embraced time and time again, no matter the technology underpinning. In the beginning, there was a data warehouse The data warehouse (DW) was an approach to data architecture and structured data management that really hit its stride in the early 1990s.

Data Integration

Data Integration Hadoop Data Lake Data Warehouse

An Overview Of The Sate Of Data Orchestration In An Increasingly Complex Data Ecosystem

Data Engineering Podcast

SEPTEMBER 10, 2023

In this episode Nick Schrock, creator of Dagster, shares his perspective on the state of data orchestration technology and its application to help inform its implementation in your environment. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Introducing RudderStack Profiles.

BI

BI SQL Data Machine Learning

The Future of Data Lakehouses: A Fireside Chat with Vinoth Chandar - Founder CEO Onehouse & PMC Chair of Apache Hudi

Data Engineering Weekly

JANUARY 8, 2025

What if your data lake could do more than just store information—what if it could think like a database? As data lakehouses evolve, they transform how enterprises manage, store, and analyze their data. Hudi, with its robust community and technical innovation, is well-positioned to lead this charge.

Data Lake

Data Lake Retail Data Ingestion Datasets

Building Linked Data Products With JSON-LD

Data Engineering Podcast

SEPTEMBER 17, 2023

In this episode Brian Platz explains how JSON-LD can be used as a shared representation of linked data for building semantic data products. Get all of the details and try the new product today at dataengineeringpodcast.com/rudderstack You shouldn't have to throw away the database to build with fast-changing data.

Building

Building BI SQL Python

Data News — Week 24.11

Christophe Blefari

MARCH 15, 2024

Understand how BigQuery inserts, deletes and updates — Once again Vu took time to deep dive into BigQuery internal, this time to explain how data management is done. Pandera, a data validation library for dataframes, now supports Polars. Arroyo, a stream-processing platform, rebuilt their engine using DataFusion.

Metadata

Metadata Data Warehouse Software Engineer Software Engineering

Cloudera announces ‘Interoperability Ecosystem’ with founding members AWS and Snowflake

Cloudera

DECEMBER 4, 2024

All this by making it easier for customers to connect their workloads with Snowflake, Cloudera, and unique AWS services such as Amazon Simple Storage Service (Amazon S3), Amazon Elastic Kubernetes Service (Amazon EKS) , Amazon Relational Database Service (Amazon RDS), Amazon Elastic Compute Cloud (Amazon EC2), Amazon EMR and Amazon Athena.

AWS

AWS Raw Data Relational Database Government

Reducing The Barrier To Entry For Building Stream Processing Applications With Decodable

Data Engineering Podcast

OCTOBER 15, 2023

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Introducing RudderStack Profiles. RudderStack Profiles takes the SaaS guesswork and SQL grunt work out of building complete customer profiles so you can quickly ship actionable, enriched data to every downstream team.

Process

Process Building SQL BI

Tackling Real Time Streaming Data With SQL Using RisingWave

Data Engineering Podcast

FEBRUARY 4, 2024

RisingWave is a database engine that was created specifically for stream processing, with S3 as the storage layer. In this episode Yingjun Wu explains how it is architected to power analytical workflows on continuous data flows, and the challenges of making it responsive and scalable. Closing Announcements Thank you for listening!

SQL

SQL Data Lake High Quality Data Kafka

Simplifying Data Architecture and Security to Accelerate Value

Snowflake

NOVEMBER 11, 2024

Unify transactional and analytical workloads in Snowflake for greater simplicity Many businesses must maintain two separate databases: one to handle transactional workloads and another for analytical workloads.

Data Architecture

Data Architecture Architecture Data Lake Kafka

AWS Glue-Unleashing the Power of Serverless ETL Effortlessly

ProjectPro

JUNE 6, 2025

This serverless data integration service can automatically and quickly discover structured or unstructured enterprise data when stored in data lakes in Amazon S3, data warehouses in Amazon Redshift, and other databases that are a component of the Amazon Relational Database Service.

AWS

AWS Scala Metadata Data Lake

Secure Data Sharing and Interoperability Powered by Iceberg REST Catalog

Cloudera

DECEMBER 3, 2024

Rich set of SQL (query, DDL, DML) commands: Create or manipulate database objects, run queries, load and modify data, perform time travel operations, and convert Hive external tables to Iceberg tables using SQL commands. Create Database and Tables: Open HUE and execute the following to create a database and tables.

Metadata

Metadata SQL Data Warehouse Database

Eliminate The Overhead In Your Data Integration With The Open Source dlt Library

Data Engineering Podcast

SEPTEMBER 3, 2023

In this episode Adrian Brudaru explains how it works, the benefits that it provides over other data integration solutions, and how you can start building pipelines today. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Introducing RudderStack Profiles.

Data Integration

Data Integration BI SQL Machine Learning

Shining Some Light In The Black Box Of PostgreSQL Performance

Data Engineering Podcast

NOVEMBER 5, 2023

Summary Databases are the core of most applications, but they are often treated as inscrutable black boxes. When an application is slow, there is a good probability that the database needs some attention. It’s the only true SQL streaming database built from the ground up to meet the needs of modern data products.

PostgreSQL

PostgreSQL Data Lake High Quality Data SQL

How to Become a Data Architect in 2025?

ProjectPro

JUNE 6, 2025

According to the Data Management Body of Knowledge, a Data Architect "provides a standard common business vocabulary, expresses strategic requirements, outlines high-level integrated designs to meet those requirements, and aligns with enterprise strategy and related business architecture."

Data Architect

Data Architect Data Mining Programming Language Java

Defining A Strategy For Your Data Products

Data Engineering Podcast

OCTOBER 22, 2023

In this episode Ranjith Raghunath shares his thoughts on how to build a strategy for the development, delivery, and evolution of data products. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Introducing RudderStack Profiles. With Materialize, you can!

BI

BI SQL Machine Learning Programming Language

Using Data To Illuminate The Intentionally Opaque Insurance Industry

Data Engineering Podcast

OCTOBER 8, 2023

In this episode he shares his journey of data collection and analysis and the challenges of automating an intentionally manual industry. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Introducing RudderStack Profiles. With Materialize, you can!

Insurance

Insurance BI SQL Machine Learning

Building ETL Pipelines With Generative AI

Data Engineering Podcast

OCTOBER 1, 2023

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Introducing RudderStack Profiles. RudderStack Profiles takes the SaaS guesswork and SQL grunt work out of building complete customer profiles so you can quickly ship actionable, enriched data to every downstream team.

Building

Building BI SQL Machine Learning

7 Best Data Warehousing Tools for Efficient Data Storage Needs

ProjectPro

JUNE 6, 2025

This article will explore the top seven data warehousing tools that simplify the complexities of data storage, making it more efficient and accessible. So, read on to discover these essential tools for your data management needs. Table of Contents What are Data Warehousing Tools? Why Choose a Data Warehousing Tool?

Data Storage

Data Storage PostgreSQL Data Warehouse AWS

X-Ray Vision For Your Flink Stream Processing With Datorios

Data Engineering Podcast

JUNE 9, 2024

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management This episode is supported by Code Comments, an original podcast from Red Hat. Data observability has been gaining adoption for a number of years now, with a large focus on data warehouses.

Process

Process Data Lake High Quality Data Government

Top ETL Use Cases for BI and Analytics:Real-World Examples

ProjectPro

JUNE 6, 2025

If you're wondering how the ETL process can drive your company to a new era of success, this blog will help you discover what use cases of ETL make it a critical component in many data management and analytic systems. EHR data allows practitioners and researchers to improve patient outcomes and health-related decision-making.

BI

BI ETL Tools Retail Healthcare

Addressing The Challenges Of Component Integration In Data Platform Architectures

Data Engineering Podcast

NOVEMBER 26, 2023

In this episode Tobias Macey shares his thoughts on the challenges that he is facing as he prepares to build the next set of architectural layers for his data platform to enable a larger audience to start accessing the data being managed by his team. With Materialize, you can! Closing Announcements Thank you for listening!

Architecture

Architecture Data Lake High Quality Data Java

Powering Vector Search With Real Time And Incremental Vector Indexes

Data Engineering Podcast

SEPTEMBER 24, 2023

In this episode Louis Brandy discusses the applications for vector search capabilities both in and outside of AI, as well as the challenges of maintaining real-time indexes of vector data. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Introducing RudderStack Profiles.

BI

BI SQL Machine Learning Python

Azure SQL Database: The Future of Cloud Data Management

The Future of Data Management Is Agentic AI

Webinars

Trending Sources

Surveying The Market Of Database Products

Webinars

Back to Basics Week 2: Database, SQL, Data Management and Statistical Concepts

Designing A Non-Relational Database Engine

Reconciling The Data In Your Databases With Datafold

A Beginner’s Guide to Graph Databases

Find Out About The Technology Behind The Latest PFAD In Analytical Database Development

How To Choose Right AWS Databases for Your Needs

Azure Cosmos DB: The Future of Database Management

How to Use Pinecone Vector Database in your AI Projects?

Building An Internal Database As A Service Platform At Cloudflare

Exploring Vector Databases: A Guide to Their Role in AI Tech

Amazon Aurora: The Future of Cloud Database Technology

HBase vs Cassandra-The Battle of the Best NoSQL Databases

How Does AWS DocumentDB Simplify Database Management?

Mastering the Art of ETL on AWS for Data Management

Composable data management at Meta

Master Data Management: Common Misconceptions You Should Know

Hottest IT Certifications of 2025- NoSQL Databases (MongoDB Certification)

Troubleshooting Kafka In Production

Delivering the Most Enterprise-Ready Postgres, Built for Snowflake

Data Integrity for AI: What’s Old is New Again

An Overview Of The Sate Of Data Orchestration In An Increasingly Complex Data Ecosystem

The Future of Data Lakehouses: A Fireside Chat with Vinoth Chandar - Founder CEO Onehouse & PMC Chair of Apache Hudi

Building Linked Data Products With JSON-LD

Data News — Week 24.11

Cloudera announces ‘Interoperability Ecosystem’ with founding members AWS and Snowflake

Reducing The Barrier To Entry For Building Stream Processing Applications With Decodable

Tackling Real Time Streaming Data With SQL Using RisingWave

Simplifying Data Architecture and Security to Accelerate Value

AWS Glue-Unleashing the Power of Serverless ETL Effortlessly

Secure Data Sharing and Interoperability Powered by Iceberg REST Catalog

Eliminate The Overhead In Your Data Integration With The Open Source dlt Library

Shining Some Light In The Black Box Of PostgreSQL Performance

How to Become a Data Architect in 2025?

Defining A Strategy For Your Data Products

Using Data To Illuminate The Intentionally Opaque Insurance Industry

Building ETL Pipelines With Generative AI

7 Best Data Warehousing Tools for Efficient Data Storage Needs

X-Ray Vision For Your Flink Stream Processing With Datorios

Top ETL Use Cases for BI and Analytics:Real-World Examples

Addressing The Challenges Of Component Integration In Data Platform Architectures

Powering Vector Search With Real Time And Incremental Vector Indexes

Stay Connected