Data Architecture and Database - Data Engineering Digest

Simplifying Data Architecture and Security to Accelerate Value

Snowflake

NOVEMBER 11, 2024

Whether it’s unifying transactional and analytical data with Hybrid Tables, improving governance for an open lakehouse with Snowflake Open Catalog or enhancing threat detection and monitoring with Snowflake Horizon Catalog , Snowflake is reducing the number of moving parts to give customers a fully managed service that just works.

Data Architecture

Data Architecture Architecture Data Lake Kafka

A Multipurpose Database For Transactions And Analytics To Simplify Your Data Architecture With Singlestore

Data Engineering Podcast

MAY 29, 2022

Singlestore aims to cut down on the number of database engines that you need to run so that you can reduce the amount of copying that is required. By supporting fast, in-memory row-based queries and columnar on-disk representation, it lets your transactional and analytical workloads run in the same database.

Database

Database Architecture Data Architecture PostgreSQL

Accelerate AI Innovation: Build the Right Real-Time Data Architecture

Striim

APRIL 22, 2025

From delivering event-driven predictions to powering live recommendations and dynamic chatbot conversations, AI/ML initiatives depend on the continuous movement, transformation, and synchronization of diverse datasets across clouds, applications, and databases.

Architecture

Architecture Data Architecture Building Datasets

Webinars

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

How Apache Iceberg Is Changing the Face of Data Lakes

Snowflake

APRIL 2, 2025

Data storage has been evolving, from databases to data warehouses and expansive data lakes, with each architecture responding to different business and data needs. Traditional databases excelled at structured data and transactional workloads but struggled with performance at scale as data volumes grew.

Data Lake

Data Lake Cloud Storage Metadata Data Warehouse

Modern Data Architecture for Embedded Analytics

Every data-driven project calls for a review of your data architecture—and that includes embedded analytics. Before you add new dashboards and reports to your application, you need to evaluate your data architecture with analytics in mind. 9 questions to ask yourself when planning your ideal architecture.

Data Architecture

Best Practices for Technical Columns in Database Design

Towards Data Science

MAY 11, 2024

When architecting a transactional database or a data warehouse, it’s important not to forget about various types of technical columns… Continue reading on Towards Data Science »

Database Design

Database Design Database Designing Data Warehouse

Breaking State and Local Data Silos with Modern Data Architectures

Cloudera

AUGUST 30, 2022

Agencies are plagued by a wide range of data formats and storage environments—legacy systems, databases, on-premises applications, citizen access portals, innumerable sensors and devices, and more—that all contribute to a siloed ecosystem and the data management challenge. . Modern data architectures.

Data Architecture

Data Architecture Architecture Data Lake NoSQL

Simplify Your Data Architecture With The Presto Distributed SQL Engine

Data Engineering Podcast

SEPTEMBER 7, 2020

Summary Databases are limited in scope to the information that they directly contain. For analytical use cases you often want to combine data across multiple sources and storage locations. This frequently requires cumbersome and time-consuming data integration.

Architecture

Architecture Data Architecture SQL Engineering

Building Streaming Data Architectures with Qlik Replicate and Apache Kafka

Confluent

OCTOBER 30, 2020

A fundamental challenge with today’s “data explosion” is finding the best answer to the question, “So where do I put my data?” while avoiding the longer-term problem of data warehouses, […].

Data Architecture

Data Architecture Architecture Kafka Building

Data Integrity for AI: What’s Old is New Again

Precisely

JANUARY 9, 2025

The goal of this post is to understand how data integrity best practices have been embraced time and time again, no matter the technology underpinning. In the beginning, there was a data warehouse The data warehouse (DW) was an approach to data architecture and structured data management that really hit its stride in the early 1990s.

Data Integration

Data Integration Hadoop Data Warehouse Data Lake

The Top Three Entangled Trends in Data Architectures: Data Mesh, Data Fabric, and Hybrid Architectures

Cloudera

SEPTEMBER 29, 2022

Each of these trends claim to be complete models for their data architectures to solve the “everything everywhere all at once” problem. Data teams are confused as to whether they should get on the bandwagon of just one of these trends or pick a combination. First, we describe how data mesh and data fabric could be related.

Architecture

Architecture Data Architecture Metadata Data Warehouse

Evaluating Change Data Capture Tools: A Comprehensive Guide

Data Engineering Weekly

AUGUST 6, 2024

CDC tools fuel analytical apps and mission-critical data feeds in banking and regulated industries, with use cases ranging from data synchronization, managing risk, and preventing fraud to driving personalization. This approach simplifies data architecture and enhances performance by reducing data movement and latency.

Data Lake

Data Lake Data Warehouse Database Data Architecture

Data Engineering Weekly #222

Data Engineering Weekly

JUNE 1, 2025

This announcement has triggered many interesting conversations about storing metadata in a relational database vs. object storage. With S3 Express One, why not metastore in Express One vs a relational database, which can reduce additional complexity? ICE stack elegantly represents the reference architecture.

Data Engineering

Data Engineering Data Engineer Engineering Relational Database

What is an AI Data Engineer? 4 Important Skills, Responsibilities, & Tools

Monte Carlo

OCTOBER 31, 2024

Key Differences Between AI Data Engineers and Traditional Data Engineers While traditional data engineers and AI data engineers have similar responsibilities, they ultimately differ in where they focus their efforts. Let’s dive into the tools necessary to become an AI data engineer.

Data Engineering

Data Engineering Data Engineer Engineering Unstructured Data

A Prequel to Data Mesh

Towards Data Science

JANUARY 16, 2024

When I heard the words ‘decentralised data architecture’, I was left utterly confused at first! In my then limited experience as a Data Engineer, I had only come across centralised data architectures and they seemed to be working very well. Organizations began to use relational databases for ‘everything’.

Data Warehouse

Data Warehouse Data Architecture Relational Database NoSQL

Why Open Table Format Architecture is Essential for Modern Data Systems

phData: Data Engineering

NOVEMBER 8, 2024

Additionally, the optimized query execution and data pruning features reduce the compute cost associated with querying large datasets. Scaling data infrastructure while maintaining efficiency is one of the primary challenges of modern data architecture. Amazon S3, Azure Data Lake, or Google Cloud Storage).

Architecture

Architecture Systems Data Lake Google Cloud

5 Advantages of Real-Time ETL for Snowflake

Striim

MARCH 21, 2025

A streaming ETL for Snowflake approach loads data to Snowflake from diverse sources such as transactional databases, security systems logs, and IoT sensors/devices in real time , while simultaneously meeting scalability, latency, security, and reliability requirements.

Data Warehouse

Data Warehouse MongoDB MySQL Hadoop

The Challenge of Data Quality and Availability—And Why It’s Holding Back AI and Analytics

Striim

APRIL 18, 2025

Many organizations struggle with: Inconsistent data formats : Different systems store data in varied structures, requiring extensive preprocessing before analysis. Siloed storage : Critical business data is often locked away in disconnected databases, preventing a unified view.

High Quality Data

High Quality Data Business Intelligence Unstructured Data Data Pipeline

End-to-End Data Observability and Our Rapidly Expanding Database Support

Monte Carlo

MARCH 6, 2024

If you’ve been keeping track of our integrations page you might have noticed Monte Carlo is adding support for different databases faster than a nested loop server crash. By the end of 2023, we added Azure SQL database to the list as well as enterprise databases such as Oracle DB , SAP HANA and Teradata.

Database

Database MySQL Data Warehouse Architecture

Getting started with the MongoDB Connector for Apache Kafka and MongoDB

Confluent

JULY 17, 2019

Together, MongoDB and Apache Kafka ® make up the heart of many modern data architectures today. This API enables users to leverage ready-to-use components that can stream data from external systems into Kafka topics, as well as stream data from Kafka topics into external systems. Free MongoDB Atlas cluster.

MongoDB

MongoDB Kafka Database Medical

Understanding Modern Data Architecture

Hevo

SEPTEMBER 17, 2024

Organizations have begun to built data warehouses and lakes to analyze large amounts of data for insights and business reports. Often time they bring data from multiple data silos into their data lake and also have data stored in particular data stores like NoSQL databases to support different use cases.

Data Architecture

Data Architecture Architecture NoSQL Data Lake

The View From The Lakehouse Of Architectural Patterns For Your Data Platform

Data Engineering Podcast

JULY 3, 2022

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode.

Architecture

Architecture Metadata MongoDB MySQL

Ramp Simplifies Data Architecture Management, Cuts Costs, and Delivers Market Insights to Customers at Scale

Snowflake

JANUARY 30, 2023

After four months of testing, du Toit and his team had moved one of its databases to Snowflake for bulk data processing. Ramp fetches and delivers data into S3 buckets as well as uses dbt to transform data at each stage. “A VC suggested that we should look at Snowflake as part of our assessment,” du Toit recalled.

Data Architecture

Data Architecture Architecture Management Datasets

14 Best Database Certifications in 2023 to Boost Your Career

Knowledge Hut

SEPTEMBER 6, 2023

Back when I studied Computer Science in the early 2000s, databases like MS Access and Oracle ruled. The rise of big data and NoSQL changed the game. Systems evolved from simple to complex, and we had to split how we find data from where we store it. What Is a Database? Now, it's different. Let’s begin!

Certification

Certification Database MongoDB MySQL

Data Architect: Role Description, Skills, Certifications and When to Hire

AltexSoft

FEBRUARY 11, 2023

This specialist works closely with people on both business and IT sides of a company to understand the current needs of the stakeholders and help them unlock the full potential of data. To get a better understanding of a data architect’s role, let’s clear up what data architecture is.

Data Architect

Data Architect Certification Generalist Big Data

What is AWS Database Migration Service (AWS DMS)?

Edureka

SEPTEMBER 3, 2024

Moving databases to the cloud can be a really challenging and risky process, and it can also interrupt business processes. This fully managed service makes it easier to migrate databases to the cloud, from on-premises, or from one cloud service to another. What is AWS Database Migration Service?

AWS

AWS Database MySQL PostgreSQL

Upgrade Journey: The Path from CDH to CDP Private Cloud

Cloudera

SEPTEMBER 28, 2020

The data lifecycle model ingests data using Kafka, enriches that data with Spark-based batch process, performs deep data analytics using Hive and Impala, and finally uses that data for data science using Cloudera Data Science Workbench to get deep insights. OS – RHEL/CentOS/OEL 7.6/7.7/7.8

Cloud

Cloud Kafka Professional Services Metadata

Simplifying Data Integration Through Eventual Connectivity

Data Engineering Podcast

JULY 28, 2019

In this episode Tim Ward, CEO of CluedIn, explains the idea of eventual connectivity as a new paradigm for data integration. Rather than manually defining all of the mappings ahead of time, we can rely on the power of graph databases and some strategic metadata to allow connections to occur as the data becomes available.

Data Integration

Data Integration Metadata Architecture Media

The Race For Data Quality in a Medallion Architecture

DataKitchen

NOVEMBER 5, 2024

This architecture is valuable for organizations dealing with large volumes of diverse data sources, where maintaining accuracy and accessibility at every stage is a priority. It sounds great, but how do you prove the data is correct at each layer? How do you ensure data quality in every layer ?

Architecture

Architecture Raw Data Pipeline-centric Data Ingestion

New Snowflake Features Released in September–November 2023

Snowflake

DECEMBER 12, 2023

Iceberg Tables bring the easy management and great performance of Snowflake to data stored externally in an open source format. The intuitive dashboard provides seamless navigation to desired databases and schemas, offering detailed reporting on tags and policies and the ability to take immediate action to apply them. Learn more here.

Metadata

Metadata Python AWS Government

Back to the Financial Regulatory Future

Cloudera

FEBRUARY 15, 2024

Seeing the future in a modern data architecture The key to successfully navigating these challenges lies in the adoption of a modern data architecture. The promise of a modern data architecture might seem like a distant reality, but we at Cloudera believe data can make what is impossible today, possible tomorrow.

Insurance

Insurance Banking Data Architecture Data Ingestion

Scale Your Analytics On The Clickhouse Data Warehouse

Data Engineering Podcast

JULY 8, 2019

Summary The market for data warehouse platforms is large and varied, with options for every use case. ClickHouse is an open source, column-oriented database engine built for interactive analytics with linear scalability. Coming up this fall is the combined events of Graphorum and the Data Architecture Summit.

Data Warehouse

Data Warehouse MySQL Hadoop Data Lake

Building Real-Time Data Platforms For Large Volumes Of Information With Aerospike

Data Engineering Podcast

OCTOBER 2, 2021

Summary Aerospike is a database engine that is designed to provide millisecond response times for queries across terabytes or petabytes. He also discusses the technical implementation that allows for such extreme performance and how the data model contributes to the scalability of the system.

Building

Building BI Data Architecture Architecture

SnowflakeDB: The Data Warehouse Built For The Cloud

Data Engineering Podcast

DECEMBER 8, 2019

Summary Data warehouses have gone through many transformations, from standard relational databases on powerful hardware, to column oriented storage engines, to the current generation of cloud-native analytical engines. Go to dataengineeringpodcast.com/linode today to get a $20 credit and launch a new server in under a minute.

Data Warehouse

Data Warehouse Cloud AWS Relational Database

Introducing Apache Iceberg in Cloudera Data Platform

Cloudera

FEBRUARY 22, 2022

Over the past decade, the successful deployment of large scale data platforms at our customers has acted as a big data flywheel driving demand to bring in even more data, apply more sophisticated analytics, and on-board many new data practitioners from business analysts to data scientists.

Metadata

Metadata Datasets BI SQL

A Guide to Data Pipelines (And How to Design One From Scratch)

Striim

SEPTEMBER 11, 2024

Data pipelines are the backbone of your business’s data architecture. Implementing a robust and scalable pipeline ensures you can effectively manage, analyze, and organize your growing data. Understanding the essential components of data pipelines is crucial for designing efficient and effective data architectures.

Data Pipeline

Data Pipeline Designing Data Lake Data Warehouse

Let Your Analysts Build A Data Lakehouse With Cuelake

Data Engineering Podcast

AUGUST 20, 2021

Summary Data lakes have been gaining popularity alongside an increase in their sophistication and usability. Despite improvements in performance and data architecture they still require significant knowledge and experience to deploy and manage.

Building

Building Data Lake Data Warehouse SQL

Accelerate Development Of Enterprise Analytics With The Coalesce Visual Workflow Builder

Data Engineering Podcast

APRIL 3, 2022

Datafold shows how a change in SQL code affects your data, both on a statistical level and down to individual rows and values before it gets merged to production. No more shipping and praying, you can now know exactly what will change in your database! Visit dataengineeringpodcast.com/datafold today to book a demo with Datafold.

Data Warehouse

Data Warehouse Data Workflow Data Architecture SQL

Digging Into Data Replication At Fivetran

Data Engineering Podcast

AUGUST 12, 2019

You listen to this show to learn and stay up to date with what’s happening in databases, streaming platforms, big data, and everything else you need to know about modern data management.For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season.

Media

Media Data Warehouse Data Machine Learning

Enterprise Database Architecture: Everything to Know

Hevo

APRIL 26, 2024

Efficient data management is a crucial part of an enterprise’s success. Data architecture helps streamline data management and workloads. It also enables you to establish a data governance framework that improves the data quality and gives useful business insights.

Architecture

Architecture Database Data Governance Data Architecture

Thoughts on Amazon Express One and its impact in Data Infrastructure

Data Engineering Weekly

DECEMBER 2, 2023

The Current State of the Data Architecture S3 intelligent tiered storage provides a fine balance between the cost and the duration of the data retention. However, the real-time insight on accessing the recent data remains a big challenge. The combination of stream processing + OLAP storage like Pinot.

IT

IT BI AWS Kafka

Building a Scalable Search Architecture

Confluent

JUNE 18, 2019

As the databases professor at my university used to say, it depends. Using SQL to run your search might be enough for your use case, but as your project requirements grow and more advanced features are needed—for example, enabling synonyms, multilingual search, or even machine learning—your relational database might not be enough.

Architecture

Architecture Building Kafka Database-centric

Keeping Your Data Warehouse In Order With DataForm

Data Engineering Podcast

OCTOBER 14, 2019

You listen to this show to learn and stay up to date with what’s happening in databases, streaming platforms, big data, and everything else you need to know about modern data management. We have partnered with organizations such as O’Reilly Media, Dataversity, Corinium Global Intelligence, Alluxio, and Data Council.

Data Warehouse

Data Warehouse PostgreSQL AWS Programming Language

IBM Technology Chooses Cloudera as its Preferred Partner for Addressing Real Time Data Movement Using Kafka

Cloudera

SEPTEMBER 26, 2023

As lakehouse architectures (including offerings from Cloudera and IBM) become the norm for data processing and building AI applications, a robust streaming service becomes a critical building block for modern data architectures. Apache Kafka has evolved into the most widely-used streaming platform, capable of ingesting and processing (..)

Kafka

Kafka Technology IT Government

Simplifying Data Architecture and Security to Accelerate Value

A Multipurpose Database For Transactions And Analytics To Simplify Your Data Architecture With Singlestore

Webinars

Trending Sources

Accelerate AI Innovation: Build the Right Real-Time Data Architecture

Webinars

How Apache Iceberg Is Changing the Face of Data Lakes

Modern Data Architecture for Embedded Analytics

Best Practices for Technical Columns in Database Design

Breaking State and Local Data Silos with Modern Data Architectures

Simplify Your Data Architecture With The Presto Distributed SQL Engine

Building Streaming Data Architectures with Qlik Replicate and Apache Kafka

Data Integrity for AI: What’s Old is New Again

The Top Three Entangled Trends in Data Architectures: Data Mesh, Data Fabric, and Hybrid Architectures

Evaluating Change Data Capture Tools: A Comprehensive Guide

Data Engineering Weekly #222

What is an AI Data Engineer? 4 Important Skills, Responsibilities, & Tools

A Prequel to Data Mesh

Why Open Table Format Architecture is Essential for Modern Data Systems

5 Advantages of Real-Time ETL for Snowflake

The Challenge of Data Quality and Availability—And Why It’s Holding Back AI and Analytics

End-to-End Data Observability and Our Rapidly Expanding Database Support

Getting started with the MongoDB Connector for Apache Kafka and MongoDB

Understanding Modern Data Architecture

The View From The Lakehouse Of Architectural Patterns For Your Data Platform

Ramp Simplifies Data Architecture Management, Cuts Costs, and Delivers Market Insights to Customers at Scale

14 Best Database Certifications in 2023 to Boost Your Career

Data Architect: Role Description, Skills, Certifications and When to Hire

What is AWS Database Migration Service (AWS DMS)?

Upgrade Journey: The Path from CDH to CDP Private Cloud

Simplifying Data Integration Through Eventual Connectivity

The Race For Data Quality in a Medallion Architecture

New Snowflake Features Released in September–November 2023

Back to the Financial Regulatory Future

Scale Your Analytics On The Clickhouse Data Warehouse

Building Real-Time Data Platforms For Large Volumes Of Information With Aerospike

SnowflakeDB: The Data Warehouse Built For The Cloud

Introducing Apache Iceberg in Cloudera Data Platform

A Guide to Data Pipelines (And How to Design One From Scratch)

Let Your Analysts Build A Data Lakehouse With Cuelake

Accelerate Development Of Enterprise Analytics With The Coalesce Visual Workflow Builder

Digging Into Data Replication At Fivetran

Enterprise Database Architecture: Everything to Know

Thoughts on Amazon Express One and its impact in Data Infrastructure

Building a Scalable Search Architecture

Keeping Your Data Warehouse In Order With DataForm

IBM Technology Chooses Cloudera as its Preferred Partner for Addressing Real Time Data Movement Using Kafka

Stay Connected