Data Architecture, Metadata and Structured Data

Data Architecture

Metadata

Structured Data

How Apache Iceberg Is Changing the Face of Data Lakes

Snowflake

APRIL 2, 2025

Data storage has been evolving, from databases to data warehouses and expansive data lakes, with each architecture responding to different business and data needs. Traditional databases excelled at structured data and transactional workloads but struggled with performance at scale as data volumes grew.

Data Lake

Data Lake Cloud Storage Metadata Data Warehouse

The Future Is Hybrid Data, Embrace It

Cloudera

JUNE 7, 2022

We live in a hybrid data world. In the past decade, the amount of structured data created, captured, copied, and consumed globally has grown from less than 1 ZB in 2011 to nearly 14 ZB in 2020. Impressive, but dwarfed by the amount of unstructured data, cloud data, and machine data – another 50 ZB.

IT Unstructured Data Data Architecture Government

Join 37,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

Announcing New Innovations for Data Warehouse, Data Lake, and Data Lakehouse in the Data Cloud

Snowflake

NOVEMBER 2, 2023

And, since historically tools and commercial platforms were often designed to align with one specific architecture pattern, organizations struggled to adapt to changing business needs – which of course has implications on data architecture. The schema of semi-structured data tends to evolve over time.

Data Lake

Data Lake Data Warehouse Cloud Unstructured Data

How Cloudera Data Flow Enables Successful Data Mesh Architectures

Cloudera

OCTOBER 7, 2021

Those decentralization efforts appeared under different monikers through time, e.g., data marts versus data warehousing implementations (a popular architectural debate in the era of structured data) then enterprise-wide data lakes versus smaller, typically BU-Specific, “data ponds”.

Architecture

Architecture Metadata Kafka Government

Hands-On Introduction to Delta Lake with (py)Spark

Towards Data Science

FEBRUARY 15, 2023

In this context, data management in an organization is a key point for the success of its projects involving data. One of the main aspects of correct data management is the definition of a data architecture. show() The history object is a Spark Data Frame. delta_table.history().select("version",

Data Lake

Data Lake Data Warehouse Hadoop Architecture

The Symbiotic Relationship Between AI and Data Engineering

Ascend.io

FEBRUARY 28, 2024

Read More: AI Data Platform: Key Requirements for Fueling AI Initiatives How Data Engineering Enables AI Data engineering is the backbone of AI’s potential to transform industries , offering the essential infrastructure that powers AI algorithms.

Data Engineering

Data Engineering Data Engineer Engineering Metadata

5 Reasons Data Discovery Platforms Are Best For Data Lakes

Monte Carlo

APRIL 1, 2021

Data Catalogs Can Drown in a Data Lake Although exceptionally flexible and scalable, data lakes lack the organization necessary to facilitate proper metadata management and data governance. Data discovery tools and platforms can help. Interested in learning how to scale data discovery across your data lake?

Data Lake

Data Lake Data Warehouse Unstructured Data Government

Data Lakehouse: Concept, Key Features, and Architecture Layers

AltexSoft

NOVEMBER 10, 2021

The pun being obvious, there’s more to that than just a new term: Data lakehouses combine the best features of both data lakes and data warehouses and this post will explain this all. What is a data lakehouse? Traditional data warehouse platform architecture. Unstructured and streaming data support.

Architecture

Architecture Data Lake Data Warehouse Metadata

Unstructured Data: Examples, Tools, Techniques, and Best Practices

AltexSoft

MAY 12, 2023

What is unstructured data? Definition and examples Unstructured data , in its simplest form, refers to any data that does not have a pre-defined structure or organization. It can come in different forms, such as text documents, emails, images, videos, social media posts, sensor data, etc.

Unstructured Data

Unstructured Data NoSQL Hadoop Data Lake

Data Lake Explained: A Comprehensive Guide to Its Architecture and Use Cases

AltexSoft

AUGUST 29, 2023

Instead of relying on traditional hierarchical structures and predefined schemas, as in the case of data warehouses, a data lake utilizes a flat architecture. This structure is made efficient by data engineering practices that include object storage. Watch our video explaining how data engineering works.

Data Lake

Data Lake Architecture IT Amazon Web Services

Data Engineering Glossary

Silectis

JANUARY 3, 2021

Big Query Google’s cloud data warehouse. Data Architecture Data architecture is a composition of models, rules, and standards for all data systems and interactions between them. Data Catalog An organized inventory of data assets relying on metadata to help with data management.

Data Engineering

Data Engineering Data Engineer Engineering Hadoop

Three Reference Architectures for Real-Time Analytics On Streaming Data

Rockset

APRIL 26, 2023

We’ve noticed many common patterns across streaming data architectures and we’ll be sharing a blueprint for three of the most popular: anomaly detection, IoT, and recommendations. This will simplify your architecture considerably, deliver low-latency recommendation results, and enable metadata filtering.

Architecture

Architecture Transportation Data Lake Insurance

How Windward Built Real-Time Logistics Tracking and AI Insights for the Maritime Industry

Rockset

AUGUST 2, 2023

The steps Windward takes to create proprietary data and AI insights As Windward operated in a batch-based data stack, they stored raw data in S3. They used MongoDB as their metadata store to capture vessel and company data.

Database-centric

Database-centric PostgreSQL Transportation Insurance

Snowflake Architecture and It's Fundamental Concepts

ProjectPro

JANUARY 31, 2022

Snowflake in Action at Western Union Snowflake's multi-cluster shared data architecture expanded instantaneously to serve Western Union's data, users, and workloads without causing resource conflict. Snowflake saves and manages data on the cloud using a shared-disk approach, making data management simple.

Architecture

Architecture IT Data Warehouse Amazon Web Services

The Good and the Bad of Databricks Lakehouse Platform

AltexSoft

MARCH 30, 2023

What is Databricks Databricks is an analytics platform with a unified set of tools for data engineering, data management , data science, and machine learning. It combines the best elements of a data warehouse, a centralized repository for structured data, and a data lake used to host large amounts of raw data.

Scala

Scala Data Lake Machine Learning BI

Case Study: Standard Cognition Uses Rockset to Deliver Data APIs and Real-Time Metrics for Vision AI

Rockset

JANUARY 28, 2020

Aside from video data from each camera-equipped store, Standard deals with other data sets such as transactional data, store inventory data that arrive in different formats from different retailers, and metadata derived from the extensive video captured by their cameras.

Retail

Retail Google Cloud Raw Data SQL

Data Governance: Concept, Models, Framework, Tools, and Implementation Best Practices

AltexSoft

MARCH 2, 2023

Data modeling involves creating a conceptual representation of data objects and their relationships to each other, as well as the rules governing those relationships. To design an effective data governance program, it’s crucial to choose an operational model that fits your business size and structure.

Data Governance

Data Governance Government Programming Healthcare

100+ Big Data Interview Questions and Answers 2023

ProjectPro

JANUARY 31, 2023

Data Variety Hadoop stores structured, semi-structured and unstructured data. RDBMS stores structured data. Data storage Hadoop stores large data sets. RDBMS stores the average amount of data. Works with only structured data. Hardware Hadoop uses commodity hardware.

Big Data

Big Data Hadoop Relational Database AWS

Moving Past ETL and ELT: Understanding the EtLT Approach

Ascend.io

AUGUST 31, 2023

In the dynamic world of data, many professionals are still fixated on traditional patterns of data warehousing and ETL, even while their organizations are migrating to the cloud and adopting cloud-native data services. Central to this transformation are two shifts.

Data Lake

Data Lake Data Warehouse ETL Tools Data Pipeline

The Good and the Bad of Apache Spark Big Data Processing

AltexSoft

JULY 18, 2023

Spark SQL brings native support for SQL to Spark and streamlines the process of querying semistructured and structured data. When transformations are applied to RDDs, Spark records the metadata to build up a DAG, which reflects the sequence of computations performed during the execution of the Spark job.

Big Data

Big Data Data Process Process Hadoop

Data Mesh Architecture: Concept, Main Principles, and Implementation

AltexSoft

JULY 19, 2022

What data mesh is and is not. What data mesh IS. Data mesh is a set of principles for designing a modern distributed data architecture that focuses on business domains, not the technology used, and treats data as a product. For example, your organization has an HR platform that produces employee data.

Architecture

Architecture Data Lake Medical Datasets

Data Pipeline Architecture Explained: 6 Diagrams and Best Practices

Monte Carlo

JUNE 14, 2023

Data observability platforms reduce your data downtime up to 80% and make your data engineers 30% more time efficient by replacing static, cumbersome data testing with machine learning models that can help detect, resolve, and prevent data issues. Let the data drive the data pipeline architecture.

Data Pipeline

Data Pipeline Architecture Data Lake Data Warehouse

The Ultimate Modern Data Stack Migration Guide

phData: Data Engineering

JULY 18, 2023

Demands on the cloud data warehouse are also evolving to require it to become more of an all-in-one platform for an organization’s analytics needs. Enter Snowflake The Snowflake Data Cloud is one of the most popular and powerful CDW providers.

Data Warehouse

Data Warehouse Pipeline-centric Government Data

50 Artificial Intelligence Interview Questions and Answers [2023]

ProjectPro

OCTOBER 20, 2021

Data Integration at Scale Most data architectures rely on a single source of truth. Having multiple data integration routes helps optimize the operational as well as analytical use of data. A feature store is a modern, elegant solution to leverage data prep work from previous runs or other teams as well.

Machine Learning

Machine Learning Algorithm Data Science Government

Data Engineering Digest

How Apache Iceberg Is Changing the Face of Data Lakes

The Future Is Hybrid Data, Embrace It

Trending Sources

Announcing New Innovations for Data Warehouse, Data Lake, and Data Lakehouse in the Data Cloud

How Cloudera Data Flow Enables Successful Data Mesh Architectures

Hands-On Introduction to Delta Lake with (py)Spark

The Symbiotic Relationship Between AI and Data Engineering

5 Reasons Data Discovery Platforms Are Best For Data Lakes

Data Lakehouse: Concept, Key Features, and Architecture Layers

Unstructured Data: Examples, Tools, Techniques, and Best Practices

Data Lake Explained: A Comprehensive Guide to Its Architecture and Use Cases

Data Engineering Glossary

Three Reference Architectures for Real-Time Analytics On Streaming Data

How Windward Built Real-Time Logistics Tracking and AI Insights for the Maritime Industry

Snowflake Architecture and It's Fundamental Concepts

The Good and the Bad of Databricks Lakehouse Platform

Case Study: Standard Cognition Uses Rockset to Deliver Data APIs and Real-Time Metrics for Vision AI

Data Governance: Concept, Models, Framework, Tools, and Implementation Best Practices

100+ Big Data Interview Questions and Answers 2023

Moving Past ETL and ELT: Understanding the EtLT Approach

The Good and the Bad of Apache Spark Big Data Processing

Data Mesh Architecture: Concept, Main Principles, and Implementation

Data Pipeline Architecture Explained: 6 Diagrams and Best Practices

The Ultimate Modern Data Stack Migration Guide

50 Artificial Intelligence Interview Questions and Answers [2023]

Stay Connected