Data Governance, Data Pipeline and Metadata

Level Up Your Data Platform With Active Metadata

Data Engineering Podcast

JUNE 19, 2022

Summary Metadata is the lifeblood of your data platform, providing information about what is happening in your systems. In order to level up their value a new trend of active metadata is being implemented, allowing use cases like keeping BI reports up to date, auto-scaling your warehouses, and automated data governance.

Metadata

Metadata MongoDB MySQL Scala

Data governance beyond SDX: Adding third party assets to Apache Atlas

Cloudera

MARCH 9, 2021

In this blog, we’ll highlight the key CDP aspects that provide data governance and lineage and show how they can be extended to incorporate metadata for non-CDP systems from across the enterprise. The SDX layer of CDP leverages the full spectrum of Atlas to automatically track and control all data assets. Assets: Files.

Data Governance

Data Governance Government Metadata Datasets

Bringing The Power Of The DataHub Real-Time Metadata Graph To Everyone At Acryl Data

Data Engineering Podcast

OCTOBER 15, 2021

Summary The binding element of all data work is the metadata graph that is generated by all of the workflows that produce the assets used by teams across the organization. The DataHub project was created as a way to bring order to the scale of LinkedIn’s data needs. How is the governance of DataHub being managed?

Metadata

Metadata BI Data Warehouse Government

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

How to Modernize Manufacturing Without Losing Control

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Beyond Legacy Detection: How AI-Driven Data Governance Surpasses Traditional Methods

Striim

MARCH 4, 2025

These incidents serve as a stark reminder that legacy data governance systems, built for a bygone era, are struggling to fend off modern cyber threats. They react too slowly, too rigidly, and cant keep pace with the dynamic, sophisticated attacks occurring today, leaving hackable data exposed.

Data Governance

Data Governance Government Healthcare NoSQL

How To Prepare Your Data Team for 2025

Ascend.io

DECEMBER 4, 2024

As we look towards 2025, it’s clear that data teams must evolve to meet the demands of evolving technology and opportunities. In this blog post, we’ll explore key strategies that data teams should adopt to prepare for the year ahead. The anticipated growth in data pipelines presents both challenges and opportunities.

Data Pipeline

Data Pipeline Metadata Data Workflow Data

Being Data Driven At Stripe With Trino And Iceberg

Data Engineering Podcast

JUNE 16, 2024

what kinds of questions are you answering with table metadata what use case/team does that support comparative utility of iceberg REST catalog What are the shortcomings of Trino and Iceberg? What were the requirements and selection criteria that led to the selection of that combination of technologies? Want to see Starburst in action?

Data Lake

Data Lake High Quality Data Metadata Machine Learning

Toward a Data Mesh (part 2) : Architecture & Technologies

François Nguyen

MARCH 22, 2021

TL;DR After setting up and organizing the teams, we are describing 4 topics to make data mesh a reality. How do we build data products ? How can we interoperate between the data domains ? We want interoperability for any data stored versus we have to think how to store the data in a specific node to optimize the processing.

Technology

Technology Architecture Google Cloud Metadata

Data Engineering Weekly #198

Data Engineering Weekly

NOVEMBER 24, 2024

Canva writes about its custom solution using dbt and metadata capturing to attribute costs, monitor performance, and enable data-driven decision-making, significantly enhancing its Snowflake environment management. link] Grab: Metasense V2 - Enhancing, improving, and productionisation of LLM-powered data governance.

Data Engineering

Data Engineering Data Engineer Engineering Insurance

The last (but not least)”ops” you need for your data : DataGovops

François Nguyen

JANUARY 18, 2021

To finish the trilogy (Dataops, MLops), let’s talk about DataGovOps or how you can support your Data Governance initiative. In every step,we do not just read, transform and write data, we are also doing that with the metadata. Last part, it was added the data security and privacy part. What data do we have ?

Data Governance

Data Governance Metadata Government Data Pipeline

6 Ways To Prepare Your Data Team for 2025

Ascend.io

DECEMBER 4, 2024

As we look towards 2025, it’s clear that data teams must evolve to meet the demands of evolving technology and opportunities. In this blog post, we’ll explore key strategies that data teams should adopt to prepare for the year ahead. The anticipated growth in data pipelines presents both challenges and opportunities.

Data Pipeline

Data Pipeline Metadata Data Workflow Data

Metadata: What Is It and Why it Matters

Ascend.io

JULY 11, 2024

Metadata is the information that provides context and meaning to data, ensuring it’s easily discoverable, organized, and actionable. It enhances data quality, governance, and automation, transforming raw data into valuable insights. This is what managing data without metadata feels like.

Metadata

Metadata IT Government High Quality Data

Data Governance: Framework, Tools, Principles, Benefits

Knowledge Hut

APRIL 20, 2023

Data governance refers to the set of policies, procedures, mix of people and standards that organisations put in place to manage their data assets. It involves establishing a framework for data management that ensures data quality, privacy, security, and compliance with regulatory requirements.

Data Governance

Data Governance Government Data Cleanse Data Security

How DataOS Nails Gartner’s Magic Quadrant for Data Integration

The Modern Data Company

JANUARY 22, 2024

This mission culminates in the esteemed recognition of honorable mention in Gartner’s 2023 Magic Quadrant for Data Integration, showcasing commitment to excellence and industry leadership in the data-driven era. Data engineering excellence Modern offers robust solutions for building, managing, and operationalizing data pipelines.

Data Integration

Data Integration Metadata Government Unstructured Data

CI/CD for Data Pipelines: A Game-Changer with AnalyticsCreator

Data Science Blog: Data Engineering

MAY 20, 2024

Continuous Integration and Continuous Delivery (CI/CD) for Data Pipelines: It is a Game-Changer with AnalyticsCreator! The need for efficient and reliable data pipelines is paramount in data science and data engineering. They transform data into a consistent format for users to consume.

Data Pipeline

Data Pipeline BI Data Lake Data Warehouse

Charting the Path of Riskified's Data Platform Journey

Data Engineering Podcast

JULY 10, 2022

Atlan is the metadata hub for your data ecosystem. Instead of locking your metadata into a new silo, unleash its transformative potential with Atlan’s active metadata capabilities. Modern data teams are dealing with a lot of complexity in their data pipelines and analytical code.

Metadata

Metadata MongoDB MySQL Machine Learning

Addressing The Challenges Of Component Integration In Data Platform Architectures

Data Engineering Podcast

NOVEMBER 26, 2023

Starburst Logo]([link] This episode is brought to you by Starburst - a data lake analytics platform for data engineers who are battling to build and scale high quality data pipelines on the data lake.

Architecture

Architecture Data Lake High Quality Data SQL

Towards a Data Mesh (part 1) : Data Domains and Teams Topologies.

François Nguyen

MARCH 7, 2021

In this article, Juan Sequada gives maybe one of the best definition of Data Mesh ” It is paradigm shift towards a distributed architecture that attempts to find an ideal balance between centralization and decentralization of metadata and data management.”

Government

Government Data Governance Data Metadata

New Snowflake Features Released in September–November 2023

Snowflake

DECEMBER 12, 2023

To give customers flexibility for how they fit Snowflake into their data architecture, Iceberg Tables can be configured to use either Snowflake or an external service such as AWS Glue as the table’s catalog to track metadata, with an easy, one-line SQL command to convert the table’s catalog to Snowflake in a metadata-only operation.

Metadata

Metadata Python AWS Government

Simplify Data Security For Sensitive Information With The Skyflow Data Privacy Vault

Data Engineering Podcast

JUNE 5, 2022

Atlan is the metadata hub for your data ecosystem. Instead of locking all of that information into a new silo, unleash its transformative potential with Atlan’s active metadata capabilities. Go to dataengineeringpodcast.com/atlan today to learn more about how you can take advantage of active metadata and escape the chaos.

Data Security

Data Security Metadata MongoDB MySQL

How Cloudera Data Flow Enables Successful Data Mesh Architectures

Cloudera

OCTOBER 7, 2021

Application Logic: Application logic refers to the type of data processing, and can be anything from analytical or operational systems to data pipelines that ingest data inputs, apply transformations based on some business logic and produce data outputs.

Architecture

Architecture Metadata Kafka Government

How DataOS Nails Gartner’s Magic Quadrant for Data Integration

The Modern Data Company

JANUARY 22, 2024

This mission culminates in the esteemed recognition of honorable mention in Gartner’s 2023 Magic Quadrant for Data Integration, showcasing commitment to excellence and industry leadership in the data-driven era. Data engineering excellence Modern offers robust solutions for building, managing, and operationalizing data pipelines.

Data Integration

Data Integration Metadata Government Unstructured Data

Recognizing Organizations Leading the Way in Data Security & Governance

Cloudera

DECEMBER 20, 2021

Understanding that the future of banking is data-driven and cloud-based, Bank of the West embraced cloud computing and its benefits, like remote capabilities, integrated processes, and flexible systems. The platform is centralizing the data, data management & governance, and building custom controls for data ingestion into the system.

Government

Government Data Security Banking Metadata

Put Your Whole Data Team On The Same Page With Atlan

Data Engineering Podcast

APRIL 5, 2021

Modern Data teams are dealing with a lot of complexity in their data pipelines and analytical code. Monitoring data quality, tracing incidents, and testing changes can be daunting and often takes hours to days. Once you sign up and create an alert in Datafold for your company data, they will send you a cool water flask.

Data Warehouse

Data Warehouse Data Pipeline BI Metadata

Data Pipeline Architecture Explained: 6 Diagrams and Best Practices

Monte Carlo

JUNE 14, 2023

In this post, we will help you quickly level up your overall knowledge of data pipeline architecture by reviewing: Table of Contents What is data pipeline architecture? Why is data pipeline architecture important? What is data pipeline architecture? Why is data pipeline architecture important?

Data Pipeline

Data Pipeline Architecture Data Lake Data Warehouse

Building A Cost Effective Data Catalog With Tree Schema

Data Engineering Podcast

NOVEMBER 9, 2020

In this episode Grant Seward explains how he built Tree Schema to be an easy to use and cost effective option for organizations to build their data catalogs. He also shares the internal architecture, how he approached the design to make it accessible and easy to use, and how it autodiscovers the schemas and metadata for your source systems.

Building

Building PostgreSQL BI Metadata

Data Lineage Tools: Key Capabilities and 5 Notable Solutions

Databand.ai

JULY 19, 2023

However, their importance has grown significantly in recent years due to the increasing complexity of data architectures and the growing need for data governance and compliance. In this article: Why Are Data Lineage Tools Important? It provides context for data, making it easier to understand and manage.

Pipeline-centric

Pipeline-centric Data Governance Metadata Government

5 Reasons to Use Apache Iceberg on Cloudera Data Platform (CDP)

Cloudera

MARCH 23, 2022

Through Cloudera’s contributions, we have extended support for Hive and Impala, delivering on the vision of a data architecture for multi-function analytics from large scale data engineering (DE) workloads and stream processing (DF) to fast BI and querying (within DW) and machine learning (ML). . 3: Open Performance.

Metadata

Metadata Data Architecture BI Machine Learning

The Data Integration Solution Checklist: Top 10 Considerations

Precisely

MAY 13, 2024

Integrated data catalog for metadata support As you build out your IT ecosystem, it’s important to leverage tools that have the capabilities to support forward-looking use cases. A notable capability that achieves this is the data catalog. If so, how do you combine that metadata with other data across the enterprise? #4.

Data Integration

Data Integration Metadata Amazon Web Services Data Governance

Data Architect: Role Description, Skills, Certifications and When to Hire

AltexSoft

FEBRUARY 11, 2023

Data architecture is the organization and design of how data is collected, transformed, integrated, stored, and used by a company. What is the main difference between a data architect and a data engineer? This privacy law must be kept in mind when building data architecture.

Data Architect

Data Architect Certification Generalist Big Data

Demystifying Data Mesh

Precisely

JULY 15, 2024

Watch Preparing for a Data Mesh Strategy Key pillars when preparing for a data mesh strategy include: A mature data governance strategy to manage and organize a decentralized data system. Proper governance ensures that data is uniformly accessible and the appropriate security measures are met.

Data Governance

Data Governance Data Integration Government Data

Top 5 Open Source Data Lineage Tools (With User Reviews)

Monte Carlo

MAY 18, 2023

Whether you’re a data scientist, data engineer, or business analyst, keeping track of your data’s origin, transformation, and movement is crucial for maintaining transparency, enforcing data governance, and ensuring data quality. Why do you need data lineage?

Metadata

Metadata Data Lake Government Data Governance

The State of Data Engineering in 2024: Key Insights and Trends

Data Engineering Weekly

DECEMBER 16, 2024

Grab’s Metasense , Uber’s DataK9 , and Meta’s classification systems use AI to automatically categorize vast data sets, reducing manual efforts and improving accuracy. Beyond classification, organizations now use AI for automated metadata generation and data lineage tracking, creating more intelligent data infrastructures.

Data Engineering

Data Engineering Data Engineer Engineering Unstructured Data

How to Choose a Futureproof Data Integration Solution

Precisely

MAY 23, 2024

Integrated data catalog for metadata support As you build out your IT ecosystem, it is important to leverage tools that have the capabilities to support forward-looking use cases. A notable capability that achieves this is the data catalog. Deployment should be resource-efficient and easily targeted to fit your use cases.

Data Integration

Data Integration Metadata Telecommunication Data Governance

Data Mesh vs. Data Fabric: Which One Is Right for You?

Ascend.io

APRIL 7, 2023

With the monolithic architectures most organizations have today, business users are stuck, constantly waiting for new data pipelines to be built or amended based on their requests. Data engineers aren’t huge fans of this paradigm either. Anyone can query the metadata any time anywhere to get the information they need.

Metadata

Metadata Data Governance Datasets Government

Data Engineering Weekly #186

Data Engineering Weekly

AUGUST 25, 2024

Take Astro (the fully managed Airflow solution) for a test drive today and unlock a suite of features designed to simplify, optimize, and scale your data pipelines. Try For Free → Conference Alert: Data Engineering for AI/ML This is a virtual conference at the intersection of Data and AI.

Data Engineering

Data Engineering Data Engineer Engineering Database-centric

The Evolution of Table Formats

Monte Carlo

MAY 14, 2024

At its core, a table format is a sophisticated metadata layer that defines, organizes, and interprets multiple underlying data files. Table formats incorporate aspects like columns, rows, data types, and relationships, but can also include information about the structure of the data itself.

Data Lake

Data Lake Metadata Hadoop Data Governance

Extreme data center pressure? Burst to the cloud with CDP!

Cloudera

NOVEMBER 12, 2020

Inability to maintain context – This is the worst of them all because every time a data set or workload is re-used, you must recreate its context including security, metadata, and governance. Alternatively, you can also spin up a different compute cluster and access the data by using CDP’s Shared Data Experience.

Cloud

Cloud Data Warehouse Banking Data

8 Data Quality Monitoring Techniques & Metrics to Watch

Databand.ai

AUGUST 30, 2023

A shorter time-to-value indicates that your organization is efficient at processing and analyzing data for decision-making purposes. Monitoring this metric helps identify bottlenecks in the data pipeline and ensures timely insights are available for business users.

Data Cleanse

Data Cleanse Metadata High Quality Data Datasets

Announcing the 2021 Data Impact Awards

Cloudera

MAY 12, 2021

This category is open to organizations that have tackled transformative business use cases by connecting multiple parts of the data lifecycle to enrich, report, serve, and predict. . DATA FOR ENTERPRISE AI. SECURITY AND GOVERNANCE LEADERSHIP.

Food

Food Healthcare Government Medical

2024 Gartner Market Guide To DataOps

DataKitchen

AUGUST 16, 2024

At DataKitchen, we think of this is a ‘meta-orchestration’ of the code and tools acting upon the data. Data Pipeline Observability: Optimizes pipelines by monitoring data quality, detecting issues, tracing data lineage, and identifying anomalies using live and historical metadata.

Data Pipeline

Data Pipeline Metadata Data Governance Coding

DataOps Facilitates Remote Work

DataKitchen

JANUARY 5, 2021

The data teams share a common objective; to create analytics for the (internal or external) customer. Execution of this mission requires the contribution of several groups: data center/IT, data engineering, data science, data visualization, and data governance.

Data Governance

Data Governance Government Data Science Metadata

Data Entropy?—?More Data, More Problems?

Towards Data Science

MAY 19, 2023

Tech Target , defines a data silo as a repository of data controlled by one department or business unit and, therefore, not wholly or easily accessible by other departments within the same organisation. Absence or poor adoption of company-wide guidelines surrounding the creation and deployment of data products.

Pipeline-centric

Pipeline-centric Data Software Engineer Software Engineering

5 Data Quality Tools—What They Are & When You Need Them

Monte Carlo

MARCH 13, 2024

Let’s dig in and explore the landscape of the top so-called “data quality tools” — what they are, what they’re not, and whether they’re the right first step towards more reliable data. Governance helps companies set important standards and achieve higher levels of data security, data accessibility, and data quality.

Data Governance

Data Governance Government Metadata Data

DataOps Architecture: 5 Key Components and How to Get Started

Databand.ai

AUGUST 30, 2023

Poor data quality: The lack of automation and data governance in legacy architectures can lead to data quality issues, such as incomplete, inaccurate, or duplicate data. This requires implementing robust data integration tools and practices, such as data validation, data cleansing, and metadata management.

Architecture

Architecture Data Ingestion Data Governance Data Cleanse

Level Up Your Data Platform With Active Metadata

Data governance beyond SDX: Adding third party assets to Apache Atlas

Webinars

Trending Sources

Bringing The Power Of The DataHub Real-Time Metadata Graph To Everyone At Acryl Data

Webinars

Beyond Legacy Detection: How AI-Driven Data Governance Surpasses Traditional Methods

How To Prepare Your Data Team for 2025

Being Data Driven At Stripe With Trino And Iceberg

Toward a Data Mesh (part 2) : Architecture & Technologies

Data Engineering Weekly #198

The last (but not least)”ops” you need for your data : DataGovops

6 Ways To Prepare Your Data Team for 2025

Metadata: What Is It and Why it Matters

Data Governance: Framework, Tools, Principles, Benefits

How DataOS Nails Gartner’s Magic Quadrant for Data Integration

CI/CD for Data Pipelines: A Game-Changer with AnalyticsCreator

Charting the Path of Riskified's Data Platform Journey

Addressing The Challenges Of Component Integration In Data Platform Architectures

Towards a Data Mesh (part 1) : Data Domains and Teams Topologies.

New Snowflake Features Released in September–November 2023

Simplify Data Security For Sensitive Information With The Skyflow Data Privacy Vault

How Cloudera Data Flow Enables Successful Data Mesh Architectures

How DataOS Nails Gartner’s Magic Quadrant for Data Integration

Recognizing Organizations Leading the Way in Data Security & Governance

Put Your Whole Data Team On The Same Page With Atlan

Data Pipeline Architecture Explained: 6 Diagrams and Best Practices

Building A Cost Effective Data Catalog With Tree Schema

Data Lineage Tools: Key Capabilities and 5 Notable Solutions

5 Reasons to Use Apache Iceberg on Cloudera Data Platform (CDP)

The Data Integration Solution Checklist: Top 10 Considerations

Data Architect: Role Description, Skills, Certifications and When to Hire

Demystifying Data Mesh

Top 5 Open Source Data Lineage Tools (With User Reviews)

The State of Data Engineering in 2024: Key Insights and Trends

How to Choose a Futureproof Data Integration Solution

Data Mesh vs. Data Fabric: Which One Is Right for You?

Data Engineering Weekly #186

The Evolution of Table Formats

Extreme data center pressure? Burst to the cloud with CDP!

8 Data Quality Monitoring Techniques & Metrics to Watch

Announcing the 2021 Data Impact Awards

2024 Gartner Market Guide To DataOps

DataOps Facilitates Remote Work

Data Entropy?—?More Data, More Problems?

5 Data Quality Tools—What They Are & When You Need Them

DataOps Architecture: 5 Key Components and How to Get Started

Stay Connected