article thumbnail

Data Engineering Best Practices - #2. Metadata & Logging

Start Data Engineering

Data Pipeline Logging Best Practices 3.1. Metadata: Information about pipeline runs, & data flowing through your pipeline 3.2. Introduction 2. Setup & Logging architecture 3. Obtain visibility into the code’s execution sequence using text logs 3.3. Monitoring UI & Traceability 3.5.

Metadata 130
article thumbnail

Ready-to-go sample data pipelines with Dataflow

Netflix Tech

by Jasmine Omeke , Obi-Ike Nwoke , Olek Gorajek Intro This post is for all data practitioners, who are interested in learning about bootstrapping, standardization and automation of batch data pipelines at Netflix. You may remember Dataflow from the post we wrote last year titled Data pipeline asset management with Dataflow.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Level Up Your Data Platform With Active Metadata

Data Engineering Podcast

Summary Metadata is the lifeblood of your data platform, providing information about what is happening in your systems. In order to level up their value a new trend of active metadata is being implemented, allowing use cases like keeping BI reports up to date, auto-scaling your warehouses, and automated data governance.

Metadata 130
article thumbnail

Build your data pipelines like the Toyota Way

François Nguyen

Today, we are going to apply these principles to the data pipelines. The idea is to transpose these 7 principles to data pipeline knowing that Data pipelines are 100% flexible : if you have the skills, you build the pipeline you want. How does a bad data pipeline process look like ?

article thumbnail

Data News — Week 24.11

Christophe Blefari

Attributing Snowflake cost to whom it belongs — Fernando gives ideas about metadata management to attribute better Snowflake cost. Forward thinking Dataviz is hierarchical — Malloy, once again, provides an excellent article about a new way to see data visualisations. This is Croissant. It's inspirational.

Metadata 272
article thumbnail

Declarative Data Pipelines with Hoptimator

LinkedIn Engineering

However, we've found that this vertical self-service model doesn't work particularly well for data pipelines, which involve wiring together many different systems into end-to-end data flows. Data pipelines power foundational parts of LinkedIn's infrastructure, including replication between data centers.

article thumbnail

Eliminate Friction In Your Data Platform Through Unified Metadata Using OpenMetadata

Data Engineering Podcast

Summary A significant source of friction and wasted effort in building and integrating data management systems is the fragmentation of metadata across various tools. What are the common challenges faced by engineers and data practitioners in organizing the metadata for their systems? What are the goals of the project?

Metadata 100