Remove Data Warehouse Remove Definition Remove Metadata
article thumbnail

How to get started with dbt

Christophe Blefari

dbt Core is an open-source framework that helps you organise data warehouse SQL transformation. dbt was born out of the analysis that more and more companies were switching from on-premise Hadoop data infrastructure to cloud data warehouses. This switch has been lead by modern data stack vision.

article thumbnail

Keeping Your Data Warehouse In Order With DataForm

Data Engineering Podcast

Summary Managing a data warehouse can be challenging, especially when trying to maintain a common set of patterns. What are some of the challenges and mistakes that are common among engineers and analysts with regard to versioning and evolving schemas and the accompanying data?

article thumbnail

Optimizing data warehouse storage

Netflix Tech

By Anupom Syam Background At Netflix, our current data warehouse contains hundreds of Petabytes of data stored in AWS S3 , and each day we ingest and create additional Petabytes. Some of the optimizations are prerequisites for a high-performance data warehouse.

article thumbnail

The View Below The Waterline Of Apache Iceberg And How It Fits In Your Data Lakehouse

Data Engineering Podcast

Summary Cloud data warehouses have unlocked a massive amount of innovation and investment in data applications, but they are still inherently limiting. Because of their complete ownership of your data they constrain the possibilities of what data you can store and how it can be used.

IT 147
article thumbnail

Ready-to-go sample data pipelines with Dataflow

Netflix Tech

The most commonly used one is dataflow project , which helps folks in managing their data pipeline repositories through creation, testing, deployment and few other activities. It lets you create YAML formatted mock data files based on selected tables, columns and a few rows of data from the Netflix data warehouse.

article thumbnail

Collecting And Retaining Contextual Metadata For Powerful And Effective Data Discovery

Data Engineering Podcast

Select Star’s data discovery platform solves that out of the box, with an automated catalog that includes lineage from where the data originated, all the way to which dashboards rely on it and who is viewing them every day.

Metadata 100
article thumbnail

The Rise of the Data Engineer

Maxime Beauchemin

Data modeling is changing Typical data modeling techniques — like the star schema  — which defined our approach to data modeling for the analytics workloads typically associated with data warehouses, are less relevant than they once were.