Remove Data Warehouse Remove Metadata Remove Raw Data
article thumbnail

How to get started with dbt

Christophe Blefari

dbt Core is an open-source framework that helps you organise data warehouse SQL transformation. dbt was born out of the analysis that more and more companies were switching from on-premise Hadoop data infrastructure to cloud data warehouses. This switch has been lead by modern data stack vision.

article thumbnail

Data logs: The latest evolution in Meta’s access tools

Engineering at Meta

Meta joins the Data Transfer Project and has continuously led the development of shared technologies that enable users to port their data from one platform to another. 2024: Users can access data logs in Download Your Information. What are data logs?

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Databricks, Snowflake and the future

Christophe Blefari

Snowflake was founded in 2012 around its data warehouse product, which is still its core offering, and Databricks was founded in 2013 from academia with Spark co-creator researchers, becoming Apache Spark in 2014. It adds metadata, read, write and transactions that allow you to treat a Parquet file as a table.

Metadata 147
article thumbnail

The Downfall of the Data Engineer

Maxime Beauchemin

Consensus seeking Whether you think that old-school data warehousing concepts are fading or not, the quest to achieve conformed dimensions and conformed metrics is as relevant as it ever was. The data warehouse needs to reflect the business, and the business should have clarity on how it thinks about analytics.

article thumbnail

5 Helpful Extract & Load Practices for High-Quality Raw Data

Meltano

Setting the Stage: We need E&L practices, because “copying raw data” is more complex than it sounds. For instance, how would you know which orders got “canceled”, an operation that usually takes place in the same data record and just “modifies” it in place.

article thumbnail

Functional Data Engineering — a modern paradigm for batch data processing

Maxime Beauchemin

While business rules evolve constantly, and while corrections and adjustments to the process are more the rule than the exception, it’s important to insulate compute logic changes from data changes and have control over all of the moving parts. But how do we model this in a functional data warehouse without mutating data?

article thumbnail

The Hidden Threats in Your Data Warehouse Layers (And How to Fix Them)

Monte Carlo

Data warehouses are the centralized repositories that store and manage data from various sources. They are integral to an organization’s data strategy, ensuring data accessibility, accuracy, and utility. However, beneath their surface lies a host of invisible risks embedded within the data warehouse layers.