Remove Hadoop Remove Python Remove SQL
article thumbnail

How to get started with dbt

Christophe Blefari

dbt Core is an open-source framework that helps you organise data warehouse SQL transformation. dbt was born out of the analysis that more and more companies were switching from on-premise Hadoop data infrastructure to cloud data warehouses. Jinja templating — Jinja is a templating engine that seems to exist forever in Python.

article thumbnail

Hadoop vs Spark: Main Big Data Tools Explained

AltexSoft

Hadoop and Spark are the two most popular platforms for Big Data processing. To come to the right decision, we need to divide this big question into several smaller ones — namely: What is Hadoop? To come to the right decision, we need to divide this big question into several smaller ones — namely: What is Hadoop? scalability.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Reflecting On The Past 6 Years Of Data Engineering

Data Engineering Podcast

__init__ covers the Python language, its community, and the innovative ways it is being used. Materialize’s PostgreSQL-compatible interface lets users leverage the tools they already use, with unsurpassed simplicity enabled by full ANSI SQL support. Closing Announcements Thank you for listening!

article thumbnail

Ship Smarter Not Harder With Declarative And Collaborative Data Orchestration On Dagster+

Data Engineering Podcast

For data engineers who battle to build and scale high quality data workflows on the data lake, Starburst powers petabyte-scale SQL analytics fast, at a fraction of the cost of traditional methods, so that you can meet all your data needs ranging from AI to data applications to complete analytics. Your first 30 days are free!

Data Lake 162
article thumbnail

Simplify Your Data Architecture With The Presto Distributed SQL Engine

Data Engineering Podcast

Your host is Tobias Macey and today I’m interviewing Martin Traverso about PrestoSQL, a distributed SQL engine that queries data in place Interview Introduction How did you get involved in the area of data management? __init__ to learn about the Python language, its community, and the innovative ways it is being used.

article thumbnail

Modern Customer Data Platform Principles

Data Engineering Podcast

For data engineers who battle to build and scale high quality data workflows on the data lake, Starburst powers petabyte-scale SQL analytics fast, at a fraction of the cost of traditional methods, so that you can meet all your data needs ranging from AI to data applications to complete analytics. Closing Announcements Thank you for listening!

Data Lake 147
article thumbnail

How to learn data engineering

Christophe Blefari

Hadoop initially led the way with Big Data and distributed computing on-premise to finally land on Modern Data Stack — in the cloud — with a data warehouse at the center. In order to understand today's data engineering I think that this is important to at least know Hadoop concepts and context and computer science basics.