5 Free Courses to Master Data Engineering
KDnuggets
NOVEMBER 30, 2023
Data engineers must prepare and manage the infrastructure and tools necessary for the whole data workflow in a data-driven company.
This site uses cookies to improve your experience. To help us insure we adhere to various privacy regulations, please select your country/region of residence. If you do not select a country, we will assume you are from the United States. Select your Cookie Settings or view our Privacy Policy and Terms of Use.
Cookies and similar technologies are used on this website for proper function of the website, for tracking performance analytics and for marketing purposes. We and some of our third-party providers may use cookie data for various purposes. Please review the cookie settings below and choose your preference.
Used for the proper function of the website
Used for monitoring website traffic and interactions
Cookies and similar technologies are used on this website for proper function of the website, for tracking performance analytics and for marketing purposes. We and some of our third-party providers may use cookie data for various purposes. Please review the cookie settings below and choose your preference.
KDnuggets
NOVEMBER 30, 2023
Data engineers must prepare and manage the infrastructure and tools necessary for the whole data workflow in a data-driven company.
Data Engineering Podcast
MAY 18, 2024
Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management This episode is supported by Code Comments, an original podcast from Red Hat. Data lakes are notoriously complex. Data lakes are notoriously complex. My thanks to the team at Code Comments for their support.
KDnuggets
DECEMBER 6, 2023
This week on KDnuggets: Discover GitHub repositories from machine learning courses, bootcamps, books, tools, interview questions, cheat sheets, MLOps platforms, and more to master ML and secure your dream job • Data engineers must prepare and manage the infrastructure and tools necessary for the whole data workflow in a data-driven company • And much, (..)
ThoughtSpot
SEPTEMBER 6, 2023
The pathway from ETL to actionable analytics can often feel disconnected and cumbersome, leading to frustration for data teams and long wait times for business users. And even when we manage to streamline the data workflow, those insights aren’t always accessible to users unfamiliar with antiquated business intelligence tools.
Data Engineering Podcast
FEBRUARY 18, 2024
In this episode Dain Sundstrom, CTO of Starburst, explains how the combination of the Trino query engine and the Iceberg table format offer the ease of use and execution speed of data warehouses with the infinite storage and scalability of data lakes. Data lakes are notoriously complex. Your first 30 days are free!
Snowflake
APRIL 17, 2024
This traditional SQL-centric approach often challenged data engineers working in a Python environment, requiring context-switching and limiting the full potential of Python’s rich libraries and frameworks. To get started, explore the comprehensive API documentation , which will guide you through every step.
Data Engineering Podcast
DECEMBER 24, 2023
Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Introducing RudderStack Profiles. RudderStack Profiles takes the SaaS guesswork and SQL grunt work out of building complete customer profiles so you can quickly ship actionable, enriched data to every downstream team.
Data Engineering Podcast
JANUARY 30, 2022
Summary Pandas is a powerful tool for cleaning, transforming, manipulating, or enriching data, among many other potential uses. As a result it has become a standard tool for data engineers for a wide range of applications. What are the main tasks that you have seen Pandas used for in a data engineering context?
Data Engineering Podcast
APRIL 7, 2024
Summary Maintaining a single source of truth for your data is the biggest challenge in data engineering. Different roles and tasks in the business need their own ways to access and analyze the data in the organization. Data lakes are notoriously complex. Your first 30 days are free!
Data Engineering Weekly
NOVEMBER 24, 2024
Editor’s Note: Launching Data & Gen-AI courses in 2025 I can’t believe DEW will reach almost its 200th edition soon. What I started as a fun hobby has become one of the top-rated newsletters in the data engineering industry. We are planning many exciting product lines to trial and launch in 2025.
Data Engineering Podcast
APRIL 14, 2024
In this episode Oren Eini, CEO and creator of RavenDB, explores the nuances of relational vs. non-relational engines, and the strategies for designing a non-relational database. Datafold has recently launched data replication testing, providing ongoing validation for source-to-target replication. Data lakes are notoriously complex.
Data Engineering Podcast
APRIL 21, 2024
Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Dagster offers a new approach to building and running data platforms and data pipelines. Data lakes are notoriously complex. Go to dataengineeringpodcast.com/dagster today to get started.
Data Engineering Podcast
AUGUST 28, 2022
Summary The dream of every engineer is to automate all of their tasks. For data engineers, this is a monumental undertaking. Orchestration engines are one step in that direction, but they are not a complete solution. The only thing worse than having bad data is not knowing that you have it.
Data Engineering Podcast
JUNE 23, 2024
Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Data lakes are notoriously complex. What are the benefits of embedding Copilot into the data engine? What are the benefits of embedding Copilot into the data engine?
Data Engineering Podcast
FEBRUARY 4, 2024
In this episode Yingjun Wu explains how it is architected to power analytical workflows on continuous data flows, and the challenges of making it responsive and scalable. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Data lakes are notoriously complex.
Data Engineering Podcast
MARCH 17, 2024
Summary A significant portion of data workflows involve storing and processing information in database engines. In this episode Gleb Mezhanskiy, founder and CEO of Datafold, discusses the different error conditions and solutions that you need to know about to ensure the accuracy of your data. Your first 30 days are free!
Data Engineering Podcast
MARCH 31, 2024
Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Data lakes are notoriously complex. Datafold has recently launched data replication testing, providing ongoing validation for source-to-target replication. Your first 30 days are free!
Data Engineering Podcast
JUNE 16, 2024
In this episode Kevin Liu shares some of the interesting features that they have built by combining those technologies, as well as the challenges that they face in supporting the myriad workloads that are thrown at this layer of their data platform. Go to [dataengineeringpodcast.com/starburst]([link] Support Data Engineering Podcast
Data Engineering Podcast
MARCH 24, 2024
In this episode Pete Hunt, CEO of Dagster labs, outlines these new capabilities, how they reduce the burden on data teams, and the increased collaboration that they enable across teams and business units. Data lakes are notoriously complex. Go to dataengineeringpodcast.com/dagster today to get started. Your first 30 days are free!
Data Engineering Podcast
SEPTEMBER 17, 2023
Summary A significant amount of time in data engineering is dedicated to building connections and semantic meaning around pieces of information. Linked data technologies provide a means of tightly coupling metadata with raw information. Linked data technologies provide a means of tightly coupling metadata with raw information.
Data Engineering Podcast
FEBRUARY 25, 2024
Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Dagster offers a new approach to building and running data platforms and data pipelines. Data lakes are notoriously complex. Go to dataengineeringpodcast.com/dagster today to get started.
Data Engineering Podcast
JUNE 30, 2024
He highlights the role of data teams in modern organizations and how Synq is empowering them to achieve this. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Data lakes are notoriously complex. Tooling only plays a small part in SLAs and incident management.
Data Engineering Podcast
MAY 12, 2024
Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management This episode is supported by Code Comments, an original podcast from Red Hat. Data lakes are notoriously complex. Support Data Engineering Podcast Summary Building a data platform is a substrantial engineering endeavor.
Data Engineering Podcast
JANUARY 7, 2024
Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Data lakes are notoriously complex. What are the open questions today in technical scalability of data engines? What are the open questions today in technical scalability of data engines?
Data Engineering Podcast
MARCH 3, 2024
Colleen Tartow has worked across all stages of the data lifecycle, and in this episode she shares her hard-earned wisdom about how to conduct an AI program for your organization. Data lakes are notoriously complex. Visit dataengineeringpodcast.com/data-council and use code dataengpod20 to register today!
Data Engineering Podcast
JUNE 9, 2024
Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management This episode is supported by Code Comments, an original podcast from Red Hat. Data lakes are notoriously complex. Unfortunately, reasoning about stream processing engines is complex and lacks sufficient tooling.
Data Engineering Weekly
NOVEMBER 3, 2024
Data Engineering Weekly readers get 15% discount by registering the following link, [link] Gustavo Akashi: Building data pipelines effortlessly with a DAG Builder for Apache Airflow Every code-first data workflow grew into a UI-based or Yaml-based workflow.
Data Engineering Podcast
JANUARY 21, 2024
In this episode Tasso Argyros, CEO of ActionIQ, gives a summary of the major epochs in database technologies and how he is applying the capabilities of cloud data warehouses to the challenge of building more comprehensive experiences for end-users through a modern customer data platform (CDP).
Data Engineering Podcast
JUNE 2, 2024
In this episode she shares the practical steps to implementing a data governance practice in your organization, and the pitfalls to avoid. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Data lakes are notoriously complex. Starburst : ![Starburst
Data Engineering Podcast
APRIL 28, 2024
In this episode he explains the data collection and preparation process, the collection of model types and sizes that work together to power the experience, and how to incorporate it into your workflow to act as a second brain. Data lakes are notoriously complex. Go to dataengineeringpodcast.com/dagster today to get started.
Data Engineering Podcast
FEBRUARY 11, 2024
In this episode Andrew Jefferson explains the complexities of building a robust system for data sharing, the techno-social considerations, and how the Bobsled platform that he is building aims to simplify the process. Support Data Engineering Podcast Summary Sharing data is a simple concept, but complicated to implement well.
Data Engineering Podcast
JANUARY 28, 2024
Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Data lakes are notoriously complex. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Data lakes are notoriously complex.
Data Engineering Podcast
MAY 5, 2024
Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Dagster offers a new approach to building and running data platforms and data pipelines. Data lakes are notoriously complex. Go to dataengineeringpodcast.com/dagster today to get started.
Data Engineering Podcast
MARCH 10, 2024
In this episode Alex Merced explains how the branching and merging functionality in Nessie allows you to use the same versioning semantics for your data lakehouse that you are used to from Git. Data lakes are notoriously complex. Visit [dataengineeringpodcast.com/data-council]([link] and use code **dataengpod20** to register today!
Data Engineering Podcast
DECEMBER 3, 2023
Andrei Tserakhau has dedicated his careeer to this problem, and in this episode he shares the lessons that he has learned and the work he is doing on his most recent data transfer system at DoubleCloud. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Introducing RudderStack Profiles.
Data Engineering Weekly
SEPTEMBER 29, 2024
The article details how bypassing intermediate storage steps reduces latency and improves data processing speed. The approach highlights the importance of streamlining data workflows for faster machine learning model training and deployment.
Data Engineering Podcast
NOVEMBER 26, 2023
In this episode Tobias Macey shares his thoughts on the challenges that he is facing as he prepares to build the next set of architectural layers for his data platform to enable a larger audience to start accessing the data being managed by his team. Data lakes are notoriously complex.
Data Engineering Weekly
DECEMBER 25, 2023
Welcome to another insightful edition of Data Engineering Weekly. As we approach the end of 2023, it's an opportune time to reflect on the key trends and developments that have shaped the field of data engineering this year. In conclusion, 2023 has been a year of significant developments and shifts in data engineering.
Meltano
OCTOBER 5, 2022
Interested in becoming a data engineer? The need for data experts in the U.S. job market is expected to grow by 22% in this decade, and according to LinkedIn’s 2020 report , a data engineer is listed as the 8th fastest growing job today. But what is data engineering exactly and what does a data engineer do?
Data Engineering Podcast
NOVEMBER 19, 2023
Summary The dbt project has become overwhelmingly popular across analytics and data engineering teams. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Data projects are notoriously complex. Data lakes are notoriously complex.
Snowflake
JUNE 6, 2024
You can now use Snowflake Notebooks to simplify the process of connecting to your data and to amplify your data engineering, analytics and machine learning workflows. Faster, easier AI/ML and data engineering workflows Explore, analyze and visualize data using Python and SQL.
Data Engineering Podcast
NOVEMBER 5, 2023
Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Introducing RudderStack Profiles. RudderStack Profiles takes the SaaS guesswork and SQL grunt work out of building complete customer profiles so you can quickly ship actionable, enriched data to every downstream team.
Data Engineering Podcast
DECEMBER 31, 2023
In this episode Andrey Korchack, CTO of fintech startup Monite, discusses the complexities of designing and implementing a data platform in that sector. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Data lakes are notoriously complex.
Data Engineering Weekly
OCTOBER 30, 2022
Data Engineering Weekly Is Brought to You by RudderStack RudderStack provides data pipelines that make it easy to collect data from every application, website, and SaaS platform, then activate it in your warehouse and business tools. The highlights are that 59% of folks think data catalogs are sometimes helpful.
Expert insights. Personalized for you.
We have resent the email to
Are you sure you want to cancel your subscriptions?
Let's personalize your content