5 Free Courses to Master Data Engineering
KDnuggets
NOVEMBER 30, 2023
Data engineers must prepare and manage the infrastructure and tools necessary for the whole data workflow in a data-driven company.
This site uses cookies to improve your experience. By viewing our content, you are accepting the use of cookies. To help us insure we adhere to various privacy regulations, please select your country/region of residence. If you do not select a country we will assume you are from the United States. View our privacy policy and terms of use.
KDnuggets
NOVEMBER 30, 2023
Data engineers must prepare and manage the infrastructure and tools necessary for the whole data workflow in a data-driven company.
Data Engineering Weekly
NOVEMBER 3, 2024
The challenges around memory, data size, and runtime are exciting to read. Sampling is an obvious strategy for data size, but the layered approach and dynamic inclusion of dependencies are some key techniques I learned with the case study. This count helps to ensure data consistency when deleting and compacting segments.
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
ThoughtSpot
SEPTEMBER 6, 2023
Those coveted insights live at the end of a process lovingly known as the data pipeline. The pathway from ETL to actionable analytics can often feel disconnected and cumbersome, leading to frustration for data teams and long wait times for business users. Keep reading to see how it works. What is a SpotApp?
Snowflake
APRIL 17, 2024
In today’s data-driven world, developer productivity is essential for organizations to build effective and reliable products, accelerate time to value, and fuel ongoing innovation. This allows your applications to handle large data sets and complex workflows efficiently.
Data Engineering Weekly
SEPTEMBER 29, 2024
Airbnb: Sandcastle - data/AI apps for everyone Product ideas powered by data and AI must go through rapid iteration on shareable, lightweight live prototypes instead of static proposals. link] Grab: Enabling conversational data discovery with LLMs at Grab.
Data Engineering Podcast
JANUARY 30, 2022
Summary Pandas is a powerful tool for cleaning, transforming, manipulating, or enriching data, among many other potential uses. As a result it has become a standard tool for data engineers for a wide range of applications. The only thing worse than having bad data is not knowing that you have it.
KDnuggets
DECEMBER 6, 2023
This week on KDnuggets: Discover GitHub repositories from machine learning courses, bootcamps, books, tools, interview questions, cheat sheets, MLOps platforms, and more to master ML and secure your dream job • Data engineers must prepare and manage the infrastructure and tools necessary for the whole data workflow in a data-driven company • And much, (..)
Data Engineering Podcast
APRIL 7, 2024
Summary Maintaining a single source of truth for your data is the biggest challenge in data engineering. Different roles and tasks in the business need their own ways to access and analyze the data in the organization. Dagster offers a new approach to building and running data platforms and data pipelines.
Data Engineering Podcast
FEBRUARY 18, 2024
Summary A data lakehouse is intended to combine the benefits of data lakes (cost effective, scalable storage and compute) and data warehouses (user friendly SQL interface). Data lakes are notoriously complex. Join in with the event for the global data community, Data Council Austin.
Data Engineering Weekly
DECEMBER 25, 2023
Welcome to another insightful edition of Data Engineering Weekly. As we approach the end of 2023, it's an opportune time to reflect on the key trends and developments that have shaped the field of data engineering this year. The future of MDS remains a subject of keen interest as we move into 2024.
Data Engineering Podcast
JUNE 16, 2024
Summary Stripe is a company that relies on data to power their products and business. In this episode Kevin Liu shares some of the interesting features that they have built by combining those technologies, as well as the challenges that they face in supporting the myriad workloads that are thrown at this layer of their data platform.
Data Engineering Podcast
AUGUST 28, 2022
Summary The dream of every engineer is to automate all of their tasks. For data engineers, this is a monumental undertaking. Orchestration engines are one step in that direction, but they are not a complete solution. Atlan is the metadata hub for your data ecosystem.
Data Engineering Podcast
MARCH 17, 2024
Summary A significant portion of data workflows involve storing and processing information in database engines. In this episode Gleb Mezhanskiy, founder and CEO of Datafold, discusses the different error conditions and solutions that you need to know about to ensure the accuracy of your data. Your first 30 days are free!
Data Engineering Podcast
JUNE 30, 2024
Petr shares his journey from being an engineer to founding Synq, emphasizing the importance of treating data systems with the same rigor as engineering systems. He discusses the challenges and solutions in data reliability, including the need for transparency and ownership in data systems.
Data Engineering Podcast
MAY 12, 2024
Summary Building a data platform is a substrantial engineering endeavor. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management This episode is supported by Code Comments, an original podcast from Red Hat. Data lakes are notoriously complex.
Ascend.io
JUNE 10, 2024
Snowflake Summit 2024 has set the stage for exciting changes in the data landscape. As a data enthusiast and a leader in data engineering, I’m eager to share my reflections on these innovations and their implications for Ascend. Snowflake’s investment in expanding data engineering capabilities is a game-changer.
Meltano
OCTOBER 5, 2022
Interested in becoming a data engineer? The need for data experts in the U.S. job market is expected to grow by 22% in this decade, and according to LinkedIn’s 2020 report , a data engineer is listed as the 8th fastest growing job today. But what is data engineering exactly and what does a data engineer do?
Data Engineering Podcast
DECEMBER 3, 2023
Summary The first step of data pipelines is to move the data to a place where you can process and prepare it for its eventual purpose. Data transfer systems are a critical component of data enablement, and building them to support large volumes of information is a complex endeavor. Data lakes are notoriously complex.
Data Engineering Podcast
JUNE 2, 2024
Summary Modern businesses aspire to be data driven, and technologists enjoy working through the challenge of building data systems to support that goal. Data governance is the binding force between these two parts of the organization. What are some of the misconceptions that you encounter about data governance?
Data Engineering Podcast
MAY 18, 2024
Summary The purpose of business intelligence systems is to allow anyone in the business to access and decode data to help them make informed decisions. Unfortunately this often turns into an exercise in frustration for everyone involved due to complex workflows and hard-to-understand dashboards. Data lakes are notoriously complex.
Data Engineering Weekly
MARCH 11, 2023
We are back in our Data Engineering Weekly Radio for edition #120. We will take 2 or 3 articles from each week's Data Engineering Weekly edition and go through an in-depth analysis. We discuss an article by Colin Campbell highlighting the need for a data catalog and the market scope for data contract solutions.
Knowledge Hut
SEPTEMBER 25, 2023
This demonstrates how in-demand Microsoft Certified Data Engineers are becoming. They are moving their servers and on-premises data to Azure Cloud. What does all of this mean for Data Engineering professionals? Who is an Azure Data Engineer? Azure Data Engineers work with these and other solutions.
Data Engineering Podcast
JANUARY 21, 2024
A substantial amount of the data that is being managed in these systems is related to customers and their interactions with an organization. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Data lakes are notoriously complex. Want to see Starburst in action?
Data Engineering Weekly
JANUARY 15, 2023
Data Engineering Weekly Is Brought to You by RudderStack RudderStack provides data pipelines that make it easy to collect data from every application, website, and SaaS platform, then activate it in your warehouse and business tools. Data Catalogs are moving towards a feature, not a product.
Data Engineering Weekly
OCTOBER 30, 2022
Data Engineering Weekly Is Brought to You by RudderStack RudderStack provides data pipelines that make it easy to collect data from every application, website, and SaaS platform, then activate it in your warehouse and business tools. The highlights are that 59% of folks think data catalogs are sometimes helpful.
Data Engineering Podcast
FEBRUARY 4, 2024
In this episode Yingjun Wu explains how it is architected to power analytical workflows on continuous data flows, and the challenges of making it responsive and scalable. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Data lakes are notoriously complex.
phData: Data Engineering
SEPTEMBER 29, 2023
When you think of data engineering , what comes to mind? In reality, though, if you use data (read: any information), you are most likely practicing some form of data engineering every single day. Said differently, any tools or steps we use to help us utilize data can be considered data engineering.
Data Engineering Podcast
MARCH 24, 2024
Summary A core differentiator of Dagster in the ecosystem of data orchestration is their focus on software defined assets as a means of building declarative workflows. Data lakes are notoriously complex. Your first 30 days are free! Want to see Starburst in action? What problems are you trying to solve with Dagster+?
Data Engineering Podcast
FEBRUARY 11, 2024
Summary Sharing data is a simple concept, but complicated to implement well. There are also numerous technical considerations to be made, particularly if the producer and consumer of the data aren't using the same platforms. Dagster offers a new approach to building and running data platforms and data pipelines.
Data Engineering Podcast
SEPTEMBER 17, 2023
Summary A significant amount of time in data engineering is dedicated to building connections and semantic meaning around pieces of information. Linked data technologies provide a means of tightly coupling metadata with raw information. Go to dataengineeringpodcast.com/materialize today to get 2 weeks free!
Data Engineering Podcast
APRIL 14, 2024
Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management This episode is brought to you by Datafold – a testing automation platform for data engineers that prevents data quality issues from entering every part of your data workflow, from migration to dbt deployment.
Data Engineering Podcast
MAY 26, 2024
When that system is responsible for the data layer the process becomes more challenging. Sriram Panyam has been involved in several projects that required migration of large volumes of data in high traffic environments. Can you start by sharing some of your experiences with data migration projects?
Data Engineering Podcast
JANUARY 28, 2024
Summary Monitoring and auditing IT systems for security events requires the ability to quickly analyze massive volumes of unstructured log data. Cliff Crosland co-founded Scanner to provide fast querying of high scale log data for security auditing. SIEM) A query engine is useless without data to analyze.
Data Engineering Podcast
MARCH 10, 2024
Summary Data lakehouse architectures are gaining popularity due to the flexibility and cost effectiveness that they offer. The link that bridges the gap between data lake and warehouse capabilities is the catalog. Data lakes are notoriously complex. Join us at the top event for the global data community, Data Council Austin.
Data Engineering Podcast
JANUARY 7, 2024
Summary Data processing technologies have dramatically improved in their sophistication and raw throughput. Unfortunately, the volumes of data that are being generated continue to double, requiring further advancements in the platform capabilities to keep up. What are the open questions today in technical scalability of data engines?
Knowledge Hut
MARCH 28, 2024
Data science has become one of the most trending fields today. Data engineering is one of them. According to AnalytixLabs , the data science market is expected to be worth USD 230.80 This demonstrates the increasing need for Microsoft Certified Data Engineers. That’s where data engineers are on the go.
Knowledge Hut
DECEMBER 28, 2023
Human society in 2023 is a digital world, and its fuel - its currency - is data. Today, organizations seek skilled professionals who can harness data’s power to drive informed decisions. As technology evolves, cloud platforms have emerged as the cornerstone of modern data management. Who is an Azure Data Engineer?
Knowledge Hut
NOVEMBER 17, 2023
Azure Data Engineers play an important role in building efficient, secure, and intelligent data solutions on Microsoft Azure's powerful platform. The position of Azure Data Engineers is becoming increasingly important as businesses attempt to use the power of data for strategic decision-making and innovation.
Knowledge Hut
NOVEMBER 2, 2023
Azure Data engineering projects are complicated and require careful planning and effective team participation for a successful completion. While many technologies are available to help data engineers streamline their workflows and guarantee that each aspect meets its objectives, ensuring that everything works properly takes time.
Data Engineering Podcast
JUNE 23, 2024
Summary Data lakehouse architectures have been gaining significant adoption. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Data lakes are notoriously complex. What are the benefits of embedding Copilot into the data engine?
Data Engineering Podcast
MARCH 31, 2024
Summary Working with data is a complicated process, with numerous chances for something to go wrong. Identifying and accounting for those errors is a critical piece of building trust in the organization that your data is accurate and up to date. Dagster offers a new approach to building and running data platforms and data pipelines.
Data Engineering Podcast
NOVEMBER 26, 2023
Summary Building a data platform that is enjoyable and accessible for all of its end users is a substantial challenge. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Introducing RudderStack Profiles. Data lakes are notoriously complex.
Data Engineering Podcast
DECEMBER 31, 2023
Summary Working with financial data requires a high degree of rigor due to the numerous regulations and the risks involved in security breaches. In this episode Andrey Korchack, CTO of fintech startup Monite, discusses the complexities of designing and implementing a data platform in that sector. Want to see Starburst in action?
DataKitchen
FEBRUARY 27, 2024
Your LLM Needs a Data Journey: A Comprehensive Guide for Data Engineers The rise of Large Language Models (LLMs) such as GPT-4 marks a transformative era in artificial intelligence, heralding new possibilities and challenges in equal measure. Embedding: The retrieved data is encoded into embeddings that the LLM can interpret.
Expert insights. Personalized for you.
We have resent the email to
Are you sure you want to cancel your subscriptions?
Let's personalize your content