5 Free Courses to Master Data Engineering
KDnuggets
NOVEMBER 30, 2023
Data engineers must prepare and manage the infrastructure and tools necessary for the whole data workflow in a data-driven company.
KDnuggets
NOVEMBER 30, 2023
Data engineers must prepare and manage the infrastructure and tools necessary for the whole data workflow in a data-driven company.
Analytics Vidhya
NOVEMBER 26, 2023
In this Leading with Data episode, explore the analytics landscape with Dr. Swati Jain, a seasoned leader boasting over two decades of experience. From her unforeseen foray into analytics to steering EXL Analytics’ India business, Dr. Jain imparts invaluable insights into the ever-evolving world of data science. Read on to know more about her career, […] The post Unlocking the Power of Analytics with Dr.
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Jesse Anderson
NOVEMBER 30, 2023
Lately, I’ve been learning how to trade options. Although there’s data and programming involved in options trading, it isn’t as technical as data engineering or software engineering. However, it reflects the current state of learning, whether that’s data engineering or options trading. It gave me a look into learning a skill using videos. Each lesson I learned will directly apply to your learning or skill improvement.
Data Engineering Podcast
NOVEMBER 26, 2023
Summary Building a data platform that is enjoyable and accessible for all of its end users is a substantial challenge. One of the core complexities that needs to be addressed is the fractal set of integrations that need to be managed across the individual components. In this episode Tobias Macey shares his thoughts on the challenges that he is facing as he prepares to build the next set of architectural layers for his data platform to enable a larger audience to start accessing the data being ma
Advertisement
In Airflow, DAGs (your data pipelines) support nearly every use case. As these workflows grow in complexity and scale, efficiently identifying and resolving issues becomes a critical skill for every data engineer. This is a comprehensive guide with best practices and examples to debugging Airflow DAGs. You’ll learn how to: Create a standardized process for debugging to quickly diagnose errors in your DAGs Identify common issues with DAGs, tasks, and connections Distinguish between Airflow-relate
KDnuggets
DECEMBER 1, 2023
The blog covers machine learning courses, bootcamps, books, tools, interview questions, cheat sheets, MLOps platforms, and more to master ML and secure your dream job.
Confessions of a Data Guy
NOVEMBER 25, 2023
Ok. Get off your high horse. You are human just like the rest of us. Just like your ancient ancestors who were throwing rocks and sticks at each other a thousand years ago … you are looking for a leg up on the competition. Isn’t that the world we live in? At the end of […] The post How to be Better Than Everyone Else appeared first on Confessions of a Data Guy.
Data Engineering Digest brings together the best content for data engineering professionals from the widest variety of industry thought leaders.
Confluent
NOVEMBER 27, 2023
The top 7 free online courses, tutorials, get started guides, and examples for the easiest way to learn Apache Kafka.
KDnuggets
DECEMBER 1, 2023
Image by Author When you are getting started with machine learning, logistic regression is one of the first algorithms you’ll add to your toolbox.
Waitingforcode
NOVEMBER 28, 2023
In March I wrote a blog showing how to use accumulators to know the application of each filter statement. Turns out, the solution may not be perfect as mentioned by Aravind in one of the comments. I bet you already have an idea but if not, keep reading. Everything will be clear in the end!
Seattle Data Guy
NOVEMBER 28, 2023
Data warehousing would be easy if all data were structured and formatted in the data source. Maybe we wouldn’t even need to build a data warehouse. But as anyone who has worked with data from more than one source knows, that’s rarely the case. Businesses today need to pull data from a plethora of sources,… Read more The post Finding The Right ETL/ELT Solution – What Is Estuary And Should You Use It?
Speaker: Tamara Fingerlin, Developer Advocate
Apache Airflow® 3.0, the most anticipated Airflow release yet, officially launched this April. As the de facto standard for data orchestration, Airflow is trusted by over 77,000 organizations to power everything from advanced analytics to production AI and MLOps. With the 3.0 release, the top-requested features from the community were delivered, including a revamped UI for easier navigation, stronger security, and greater flexibility to run tasks anywhere at any time.
Confluent
NOVEMBER 28, 2023
Learn how to write code that produces messages via librdkafka, how it will behave during error situations, and how your application should detect and respond to them.
KDnuggets
DECEMBER 1, 2023
Curious about optimizing AI for everyday devices? Dive into the complete overview of MIT's TinyML and Efficient Deep Learning Computing course. Explore strategies to make AI smarter on small devices. Read the full article for an in-depth look!
ArcGIS
NOVEMBER 30, 2023
Learn how to filter coordinate systems based on a spatial extent, GCS, or projection property.
Seattle Data Guy
NOVEMBER 27, 2023
If you’re a data engineer, then you’ve likely at least heard of Airflow. Apache Airflow is one of the most popular open-source workflow orchestration solutions that gets used for data pipelines. This is what spurred me to write the article “Should You Use Airflow” because there are plenty of people who don’t enjoy Airflow or… Read more The post Common Pitfalls in Deploying Airflow for Data Teams appeared first on Seattle Data Guy.
Speaker: Alex Salazar, CEO & Co-Founder @ Arcade | Nate Barbettini, Founding Engineer @ Arcade | Tony Karrer, Founder & CTO @ Aggregage
There’s a lot of noise surrounding the ability of AI agents to connect to your tools, systems and data. But building an AI application into a reliable, secure workflow agent isn’t as simple as plugging in an API. As an engineering leader, it can be challenging to make sense of this evolving landscape, but agent tooling provides such high value that it’s critical we figure out how to move forward.
Tweag
NOVEMBER 27, 2023
Sponsored by Antithesis (distributed systems reliability testing experts), I’ve developed a new library to filter local files in Nix which I’d like to introduce! This post requires some familiarity with Nix and its language. So if you don’t know what Nix is yet, take a look first, it’s pretty neat. In this post we’re going to look at what source filtering is, why it’s useful, why a new library was needed for it, and the basics of the new library.
KDnuggets
NOVEMBER 29, 2023
From ANI to AGI and Beyond: Deciphering AI's Evolutionary Path.
Snowflake
NOVEMBER 27, 2023
Welcome to Snowflake’s Startup Spotlight, where we learn about awesome companies building businesses on Snowflake. In this edition, find out how Hum is applying the power of AI and large language models (LLMs) to help publishers build stronger customer relationships, and how the mantra of “build what people want” helped their leadership team make the decision to target publishers as their target audience.
Speaker: Andrew Skoog, Founder of MachinistX & President of Hexis Representatives
Manufacturing is evolving, and the right technology can empower—not replace—your workforce. Smart automation and AI-driven software are revolutionizing decision-making, optimizing processes, and improving efficiency. But how do you implement these tools with confidence and ensure they complement human expertise rather than override it? Join industry expert Andrew Skoog as he explores how manufacturers can leverage automation to enhance operations, streamline workflows, and make smarter, data-dri
databricks
NOVEMBER 30, 2023
We are excited to introduce five new integrations in Databricks Partner Connect—a one-stop portal enabling you to use partner solutions with your Databricks D.
KDnuggets
NOVEMBER 28, 2023
Want to support the behavior of built-in functions and method calls in your Python classes? Magic methods in Python let you do just that! So let’s uncover the method behind the magic.
ArcGIS
NOVEMBER 27, 2023
A suite of ArcGIS Solutions to support common workflows in the stormwater industry.
Snowflake
NOVEMBER 30, 2023
At TCS , we help companies shift their enterprise data warehouse (EDW) platforms to the cloud as well as offering IT services. We’re extremely familiar with just how tricky a cloud migration can be, especially when it involves moving historical business data. Choosing a migration approach involves balancing cloud strategy, architecture needs and business priorities.
Advertisement
With Airflow being the open-source standard for workflow orchestration, knowing how to write Airflow DAGs has become an essential skill for every data engineer. This eBook provides a comprehensive overview of DAG writing features with plenty of example code. You’ll learn how to: Understand the building blocks DAGs, combine them in complex pipelines, and schedule your DAG to run exactly when you want it to Write DAGs that adapt to your data at runtime and set up alerts and notifications Scale you
databricks
NOVEMBER 29, 2023
Background: Modernizing Data Delivery Today's enterprise data estates are vastly different from 10 years ago. Industries have transitioned their analytics from monolithic data.
KDnuggets
NOVEMBER 29, 2023
Probability is one of the foundational elements of computer science. Some bootcamps will skim over the topic, however, it is integral to your computer science knowledge.
ArcGIS
DECEMBER 1, 2023
Learn how to resolve error 00374: Unique numeric IDs are not assigned.
Snowflake
NOVEMBER 28, 2023
ERP systems run the world’s businesses. These stalwart systems are great at managing records and processes for finance, operations, supply chain management and more. But their insights need an upgrade. That’s the case put forward by Maxa , an enterprise-grade startup that has made it their mission to reinvent the way companies access and use ERP data for transformational insights.
Speaker: Tamara Fingerlin, Developer Advocate
In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!
databricks
NOVEMBER 28, 2023
We’re thrilled to share that Databricks has won the AWS ISV Partner of the Year award for North America. This award recognizes top I.
KDnuggets
NOVEMBER 30, 2023
The blog discusses five platforms designed for data scientists with specialized capabilities in managing large datasets, models, workflows, and collaboration beyond what GitHub offers.
Cloudera
DECEMBER 1, 2023
Recent Government Initiatives on Public Sector AI Solutions In recent years, governments across the globe have recognized the transformative potential of artificial intelligence (AI) and have embarked on initiatives to harness this technology to drive innovation and serve their citizens more effectively. These government-led efforts have had a profound impact on the development and adoption of AI solutions in the public sector, paving the way for a future where data-driven decision-making and au
Lyft Engineering
NOVEMBER 29, 2023
Written by Ritesh Varyani and Jeana Choi at Lyft. Introduction At Lyft, we have used systems like Apache ClickHouse and Apache Druid for near real-time and sub-second analytics. Sub-second query systems allow for near real-time data explorations and low latency, high throughput queries, which are particularly well-suited for handling time-series data.
Speaker: Ben Epstein, Stealth Founder & CTO | Tony Karrer, Founder & CTO, Aggregage
When tasked with building a fundamentally new product line with deeper insights than previously achievable for a high-value client, Ben Epstein and his team faced a significant challenge: how to harness LLMs to produce consistent, high-accuracy outputs at scale. In this new session, Ben will share how he and his team engineered a system (based on proven software engineering approaches) that employs reproducible test variations (via temperature 0 and fixed seeds), and enables non-LLM evaluation m
Let's personalize your content