This site uses cookies to improve your experience. To help us insure we adhere to various privacy regulations, please select your country/region of residence. If you do not select a country, we will assume you are from the United States. Select your Cookie Settings or view our Privacy Policy and Terms of Use.
Cookie Settings
Cookies and similar technologies are used on this website for proper function of the website, for tracking performance analytics and for marketing purposes. We and some of our third-party providers may use cookie data for various purposes. Please review the cookie settings below and choose your preference.
Used for the proper function of the website
Used for monitoring website traffic and interactions
Cookie Settings
Cookies and similar technologies are used on this website for proper function of the website, for tracking performance analytics and for marketing purposes. We and some of our third-party providers may use cookie data for various purposes. Please review the cookie settings below and choose your preference.
Strictly Necessary: Used for the proper function of the website
Performance/Analytics: Used for monitoring website traffic and interactions
The past 18 months have seen major change reshape the tech industry. What does it all mean for businesses and dev teams – and what will pragmatic software engineering approaches look like in the future? I tackled these burning questions in my conference talk, “What’s Old is New Again,” which was the keynote of the Craft Conference in May 2024.
1. Introduction 2. Data Quality(DQ) checks are run as part of your pipeline 2.1. Ensure your consumers don’t get incorrect data with output DQ checks 2.2. Catch upstream issues quickly with input DQ checks 2.3. Waiting a long time to run output DQ checks? Save time & money with mid-pipeline DQ checks. 2.4. Track incoming and outgoing row counts with Audit logs 3.
In Airflow, DAGs (your data pipelines) support nearly every use case. As these workflows grow in complexity and scale, efficiently identifying and resolving issues becomes a critical skill for every data engineer. This is a comprehensive guide with best practices and examples to debugging Airflow DAGs. You’ll learn how to: Create a standardized process for debugging to quickly diagnose errors in your DAGs Identify common issues with DAGs, tasks, and connections Distinguish between Airflow-relate
With this blog I'm starting a follow-up series for my Data+AI Summit 2024 talk. I missed this family of blog posts a lot as the previous DAIS with me as speaker was 4 years ago! As previously, this time too I'll be writing several blog posts that should help you remember the talk and also cover some of the topics left aside because of the time constraints.
EuroSeagull ( credits ) Dear members, it's been a few weeks since I did not catch you on a proper Data News with a collection of links. Here we are. This week, I attended EuroPython in Prague. While I spent most of my time at the dltHub booth in the sponsors hall, I didn't attend many talks. However, I did give a few presentations on my SQL orchestration library, yato , which pairs well with dlt.
Discover how Databricks Lakehouse Monitoring empowers you to ensure reliable, accurate forecasts by proactively detecting data drift, model degradation, and more.
The key to developer velocity across AI lies in minimizing time to first batch (TTFB) for machine learning (ML) engineers. AI Lab is a pre-production framework used internally at Meta. It allows us to continuously A/B test common ML workflows – enabling proactive improvements and automatically preventing regressions on TTFB. AI Lab prevents TTFB regressions whilst enabling experimentation to develop improvements.
All eyes are on sports this summer, with blockbuster events happening in everything from soccer and cycling to cricket and car racing. Snowflake is excited to join the action with a virtual “relay race,” where Snowflake sports and data experts, customers and partners will demonstrate how the sports industry can win big with data and AI. Industry leaders already know that sports runs on data analytics: from individual athlete performance and team statistics, to marketing and fan engagement, to ti
Apache Airflow® 3.0, the most anticipated Airflow release yet, officially launched this April. As the de facto standard for data orchestration, Airflow is trusted by over 77,000 organizations to power everything from advanced analytics to production AI and MLOps. With the 3.0 release, the top-requested features from the community were delivered, including a revamped UI for easier navigation, stronger security, and greater flexibility to run tasks anywhere at any time.
Planning a city block, a neighborhood, or maybe a whole new city is a multifaceted task with no universal recipe to use. How can Generative AI help Urban Planners?
A potential recipe for disaster proved to be the focus of every data executive’s agenda over the last year. A year ago many data leaders were caught off-guard. Employees embraced new gen AI tools with fervor, driving interest in all AI initiatives. Generative AI had penetrated the enterprise, with gen AI positioned in the Peak Of Inflated Expectation segment on the Gartner® Hype Cycle for Artificial IntelligenceI, 2023 1.
Speaker: Alex Salazar, CEO & Co-Founder @ Arcade | Nate Barbettini, Founding Engineer @ Arcade | Tony Karrer, Founder & CTO @ Aggregage
There’s a lot of noise surrounding the ability of AI agents to connect to your tools, systems and data. But building an AI application into a reliable, secure workflow agent isn’t as simple as plugging in an API. As an engineering leader, it can be challenging to make sense of this evolving landscape, but agent tooling provides such high value that it’s critical we figure out how to move forward.
In this blog article, we'll explore BlueBikes data, a bike share service in bustling Boston, and uncover hidden insights through the power of visualization
Unlocking True Water Risk Assessment Across Insurance, Finance, Public Safety, and Beyond Check out the solution accelerator to download the notebooks referred to.
Generative AI has a Sustainability problem Generative AI , including large language models (LLMs), has taken the world by storm. Inspired by ChatGPT, many companies are racing to implement GenAI in their projects, lured by its hyped potential to revolutionise industries. However, based on my experience of applying GenAI to enterprise implementations, I am seeing first-hand the sustainability challenges threatening to implode the first generation of this technology.
Artificial intelligence (AI) has revolutionized numerous sectors, including digital marketing. This field leverages online platforms to promote products and services.
Speaker: Andrew Skoog, Founder of MachinistX & President of Hexis Representatives
Manufacturing is evolving, and the right technology can empower—not replace—your workforce. Smart automation and AI-driven software are revolutionizing decision-making, optimizing processes, and improving efficiency. But how do you implement these tools with confidence and ensure they complement human expertise rather than override it? Join industry expert Andrew Skoog as he explores how manufacturers can leverage automation to enhance operations, streamline workflows, and make smarter, data-dri
Canva: How Canva collects 25 billion events per day Canva writes about its event collection infrastructure capabilities, handling 25 billion events per day (800 billion events per month) with 99.999% uptime. At our team’s inception, a key decision we made, one we still believe to be a big part of our success, was that every collected event must have a machine-readable, well-documented schema.
Despite all the talk about AI replacing humans, Skynet blowing up the sun, and deep-fake celebrities parenting our children, it’s difficult to point to a generative AI use case that it’s demonstrably more interesting than your average run-of-the-mill chatbot. But what if instead of replacing customer support teams with chatbots, we could leverage AI to improve the performance of real human CS teams?
Uber elevates tech with the GenAI Gateway, integrating Large Language Models (LLMs) for 60+ use cases, from automation to customer support. This unified platform offers easy access to models from OpenAI, Vertex AI, and Uber’s own, ensuring efficiency and security.
With Airflow being the open-source standard for workflow orchestration, knowing how to write Airflow DAGs has become an essential skill for every data engineer. This eBook provides a comprehensive overview of DAG writing features with plenty of example code. You’ll learn how to: Understand the building blocks DAGs, combine them in complex pipelines, and schedule your DAG to run exactly when you want it to Write DAGs that adapt to your data at runtime and set up alerts and notifications Scale you
What happens when a team of mechanical engineers get tired of looking at flat images of 3D models over Zoom? Meet the team behind Caddy, a new CAD app for mixed reality. They join Pascal Hartig ( @passy ) on the Meta Tech Podcast to talk about teaching themselves to code, disrupting the CAD software space, and how they integrated Caddy with Llama 3, and so much more!
Robinhood was founded on a simple idea: that our financial markets should be accessible to all. With customers at the heart of our decisions, Robinhood is lowering barriers and providing greater access to financial information and investing. Together, we are building products and services that help create a financial system everyone can participate in. … Several members of our engineering, security, corporate engineering, and recruiting teams were recently in Toronto for our office opening in ea
Strings are common built-in data types in Python. But sometimes, you may need to work with bytes instead. Let’s learn how to convert bytes to string in Python.
In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!
Why automated end-to-end tests? What are automated end-to-end tests? Do you need them at all? In this blog post we dive into the ugly behind automated end-to-end testing, what we struggled with at Zalando, what worked well for us and our latest solution with end-to-end test probes. Automated end-to-end tests continue to polarise the industry, with some leaders advocating for them and others rightfully questioning their return on investments and recommending to invest in monitoring and alerting s
As part of our ongoing efforts to enhance product development while safeguarding app health and the consumer experience, we are introducing metric-aware rollouts for experiments. Metric-aware rollouts refer to established decision rules to flag issues with automated checks on standardized app quality metrics during the new feature rollout process. Every action DoorDash takes focuses on enhancing the consumer experience.
Speaker: Ben Epstein, Stealth Founder & CTO | Tony Karrer, Founder & CTO, Aggregage
When tasked with building a fundamentally new product line with deeper insights than previously achievable for a high-value client, Ben Epstein and his team faced a significant challenge: how to harness LLMs to produce consistent, high-accuracy outputs at scale. In this new session, Ben will share how he and his team engineered a system (based on proven software engineering approaches) that employs reproducible test variations (via temperature 0 and fixed seeds), and enables non-LLM evaluation m
We organize all of the trending information in your field so you don't have to. Join 37,000+ users and stay up to date on the latest articles your peers are reading.
You know about us, now we want to get to know you!
Let's personalize your content
Let's get even more personalized
We recognize your account from another site in our network, please click 'Send Email' below to continue with verifying your account and setting a password.
Let's personalize your content