This site uses cookies to improve your experience. To help us insure we adhere to various privacy regulations, please select your country/region of residence. If you do not select a country, we will assume you are from the United States. Select your Cookie Settings or view our Privacy Policy and Terms of Use.
Cookie Settings
Cookies and similar technologies are used on this website for proper function of the website, for tracking performance analytics and for marketing purposes. We and some of our third-party providers may use cookie data for various purposes. Please review the cookie settings below and choose your preference.
Used for the proper function of the website
Used for monitoring website traffic and interactions
Cookie Settings
Cookies and similar technologies are used on this website for proper function of the website, for tracking performance analytics and for marketing purposes. We and some of our third-party providers may use cookie data for various purposes. Please review the cookie settings below and choose your preference.
Strictly Necessary: Used for the proper function of the website
Performance/Analytics: Used for monitoring website traffic and interactions
👋 Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. In every issue, I cover topics related to Big Tech and startups through the lens of engineering managers and senior engineers. In this article, we cover one out of four topics from today’s subscriber-only The Pulse issue. To get full issues twice a week, subscribe here.
Summary Maintaining a single source of truth for your data is the biggest challenge in data engineering. Different roles and tasks in the business need their own ways to access and analyze the data in the organization. In order to enable this use case, while maintaining a single point of access, the semantic layer has evolved as a technological solution to the problem.
Learn Python through tutorials, blogs, books, project work, and exercises. Access all of it on GitHub for free and join a supportive open-source community.
Apache Airflow® 3.0, the most anticipated Airflow release yet, officially launched this April. As the de facto standard for data orchestration, Airflow is trusted by over 77,000 organizations to power everything from advanced analytics to production AI and MLOps. With the 3.0 release, the top-requested features from the community were delivered, including a revamped UI for easier navigation, stronger security, and greater flexibility to run tasks anywhere at any time.
Data is the lifeblood of modern businesses, but unlocking its true insights often requires complex SQL queries. These queries can be time-consuming to write and challenging to maintain. At Snowflake, we believe in making the power of data accessible to all. That’s why we prioritize simplicity, governance and quality in everything we build – including our AI-powered tools.
At Databricks, we’re committed to building the most efficient and performant training tools for large-scale AI models. With the recent release of DBRX.
The only data engineering roadmap you need for an introduction to concepts, tools, and techniques to collect, store, transform, analyze, and model data.
The only data engineering roadmap you need for an introduction to concepts, tools, and techniques to collect, store, transform, analyze, and model data.
In 2020, Snowflake announced a new global competition to recognize the work of early-stage startups building their apps — and their businesses — on Snowflake, offering up to $250,000 in investment as the top prize. Four years later, the Snowflake Startup Challenge has grown into a premiere showcase for emerging startups, garnering interest from companies in over 100 countries and offering a prize package featuring a portion of up to $1 million in potential investment opportunities and exclusive
Speaker: Alex Salazar, CEO & Co-Founder @ Arcade | Nate Barbettini, Founding Engineer @ Arcade | Tony Karrer, Founder & CTO @ Aggregage
There’s a lot of noise surrounding the ability of AI agents to connect to your tools, systems and data. But building an AI application into a reliable, secure workflow agent isn’t as simple as plugging in an API. As an engineering leader, it can be challenging to make sense of this evolving landscape, but agent tooling provides such high value that it’s critical we figure out how to move forward.
I recently did a post on Linkedin and Reddit about Databricks removing Standard Tier and forcing folks into Unity Catalog. The post got big traction and blew up, more than I thought. Enough for the Databricks folk to hunt me down at work and tell me I’m naughty. I will be writing a more in-depth […] The post Databricks Doubles Cost. Reddit Explodes.
Data enrichment is a crucial step in making data more usable by the business users. Doing that with a batch is relatively easy due to the static nature of the dataset. When it comes to streaming, the task is more challenging.
Speaker: Andrew Skoog, Founder of MachinistX & President of Hexis Representatives
Manufacturing is evolving, and the right technology can empower—not replace—your workforce. Smart automation and AI-driven software are revolutionizing decision-making, optimizing processes, and improving efficiency. But how do you implement these tools with confidence and ensure they complement human expertise rather than override it? Join industry expert Andrew Skoog as he explores how manufacturers can leverage automation to enhance operations, streamline workflows, and make smarter, data-dri
Large language models (LLMs) have generated interest in effective human-AI interaction through optimizing prompting techniques. “Prompt engineering” is a growing methodology for tailoring.
One of our key objectives at Snowflake is to help enterprises fully unlock the value of their data, and an important aspect of that is making data both accessible and actionable to as many people as possible, regardless of their role or technical skill set. We’re announcing a new investment today that will transform how teams across the business work with data in the future.
With Airflow being the open-source standard for workflow orchestration, knowing how to write Airflow DAGs has become an essential skill for every data engineer. This eBook provides a comprehensive overview of DAG writing features with plenty of example code. You’ll learn how to: Understand the building blocks DAGs, combine them in complex pipelines, and schedule your DAG to run exactly when you want it to Write DAGs that adapt to your data at runtime and set up alerts and notifications Scale you
It’s been a while since I wrote about Polars on this blog, I’ve been remiss. Some time ago I wrote a very simple comparison of switching from Pandas to Polars, I didn’t put much real effort into it, yet it was popular, so this is my attempt at trying to expand on that topic a […] The post How to JOIN datasets in Polars … compared to Pandas. appeared first on Confessions of a Data Guy.
As Snowflake continues to expand our commitment to compliance, we are pleased to announce that we have successfully completed both C5 and TISAX attestations in Germany. Cloud Computing Compliance Controls Catalog (C5) C5 is an audited standard establishing baselines for cloud security. It was initially created for government agencies and organizations that work with the government to ensure security baselines are met by their cloud service providers (CSPs).
In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!
Have you ever been part of a data or software project that seems stuck in a loop? Three weeks have passed, and although you arrive at work daily, exhausted, having tackled numerous issues, the project remains stagnant. Why? Then, suddenly, a new engineer or project manager steps in, reorganizes and prioritizes tasks, and just like… Read more The post Common Pitfalls of Data Analytics Projects appeared first on Seattle Data Guy.
Like many companies, Snowflake uses Outreach as a sales execution platform to help our sales teams improve prospecting efforts and efficiently follow up on leads. For Snowflake sales reps, Outreach is the central repository for almost all inbound and outbound communications with current and potential customers. For the sales development representative (SDR) leadership team, it’s an immensely valuable source of insights for sales enablement and automation.
And just like that, we’re down to the 10th episode of Unapologetically Technical! In this episode, I interview Michael Drogalis, the founder and CEO of ShadowTraffic where we talked about the early Hadoop era and how he saw the need for Kafka in the industry. He shared his journey of starting a new company in his 20s and being acquired by Confluent.
Speaker: Ben Epstein, Stealth Founder & CTO | Tony Karrer, Founder & CTO, Aggregage
When tasked with building a fundamentally new product line with deeper insights than previously achievable for a high-value client, Ben Epstein and his team faced a significant challenge: how to harness LLMs to produce consistent, high-accuracy outputs at scale. In this new session, Ben will share how he and his team engineered a system (based on proven software engineering approaches) that employs reproducible test variations (via temperature 0 and fixed seeds), and enables non-LLM evaluation m
I never thought I would live to see the day, it’s crazy. I’m not sure who’s idea it was to make it possible to write Apache Spark with Rust, Golang, or Python … but they are all genius. As of Apache Spark 3.4 it is now possible to use Spark Connect … a thin API […] The post Writing Apache Spark with Rust! Spark Connect Introduced. appeared first on Confessions of a Data Guy.
Project management plays a significant role in the success of every organization. It ensures that the project is on track, aids in efficient management of resources, and also keeps the stakeholders know what is project and what's happening in it. In this blog, we will look at three different project organizational structures: functional, matrix, and process.
Image-generating technologies offer significant benefits for retail and consumer goods companies. By using generative models that produce both stylized and photo-realistic images from.
Adding these 7 key elements to your resume will improve your odds of getting an interview call. Remember, after graduating from the university, your full-time job is to find a job, so put some effort into preparing your resume.
Many software teams have migrated their testing and production workloads to the cloud, yet development environments often remain tied to outdated local setups, limiting efficiency and growth. This is where Coder comes in. In our 101 Coder webinar, you’ll explore how cloud-based development environments can unlock new levels of productivity. Discover how to transition from local setups to a secure, cloud-powered ecosystem with ease.
We organize all of the trending information in your field so you don't have to. Join 37,000+ users and stay up to date on the latest articles your peers are reading.
You know about us, now we want to get to know you!
Let's personalize your content
Let's get even more personalized
We recognize your account from another site in our network, please click 'Send Email' below to continue with verifying your account and setting a password.
Let's personalize your content