Trending Articles

article thumbnail

Data Engineering Weekly #222

Data Engineering Weekly

Dagster for MLOps: Deep Dive into AI Orchestration Learn what it really takes to run production-grade ML systems—without breaking your architecture or compliance efforts. Join Dagster and Neurospace to learn: - How to build AI pipelines with orchestration baked in - How to track data lineage for audits and traceability - Tips for designing compliant workflows under the EU AI Act Register for the technical session DuckDB: DuckLake - SQL as a Lakehouse Format DuckDB announced a new open tabl

article thumbnail

5 key lessons from implementing AI/BI Genie for self-service marketing insights

databricks

Introduction Marketing teams frequently encounter challenges in accessing their data, often depending on technical teams to translate that data into actionable insights.

BI 66
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

How to Write Efficient Python Code Even If You’re a Beginner

KDnuggets

You dont need to be a Python pro to write fast, clean code. Just a few smart coding habits can go a long way.

Coding 125
article thumbnail

Improve your geoprocessing productivity with Append To Existing in ArcGIS Pro (May 2025)

ArcGIS

In ArcGIS Pro 3.5, you can choose between three options to overwrite existing tool data, including appending and replacing data.

Data 97
article thumbnail

A Guide to Debugging Apache Airflow® DAGs

In Airflow, DAGs (your data pipelines) support nearly every use case. As these workflows grow in complexity and scale, efficiently identifying and resolving issues becomes a critical skill for every data engineer. This is a comprehensive guide with best practices and examples to debugging Airflow DAGs. You’ll learn how to: Create a standardized process for debugging to quickly diagnose errors in your DAGs Identify common issues with DAGs, tasks, and connections Distinguish between Airflow-relate

article thumbnail

Deliver Bi-Directional Integration for Oracle Autonomous Database and Databricks

databricks

Until now, sharing data between enterprise systems often meant complex pipelines, duplication, and lock-in. With Oracles support for Delta Sharing, thats no longer the case.

BI 69
article thumbnail

Smart Banking in 2025: The Intelligent Technologies Defining CX and Operations

Precisely

In a rapidly evolving financial landscape, one thing is clear: banks that prioritize agility and data-driven customer-centricity are not just staying afloattheyre thriving. During the recent American Banker webinar, Smart Banking in 2025: Intelligent Technologies Defining CX and Operations, I had the pleasure of speaking alongside Sarah Howell about the big shifts seen in bankingparticularly around digital transformation, compliance, and customer experience (CX).

Banking 58

More Trending

article thumbnail

Leveraging Data Insights to Guide Marketing Strategies

RandomTrees

Introduction In today’s digitally linked world, intuition is no longer sufficient to drive B2B marketing. Data analytics has developed as a critical component of effective marketing strategies, allowing companies to make educated decisions that improve performance and create quantifiable results. With vast amounts of client data available across digital channels, organizations that use data analytics may acquire a significant competitive edge.

article thumbnail

Getting to production: The secrets to secure, scalable and cost-effective enterprise AI

databricks

In a sign of how quickly enterprises are moving to embrace AI, 70% have moved past the pilot stage and are preparing to release new

Data 66
article thumbnail

Microsoft Fabric Architecture Explained: Core Components & Benefit

Edureka

Microsoft Fabric is a next-generation data platform that combines business intelligence, data warehousing, real-time analytics, and data engineering into a single integrated SaaS framework. Microsoft Fabric, which is based on the principles of governance, scalability, and simplicity, enables companies to handle their whole analytics lifecycle in one location.

article thumbnail

Structures, containers, and content oh my!

ArcGIS

Learn how to analyze the relationships between your network features and structures using ArcGIS Utility Network.

article thumbnail

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

Speaker: Tamara Fingerlin, Developer Advocate

Apache Airflow® 3.0, the most anticipated Airflow release yet, officially launched this April. As the de facto standard for data orchestration, Airflow is trusted by over 77,000 organizations to power everything from advanced analytics to production AI and MLOps. With the 3.0 release, the top-requested features from the community were delivered, including a revamped UI for easier navigation, stronger security, and greater flexibility to run tasks anywhere at any time.

article thumbnail

10 Python One-Liners for Working with Dates and Times

KDnuggets

These ten compact and pythonic shortcuts will boost your time data analysis and processing workflows. See how and why.

Python 81
article thumbnail

How to Query Apache Kafka® Topics With Natural Language

Confluent

Learn how to easily extract the data you need from Apache Kafka by generating Apache Flink SQL commands with natural language prompts or questions in this step-by-step demo.

Kafka 49
article thumbnail

Microsoft Fabric Tutorial for Beginners

Edureka

Imagine entering a control room with complete control over your data ecosystem. You won’t have to deal with siloed systems, jump between tools, or write endless lines of code to make data useful. That’s how Microsoft Fabric works. With its ability to seamlessly integrate data engineering, analytics, and business intelligence, Microsoft Fabric stands out as the all-in-one superhero in a world where data is abundant but insights are scarce.

BI 52
article thumbnail

Honeydew Revolutionizes Business Intelligence with Investment from Snowflake Ventures

Snowflake

At Snowflake, our mission is to empower every enterprise to achieve its full potential through data and AI. We actively support innovative companies within our ecosystem that demonstrate clear value for our customers, which is why we're excited to invest in Honeydew , a former Snowflake Startup Challenge finalist. Honeydews Semantic Layer revolutionizes the way data teams collaborate on business intelligence and deliver impactful data-driven insights.

article thumbnail

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Speaker: Alex Salazar, CEO & Co-Founder @ Arcade | Nate Barbettini, Founding Engineer @ Arcade | Tony Karrer, Founder & CTO @ Aggregage

There’s a lot of noise surrounding the ability of AI agents to connect to your tools, systems and data. But building an AI application into a reliable, secure workflow agent isn’t as simple as plugging in an API. As an engineering leader, it can be challenging to make sense of this evolving landscape, but agent tooling provides such high value that it’s critical we figure out how to move forward.

article thumbnail

Administering Performance Settings for ArcGIS Pro

ArcGIS

Discusses the enhancements introduced in ArcGIS Pro 3.5 to assist system administrators in optimizing performance settings.

Systems 65
article thumbnail

How AEs Champion Long-Term Customer Success

Confluent

Discover how Account Executive Jason helps customers turn roadblocks into winspowered by Confluents collaborative, one-team culture.

45
article thumbnail

Implementing a Dimensional Data Warehouse with Databricks SQL, Part 3

databricks

Dimensional modeling is a time-tested approach to building analytics-ready data warehouses.

article thumbnail

How to Write a Dockerfile: From Basic to Advanced Techniques

Edureka

Docker completely changed the development, packaging, and deployment of applications. Docker provides consistent environments from development to production by isolating applications in containers. The Dockerfile is the foundation of this ecosystem since it serves as a guide for creating Docker images. This blog covers everything from the fundamentals to more complex subjects like comparisons, troubleshooting, and best practices.

Cloud 40
article thumbnail

How to Modernize Manufacturing Without Losing Control

Speaker: Andrew Skoog, Founder of MachinistX & President of Hexis Representatives

Manufacturing is evolving, and the right technology can empower—not replace—your workforce. Smart automation and AI-driven software are revolutionizing decision-making, optimizing processes, and improving efficiency. But how do you implement these tools with confidence and ensure they complement human expertise rather than override it? Join industry expert Andrew Skoog as he explores how manufacturers can leverage automation to enhance operations, streamline workflows, and make smarter, data-dri

article thumbnail

How Natural Language Interfaces Are Transforming Marketing Workflows

Snowflake

Enterprises are navigating a complex landscape marked by evolving challenges in privacy, economics and the rapid advancement of AI. Consumer data privacy is no longer just an expectation its a nonnegotiable, the foundation of consumer trust. Economic volatility has pushed companies to do more with less, demanding greater efficiency amid ever-changing regulations.

article thumbnail

The Components of the dbt Fusion engine and how they fit together

dbt Developer Hub

Today, we announced the dbt Fusion engine. Fusion isn't just one thing it's a set of interconnected components working together to power the next generation of analytics engineering. This post maps out each piece of the Fusion architecture, explains how they fit together, and clarifies what's available to you whether you're compiling from source, using our pre-built binaries, or developing within a dbt Fusion powered product experience.

article thumbnail

Unleashing AI-Driven Innovation: ThoughtSpot’s Momentum in Australia & New Zealand

ThoughtSpot

At ThoughtSpot, were on a mission to empower every business user to become a data champion. Over the past year, Ive witnessed firsthand how organizations across Australia and New Zealand are embracing this vision, transforming the way they work, make decisions, and serve their customers. Today, Im excited to share some of the incredible momentum were seeing in the region and to celebrate the forward-thinking organizations leading the charge.

Food 52
article thumbnail

Real-time Streaming of Jira Data to Google BigQuery

Striim

Introduction The transfer of data from Atlassian Jira to Google BigQuery facilitates the scalable analysis of engineering metrics, encompassing cycle time, throughput, and issue trends. This enables forecasting and planning through the utilization of historical data for predictive insights. Moreover, with the application of BigQuery ML or external AI tools, teams can leverage machine learning to forecast delivery delays, identify anomalies, or prioritize issues based on historical patterns.

article thumbnail

The Ultimate Guide to Apache Airflow DAGS

With Airflow being the open-source standard for workflow orchestration, knowing how to write Airflow DAGs has become an essential skill for every data engineer. This eBook provides a comprehensive overview of DAG writing features with plenty of example code. You’ll learn how to: Understand the building blocks DAGs, combine them in complex pipelines, and schedule your DAG to run exactly when you want it to Write DAGs that adapt to your data at runtime and set up alerts and notifications Scale you

article thumbnail

Microsoft Fabric vs Tableau 2025: Insights and Comparisons

Edureka

In the world of data analytics, Microsoft Fabric and Tableau stand out as powerful tools, but they have very different strengths. While Microsoft Fabric offers an all-in-one data platform for enterprises deeply integrated with Azure, Tableau focuses on intuitive, high-quality data visualization for users at all levels. This guide compares their features, architecture, pricing, and use cases to help you decide which is the best fit for your data strategy.

BI 40
article thumbnail

How Agentic AI Is Transforming Autonomous Networks in Telecom

Snowflake

It sounds like a cliche to say it's a transformative time in telecommunications. But thats never been more accurate. Companies across the entire ecosystem are undergoing unprecedented change and incredible innovation across every aspect of the business. Fueled by efficiency, cost and customer experience pressures, telecoms must ensure that networks are not only highly reliable but easily adaptable to the rapidly changing needs of modern businesses and customers.

article thumbnail

Smart Banking: The Intelligent Technologies Defining CX and Operations

Precisely

In a rapidly evolving financial landscape, one thing is clear: banks that prioritize agility and data-driven customer-centricity are not just staying afloattheyre thriving. During the recent American Banker webinar, Smart Banking in 2025: Intelligent Technologies Defining CX and Operations, I had the pleasure of speaking alongside Sarah Howell about the big shifts seen in bankingparticularly around digital transformation, compliance, and customer experience (CX).

Banking 52
article thumbnail

Path to GA: How the dbt Fusion engine rolls out from beta to production

dbt Developer Hub

Today, we announced that the dbt Fusion engine is available in beta. If Fusion works with your project today, great! You're in for a treat If it's your first day using dbt, welcome! You should start on Fusion you're in for a treat too. Today is Launch Day the first day of a new era: the Age of Fusion. We expect many teams with existing projects will encounter at least one issue that will prevent them from adopting the dbt Fusion engine in production environments.

article thumbnail

Apache Airflow® Best Practices: DAG Writing

Speaker: Tamara Fingerlin, Developer Advocate

In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!

article thumbnail

Introducing Apache Spark 4.0

databricks

Apache Spark 4.0 marks a major milestone in the evolution of the Spark analytics engine.

SQL 136
article thumbnail

Top 30+ AWS Data Engineer Interview Questions and Answers

Edureka

In today’s data-driven world, the role of an AWS Data Engineer is more important than ever! Organizations are on the lookout for talented professionals who can design, build, and maintain strong data pipelines and infrastructure on the Amazon Web Services (AWS) platform. If you’re eager to kickstart your career in AWS data engineering or ready to take it to the next level, mastering the interview process is essential.

AWS 40
article thumbnail

The keyword I would like to know before thinking about watermarks

Waitingforcode

When I was learning about watermarks in Apache Flink, I saw they were taking the smallest event times instead of the biggest ones in Apache Spark Structured Streaming. From that I was puzzled. How is it possible the pipeline doesn't go back to the past? The answer came when I reread the Streaming Systems book. There was one keyword I had missed that clarified everything.

Systems 130
article thumbnail

How to Market Yourself as a Data Professional on LinkedIn

KDnuggets

Want recruiters and collaborators to find you? Fix your LinkedIn, even if you hate self-promotion.

article thumbnail

How to Achieve High-Accuracy Results When Using LLMs

Speaker: Ben Epstein, Stealth Founder & CTO | Tony Karrer, Founder & CTO, Aggregage

When tasked with building a fundamentally new product line with deeper insights than previously achievable for a high-value client, Ben Epstein and his team faced a significant challenge: how to harness LLMs to produce consistent, high-accuracy outputs at scale. In this new session, Ben will share how he and his team engineered a system (based on proven software engineering approaches) that employs reproducible test variations (via temperature 0 and fixed seeds), and enables non-LLM evaluation m