Tue.Jul 30, 2024

article thumbnail

7 Steps to Master the Art of Data Storytelling

KDnuggets

Follow this 7 step recipe to mastering effective insight and information dissemination through compelling data story crafting.

Data 138
article thumbnail

New with Confluent Platform: Enhanced security with OAuth Support, Confluent Platform for Apache Flink® (LA), a new Connector, and More

Confluent

Confluent Platform 7.

121
121
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

OKR-Centric Delivery Models for Engineering-Focused Enterprises

databricks

Introduction An organization adopting new technologies or on a modernization journey typically focuses on upcoming tools, their features and potential performance/cost improvements under.

article thumbnail

How to Perform Matrix Operations with NumPy

KDnuggets

Learning how to perform several of the most basic matrix operations with NumPy.

Python 114
article thumbnail

A Guide to Debugging Apache Airflow® DAGs

In Airflow, DAGs (your data pipelines) support nearly every use case. As these workflows grow in complexity and scale, efficiently identifying and resolving issues becomes a critical skill for every data engineer. This is a comprehensive guide with best practices and examples to debugging Airflow DAGs. You’ll learn how to: Create a standardized process for debugging to quickly diagnose errors in your DAGs Identify common issues with DAGs, tasks, and connections Distinguish between Airflow-relate

article thumbnail

The 6 Data Quality Dimensions with Examples

Monte Carlo

It’s clear that data quality is becoming more of a focus for more data teams. So why are there still so many questions like these: A quick search on subreddits for data engineers, data analysts, data scientists, and more can yield a plethora of users seeking data quality advice. And while the comment below may seem like the accepted way of doing data quality management… … there’s actually a much better way.

article thumbnail

MarshMallow: The Sweetest Python Library for Data Serialization and Validation

KDnuggets

Stop debugging data mismatches and focus on your application logic when you let Marshmallow handle serialization, deserialization and validation for you.

Python 85

More Trending

article thumbnail

Polaris Catalog Is Now Open Source

Snowflake

In June 2024, Snowflake announced Polaris Catalog to provide organizations and the Iceberg community new levels of choice, flexibility and control over their data. It enables more open, secure lakehouse architectures with broad read-and-write interoperability and cross-engine access controls. Apache Iceberg™ has greatly improved data mobility by establishing a vast community around an open standard, and the next logical step is an open, community-driven catalog to complement Iceberg.

article thumbnail

Mobiumata by Chris Price

Scott Logic

Mobiumata (a concatenation of Möbius strip and cellular automata) is a small interactive art piece that allows folk to play god to 1,500 LED cells wrapped into a Möbius strip. When Scott Logic needed something to act as a talking point for a conference booth, as a big fan of all things flashy, shiny and interactive, I jumped at the opportunity to create something engaging that was roughly themed around AI.

Coding 52
article thumbnail

How to Build RAG Applications Using Snowflake Cortex?

Hevo

GPT has become a go-to search engine for many. We often use it instead of Google to get a quick solution for any query. Given its popularity, why don’t you include a customer chatbot or a troubleshooting chatbot service in your business? Imagine having a brand-specific chatbot with expertise in answering your business related queries.

article thumbnail

If agile is the answer, what is the question? by Dave Ogle

Scott Logic

The other day a colleague asked this question on one of our internal Slack channels: “If you were writing headings in a document, and using the capitalisation style of capitalising the first letter of just the important words in a header, how would you capitalise ‘what we are trying to achieve?’” The answers were many and varied, ranging from serious answers with supporting documentation to tongue-in-cheek responses, the answer which won the day though was this: “Objective” Clever, isn’t it?

article thumbnail

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

Speaker: Tamara Fingerlin, Developer Advocate

Apache Airflow® 3.0, the most anticipated Airflow release yet, officially launched this April. As the de facto standard for data orchestration, Airflow is trusted by over 77,000 organizations to power everything from advanced analytics to production AI and MLOps. With the 3.0 release, the top-requested features from the community were delivered, including a revamped UI for easier navigation, stronger security, and greater flexibility to run tasks anywhere at any time.

article thumbnail

Change Data Capture as the Backbone of RAG AI-Driven System Resilience Strategies

Striim

Ensuring system resilience is critical for maintaining a competitive edge in today’s data-driven world. As businesses rely on real-time data to fuel decision-making, it’s essential that their systems can withstand disruptions and maintain functionality. Change Data Capture (CDC) is a key player, particularly in AI-driven systems where real-time data integration and adaptive responses are crucial.

Systems 52
article thumbnail

Optimizing Data Warehouse Cost using Apache Iceberg

Hevo

Data warehouses bring phenomenal results from well-informed, data-driven decision-making for an organization. There were times when only companies with large capital, and substantial IT infrastructures invested time and effort, let alone money, in a data warehouse.