Tue.Nov 12, 2024

article thumbnail

Using Pandas and SQL Together for Data Analysis

KDnuggets

In this tutorial, we’ll explore when and how SQL functionality can be integrated within the Pandas framework, as well as its limitations.

SQL 147
article thumbnail

15+ Companies Using DuckDB in Production: A Comprehensive Guide

Simon Späti

From Fortune 500 companies processing trillions of security records to innovative startups building interactive data tools, DuckDB is revolutionizing how organizations handle analytical workloads. Building on our exploration of DuckDB’s core capabilities in Part 1 , this guide showcases production implementations and promising experimental applications across five key categories.

article thumbnail

AI Agent Systems: Modular Engineering for Reliable Enterprise AI Applications

databricks

Monolithic to Modular The proof of concept (POC) of any new technology often starts with large, monolithic units that are difficult to characterize.

Systems 122
article thumbnail

Netflix’s Distributed Counter Abstraction

Netflix Tech

By: Rajiv Shringi , Oleksii Tkachuk , Kartik Sathyanarayanan Introduction In our previous blog post, we introduced Netflix’s TimeSeries Abstraction , a distributed service designed to store and query large volumes of temporal event data with low millisecond latencies. Today, we’re excited to present the Distributed Counter Abstraction. This counting service, built on top of the TimeSeries Abstraction, enables distributed counting at scale while maintaining similar low latency performance.

Datasets 101
article thumbnail

Apache Airflow® Best Practices for ETL and ELT Pipelines

Whether you’re creating complex dashboards or fine-tuning large language models, your data must be extracted, transformed, and loaded. ETL and ELT pipelines form the foundation of any data product, and Airflow is the open-source data orchestrator specifically designed for moving and transforming data in ETL and ELT pipelines. This eBook covers: An overview of ETL vs.

article thumbnail

They Handle 500B Events Daily. Here’s Their Data Engineering Architecture.

Monte Carlo

A data engineering architecture is the structural framework that determines how data flows through an organization – from collection and storage to processing and analysis. It’s the big blueprint we data engineers follow in order to transform raw data into valuable insights. Before building your own data architecture from scratch though, why not steal – er, learn from – what industry leaders have already figured out?

article thumbnail

How to Become a Software Engineer (Without a Degree)

KDnuggets

The fastest and simplest route to becoming a software engineer with little cost.

More Trending

article thumbnail

The role of AI in changing company structures and dynamics

databricks

The most recent wave of artificial intelligence (AI), spearheaded by the advent and mass adoption of large language models (LLM), showed the potential.

Data 89
article thumbnail

Unmatched Collaboration for Data & AI Products: What’s New

Snowflake

Getting different teams, business units and even companies to work together toward a common goal not only maximizes efficiency, but drives innovation. Effective collaboration on data and AI has never been more closely tied to success. At Snowflake, we’re removing the barriers that prevent productive cooperation while building the connections to make working together easier than ever.

AWS 74
article thumbnail

Triggered Tasks in Snowflake

Cloudyard

Read Time: 2 Minute, 32 Second Triggered tasks in Snowflake offer a key advantage: they only execute when new data arrives, eliminating the need to run a warehouse or cloud service constantly and reducing associated costs. By leveraging Snowflake’s stream processing and trigger-based task scheduling , we ensure data is loaded and validated as soon as it arrives, allowing for near real-time processing.

article thumbnail

Snowflake Unistore: Hybrid Tables Now Generally Available

Snowflake

Today we're thrilled to announce the general availability of Hybrid Tables in all AWS commercial regions (with a few exceptions ). As part of Snowflake Unistore , Hybrid Tables unify both transactional and analytical workloads on a single database to simplify architectures as well as governance and security. Since launching the public preview of Hybrid Tables this year, we have seen adoption across industries from customers such as Siemens , Panther, Mutual of Omaha, PowerSchool , MarketWise and

Food 75
article thumbnail

Apache Airflow®: The Ultimate Guide to DAG Writing

Speaker: Tamara Fingerlin, Developer Advocate

In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!

article thumbnail

Confluent Champion: Rising Through the Ranks in Security

Confluent

Read our latest Confluent Champion post to learn how security engineering manager Tejal Adsul is honing her leadership skills at Confluent.

article thumbnail

Solving the most complex task of grocery shopping — meal planning

Picnic Engineering

Solving the most complex task of grocery shoppingmealplanning Were all familiar with grocery shopping. You venture into your nearby supermarket, walk down all the aisles and start filling your basket with the things you need (and sometimes dont need). But what you might not be aware of is that grocery shopping is a task which requires you to make a myriad of smaller decisions, ranging from easy to extremely difficult ones that leave you wandering aisles for much longer than youd ideally liketo.

article thumbnail

How Data Teams Drive Business Success by Understanding Core Metrics

Seattle Data Guy

A key responsibility for any data team is to understand the core metrics driving their business. Starting from the top, these metrics often include figures like gross revenue and expenses. However, these high-level metrics can feel too far removed and abstract from the actual business. Many companies, therefore, break down these top-line metrics into more… Read more The post How Data Teams Drive Business Success by Understanding Core Metrics appeared first on Seattle Data Guy.

Data 130