Sat.Dec 09, 2023 - Fri.Dec 15, 2023

article thumbnail

Enhancing LLM Reasoning: Unveiling Chain of Code Prompting

KDnuggets

Chain of Code is an approach to interacting with language models, enhancing reasoning abilities through a blend of writing, executing, and simulating code execution, extending the capabilities of language models in logic, arithmetic, and linguistic tasks, especially those requiring a combination of these.

Coding 154
article thumbnail

Even Santa Claus has AI fever

databricks

As CEO of the North Pole, Santa Claus oversees one of the world’s most complicated supply chain, manufacturing and logistics operations. Every year, S.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Uplevel your dbt workflow with these tools and techniques

Start Data Engineering

1. Introduction 2. Setup 3. Ways to uplevel your dbt workflow 3.1. Reproducible environment 3.1.1. A virtual environment with Poetry 3.1.2. Use Docker to run your warehouse locally 3.2. Reduce feedback loop time when developing locally 3.2.1. Run only required dbt objects with selectors 3.2.2. Use prod datasets to build dev models with defer 3.2.3. Parallelize model building by increasing thread count 3.

Datasets 130
article thumbnail

Data+AI Summit 2023, retrospective part 2

Waitingforcode

One week later than initially announced, but here it is, the second part for Data+AI Summit 2023 retrospective. I don't know how, but I managed to include some streaming-related talks here too!

Data 130
article thumbnail

A Guide to Debugging Apache Airflow® DAGs

In Airflow, DAGs (your data pipelines) support nearly every use case. As these workflows grow in complexity and scale, efficiently identifying and resolving issues becomes a critical skill for every data engineer. This is a comprehensive guide with best practices and examples to debugging Airflow DAGs. You’ll learn how to: Create a standardized process for debugging to quickly diagnose errors in your DAGs Identify common issues with DAGs, tasks, and connections Distinguish between Airflow-relate

article thumbnail

Evolution in ETL: How Skipping Transformation Enhances Data Management

KDnuggets

This article provides an overview of two new data preparation techniques that enable data democratization while minimizing transformation burdens.

article thumbnail

Lakehouse Monitoring: A Unified Solution for Quality of Data and AI

databricks

Introduction Databricks Lakehouse Monitoring allows you to monitor all your data pipelines – from data to features to ML models – without additional too.

More Trending

article thumbnail

Run Your Own Anomaly Detection For Your Critical Business Metrics With Anomstack

Data Engineering Podcast

Summary If your business metrics looked weird tomorrow, would you know about it first? Anomaly detection is focused on identifying those outliers for you, so that you are the first to know when a business critical dashboard isn't right. Unfortunately, it can often be complex or expensive to incorporate anomaly detection into your data platform. Andrew Maguire got tired of solving that problem for each of the different roles he has ended up in, so he created the open source Anomstack project.

Data Lake 130
article thumbnail

Undersampling Techniques Using Python

KDnuggets

The article discusses the undersampling data preprocessing techniques to address data imbalance challenges.

Python 149
article thumbnail

Build GenAI Apps Faster with New Foundation Model Capabilities

databricks

Following the announcements we made last week about Retrieval Augmented Generation (RAG), we're excited to announce major updates to Model Serving. Databricks Model.

Building 133
article thumbnail

Making Flink Serverless, With Queries for Less Than a Penny

Confluent

Dive into the serverless architecture of Confluent Cloud for Apache Flink and explore its benefits like reduced infrastructure costs, increased reliability, & seamless adoption.

article thumbnail

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

Speaker: Tamara Fingerlin, Developer Advocate

Apache Airflow® 3.0, the most anticipated Airflow release yet, officially launched this April. As the de facto standard for data orchestration, Airflow is trusted by over 77,000 organizations to power everything from advanced analytics to production AI and MLOps. With the 3.0 release, the top-requested features from the community were delivered, including a revamped UI for easier navigation, stronger security, and greater flexibility to run tasks anywhere at any time.

article thumbnail

Big improvements for field management in Geoprocessing in ArcGIS Pro 3.2

ArcGIS

In ArcGIS Pro 3.2, the field map parameter has been redesigned for improved usability and new capabilities.

article thumbnail

7 Pandas Plotting Functions for Quick Data Visualization

KDnuggets

Want to visualize data in your pandas dataframes? Use these nifty pandas plotting functions.

Data 148
article thumbnail

Offline LLM Evaluation: Step-by-Step GenAI Application Assessment on Databricks

databricks

Background In an era where Retrieval-Augmented Generation (RAG) is revolutionizing the way we interact with AI-driven applications, ensuring the efficiency and effectiveness of.

article thumbnail

Predictions: The Cybersecurity Challenges of AI

Snowflake

Our recently released predictions report includes a number of important considerations about the likely trajectory of cybercrime in the coming years, and the strategies and tactics that will evolve in response. Every year, the story is “Attackers are getting more sophisticated, and defenders have to keep up.” As we enter a new era of advanced AI technology, we identify some surprising wrinkles to that perennial trend.

article thumbnail

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Speaker: Alex Salazar, CEO & Co-Founder @ Arcade | Nate Barbettini, Founding Engineer @ Arcade | Tony Karrer, Founder & CTO @ Aggregage

There’s a lot of noise surrounding the ability of AI agents to connect to your tools, systems and data. But building an AI application into a reliable, secure workflow agent isn’t as simple as plugging in an API. As an engineering leader, it can be challenging to make sense of this evolving landscape, but agent tooling provides such high value that it’s critical we figure out how to move forward.

article thumbnail

Layout sandwich

ArcGIS

How to make a layout sandwich with two synchronized map views, some masking, and some mischief.

119
119
article thumbnail

5 Free University Courses to Learn Python

KDnuggets

Looking for the best resources to learn Python programming? Check out these free university courses.

Python 145
article thumbnail

Real-Time Field Service Optimization

Confluent

Telcos use Confluent with event-driven microservices to enable real-time communications with 3rd-party field service providers, fulfilling customer service requests more efficiently.

110
110
article thumbnail

The Three Essentials to Get to Responsible AI

Snowflake

The excitement (and drama) around AI continues to escalate. Why? Because the stakes are high. The race for competitive advantage by applying AI to new use cases is on! The launch of generative AI last year added fuel to the fire, and for good reason. Whereas the existing portfolio of AI tools had targeted the more technically minded like data scientists and engineers, new tools like ChatGPT handed the keys to the kingdom to anyone who could type a question.

article thumbnail

How to Modernize Manufacturing Without Losing Control

Speaker: Andrew Skoog, Founder of MachinistX & President of Hexis Representatives

Manufacturing is evolving, and the right technology can empower—not replace—your workforce. Smart automation and AI-driven software are revolutionizing decision-making, optimizing processes, and improving efficiency. But how do you implement these tools with confidence and ensure they complement human expertise rather than override it? Join industry expert Andrew Skoog as he explores how manufacturers can leverage automation to enhance operations, streamline workflows, and make smarter, data-dri

article thumbnail

Tips for labeling images for object detection models

ArcGIS

In this Part-1 of a two-part blog series, we will share tips for labeling objects on images for object detection deep learning models.

article thumbnail

5 Rare Data Science Skills That Can Help You Get Employed

KDnuggets

This article is about the less common data science skills that can help you get hired. While these skills are not as common as they are for technical jobs, they are certainly worth developing.

article thumbnail

Managing AI Security Risks: Introducing a new workshop for CISOs

databricks

Adopting AI is existentially vital for most businesses Machine Learning (ML) and generative AI (GenAI) are revolutionizing the future of work. Organizations understand.

article thumbnail

New Snowflake Features Released in September–November 2023

Snowflake

At our recent Snowday event, we announced a wave of Snowflake product innovations for easier application development, new AI and LLM capabilities, better cost management and more. If you missed the event or need a refresh of what was presented, watch any Snowday session on demand. Let’s dive into all new releases in September, October and November. Architecture Flexibility Iceberg Tables – public preview While many customers value the simplicity of fully managed storage and a single, mul

Metadata 116
article thumbnail

The Ultimate Guide to Apache Airflow DAGS

With Airflow being the open-source standard for workflow orchestration, knowing how to write Airflow DAGs has become an essential skill for every data engineer. This eBook provides a comprehensive overview of DAG writing features with plenty of example code. You’ll learn how to: Understand the building blocks DAGs, combine them in complex pipelines, and schedule your DAG to run exactly when you want it to Write DAGs that adapt to your data at runtime and set up alerts and notifications Scale you

article thumbnail

ArcGIS AI Models – Year in Review

ArcGIS

Learn about our recently released pretrained deep learning models available in the ArcGIS Living Atlas of the World.

article thumbnail

AI in Intimate Roles: Girlfriends and Therapists

KDnuggets

This article is a brief overview of the field of Emotion AI, and the potential applications of its technology in intimate roles.

article thumbnail

Our First Netflix Data Engineering Summit

Netflix Tech

Holden Karau Elizabeth Stone Pedro Duarte Chris Stephens Pallavi Phadnis Lee Woodridge Mark Cho Guil Pires Sujay Jain Tristan Reid Senthilnathan Athinarayanan Bharath Mummadisetty Abhinaya Shetty Judit Lantos Amanuel Kahsay Dao Mi Mick Dreeling Chris Colburn and Agata Gryzbek Introduction Earlier this summer Netflix held our first-ever Data Engineering Forum.

article thumbnail

FedRAMP High Authorization on AWS GovCloud (US-West and US-East) Expands Snowflake’s Commitment to Serving the Public Sector

Snowflake

The authorization furthers Snowflake’s commitment to helping our government customers secure and mobilize their mission-critical data It’s a milestone moment for Snowflake to have achieved FedRAMP High authorization on the AWS GovCloud (US-West and US-East Regions). This authorization, from the Federal Risk and Authorization Management Program (FedRAMP) , is one of the most rigorous security endorsements a cloud service provider (CSP) can achieve.

AWS 110
article thumbnail

Apache Airflow® Best Practices: DAG Writing

Speaker: Tamara Fingerlin, Developer Advocate

In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!

article thumbnail

Delivering cost-effective data in real time with dbt and Databricks

databricks

As businesses grow, data volumes scale from GBs to TBs (or more), and latency demands go from hours to minutes (or less), making.

Data 105
article thumbnail

3 Ways to Generate Hyper-Realistic Faces Using Stable Diffusion

KDnuggets

You learned how to generate images using the base model, how to upgrade to the Stable Diffusion XL model to improve image quality, and how to use a custom model to generate high quality portraits.

140
140
article thumbnail

Big improvements for field management in Geoprocessing in ArcGIS Pro 3.2

ArcGIS

In ArcGIS Pro 3.2, the field map parameter has been redesigned for improved usability and new capabilities.

article thumbnail

Harnessing the Data Cloud to Empower Our Own Marketing Team: Building a Digital Ads Ecosystem on Snowflake

Snowflake

You need metrics to do your job well as a marketer but getting clear, meaningful metrics is a huge challenge. While digital advertisers and paid media professionals are on the hook to build ample sales pipeline and maximize return on ad spend (ROAS), they’re also expected to deliver personalized advertising content while navigating evolving privacy requirements and adhering to consumer expectations—all while extracting insights from siloed ad platforms.

Building 105
article thumbnail

How to Achieve High-Accuracy Results When Using LLMs

Speaker: Ben Epstein, Stealth Founder & CTO | Tony Karrer, Founder & CTO, Aggregage

When tasked with building a fundamentally new product line with deeper insights than previously achievable for a high-value client, Ben Epstein and his team faced a significant challenge: how to harness LLMs to produce consistent, high-accuracy outputs at scale. In this new session, Ben will share how he and his team engineered a system (based on proven software engineering approaches) that employs reproducible test variations (via temperature 0 and fixed seeds), and enables non-LLM evaluation m