Sat.Apr 27, 2024 - Fri.May 03, 2024

article thumbnail

Is the “AI developer”a threat to jobs – or a marketing stunt?

The Pragmatic Engineer

This article was published on 14 March 2024 in The Pragmatic Engineer, for subscribers. I'm sharing this piece in public more than a month later, as it provides important context and analysis for the AI dev tools space. Subscribe to The Pragmatic Engineer to stay up-to-date on what is happening with software engineering, Big Tech, and startups.

article thumbnail

Why did Golang lose to Rust for Data Engineering?

Confessions of a Data Guy

A few years ago I wasn’t sure, who was going to win, Golang seemed to be popular, and still is for that matter. When I first wrote a little Golang (~2+ years ago) I was just trying to see what the hype was all about. The funny thing is, at the time, and today, it […] The post Why did Golang lose to Rust for Data Engineering? appeared first on Confessions of a Data Guy.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

5 Simple Steps to Automate Data Cleaning with Python

KDnuggets

Automate your data cleaning process with a practical 5-step pipeline in Python, ideal for beginners.

Python 152
article thumbnail

Build Your Second Brain One Piece At A Time

Data Engineering Podcast

Summary Generative AI promises to accelerate the productivity of human collaborators. Currently the primary way of working with these tools is through a conversational prompt, which is often cumbersome and unwieldy. In order to simplify the integration of AI capabilities into developer workflows Tsavo Knott helped create Pieces, a powerful collection of tools that complements the tools that developers already use.

Building 147
article thumbnail

A Guide to Debugging Apache Airflow® DAGs

In Airflow, DAGs (your data pipelines) support nearly every use case. As these workflows grow in complexity and scale, efficiently identifying and resolving issues becomes a critical skill for every data engineer. This is a comprehensive guide with best practices and examples to debugging Airflow DAGs. You’ll learn how to: Create a standardized process for debugging to quickly diagnose errors in your DAGs Identify common issues with DAGs, tasks, and connections Distinguish between Airflow-relate

article thumbnail

Introducing Confluent Cloud Freight Clusters

Confluent

Confluent Cloud Freight clusters are now available in Early Access. In this blog, learn how Freight clusters can save you up to 90% at GBps+ scale.

Cloud 145
article thumbnail

Databricks named a Leader in the 2024 Forrester Wave for Data Lakehouses

databricks

We are proud to announce that Forrester has recognized Databricks as a Leader with the highest scores in both current offering and strategy.

Data 134

More Trending

article thumbnail

How to build a data team

Christophe Blefari

My personal collection of the best resources to bootstrap a data team and get inspired from what others are doing.

Building 130
article thumbnail

Reaction to Data Engineering Survey for 2024

Confessions of a Data Guy

The post Reaction to Data Engineering Survey for 2024 appeared first on Confessions of a Data Guy.

article thumbnail

Executive Overview: The Rise of Open Foundational Models

databricks

Moving generative AI applications from the proof of concept stage into production requires control, reliability and data governance. Organizations are turning to open.

article thumbnail

Containerize Python Apps with Docker in 5 Easy Steps

KDnuggets

Get up and running with Docker with this tutorial on containerizing Python applications.

Python 142
article thumbnail

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

Speaker: Tamara Fingerlin, Developer Advocate

Apache Airflow® 3.0, the most anticipated Airflow release yet, officially launched this April. As the de facto standard for data orchestration, Airflow is trusted by over 77,000 organizations to power everything from advanced analytics to production AI and MLOps. With the 3.0 release, the top-requested features from the community were delivered, including a revamped UI for easier navigation, stronger security, and greater flexibility to run tasks anywhere at any time.

article thumbnail

Terms You Should Know If You’re Planning To Use Change Data Capture

Seattle Data Guy

If you’ve worked in data long enough, then you’ve likely come across the term change data capture. Often called CDC, change data capture involves tracking and recording changes in a database as they happen, and then transmitting these changes to designated targets. This can be crucial because some pipelines, in particular batch pipelines, don’t capture… Read more The post Terms You Should Know If You’re Planning To Use Change Data Capture appeared first on Seattle D

Database 130
article thumbnail

Moving Beyond MTEB and BEIR: Snowflake AI Research Joins Forces with the University of Waterloo to Evolve RAG and Retrieval Benchmarks

Snowflake

To accurately answer business questions using LLMs, companies must augment models with their data. Retrieval Augmented Generation (RAG) is a popular solution to this problem, as it integrates the organization’s factual, real-time data into the prompt for the LLM. While the adoption of RAG has increased, an open question remains: How do enterprises know how effective their system is?

Cloud 128
article thumbnail

Databricks Assistant Tips & Tricks for Data Engineers

databricks

The generative AI revolution is transforming the way that teams work, and Databricks Assistant leverages the best of these advancements. It allows you.

article thumbnail

5 MLOps Courses from Google to Level Up Your ML Workflow

KDnuggets

Want to build and deploy robust machine learning systems to production? Start learning MLOps today with these courses from Google.

article thumbnail

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Speaker: Alex Salazar, CEO & Co-Founder @ Arcade | Nate Barbettini, Founding Engineer @ Arcade | Tony Karrer, Founder & CTO @ Aggregage

There’s a lot of noise surrounding the ability of AI agents to connect to your tools, systems and data. But building an AI application into a reliable, secure workflow agent isn’t as simple as plugging in an API. As an engineering leader, it can be challenging to make sense of this evolving landscape, but agent tooling provides such high value that it’s critical we figure out how to move forward.

article thumbnail

ArcGIS Pro 3.3 Requires WebView2 Runtime (and you probably already have it)

ArcGIS

ArcGIS Pro 3.3 requires WebView2 Runtime as an installation prerequisite. Here's how to make sure you have it.

IT 125
article thumbnail

If:Else Logic and Complexity – Hiding the Pea.

Confessions of a Data Guy

I was recently confronted with an interesting conundrum when writing a complex data pipeline. It was an interesting problem that arose from my quest to reduce complexity in part of the design, which found itself creeping into another part, re-enforcing the classic idea of whether you can really make the complexity pea go away, or […] The post If:Else Logic and Complexity – Hiding the Pea. appeared first on Confessions of a Data Guy.

article thumbnail

Calibrating the Mosaic Evaluation Gauntlet

databricks

A good benchmark is one that clearly shows which models are better and which are worse. The Databricks Mosaic Research team is dedicated.

128
128
article thumbnail

Avoid These 5 Common Mistakes Every Novice in AI Makes

KDnuggets

Top five mistakes made by AI beginners and practical tips to avoid them, along with an engaging "50-Day Challenge" that you cannot afford to miss.

141
141
article thumbnail

How to Modernize Manufacturing Without Losing Control

Speaker: Andrew Skoog, Founder of MachinistX & President of Hexis Representatives

Manufacturing is evolving, and the right technology can empower—not replace—your workforce. Smart automation and AI-driven software are revolutionizing decision-making, optimizing processes, and improving efficiency. But how do you implement these tools with confidence and ensure they complement human expertise rather than override it? Join industry expert Andrew Skoog as he explores how manufacturers can leverage automation to enhance operations, streamline workflows, and make smarter, data-dri

article thumbnail

The right words in the right place

Tweag

tl;dr You may not believe it, but Nix documentation is getting better. Nixpkgs and NixOS still need more time. Table of contents Overview Motivation Statistics Retrospective Thoughts on future work Acknowledgements Overview This is a retrospective of my and many other people’s work on documentation in the Nix ecosystem between October 2022 and March 2024.

article thumbnail

Top 8 Snowflake Marketplace Questions, Answered

Snowflake

Snowflake Marketplace is designed to give customers and organizations a place to easily find, try and buy data, apps and AI products that help solve their most pressing business problems. We have more than 540 providers, offering over 2,400 live, ready-to-use data products (as of Jan 31, 2024), so there are many options to help you enrich your own data resources, build new data apps and leverage the power of AI on Snowflake.

article thumbnail

The Modern Data Stack: How The Evolution of Data Architecture Led to The Data Intelligence Platform

databricks

Modern data stacks provide the necessary flexibility and efficiency for analytics and AI. Learn how the Databricks Data Intelligence Platform makes use of them.

article thumbnail

The Ultimate AI Strategy Playbook

KDnuggets

Many businesses rush to adopt AI but fail due to poor strategy. This post serves as your go-to playbook for success.

136
136
article thumbnail

The Ultimate Guide to Apache Airflow DAGS

With Airflow being the open-source standard for workflow orchestration, knowing how to write Airflow DAGs has become an essential skill for every data engineer. This eBook provides a comprehensive overview of DAG writing features with plenty of example code. You’ll learn how to: Understand the building blocks DAGs, combine them in complex pipelines, and schedule your DAG to run exactly when you want it to Write DAGs that adapt to your data at runtime and set up alerts and notifications Scale you

article thumbnail

Reading and Processing JSON with Rust vs Python.

Confessions of a Data Guy

Have you ever wondered about being explicit in your code vs being vague? I think about this a lot as I’m writing code on a daily basis. I’ve found I like being explicit and verbose when writing code, rather than being vague in what I’m doing most of the time. When it comes to debugging […] The post Reading and Processing JSON with Rust vs Python. appeared first on Confessions of a Data Guy.

Python 100
article thumbnail

Meet the 2024 Snowflake Startup Challenge Finalists

Snowflake

The 2024 Snowflake Startup Challenge began with over 900 applications from startups Powered by Snowflake in more than 100 countries. Our judges narrowed that long list of contenders down to 10, and after much deliberation, they’ve now pared it down to the final three. We are pleased to announce that BigGeo, Scientific Financial Systems and SignalFlare.ai by Extropy360 will advance to the Snowflake Startup Challenge finale and compete for the opportunity to receive a share of up to $1 million in

Media 105
article thumbnail

Revolutionizing Data in Sports: The Game-Changing Impact of Databricks Marketplace and Delta Sharing

databricks

Unlock the power of advanced sports analytics with Databricks Marketplace and Delta Sharing. Discover how these platforms are transforming the sports industry by enabling seamless data access, collaboration, and real-time insights. Leverage a diverse array of data assets to optimize performance, enhance fan engagement, and gain a competitive edge. Explore the future of sports analytics, powered by Databricks.

Data 111
article thumbnail

Data Science Degrees vs. Courses: The Value Verdict

KDnuggets

Exploring the merits of data science degrees vs courses, this analysis contrasts their depth, prestige, and practicality in job market preparation

article thumbnail

Apache Airflow® Best Practices: DAG Writing

Speaker: Tamara Fingerlin, Developer Advocate

In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!

article thumbnail

Google Fires Python. What Next?

Confessions of a Data Guy

What is going on? Is the world coming to an end? I thought Python was going to live forever. Well, apparently not at Google. Recently Google announced it was laying off its entire North American-based Python team that was supporting Google’s special needs with Python, in favor of cheaper offshore workers. Apparently, some of these […] The post Google Fires Python.

Python 100
article thumbnail

How the EU’s Digital Operations Resilience Act (DORA) Aims To Strengthen Operational Resilience in Financial Services 

Snowflake

As the cybersecurity threat landscape continues to evolve globally, organizations operating in the financial sector are seeing regulations shift to address the associated risks, and none may prove more impactful than the European Union’s (EU) Digital Operational Resilience Act (DORA). This regulation aims to strengthen the operational resilience of financial entities (FEs), and their third-party information and communication technology (ICT) providers.

article thumbnail

Intelligently Balance Cost Optimization & Reliability on Databricks

databricks

The Databricks Data Intelligence Platform offers unparalleled flexibility, allowing users to access nearly instant, horizontally scalable compute resources. This ease of creation can.

article thumbnail

Free Python Resources That Can Help You Become a Pro

KDnuggets

This is a collection of free courses, books, projects, repositories, cheat sheets, and online compilers on Python to help you get started and gain experience.

Python 132
article thumbnail

How to Achieve High-Accuracy Results When Using LLMs

Speaker: Ben Epstein, Stealth Founder & CTO | Tony Karrer, Founder & CTO, Aggregage

When tasked with building a fundamentally new product line with deeper insights than previously achievable for a high-value client, Ben Epstein and his team faced a significant challenge: how to harness LLMs to produce consistent, high-accuracy outputs at scale. In this new session, Ben will share how he and his team engineered a system (based on proven software engineering approaches) that employs reproducible test variations (via temperature 0 and fixed seeds), and enables non-LLM evaluation m