Trending Articles

article thumbnail

10 GitHub Repositories to Master Math

KDnuggets

Learn math through roadmaps, courses, tutorials, Python frameworks for solving equations, guides, exercises, textbooks, and more.

Python 133
article thumbnail

2024 retrospective on waitingforcode.com

Waitingforcode

Even though I was blogging less in the second half of the previous year, the retrospective is still the blog post I'm waiting for each year. Every year I summarize what happened in the past 12 months and share with you my future plans. It's time for the 2024 Edition!

IT 130
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Top 11 GenAI Powered Data Engineering Tools to Follow in 2025

Analytics Vidhya

What will data engineering look like in 2025? How will generative AI shape the tools and processes Data Engineers rely on today? As the field evolves, Data Engineers are stepping into a future where innovation and efficiency take center stage. GenAI is already transforming how data is managed, analyzed, and utilized, paving the way for […] The post Top 11 GenAI Powered Data Engineering Tools to Follow in 2025 appeared first on Analytics Vidhya.

article thumbnail

Agents of Change: Navigating 2025 with AI and Data Innovation

Data Engineering Weekly

As we approach the new year, it's time to gaze into the crystal ball and ponder the future. In this post, we delve into predictions for 2025, focusing on the transformative role of AI agents, workforce dynamics, and data platforms. Join Ananth Packkildurai, Ashwin Ashish, and Rajesh as they unravel the future and guide us through the fascinating changes ahead.

article thumbnail

Apache Airflow® Best Practices for ETL and ELT Pipelines

Whether you’re creating complex dashboards or fine-tuning large language models, your data must be extracted, transformed, and loaded. ETL and ELT pipelines form the foundation of any data product, and Airflow is the open-source data orchestrator specifically designed for moving and transforming data in ETL and ELT pipelines. This eBook covers: An overview of ETL vs.

article thumbnail

Queues in Apache Kafka®: Enhancing Message Processing and Scalability

Confluent

Queue support in Apache Kafka 4.0, enabled by share groups, lets you accommodate traditional queue-type workloads through cooperative consumption.

Kafka 131
article thumbnail

Designing a Declarative Data Stack: From Theory to Practice

Simon Späti

What started as a straightforward implementation guide for a declarative data stack quickly evolved into something more fundamental. While attempting to build a system that could define an entire data stack through a single YAML file, I encountered architectural questions that challenged my initial assumptions: Should we generate production-ready code from templates or create a boilerplate repository with best-in-class tools?

Designing 130

More Trending

article thumbnail

2024’s Biggest Moments in AI

KDnuggets

2024 has been yet another groundbreaking year for AI, with major breakthroughs, industry shifts, and ethical challenges shaping its future. Let's uncover together the key moments that defined AI this year about to finalize.

IT 125
article thumbnail

Guide to connecting to Excel files in ArcGIS Pro

ArcGIS

This blog provides step-by-step guidance to determine and use a silent install when configuring a driver to use Excel files in ArcGIS Pro. Learn More.

article thumbnail

Integrating Microservices with Confluent Cloud Using Micronaut® Framework

Confluent

Real-time data streaming and messaging are essential for building scalable, resilient, event-driven microservices. Explore integrating the Micronaut framework with Confluent Cloud.

Cloud 115
article thumbnail

Secure External Access to Unity Catalog Assets via Open APIs

databricks

We're excited to announce the Public Preview of credential vending for Unity Catalogs open APIs, allowing external clients to securely access Unity Catalog.

article thumbnail

Apache Airflow®: The Ultimate Guide to DAG Writing

Speaker: Tamara Fingerlin, Developer Advocate

In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!

article thumbnail

The Basics of SFTP: Authentication, Encryption, and File Management

Seattle Data Guy

If you’re looking to pass hundreds of GBs of data quickly, you’re likely not going to use a REST API. That’s why every day, companies share data sets of users, patient claims, financial transactions, and more via SFTP. If youve been in the industry for a while, youve probably come across automated SFTP jobs that… Read more The post The Basics of SFTP: Authentication, Encryption, and File Management appeared first on Seattle Data Guy.

article thumbnail

How to Implement Image Captioning with Vision Transformer (ViT) and Hugging Face Transformers

KDnuggets

A beginners guide to getting started with image captioning models with HuggingFace.

124
124
article thumbnail

Simplicity in the Modern Data Stack

Confessions of a Data Guy

We have all come to live in the Modern Data Stack, and whether we like it or not, our lives are no longer as simple as they were in the days of SQL Server and SSIS. Things have changed A LOT. There are good and bad sides to that coin. The Modern Data Stack has […] The post Simplicity in the Modern Data Stack appeared first on Confessions of a Data Guy.

SQL 100
article thumbnail

Generative AI Meets Data Streaming (Part I) – Data as the Engine: Building the AI Fundamentals

Confluent

Discover how data fuels Generative AI and why streaming data is key to success. Learn the fundamentals to unlock AIs true potential for your business.

Building 111
article thumbnail

Optimizing The Modern Developer Experience with Coder

Many software teams have migrated their testing and production workloads to the cloud, yet development environments often remain tied to outdated local setups, limiting efficiency and growth. This is where Coder comes in. In our 101 Coder webinar, you’ll explore how cloud-based development environments can unlock new levels of productivity. Discover how to transition from local setups to a secure, cloud-powered ecosystem with ease.

article thumbnail

Strategic Priorities for Data and AI Leaders in 2025

databricks

AI remains at the forefront of every business leaders plans for 2025. Overall, 70% of businesses continue to believe AI is critical to.

Data 108
article thumbnail

Introducing Configurable Metaflow

Netflix Tech

David J. Berg * , David Casler ^, Romain Cledat * , Qian Huang * , Rui Lin * , Nissan Pow * , Nurcan Sonmez * , Shashank Srikanth * , Chaoying Wang * , Regina Wang * , Darin Yu * *: Model Development Team, Machine Learning Platform ^: Content Demand ModelingTeam A month ago at QConSF, we showcased how Netflix utilizes Metaflow to power a diverse set of ML and AI use cases , managing thousands of unique Metaflow flows.

article thumbnail

10 Pandas One-Liners for Quick Data Quality Checks

KDnuggets

Want to run some quick data quality checks? Here are 10 pandas one-liners that'll come in handy.

Data 113
article thumbnail

Key Takeaways from AWS re:Invent 2024

Cloudera

AWS re:Invent is one of my favorite trade shows. It is one of the biggest technology conferences of the year and is an opportunity to have hundreds of conversations with customers and prospects, listen to their priorities and challenges, hopes, and give them a Cloudera tote bag or a pair of orange sunglasses. What follows is a collection of just a few things I learned and observed during my week in Las Vegas.

AWS 75
article thumbnail

15 Modern Use Cases for Enterprise Business Intelligence

Large enterprises face unique challenges in optimizing their Business Intelligence (BI) output due to the sheer scale and complexity of their operations. Unlike smaller organizations, where basic BI features and simple dashboards might suffice, enterprises must manage vast amounts of data from diverse sources. What are the top modern BI use cases for enterprise businesses to help you get a leg up on the competition?

article thumbnail

Indexing code at scale with Glean

Engineering at Meta

Were sharing details about Glean , Metas open source system for collecting, deriving and working with facts about source code. In this blog post well talk about why a system like Glean is important, explain the rationale for Gleans design, and run through some of the ways were using Glean to supercharge our developer tooling at Meta. In August 2021 we open-sourced our code indexing system Glean.

Coding 70
article thumbnail

Introducing Git Support for Queries in Databricks

databricks

Were excited to announce the Public Preview of Query Git integration as part of the new SQL Editor. Git support for queries.

SQL 107
article thumbnail

Generative AI Meets Data Streaming (Part III) – Scaling AI in Real Time: Data Streaming and Event-Driven Architecture

Confluent

Learn how data streaming platforms and event-driven architecture enable real-time, scalable AI solutions to power smarter, faster business decisions.

article thumbnail

The Most Popular KDnuggets Articles of 2024

KDnuggets

Let's have a look at the most popular articles on KDnuggets this past year. How many have you read?

108
108
article thumbnail

Prepare Now: 2025s Must-Know Trends For Product And Data Leaders

Speaker: Jay Allardyce, Deepak Vittal, Terrence Sheflin, and Mahyar Ghasemali

As we look ahead to 2025, business intelligence and data analytics are set to play pivotal roles in shaping success. Organizations are already starting to face a host of transformative trends as the year comes to a close, including the integration of AI in data analytics, an increased emphasis on real-time data insights, and the growing importance of user experience in BI solutions.

article thumbnail

Top 10 Data & AI Trends for 2025

Towards Data Science

Agentic AI, small data, and the search for value in the age of the unstructured datastack. Image credit: MonteCarlo According to industry experts, 2024 was destined to be a banner year for generative AI. Operational use cases were rising to the surface, technology was reducing barriers to entry, and general artificial intelligence was obviously right around thecorner.

article thumbnail

Translating Java to Kotlin at Scale

Engineering at Meta

Meta has been on a years-long undertaking to translate our entire Android codebase from Java to Kotlin. Today, despite having one of the largest Android codebases in the world, we’re well past the halfway point and still going. We’re sharing some of the tradeoffs we’ve made to support automating our transition to Kotlin, seemingly simple transformations that are surprisingly tricky, and how we’re collaborating with other companies to capture hundreds more corner cases.

Java 82
article thumbnail

Benchmarking Domain Intelligence

databricks

Large language models are improving rapidly; to date, this improvement has largely been measured via academic benchmarks. These benchmarks, such as MMLU and.

107
107
article thumbnail

Part 2: A Survey of Analytics Engineering Work at Netflix

Netflix Tech

This article is the second in a multi-part series sharing a breadth of Analytics Engineering work at Netflix, recently presented as part of our annual internal Analytics Engineering conference. Need to catch up? Check out Part 1. In this article, we highlight a few exciting analytic business applications, and in our final article well go into aspects of the technical craft.

article thumbnail

How to Drive Cost Savings, Efficiency Gains, and Sustainability Wins with MES

Speaker: Nikhil Joshi, Founder & President of Snic Solutions

Is your manufacturing operation reaching its efficiency potential? A Manufacturing Execution System (MES) could be the game-changer, helping you reduce waste, cut costs, and lower your carbon footprint. Join Nikhil Joshi, Founder & President of Snic Solutions, in this value-packed webinar as he breaks down how MES can drive operational excellence and sustainability.

article thumbnail

Job Hunting in 2025: What You Need to Know

KDnuggets

This is a quick shortlist to make sure youre ticking off the essentials for your job hunt in 2025.

105
105
article thumbnail

Cloudera’s Take: What’s in Store for Data and AI in 2025

Cloudera

In the last year, weve seen the explosion of AI in the enterprise, leaving organizations to consider the infrastructure and processes for AI to successfullyand securelydeploy across an organization. As we head into 2025, its clear that next year will be just as exciting as past years. Here, Cloudera experts share their insights on what to expect in data and AI for the enterprise in 2025.

article thumbnail

How we think about Threads’ iOS performance

Engineering at Meta

How did the Threads iOS team maintain the app’s performance during its incredible growth? Here’s how Meta’s Threads team thinks about performance, including the key metrics we monitor to keep the app healthy. We’re also diving into some case studies that impact publish reliability and navigation latency. When Meta launched Threads in 2023, it became the fastest-growing app in history, gaining 100 million users in only five days.

Media 79
article thumbnail

How HP is optimizing the 3D Printing supply chain using Delta Sharing

databricks

Javier Lagares is a Principal Data Engineer at HP, where he leads the development of data-driven solutions for the 3D printing business. With.

article thumbnail

Improving the Accuracy of Generative AI Systems: A Structured Approach

Speaker: Anindo Banerjea, CTO at Civio & Tony Karrer, CTO at Aggregage

When developing a Gen AI application, one of the most significant challenges is improving accuracy. This can be especially difficult when working with a large data corpus, and as the complexity of the task increases. The number of use cases/corner cases that the system is expected to handle essentially explodes. 💥 Anindo Banerjea is here to showcase his significant experience building AI/ML SaaS applications as he walks us through the current problems his company, Civio, is solving.