Tue.Feb 25, 2025

article thumbnail

Generative AI for Data Scientists in 2025: Beyond Text Generation

KDnuggets

Directions to become "upgraded" data scientists prepared to fully leverage generative AI technologies in the year ahead.

Data 123
article thumbnail

Revolutionizing Enterprise Data Analytics at ReaderLink: From SQL to AI-Powered Insights

databricks

In today's fast-paced business environment, the ability to quickly access and analyze data is crucial for maintaining a competitive edge. As North America's largest book.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

10 Essential Docker Commands for Data Engineering

KDnuggets

Tired of 'it works on my machine' problems? Learn the top 10 Docker commands every data engineer needs to build, deploy, and scale projects like a pro!

article thumbnail

ArcGIS CityEngine: 3D Visibility Analysis for Small Urban Wind Turbines in Brussels

ArcGIS

UC Louvain analysts used ArcGIS CityEngine for a Python-automated visibility analyses to determine suitable locations for urban wind turbines.

Python 81
article thumbnail

A Guide to Debugging Apache Airflow® DAGs

In Airflow, DAGs (your data pipelines) support nearly every use case. As these workflows grow in complexity and scale, efficiently identifying and resolving issues becomes a critical skill for every data engineer. This is a comprehensive guide with best practices and examples to debugging Airflow DAGs. You’ll learn how to: Create a standardized process for debugging to quickly diagnose errors in your DAGs Identify common issues with DAGs, tasks, and connections Distinguish between Airflow-relate

article thumbnail

30 Must-Know Tools for Python Development

KDnuggets

A structured overview of the essential tools developers can use across different aspects of Python development

Python 110
article thumbnail

Databricks at MWC 2025: Telecommunications runs on the Data Intelligence Platform

databricks

Book at meeting with Databricks at MWC 2025! As we approach Mobile World Congress (MWC) 2025, the telecommunications industry is poised for a transformative leap.

More Trending

article thumbnail

Managing Data Contracts: Helping Developers Codify “Shift Left”

Confluent

Weve talked a lot about shift left - but lets apply the concept with Confluent Cloud using Terraform and Gradle to manage and maintain data contracts.

article thumbnail

From Liberal Arts to Data Science: What to Expect on Your Journey

Elder Research

Data Scientist Henry Mead shares a practical guide on what it takes to land a job as a data scientist coming from the humanities.

article thumbnail

Nousot and Xcel Energy: Harnessing AI and Geospatial Intelligence for Natural Disaster Mitigation

databricks

For utility companies such as Xcel Energy, wildfire mitigation is critical to protecting electrical infrastructure and minimizing the risk of utility-related ignition events. Typical mitigation.

article thumbnail

Announcing the Migration Toolset for Utility Network

ArcGIS

Introducing the Utility Network Migration Toolsetan easy way to transition GIS data to ArcGIS Utility Network with flexibility and control.

article thumbnail

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

Speaker: Tamara Fingerlin, Developer Advocate

Apache Airflow® 3.0, the most anticipated Airflow release yet, officially launched this April. As the de facto standard for data orchestration, Airflow is trusted by over 77,000 organizations to power everything from advanced analytics to production AI and MLOps. With the 3.0 release, the top-requested features from the community were delivered, including a revamped UI for easier navigation, stronger security, and greater flexibility to run tasks anywhere at any time.

article thumbnail

トレーニングとイネーブルメントによる成長の促進

Snowflake

Snowflake Snowflake ROISnowflake The Value of Snowflake Training

52
article thumbnail

The Guide to Common Data Engineer Design Patterns

Monte Carlo

Data pipelines are messy. You build them to be smooth and efficient, but then reality happensschema changes, bottlenecks, weird edge cases you didnt see coming. Thats why solid design patterns matter. Data engineering design patterns are repeatable solutions that help you structure, optimize, and scale data processing, storage, and movement. They make data workflows more resilient and easier to manage when things inevitably go sideways.

article thumbnail

How Meta is translating its Java codebase to Kotlin

Engineering at Meta

Meta has been working to shift its Android codebase from Java to Kotlin , a newer language for Android development that offers some key advantages over Java. Weve even open sourced various examples and utilities we used to in our migration to manipulate Kotlin code. So how do you translate roughly tens of millions of lines of Java code to Kotlin ? On this episode of the Meta Tech Podcast, Pascal Hartig sits down with Eve and Jocelyn, two software engineers on Metas Mobile Infra Codebases Team, t

Java 106
article thumbnail

What is a Healthy Lake House?

Confessions of a Data Guy

Maybe I’m the only one who thinks about it, not sure. The Lake House has become the new Data Warehouse, yet when I ask this question “What makes a health Lake House?” no one is sure what the answer is, or you get different answers. It seems like a pretty important question considering that Lake […] The post What is a Healthy Lake House?

article thumbnail

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Speaker: Alex Salazar, CEO & Co-Founder @ Arcade | Nate Barbettini, Founding Engineer @ Arcade | Tony Karrer, Founder & CTO @ Aggregage

There’s a lot of noise surrounding the ability of AI agents to connect to your tools, systems and data. But building an AI application into a reliable, secure workflow agent isn’t as simple as plugging in an API. As an engineering leader, it can be challenging to make sense of this evolving landscape, but agent tooling provides such high value that it’s critical we figure out how to move forward.

article thumbnail

Empowering Growth Through Training and Enablement

Snowflake

Throughout my career, Ive had the privilege of working across the full spectrum of enablement: internal enablement, partner enablement and customer enablement. Each of these domains brings unique challenges, audiences and approaches, but a common thread unites them all: the goal of fostering growth. At its core, training and enablement are not just about imparting knowledge or improving skills.

article thumbnail

Striim 5.0 Release: Introducing Striim Copilot for AI-Driven Pipeline Creation and Troubleshooting

Striim

In Striims latest release, version 5.0, Striim Copilot takes the stage as an AI-powered assistant designed to streamline the process of building, managing, and troubleshooting streaming data pipelines. Striim Copilot brings intelligent guidance and support directly into the Striim platform, enhancing productivity and reducing the time it takes to bring data projects from concept to reality.

article thumbnail

CI/CD for Data Teams: A Roadmap to Reliable Data Pipelines

Ascend.io

Continuous Integration and Continuous Delivery (CI/CD) has transformed software development by enabling faster, safer deployments and data teams are now realizing these same benefits must extend to data pipelines and analytics code. But applying CI/CD in a data context comes with unique challenges. In this guide, we’ll explore general CI/CD principles and dive into data-specific hurdles, with practical best practices to help data team foster a robust, tool-agnostic CI/CD process.

article thumbnail

Sentiment Analysis

WeCloudData

Sentiment analysis is the process of analyzing textual data to check its emotional tone i.e.; whether it expresses a positive, negative, or neutral sentiment. Companies have massive amounts of data about their customers from emails and posts of X to feedback, online survey responses, reviews, and chats with customer service representatives. These data can be […] The post Sentiment Analysis appeared first on WeCloudData.

Process 52
article thumbnail

How to Modernize Manufacturing Without Losing Control

Speaker: Andrew Skoog, Founder of MachinistX & President of Hexis Representatives

Manufacturing is evolving, and the right technology can empower—not replace—your workforce. Smart automation and AI-driven software are revolutionizing decision-making, optimizing processes, and improving efficiency. But how do you implement these tools with confidence and ensure they complement human expertise rather than override it? Join industry expert Andrew Skoog as he explores how manufacturers can leverage automation to enhance operations, streamline workflows, and make smarter, data-dri

article thumbnail

5.0 Release: Unlocking the Power of Snowflake CDC for Real-Time Data Replication

Striim

What is Snowflake CDC? Snowflake CDC (Change Data Capture) is a method that enables real-time data replication from Snowflake databases by tracking and capturing changes made to tables. Using a specialized Snowflake Reader, it enables continuous replication after an initial load, ensuring that any data manipulation language (DML) changes like inserts, updates, and deletes are identified and captured in near real-time.

article thumbnail

Better Together: Data Enrichment and AI for Smarter Decision-Making

Precisely

Key Takeaways Enrich your raw data with context to unlock its full potential and enable smarter, data-driven decision-making. Combine data enrichment and AI for more accurate predictions, personalized insights, and proactive strategies. To successfully implement data enrichment, identify current data gaps, choose the right providers, and leverage APIs and AI for maximum impact.

article thumbnail

Striim 5.0 Release: Unlock Real-Time Insights with the JIRA Reader Integration

Striim

Striim 5.0 Release: Unlock Real-Time Insights with the JIRA Reader Integration Striim 5.0 brings exciting new features that streamline real-time data management and empower businesses to make data-driven decisions faster. Among these, the new Atlassian JIRA Reader stands out as a key innovation, enabling seamless integration with JIRA, a powerful issue tracking system widely used for bug tracking and project management.

article thumbnail

Striim 5.0 Release: Unlock Real-Time Marketing Insights with the Google Ads Reader

Striim

Real-time insights are crucial for making data-driven decisions and staying ahead of the competition. Striim 5.0’s latest feature, the Google Ads Reader , helps businesses unlock the full potential of their Google Ads data by providing seamless, real-time integration with analytical systems like BigQuery and Snowflake. Lets dive into what this feature can do, how to use it, and the value Striim adds to your business.

article thumbnail

The Ultimate Guide to Apache Airflow DAGS

With Airflow being the open-source standard for workflow orchestration, knowing how to write Airflow DAGs has become an essential skill for every data engineer. This eBook provides a comprehensive overview of DAG writing features with plenty of example code. You’ll learn how to: Understand the building blocks DAGs, combine them in complex pipelines, and schedule your DAG to run exactly when you want it to Write DAGs that adapt to your data at runtime and set up alerts and notifications Scale you

article thumbnail

Striim 5.0 Release: Streamline Data Integration with ServiceNow Reader and Writer

Striim

Striims new 5.0 release introduces the ServiceNow Reader and Writer adapters, a game-changing feature that enhances how companies integrate and manage their data with ServiceNow. By seamlessly reading from and writing to ServiceNows platform, Striim enables businesses to optimize workflows and improve operational efficiency like never before. What Does It Do?

article thumbnail

Striim 5.0 Release: Supercharge Your Data Integration with Microsoft Dataverse Writer

Striim

Organizations need to ensure their systems are integrated seamlessly for efficient decision-making. Striim’s 5.0 release introduces the Microsoft Dataverse Writer, a powerful target adapter that enhances how businesses integrate and write data to Microsoft Dataverse. With its ability to handle both standard and custom objects, Striim allows companies to unlock real-time insights and drive more informed decisions across their operations.

article thumbnail

Striim 5.0 Release: Introducing Automated Pipelines for Effortless Data Replication

Striim

The release of Striim 5.0 brings a powerful new feature, Automated Pipelines , designed to transform how businesses handle real-time data replication. Automated Pipelines empowers organizations to deploy high-performance, streaming data replication applications with just a few clicks, making complex replication processes simple and accessible for teams at any scale.

article thumbnail

Striim 5.0 Release: Supercharge Customer Service with the Zendesk Reader

Striim

Real-time access to data is essential for delivering outstanding customer experiences. Striim’s 5.0 release introduces the Zendesk Reader, a powerful tool that enables businesses to seamlessly integrate their Zendesk data into their broader data ecosystem. This integration enhances decision-making and helps teams improve customer service efficiency by providing timely insights from their help desk management system.

article thumbnail

How to Achieve High-Accuracy Results When Using LLMs

Speaker: Ben Epstein, Stealth Founder & CTO | Tony Karrer, Founder & CTO, Aggregage

When tasked with building a fundamentally new product line with deeper insights than previously achievable for a high-value client, Ben Epstein and his team faced a significant challenge: how to harness LLMs to produce consistent, high-accuracy outputs at scale. In this new session, Ben will share how he and his team engineered a system (based on proven software engineering approaches) that employs reproducible test variations (via temperature 0 and fixed seeds), and enables non-LLM evaluation m

article thumbnail

Striim 5.0 Release: Unleash Real-Time HubSpot Integration with Our Latest Connector

Striim

As businesses increasingly rely on HubSpots customer platform for marketing, sales, customer service, and more, the ability to seamlessly move data in real time is a game changer. Striim 5.0 introduces the HubSpot Reader, a powerful connector that integrates your HubSpot CRM data with any target system. With Striims robust real-time data movement capabilities, you can unlock the full potential of your HubSpot data to enhance analytics, streamline operations, and improve customer experiences.