Sat.Aug 24, 2024 - Fri.Aug 30, 2024

article thumbnail

Apache Spark’s Most Annoying Use Case

Confessions of a Data Guy

I still remember the good ole days when Apache Spark was fresh and hot, hardly anyone was using it, except a few poor AWS Glue and EMR users … Lord have mercy on their ragged souls. It’s funny how that GOAT of a tool went from being used by a few companies for extremely large […] The post Apache Spark’s Most Annoying Use Case appeared first on Confessions of a Data Guy.

AWS 147
article thumbnail

Data Teams Survey 2024 Results

Jesse Anderson

In the spring of 2024, I ran a new survey to gather more data for my Data Teams book and update my 2023 and 2020 surveys. In total, we had 81 respondents. This survey was designed to get information about how management uses data teams, the value they’re creating, and how they’re creating it. The survey asked about the best and worst practices that teams are using or experiencing.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Data News — Week 24.34

Christophe Blefari

News again. ( credits ) It's been 3 weeks. Summer continues and I hope this new edition finds you well, having had a great vacation and a nice break before getting back to business in September. Content and articles have been a little slow over the last few weeks and that's to be expected, but I feel it gonna get back to business as usual soon.

BI 130
article thumbnail

Project Ideas to Master Data Engineering

KDnuggets

Data engineering is best learned by doing projects. But which ones? Here are six projects focusing on different data engineering skills to ensure you have it all covered.

article thumbnail

Driving Responsible Innovation: How to Navigate AI Governance & Data Privacy

Speaker: Aindra Misra, Senior Manager, Product Management (Data, ML, and Cloud Infrastructure) at BILL

Join us for an insightful webinar that explores the critical intersection of data privacy and AI governance. In today’s rapidly evolving tech landscape, building robust governance frameworks is essential to fostering innovation while staying compliant with regulations. Our expert speaker, Aindra Misra, will guide you through best practices for ensuring data protection while leveraging AI capabilities.

article thumbnail

How to perform change data capture (CDC) from full database snapshots using Delta Live Tables

databricks

Learn more about processing snapshots using Delta Live Tables and how you can use the new Apply changes from Snapshshot statement in DLT to build SCD Type 1 or SCD Type 2 target tables delivering incremental data and insights that would typically take months of effort on legacy platforms.

Database 104
article thumbnail

The Big Data London Guide: 2024 Edition

Monte Carlo

Another Big Data London is right around the corner, and we couldn’t be more excited. Coming in hot on September 18-19, Big Data London is easily the UK’s biggest data event of the year. And with an event as rare and prestigious as Big Data London, it’s normal to want to maximize your time. That’s why we put together our list of the top things to see and do at Big Data London this year—including the data reliability sessions we’re most excited about and the after-parties you don’t want to miss.

More Trending

article thumbnail

5 Tips for Getting Started with Language Models

KDnuggets

Break the ice and dispel any fears about this expanding branch of AI with these five pieces of advice that will help you know where to start learning

109
109
article thumbnail

Announcing Hybrid Search General Availability in Mosaic AI Vector Search

databricks

We're excited to announce the general availability of hybrid search in Mosaic AI Vector Search. Hybrid search is a powerful feature that combines.

article thumbnail

Add Flexera’s State of the Cloud Report to Your Summer Reading List

Cloudera

It’s nearing the end of the summer in North America, and one report has been a staple on my reading list for more than a decade: the Flexera State of the Cloud Report. The annual survey of hundreds of global IT decision makers assesses cloud strategies, migration trends, and important considerations for companies moving to the cloud or managing cloud environments.

Cloud 85
article thumbnail

Introducing the Rebuild Network Topology Add-In for ArcGIS Pro 2.9 and 3.1

ArcGIS

The Rebuild Network Topology Add-In provides the ability to rebuild the network topology for the current extent of an active map with ArcGIS Pro 2.9 and 3.1.

article thumbnail

Launching LLM-Based Products: From Concept to Cash in 90 Days

Speaker: Christophe Louvion, Chief Product & Technology Officer of NRC Health and Tony Karrer, CTO at Aggregage

Christophe Louvion, Chief Product & Technology Officer of NRC Health, is here to take us through how he guided his company's recent experience of getting from concept to launch and sales of products within 90 days. In this exclusive webinar, Christophe will cover key aspects of his journey, including: LLM Development & Quick Wins 🤖 Understand how LLMs differ from traditional software, identifying opportunities for rapid development and deployment.

article thumbnail

How to Use NumPy to Solve Systems of Nonlinear Equations

KDnuggets

In this article, we’ll explore how to leverage NumPy to solve systems of nonlinear equations, turning complex mathematical challenges into manageable tasks.

Systems 106
article thumbnail

Winning at GenAI: Building the right processes for the data intelligence future

databricks

Learn how companies can create repeatable and scalable workflows that enable users to quickly turn GenAI innovation from experimentation to reality.

Process 93
article thumbnail

Data Engineering Weekly #186

Data Engineering Weekly

Try Fully Managed Apache Airflow for FREE Run Airflow without the hassle and management complexity. Take Astro (the fully managed Airflow solution) for a test drive today and unlock a suite of features designed to simplify, optimize, and scale your data pipelines. For a limited time, new sign-ups will receive a complimentary Airflow Fundamentals Certification exam (normally $150).

article thumbnail

Mosaic datasets: More than the sum of its parts

ArcGIS

Mosaic datasets are the backbone of imagery layers, but provide much more to your organization than simply creating imagery layers.

article thumbnail

How To Speak The Language Of Financial Success In Product Management

Speaker: Jamie Bernard

Success in product management goes beyond delivering great features - it’s about achieving measurable financial outcomes that resonate across the organization. By connecting your product’s journey with the company’s financial success, you’ll ensure that every feature, release, and innovation contributes to the bottom line, driving both customer satisfaction and business growth.

article thumbnail

How to Build and Train a Transformer Model from Scratch with Hugging Face Transformers

KDnuggets

A step-to-step guide to navigate you through training your own transformer-based language model.

Building 131
article thumbnail

The GenAI Journey: How Enterprises are Progressing from General-Purpose to Custom LLMs

databricks

Every company's path from foundational to tailored LLMs will be different. Each will require new tooling to help developers deliver the accurate and governed GenAI that leaders are demanding.

article thumbnail

Mainframe to Cloud Migrations: Expert Insights from AWS, Confluent, and Precisely

Precisely

Key Takeaways: Enhance capabilities through partnerships: AWS, Confluent, and Precisely accelerate mainframe modernization efforts, providing you with essential tools for success. Minimize migration disruptions through phased implementation, starting with low-risk, high-value projects. A strategic and tailored approach to mainframe modernization can enhancing business agility and innovation.

AWS 64
article thumbnail

Display “Quantity by Category” Symbology in ArcGIS Pro

ArcGIS

You can replicate Quantity by Category symbology in ArcGIS Pro 3.3 by classifying a Size or Color visual variable.

85
article thumbnail

What Is Entity Resolution? How It Works & Why It Matters

Entity Resolution Sometimes referred to as data matching or fuzzy matching, entity resolution, is critical for data quality, analytics, graph visualization and AI. Learn what entity resolution is, why it matters, how it works and its benefits. Advanced entity resolution using AI is crucial because it efficiently and easily solves many of today’s data quality and analytics problems.

article thumbnail

Generative AI Specialisation Courses from IBM for Every Profession

KDnuggets

Check out these 5 IBM specialisation courses specific to those who want to learn more about generative AI.

123
123
article thumbnail

Stepping into personalized experiences for every customer with the Databricks Data Intelligence Platform

databricks

Skechers has been at the forefront of the e-commerce industry, focusing on hyperpersonalized experiences to meet customer expectations better. Following significant growth during.

Data 71
article thumbnail

Confluent Champion: The Power of a Learning Culture and Motivated Teams

Confluent

In our latest Confluent Champion post, Janis Hom, staff security GRC program manager, highlights how Confluent fosters a culture that helps her stay motivated.

article thumbnail

Comprehensive IBM i Security Requires a Multi-layered Approach

Precisely

Key Takeaways Implement a multi-layered defense to ensure robust protection for your IBM i environment against evolving cybersecurity threats. Address unique IBM i security challenges by recognizing vulnerabilities like integration issues, skilled staff shortages, and unpatched systems. Stay proactive and informed with vulnerability reports that help you understand and mitigate risks, including zero-day vulnerabilities.

article thumbnail

Provide Real Value in Your Applications with Data and Analytics

The complexity of financial data, the need for real-time insight, and the demand for user-friendly visualizations can seem daunting when it comes to analytics - but there is an easier way. With Logi Symphony, we aim to turn these challenges into opportunities. Our platform empowers you to seamlessly integrate advanced data analytics, generative AI, data visualization, and pixel-perfect reporting into your applications, transforming raw data into actionable insights.

article thumbnail

5 Tips for Optimizing Machine Learning Algorithms

KDnuggets

Embrace these five best-practices boost the effectiveness of your trained machine learning solutions, no matter their complexity

article thumbnail

Cost-effective, incremental ETL with serverless compute for Delta Live Tables pipelines

databricks

We recently announced the general availability of serverless compute for Notebooks, Workflows, and Delta Live Tables (DLT) pipelines. Today, we'd like to explain.

78
article thumbnail

Use Business Analyst’s Target Marketing Wizard to find customers in a new area

ArcGIS

Identify the most promising customers in a new area using the Business Analyst Target Marketing wizard and Esri's Tapestry Segmentation.

article thumbnail

Unlock Real-Time Value from DynamoDB Data with Confluent's CDC Source Connector

Confluent

You can simplify the transfer of data from one or more DynamoDB tables to Confluent Cloud with the fully managed, no code, Confluent CDC source connector.

Cloud 60
article thumbnail

Generative AI Deep Dive: Advancing from Proof of Concept to Production

Speaker: Maher Hanafi, VP of Engineering at Betterworks & Tony Karrer, CTO at Aggregage

Executive leaders and board members are pushing their teams to adopt Generative AI to gain a competitive edge, save money, and otherwise take advantage of the promise of this new era of artificial intelligence. There's no question that it is challenging to figure out where to focus and how to advance when it’s a new field that is evolving everyday. 💡 This new webinar featuring Maher Hanafi, VP of Engineering at Betterworks, will explore a practical framework to transform Generative AI pr

article thumbnail

How to Translate Languages with MarianMT and Hugging Face Transformers

KDnuggets

Discover how to translate text quickly and accurately between languages with just a few simple steps using MarianMT.

113
113
article thumbnail

Highlights from the Databricks Community

databricks

Within the Databricks Community, there is a technical blog where community members share best practices, tutorials and insights on data analytics, data engineering.

article thumbnail

AWS Glue Architecture: Components, Working, and Alternatives

Hevo

AWS Glue is a fully managed serverless ETL service that simplifies preparing and loading data for analytics. But how does it work? To answer that question, we need to understand its architecture. In this blog, we will discuss the AWS Glue architecture so you can fully understand how it works and optimize your data better.

AWS 52
article thumbnail

Startup Spotlight: Genesis’ Co-Worker Agents Lend AI-Powered Assistance

Snowflake

Welcome to Snowflake’s Startup Spotlight, where we ask startup founders about the problems they’re solving, the apps they’re building and the lessons they’ve learned during their startup journey. In this edition, we’ll learn why the founders of Genesis , Matt Glickman and Justin Langseth, decided to take on the challenge of creating AI-powered assistants to run generative AI workloads in Snowflake, and why “Eliza” and “Stuart” might soon be joining your team meetings.

Cloud 52
article thumbnail

The AI Superhero Approach to Product Management

Speaker: Conrado Morlan

In this engaging and witty talk, industry expert Conrado Morlan will explore how artificial intelligence can transform the daily tasks of product managers into streamlined, efficient processes. Using the lens of a superhero narrative, he’ll uncover how AI can be the ultimate sidekick, aiding in data management and reporting, enhancing productivity, and boosting innovation.