Sat.Sep 28, 2019 - Fri.Oct 04, 2019

article thumbnail

Choosing the Right Clustering Algorithm for your Dataset

KDnuggets

Applying a clustering algorithm is much easier than selecting the best one. Each type offers pros and cons that must be considered if you’re striving for a tidy cluster structure.

Algorithm 123
article thumbnail

Ship Faster With An Opinionated Data Pipeline Framework

Data Engineering Podcast

Summary Building an end-to-end data pipeline for your machine learning projects is a complex task, made more difficult by the variety of ways that you can structure it. Kedro is a framework that provides an opinionated workflow that lets you focus on the parts that matter, so that you don’t waste time on gluing the steps together. In this episode Tom Goldenberg explains how it works, how it is being used at Quantum Black for customer projects, and how it can help you structure your own.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Free Apache Kafka as a Service with Confluent Cloud

Confluent

Go from zero to production on Apache Kafka ® without talking to sales reps or building infrastructure. Apache Kafka is the standard for event-driven applications. But it’s not without its challenges, and the ops burden can be heavy. Organizations that successfully build and run their own Kafka environment must make significant investments in engineering and operations to account for failover and security.

Kafka 19
article thumbnail

How to Deliver Better Business Outcomes with Predictive Modeling

Teradata

Predict the future faster with predictive modeling. Learn more about use cases and how to get more value out of your data.

Data 69
article thumbnail

A Guide to Debugging Apache Airflow® DAGs

In Airflow, DAGs (your data pipelines) support nearly every use case. As these workflows grow in complexity and scale, efficiently identifying and resolving issues becomes a critical skill for every data engineer. This is a comprehensive guide with best practices and examples to debugging Airflow DAGs. You’ll learn how to: Create a standardized process for debugging to quickly diagnose errors in your DAGs Identify common issues with DAGs, tasks, and connections Distinguish between Airflow-relate

article thumbnail

Data Preparation for Machine learning 101: Why it’s important and how to do it

KDnuggets

As data scientists who are the brains behind the AI-based innovations, you need to understand the significance of data preparation to achieve the desired level of cognitive capability for your models. Let’s begin.

article thumbnail

How We Analyze and Visualize Kubernetes Events in Real Time at Rockset

Rockset

Kubernetes at Rockset At Rockset, we use Kubernetes (k8s) for cluster orchestration. It runs all our production microservices — from our ingest workers to our query-serving tier. In addition to hosting all the production infrastructure, each engineer has their own Kubernetes namespace and dedicated resources that we use to locally deploy and test new versions of code and configuration.

SQL 40

More Trending

article thumbnail

Teradata Certification Program Embraces Vantage

Teradata

The Teradata Certification program is celebrating its 20th anniversary! Find out how it can advance your career by making you a certified expert on Vantage.

article thumbnail

Know Your Data: Part 1

KDnuggets

This article will introduce the different type of data sets, data object and attributes.

Data 123
article thumbnail

DeepMind Has Quietly Open Sourced Three New Impressive Reinforcement Learning Frameworks

KDnuggets

Three new releases that will help researchers streamline the implementation of reinforcement learning programs.

article thumbnail

A European Approach to Master’s Degrees in Data Science

KDnuggets

Data science education in Europe has been reevaluated and new recommendations are leading the way to the next generation of data science Master's courses to better support and train students.

article thumbnail

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

Speaker: Tamara Fingerlin, Developer Advocate

Apache Airflow® 3.0, the most anticipated Airflow release yet, officially launched this April. As the de facto standard for data orchestration, Airflow is trusted by over 77,000 organizations to power everything from advanced analytics to production AI and MLOps. With the 3.0 release, the top-requested features from the community were delivered, including a revamped UI for easier navigation, stronger security, and greater flexibility to run tasks anywhere at any time.

article thumbnail

The Last SQL Guide for Data Analysis You’ll Ever Need

KDnuggets

This is it: the last SQL guide for data analysis you'll ever need! OK, maybe it’s actually the first. But it’ll give you a solid head start.

article thumbnail

How AI will transform healthcare (and can it fix the US healthcare system?)

KDnuggets

This thorough review focuses on the impact of AI, 5G, and edge computing on the healthcare sector in the 2020s as well as a look at quantum computing's potential impact on AI, healthcare, and financial services.

article thumbnail

Overcoming Deep Learning Stumbling Blocks

KDnuggets

Find out what was presented at the 6th annual Deep Learning Summit in London where industry leaders, academics, researchers, and innovative startups presenting the latest technological advancements and industry application methods in the field of deep learning.

article thumbnail

Clustering Metrics Better Than the Elbow Method

KDnuggets

We show what metric to use for visualizing and determining an optimal number of clusters much better than the usual practice — elbow method.

102
102
article thumbnail

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Speaker: Alex Salazar, CEO & Co-Founder @ Arcade | Nate Barbettini, Founding Engineer @ Arcade | Tony Karrer, Founder & CTO @ Aggregage

There’s a lot of noise surrounding the ability of AI agents to connect to your tools, systems and data. But building an AI application into a reliable, secure workflow agent isn’t as simple as plugging in an API. As an engineering leader, it can be challenging to make sense of this evolving landscape, but agent tooling provides such high value that it’s critical we figure out how to move forward.

article thumbnail

Sentiment and Emotion Analysis for Beginners: Types and Challenges

KDnuggets

There are three types of emotion AI, and their combinations. In this article, I’ll briefly go through these three types and the challenges of their real-life applications.

99
article thumbnail

Training a Machine Learning Engineer

KDnuggets

There is no clear outline on how to study Machine Learning/Deep Learning due to which many individuals apply all the possible algorithms that they have heard of and hope that one of implemented algorithms work for their problem in hand. Below, I've listed out some of the steps that one should adopt while solving a machine learning problem.

article thumbnail

Will Machine Learning End Retail? Data Science Seattle Oct 17, 2019

KDnuggets

In advance of the Data Science Salon taking place in Seattle on Oct 17, we asked our speakers to shed some light on how Artificial Intelligence and Machine Learning are impacting one of America’s most disruptive industries. Read for more insight, and then register with KDnuggets exclusive link for 20% off tickets.

article thumbnail

Research Guide for Neural Architecture Search

KDnuggets

In this guide, we will explore a range of research papers that have sought to solve the challenging task of automating neural network design.

article thumbnail

How to Modernize Manufacturing Without Losing Control

Speaker: Andrew Skoog, Founder of MachinistX & President of Hexis Representatives

Manufacturing is evolving, and the right technology can empower—not replace—your workforce. Smart automation and AI-driven software are revolutionizing decision-making, optimizing processes, and improving efficiency. But how do you implement these tools with confidence and ensure they complement human expertise rather than override it? Join industry expert Andrew Skoog as he explores how manufacturers can leverage automation to enhance operations, streamline workflows, and make smarter, data-dri

article thumbnail

5 Fundamental AI Principles

KDnuggets

While AI may appear magical at times, these five principles will help guide you to avoid pitfalls when leveraging this tech.

Data 89
article thumbnail

How to Deploy Confluent Platform on Pivotal Container Service (PKS) with Confluent Operator

Confluent

This tutorial describes how to set up an Apache Kafka ® cluster on Enterprise Pivotal Container Service (Enterprise PKS) using Confluent Operator , which allows you to deploy and run Confluent Platform at scale on virtually any Kubernetes platform, including Pivotal Container Service (PKS). With Enterprise PKS , you can deploy, scale, patch, and upgrade all the Kubernetes clusters in your system without downtime.

Kafka 16
article thumbnail

Why Scrapinghub’s AutoExtract Chose Confluent Cloud for Their Apache Kafka Needs

Confluent

We recently launched a new artificial intelligence (AI) data extraction API called Scrapinghub AutoExtract , which turns article and product pages into structured data. At Scrapinghub, we specialize in web data extraction , and our products empower everyone from programmers to CEOs to extract web data quickly and effectively. Example of article extraction on Introducing a Cloud-Native Experience for Apache Kafka ® in Confluent Cloud.

Kafka 16
article thumbnail

Multi-Task Learning – ERNIE 2.0: State-of-the-Art NLP Architecture Intuitively Explained

KDnuggets

The tech giant Baidu unveiled its state-of-the-art NLP architecture ERNIE 2.0 earlier this year, which scored significantly higher than XLNet and BERT on all tasks in the GLUE benchmark. This major breakthrough in NLP takes advantage of a new innovation called “Continual Incremental Multi-Task Learning”.

article thumbnail

The Ultimate Guide to Apache Airflow DAGS

With Airflow being the open-source standard for workflow orchestration, knowing how to write Airflow DAGs has become an essential skill for every data engineer. This eBook provides a comprehensive overview of DAG writing features with plenty of example code. You’ll learn how to: Understand the building blocks DAGs, combine them in complex pipelines, and schedule your DAG to run exactly when you want it to Write DAGs that adapt to your data at runtime and set up alerts and notifications Scale you

article thumbnail

Kafka Summit San Francisco 2019: Day 2 Recap

Confluent

If you looked at the Kafka Summits I’ve been a part of as a sequence of immutable events (and they are, unless you know something about time I don’t), it would look like this: New York City 2017, San Francisco 2017, London 2018, San Francisco 2018, New York City 2019, London 2019, San Francisco 2019. That makes this the seventh Summit I’ve attended.

Kafka 13
article thumbnail

6 Must See Deep Learning Experts at ODSC West 2019 – 20% Off Ends Friday

KDnuggets

You won’t want to miss the opportunity to learn about the future of deep learning first-hand at ODSC West in San Francisco, Oct 29 - Nov 1. So don’t forget to register soon for 20% off.

article thumbnail

KDnuggets™ News 19:n37, Oct 2: The Future of Analytics & Data Science! Starting NLP with spaCy & Python

KDnuggets

This week, find out what the future of analytics and data science holds; get an introduction to spaCy for natural language processing; find out how to use time series analysis for baseball; get to know your data; read 6 bits of advice for data scientists; and much, much more!

article thumbnail

Top Stories, Sep 23-29: The Future of Analytics and Data Science; 5 Famous Deep Learning Courses/Schools of 2019

KDnuggets

Also: 12 Deep Learning Researchers and Leaders; Natural Language in Python using spaCy: An Introduction; A Single Function to Streamline Image Classification with Keras; Which Data Science Skills are core and which are hot/emerging ones?; 6 bits of advice for Data Scientists.

article thumbnail

Apache Airflow® Best Practices: DAG Writing

Speaker: Tamara Fingerlin, Developer Advocate

In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!

article thumbnail

Top KDnuggets tweets, Sep 25 – Oct 01: Natural Language in Python using spaCy: An Introduction

KDnuggets

Also: Top KDnuggets tweets, Sep 18-24: Python Libraries for Interpretable Machine Learning; Scikit-Learn: A silver bullet for basic ML; Automatic Version Control for Data Scientists; My journey path from a Software Engineer to BI Specialist to a Data Scientist.

Python 62
article thumbnail

Recreating Imagination: DeepMind Builds Neural Networks that Spontaneously Replay Past Experiences

KDnuggets

DeepMind researchers created a model to be able to replay past experiences in a way that simulate the mechanisms in the hippocampus.

article thumbnail

Statistical Thinking for Industrial Problem Solving: a free online course

KDnuggets

This online course is available – for free – to anyone interested in building practical skills in using data to solve problems better.