Sat.Nov 09, 2019 - Fri.Nov 15, 2019

article thumbnail

How to Speed up Pandas by 4x with one line of code

KDnuggets

While Pandas is the library for data processing in Python, it isn't really built for speed. Learn more about the new library, Modin, developed to distribute Pandas' computation to speedup your data prep.

Coding 123
article thumbnail

Designing For Data Protection

Data Engineering Podcast

Summary The practice of data management is one that requires technical acumen, but there are also many policy and regulatory issues that inform and influence the design of our systems. With the introduction of legal frameworks such as the EU GDPR and California’s CCPA it is necessary to consider how to implement data protectino and data privacy principles in the technical and policy controls that govern our data platforms.

Designing 100
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

Rich Model, Poor Model

Teradata

An integrated data foundation allows data science models to be more accurate, actionable and engage more customers. Find out how your model can positively impact your bottom line.

article thumbnail

Workforce Analytics is Reinventing HR

U-Next

Introduction to Workforce Analytics Today, the need to understand what attracts skillful individuals to join an organization, stay motivated, and deliver outstanding results has become more important than ever. However, this is not a task which can be shouldered by the HR team alone; they need the right tools to deliver optimal results. Over the years, organizations around the globe have spent billions of dollars on employee performance analysis, talent recruitment, leadership training, and deve

article thumbnail

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

Speaker: Tamara Fingerlin, Developer Advocate

Apache Airflow® 3.0, the most anticipated Airflow release yet, officially launched this April. As the de facto standard for data orchestration, Airflow is trusted by over 77,000 organizations to power everything from advanced analytics to production AI and MLOps. With the 3.0 release, the top-requested features from the community were delivered, including a revamped UI for easier navigation, stronger security, and greater flexibility to run tasks anywhere at any time.

article thumbnail

How Data Analytics Can Assist in Fraud Detection

KDnuggets

A primary advantage of data analytics tools is that they can handle massive quantities of information at once. These solutions typically learn what's normal within a collection of information and how to spot anomalies.

article thumbnail

Page Simulator

Netflix Tech

Page Simulation for Better Offline Metrics at Netflix by David Gevorkyan , Mehmet Yilmaz , Ajinkya More , Gaurav Agrawal , Richard Wellington , Vivek Kaushal , Prasanna Padmanabhan , Justin Basilico At Netflix, we spend a lot of effort to make it easy for our members to find content they will love. To make this happen, we personalize many aspects of our service, including which movies and TV shows we present on each member’s homepage.

More Trending

article thumbnail

Research Guide for Depth Estimation with Deep Learning

KDnuggets

In this guide, we’ll look at papers aimed at solving the problems of depth estimation using deep learning.

article thumbnail

Transfer Learning Made Easy: Coding a Powerful Technique

KDnuggets

While the revolution of deep learning now impacts our daily lives, these networks are expensive. Approaches in transfer learning promise to ease this burden by enabling the re-use of trained models -- and this hands-on tutorial will walk you through a transfer learning technique you can run on your laptop.

Coding 112
article thumbnail

Python Lists and List Manipulation

KDnuggets

In Python, lists store an ordered collection of items which can be of different types. This post is an overview of lists and their manipulation.

Python 111
article thumbnail

The Complete Data Science LinkedIn Profile Guide

KDnuggets

With so many Data Scientists showing up on LinkedIn, it's time to make sure your profile is top-notch because your talent is still highly sought after. Recruitment specialists want to find you fast, and this guide will help you create the best profile to feature your expertise.

article thumbnail

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Speaker: Alex Salazar, CEO & Co-Founder @ Arcade | Nate Barbettini, Founding Engineer @ Arcade | Tony Karrer, Founder & CTO @ Aggregage

There’s a lot of noise surrounding the ability of AI agents to connect to your tools, systems and data. But building an AI application into a reliable, secure workflow agent isn’t as simple as plugging in an API. As an engineering leader, it can be challenging to make sense of this evolving landscape, but agent tooling provides such high value that it’s critical we figure out how to move forward.

article thumbnail

Beginners Guide to the Three Types of Machine Learning

KDnuggets

The following article is an introduction to classification and regression — which are known as supervised learning — and unsupervised learning — which in the context of machine learning applications often refers to clustering — and will include a walkthrough in the popular python library scikit-learn.

article thumbnail

How to Visualize Data in Python (and R)

KDnuggets

Producing accessible data visualizations is a key data science skill. The following guidelines will help you create the best representations of your data using R and Python's Pandas library.

Python 106
article thumbnail

Tips for a cost-effective machine learning project

KDnuggets

Spoiler: you don’t need a VM running 24/7 to handle 16 requests a day.

article thumbnail

Topics Extraction and Classification of Online Chats

KDnuggets

This article provides covers how to automatically identify the topics within a corpus of textual data by using unsupervised topic modelling, and then apply a supervised classification algorithm to assign topic labels to each textual document by using the result of the previous step as target labels.

Algorithm 103
article thumbnail

How to Modernize Manufacturing Without Losing Control

Speaker: Andrew Skoog, Founder of MachinistX & President of Hexis Representatives

Manufacturing is evolving, and the right technology can empower—not replace—your workforce. Smart automation and AI-driven software are revolutionizing decision-making, optimizing processes, and improving efficiency. But how do you implement these tools with confidence and ensure they complement human expertise rather than override it? Join industry expert Andrew Skoog as he explores how manufacturers can leverage automation to enhance operations, streamline workflows, and make smarter, data-dri

article thumbnail

Testing Your Machine Learning Pipelines

KDnuggets

Let’s take a look at traditional testing methodologies and how we can apply these to our data/ML pipelines.

article thumbnail

How I Got Better at Machine Learning

KDnuggets

Check out this author's collection of tips and tricks that I learned over the years to get better at Machine Learning.

article thumbnail

On the sensationalism of artificial intelligence news

KDnuggets

With artificial intelligence and machine learning now a mainstay of our daily awareness, news organizations have been seen to overstate the reality behind progress in the field. Learn more about recent examples of media hyperbole and explore why this may be happening.

Media 88
article thumbnail

How to Extract Google Maps Coordinates

KDnuggets

In this article, I will show you how to quickly extract Google Maps coordinates with a simple and easy method.

85
article thumbnail

The Ultimate Guide to Apache Airflow DAGS

With Airflow being the open-source standard for workflow orchestration, knowing how to write Airflow DAGs has become an essential skill for every data engineer. This eBook provides a comprehensive overview of DAG writing features with plenty of example code. You’ll learn how to: Understand the building blocks DAGs, combine them in complex pipelines, and schedule your DAG to run exactly when you want it to Write DAGs that adapt to your data at runtime and set up alerts and notifications Scale you

article thumbnail

Python Workout / Practices of a Python Pro / Classic Computer Science Problems in Python

KDnuggets

Whether you’re a beginner or an expert, there’s always new ways you can improve your Python coding. Save 40% off this trio of Manning Python books today! Just enter the code nlpropython40 at checkout when you buy from manning.com.

Python 77
article thumbnail

Understanding NLP and Topic Modeling Part 1

KDnuggets

In this post, we seek to understand why topic modeling is important and how it helps us as data scientists.

IT 76
article thumbnail

AI ROI: The Questions You Need To Be Asking

KDnuggets

During this free Metis Corporate Training webinar, Dec 5 @ 12pm ET, Kerstin Frailey, Senior Data Scientist and Head of Executive Corporate Training at Metis, will walk through what you need to ask before, during, and after the lifetime of a data science project to accurately assess its impact on the business.

article thumbnail

Facebook Adds This New Framework to It’s Reinforcement Learning Arsenal

KDnuggets

ReAgent is a new framework that streamlines the implementation of reasoning systems.

Systems 64
article thumbnail

Apache Airflow® Best Practices: DAG Writing

Speaker: Tamara Fingerlin, Developer Advocate

In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!

article thumbnail

KDnuggets™ News 19:n43, Nov 13: Dynamic Reports in Python and R; Creating NLP Vocabularies; What is Data Science?

KDnuggets

On KDnuggets this week: Orchestrating Dynamic Reports in Python and R with Rmd Files; How to Create a Vocabulary for NLP Tasks in Python; What is Data Science?; The Complete Data Science LinkedIn Profile Guide; Set Operations Applied to Pandas DataFrames; and much, much more.

article thumbnail

Top Stories, Nov 4-10: 10 Free Must-read Books on AI

KDnuggets

Also: Understanding Boxplots; Probability Learning: Maximum Likelihood; Designing Your Neural Networks; Facebook Has Been Quietly Open Sourcing Some Amazing Deep Learning Capabilities for PyTorch; 5 Statistical Traps Data Scientists Should Avoid.

article thumbnail

Top KDnuggets tweets, Nov 06-12: 10 FREE must-read ebooks on AI. Things just keep getting more interesting in the field, so use these resources to stay up to speed.

KDnuggets

Also: It's time to make your Data Science LinkedIn profile ready for recruiters.; Python Libraries for Interpretable Machine Learning - KDnuggets; Process your data with Pandas up to 4x faster with this new Python library.; How to Extract Google Maps Coordinates.

article thumbnail

MLOps for production-level machine learning [Nov 14 Webinar]

KDnuggets

This live webinar, Nov 14 @ 12pm EST, on MLOps for production-level machine learning, will detail MLOps, a compound of “machine learning” and “operations”, a practice for collaboration and communication between data scientists and operations professionals to help manage the production machine learning lifecycle. Register now.

article thumbnail

How to Achieve High-Accuracy Results When Using LLMs

Speaker: Ben Epstein, Stealth Founder & CTO | Tony Karrer, Founder & CTO, Aggregage

When tasked with building a fundamentally new product line with deeper insights than previously achievable for a high-value client, Ben Epstein and his team faced a significant challenge: how to harness LLMs to produce consistent, high-accuracy outputs at scale. In this new session, Ben will share how he and his team engineered a system (based on proven software engineering approaches) that employs reproducible test variations (via temperature 0 and fixed seeds), and enables non-LLM evaluation m

article thumbnail

Page Simulator

Netflix Tech

Page Simulation for Better Offline Metrics at Netflix by David Gevorkyan , Mehmet Yilmaz , Ajinkya More , Gaurav Agrawal , Richard Wellington , Vivek Kaushal , Prasanna Padmanabhan , Justin Basilico At Netflix, we spend a lot of effort to make it easy for our members to find content they will love. To make this happen, we personalize many aspects of our service, including which movies and TV shows we present on each member’s homepage.

article thumbnail

Page Simulator

Netflix Tech

Page Simulation for Better Offline Metrics at Netflix by David Gevorkyan , Mehmet Yilmaz , Ajinkya More , Gaurav Agrawal , Richard Wellington , Vivek Kaushal , Prasanna Padmanabhan , Justin Basilico At Netflix, we spend a lot of effort to make it easy for our members to find content they will love. To make this happen, we personalize many aspects of our service, including which movies and TV shows we present on each member’s homepage.