Sat.Nov 30, 2019 - Fri.Dec 06, 2019

article thumbnail

Data Science Curriculum Roadmap

KDnuggets

What follows is a set of broad recommendations, and it will inevitably require a lot of adjustments in each implementation. Given that caveat, here are our curriculum recommendations.

article thumbnail

Organizing And Empowering Data Engineers At Citadel

Data Engineering Podcast

Summary The financial industry has long been driven by data, requiring a mature and robust capacity for discovering and integrating valuable sources of information. Citadel is no exception, and in this episode Michael Watson and Robert Krzyzanowski share their experiences managing and leading the data engineering teams that power the business. They shared helpful insights into some of the challenges associated with working in a regulated industry, organizing teams to deliver value rapidly and re

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Integrating Apache Kafka With Python Asyncio Web Applications

Confluent

Modern Python has very good support for cooperative multitasking. Coroutines were first added to the language in version 2.5 with PEP 342 and their use is becoming mainstream following the […].

Python 19
article thumbnail

Open-Sourcing Metaflow, a Human-Centric Framework for Data Science

Netflix Tech

by David Berg , Ravi Kiran Chirravuri , Romain Cledat , Savin Goyal , Ferras Hamad , Ville Tuulos tl;dr Metaflow is now open-source! Get started at metaflow.org. Netflix applies data science to hundreds of use cases across the company, including optimizing content delivery and video encoding. Data scientists at Netflix relish our culture that empowers them to work autonomously and use their judgment to solve problems independently.

article thumbnail

A Guide to Debugging Apache Airflow® DAGs

In Airflow, DAGs (your data pipelines) support nearly every use case. As these workflows grow in complexity and scale, efficiently identifying and resolving issues becomes a critical skill for every data engineer. This is a comprehensive guide with best practices and examples to debugging Airflow DAGs. You’ll learn how to: Create a standardized process for debugging to quickly diagnose errors in your DAGs Identify common issues with DAGs, tasks, and connections Distinguish between Airflow-relate

article thumbnail

10 Free Top Notch Machine Learning Courses

KDnuggets

Are you interested in studying machine learning over the holidays? This collection of 10 free top notch courses will allow you to do just that, with something for every approach to improving your machine learning skills.

article thumbnail

Data Analytics in the Cloud: It's Not Just Lift and Shift

Teradata

The cloud’s flexibility is becoming an essential success factor for businesses. But moving your data analytics to the cloud isn't just lift and shift. Read more.

More Trending

article thumbnail

Data Compression for Large-Scale Streaming Experimentation

Netflix Tech

Julie (Novak) Beckley, Andy Rhines, Jeffrey Wong, Matthew Wardrop, Toby Mao, Martin Tingley Ever wonder why Netflix works so well when you’re streaming at home, on the train, or in a foreign hotel? Behind the scenes, Netflix engineers are constantly striving to improve the quality of your streaming service. The goal is to bring you joy by delivering the content you love quickly and reliably every time you watch.

article thumbnail

Explainability: Cracking open the black box, Part 1

KDnuggets

What is Explainability in AI and how can we leverage different techniques to open the black box of AI and peek inside? This practical guide offers a review and critique of the various techniques of interpretability.

153
153
article thumbnail

Six Ways Teradata Vantage is Moving the Cloud Forward

Teradata

Learn how Teradata Vantage and its modern cloud architecture enables companies to leverage 100% of their data to uncover real-time intelligence, at scale.

Cloud 49
article thumbnail

5 Techniques to Prevent Overfitting in Neural Networks

KDnuggets

In this article, I will present five techniques to prevent overfitting while training neural networks.

145
145
article thumbnail

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

Speaker: Tamara Fingerlin, Developer Advocate

Apache Airflow® 3.0, the most anticipated Airflow release yet, officially launched this April. As the de facto standard for data orchestration, Airflow is trusted by over 77,000 organizations to power everything from advanced analytics to production AI and MLOps. With the 3.0 release, the top-requested features from the community were delivered, including a revamped UI for easier navigation, stronger security, and greater flexibility to run tasks anywhere at any time.

article thumbnail

The Essential Toolbox for Data Cleaning

KDnuggets

Increase your confidence to perform data cleaning with a broader perspective of what datasets typically look like, and follow this toolbox of code snipets to make your data cleaning process faster and more efficient.

Datasets 144
article thumbnail

A Non-Technical Reading List for Data Science

KDnuggets

The world still cannot be reduced to numbers on a page because human beings are still the ones making all the decisions. So, the best data scientists understand the numbers and the people. Check out these great data science books that will make you a better data scientist without delving into the technical details.

article thumbnail

Enabling the Deep Learning Revolution

KDnuggets

Deep learning models are revolutionizing the business and technology world with jaw-dropping performances in one application area after another. Read this post on some of the numerous composite technologies which allow deep learning its complex nonlinearity.

article thumbnail

Why software engineering processes and tools don’t work for machine learning

KDnuggets

While AI may be the new electricity significant challenges remain to realize AI potential. Here we examine why data scientists and teams can’t rely on software engineering tools and processes for machine learning.

article thumbnail

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Speaker: Alex Salazar, CEO & Co-Founder @ Arcade | Nate Barbettini, Founding Engineer @ Arcade | Tony Karrer, Founder & CTO @ Aggregage

There’s a lot of noise surrounding the ability of AI agents to connect to your tools, systems and data. But building an AI application into a reliable, secure workflow agent isn’t as simple as plugging in an API. As an engineering leader, it can be challenging to make sense of this evolving landscape, but agent tooling provides such high value that it’s critical we figure out how to move forward.

article thumbnail

Google Open Sources MobileNetV3 with New Ideas to Improve Mobile Computer Vision Models

KDnuggets

The latest release of MobileNets incorporates AutoML and other novel ideas in mobile deep learning.

article thumbnail

KDnuggets Poll: How well do current AutoML solutions work?

KDnuggets

Take part in our latest poll, asking readers their opinions on the effectiveness of current automated machine learning solutions.

article thumbnail

The Rise of User-Generated Data Labeling

KDnuggets

Let’s say your project is humongous and needs data labeling to be done continuously - while you’re on-the-go, sleeping, or eating. I’m sure you’d appreciate User-generated Data Labeling. I’ve got 6 interesting examples to help you understand this, let’s dive right in!

Data 107
article thumbnail

Top 7 Data Science Use Cases in Trust and Security

KDnuggets

What are trust and safety? What is the role of trust and security in the modern world? Read this overview of 7 data science application use cases in the realm of trust and security.

article thumbnail

How to Modernize Manufacturing Without Losing Control

Speaker: Andrew Skoog, Founder of MachinistX & President of Hexis Representatives

Manufacturing is evolving, and the right technology can empower—not replace—your workforce. Smart automation and AI-driven software are revolutionizing decision-making, optimizing processes, and improving efficiency. But how do you implement these tools with confidence and ensure they complement human expertise rather than override it? Join industry expert Andrew Skoog as he explores how manufacturers can leverage automation to enhance operations, streamline workflows, and make smarter, data-dri

article thumbnail

Vega-Lite: A grammar of interactive graphics

KDnuggets

Vega and Vega-lite follow in a long line of work that can trace its roots back to Wilkinson’s ‘The Grammar of Graphics.’ Since then VegaLite has come into existence, bringing high-level specification of interactive visualisations to the Vega-Lite world.

IT 90
article thumbnail

Popular Deep Learning Courses of 2019

KDnuggets

With deep learning and AI on the forefront of the latest applications and demands for new business directions, additional education is paramount for current machine learning engineers and data scientists. These courses are famous among peers, and will help you demonstrate tangible proof of your new skills.

article thumbnail

Accuracy Fallacy: The Media’s Coverage of AI Is Bogus

KDnuggets

Such as the gross exaggerations Stanford researchers broadcasted about their infamous "AI gaydar" project, there exists a prevalent "accuracy fallacy" in relation to AI from the media. Find out more about how the press constantly misleads the public into believing that machine learning can reliably predict psychosis, heart attacks, sexuality, and much more.

Media 78
article thumbnail

PyTorch in 2019 and where in Europe you can learn about PyTorch in 2020

KDnuggets

The Reinforce AI Conference is coming to Budapest again. Join us Apr 6-7 for the conference days, and optionally Apr 8 for workshops. Stefan Otte returns as a speaker, while Francois Chollet joins this time as well.

73
article thumbnail

The Ultimate Guide to Apache Airflow DAGS

With Airflow being the open-source standard for workflow orchestration, knowing how to write Airflow DAGs has become an essential skill for every data engineer. This eBook provides a comprehensive overview of DAG writing features with plenty of example code. You’ll learn how to: Understand the building blocks DAGs, combine them in complex pipelines, and schedule your DAG to run exactly when you want it to Write DAGs that adapt to your data at runtime and set up alerts and notifications Scale you

article thumbnail

Webinar: Natural Language Processing for Digital Transformation of Unstructured Text

KDnuggets

Learn how pharma and healthcare organizations are using the power of Natural Language Processing (NLP) to transform unstructured text into actionable structured data.

Process 65
article thumbnail

KDnuggets™ News 19:n46, Dec 4: The Future of Data Science Careers; Which Data Visualization Should I Use?

KDnuggets

This week: The Future of Careers in Data Science & Analysis; Task-based effectiveness of basic visualizations; Open Source Projects by Google, Uber and Facebook for Data Science and AI; Getting Started with Automated Text Summarization; A Non-Technical Reading List for Data Science; and much more!

article thumbnail

Artificial Friend or Virtual Foe

KDnuggets

Is AI making more good than harm?

article thumbnail

Data Compression for Large-Scale Streaming Experimentation

Netflix Tech

Julie (Novak) Beckley, Andy Rhines, Jeffrey Wong, Matthew Wardrop, Toby Mao, Martin Tingley Ever wonder why Netflix works so well when you’re streaming at home, on the train, or in a foreign hotel? Behind the scenes, Netflix engineers are constantly striving to improve the quality of your streaming service. The goal is to bring you joy by delivering the content you love quickly and reliably every time you watch.

article thumbnail

Apache Airflow® Best Practices: DAG Writing

Speaker: Tamara Fingerlin, Developer Advocate

In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!

article thumbnail

Statistical Thinking for Industrial Problem Solving – a free online course

KDnuggets

This online course is available – for free – to anyone interested in building practical skills in using data to solve problems better.

article thumbnail

Open-Sourcing Metaflow, a Human-Centric Framework for Data Science

Netflix Tech

by David Berg , Ravi Kiran Chirravuri , Romain Cledat , Savin Goyal , Ferras Hamad , Ville Tuulos tl;dr Metaflow is now open-source! Get started at metaflow.org. Netflix applies data science to hundreds of use cases across the company, including optimizing content delivery and video encoding. Data scientists at Netflix relish our culture that empowers them to work autonomously and use their judgment to solve problems independently.

article thumbnail

Open-Sourcing Metaflow, a Human-Centric Framework for Data Science

Netflix Tech

by David Berg, Ravi Kiran Chirravuri, Romain Cledat, Savin Goyal, Ferras Hamad, Ville Tuulos Continue reading on Netflix TechBlog ».

article thumbnail

Data Compression for Large-Scale Streaming Experimentation

Netflix Tech

Julie (Novak) Beckley, Andy Rhines, Jeffrey Wong, Matthew Wardrop, Toby Mao, Martin Tingley Ever wonder why Netflix works so well when you’re streaming at home, on the train, or in a foreign hotel? Behind the scenes, Netflix engineers are constantly striving to improve the quality of your streaming service. The goal is to bring you joy by delivering the content you love quickly and reliably every time you watch.

article thumbnail

How to Achieve High-Accuracy Results When Using LLMs

Speaker: Ben Epstein, Stealth Founder & CTO | Tony Karrer, Founder & CTO, Aggregage

When tasked with building a fundamentally new product line with deeper insights than previously achievable for a high-value client, Ben Epstein and his team faced a significant challenge: how to harness LLMs to produce consistent, high-accuracy outputs at scale. In this new session, Ben will share how he and his team engineered a system (based on proven software engineering approaches) that employs reproducible test variations (via temperature 0 and fixed seeds), and enables non-LLM evaluation m