Sat.Sep 07, 2019 - Fri.Sep 13, 2019

article thumbnail

Building A Reliable And Performant Router For Observability Data

Data Engineering Podcast

Summary The first stage in every data project is collecting information and routing it to a storage system for later analysis. For operational data this typically means collecting log messages and system metrics. Often a different tool is used for each class of data, increasing the overall complexity and number of moving parts. The engineers at Timber.io decided to build a new tool in the form of Vector that allows for processing both of these data types in a single framework that is reliable an

Building 100
article thumbnail

10 Great Python Resources for Aspiring Data Scientists

KDnuggets

This is a collection of 10 interesting resources in the form of articles and tutorials for the aspiring data scientist new to Python, meant to provide both insight and practical instruction when starting on your journey.

Python 119
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

How Artificial Intelligence & Deep Learning Change the Game

Teradata

AI & Deep Learning allow organizations to maximize player performance while minimizing player risk through better insights from performance and wellness data.

article thumbnail

Grafana Time-Series Dashboards with the Rockset-Grafana Plugin

Rockset

What Is Grafana? Grafana is an open-source software platform for time series analytics and monitoring. You can connect Grafana to a large number of data sources, from PostgreSQL to Prometheus. Once your data source is connected, you can use a built-in query control or editor to fetch data, and build dashboards from your data source. Grafana is frequently deployed for a wide variety of use cases, including DevOps and AdTech.

article thumbnail

15 Modern Use Cases for Enterprise Business Intelligence

Large enterprises face unique challenges in optimizing their Business Intelligence (BI) output due to the sheer scale and complexity of their operations. Unlike smaller organizations, where basic BI features and simple dashboards might suffice, enterprises must manage vast amounts of data from diverse sources. What are the top modern BI use cases for enterprise businesses to help you get a leg up on the competition?

article thumbnail

Story about AWS RDS upgrade to AWS Aurora and InnoDB adaptive hash index parameter

nodeSWAT

Story about unexpected slowdown during AWS RDS upgrade to AWS Aurora and InnoDB adaptive hash index parameter TL;DR at the end. The parameter. MySQL 5.7 documentation about InnoDB adaptive hash index. Turning this parameter ON enables the database engine to analyze index searches and to automatically adapt to the queries/searches you are running. It does so by making custom indexes for these specific cases, in return making your queries run faster because they can now use the automatically gener

AWS 52
article thumbnail

Train sklearn 100x Faster

KDnuggets

As compute gets cheaper and time to market for machine learning solutions becomes more critical, we’ve explored options for speeding up model training. One of those solutions is to combine elements from Spark and scikit-learn into our own hybrid solution.

More Trending

article thumbnail

Apache Kafka Rebalance Protocol for the Cloud: Static Membership

Confluent

Static Membership is an enhancement to the current rebalance protocol that aims to reduce the downtime caused by excessive and unnecessary rebalances for general Apache Kafka ® client implementations. This applies to Kafka consumers, Kafka Connect, and Kafka Streams. To get a better grasp on the rebalance protocol, we’ll examine this concept in depth and explain what it means.

Kafka 21
article thumbnail

Reimagining Experimentation Analysis at Netflix

Netflix Tech

Toby Mao , Sri Sri Perangur , Colin McFarland Another day, another custom script to analyze an A/B test. Maybe you’ve done this before and have an old script lying around. If it’s new, it’s probably going to take some time to set up, right? Not at Netflix. ABlaze: The standard view of analyses in the XP UI Suppose you’re running a new video encoding test and theorize that the two new encodes should reduce play delay, a metric describing how long it takes for a video to play after you press the s

article thumbnail

Classification vs Prediction

KDnuggets

It is important to distinguish prediction and classification. In many decision-making contexts, classification represents a premature decision, because classification combines prediction and decision making and usurps the decision maker in specifying costs of wrong decisions.

IT 111
article thumbnail

Many Heads Are Better Than One: The Case For Ensemble Learning

KDnuggets

While ensembling techniques are notoriously hard to set up, operate, and explain, with the latest modeling, explainability and monitoring tools, they can produce more accurate and stable predictions. And better predictions can be better for business.

article thumbnail

Prepare Now: 2025s Must-Know Trends For Product And Data Leaders

Speaker: Jay Allardyce, Deepak Vittal, and Terrence Sheflin

As we look ahead to 2025, business intelligence and data analytics are set to play pivotal roles in shaping success. Organizations are already starting to face a host of transformative trends as the year comes to a close, including the integration of AI in data analytics, an increased emphasis on real-time data insights, and the growing importance of user experience in BI solutions.

article thumbnail

Scikit-Learn vs mlr for Machine Learning

KDnuggets

How does the scikit-learn machine learning library for Python compare to the mlr package for R? Following along with a machine learning workflow through each approach, and see if you can gain a competitive advantage by knowing both frameworks.

article thumbnail

The 5 Graph Algorithms That Data Scientists Should Know

KDnuggets

In this post, I am going to be talking about some of the most important graph algorithms you should know and how to implement them using Python.

Algorithm 117
article thumbnail

There is No Free Lunch in Data Science

KDnuggets

There is no such thing as a free lunch in life or data science. Here, we'll explore some science philosophy and discuss the No Free Lunch theorems to find out what they mean for the field of data science.

article thumbnail

Common Machine Learning Obstacles

KDnuggets

In this blog, Seth DeLand of MathWorks discusses two of the most common obstacles relate to choosing the right classification model and eliminating data overfitting.

article thumbnail

How to Drive Cost Savings, Efficiency Gains, and Sustainability Wins with MES

Speaker: Nikhil Joshi, Founder & President of Snic Solutions

Is your manufacturing operation reaching its efficiency potential? A Manufacturing Execution System (MES) could be the game-changer, helping you reduce waste, cut costs, and lower your carbon footprint. Join Nikhil Joshi, Founder & President of Snic Solutions, in this value-packed webinar as he breaks down how MES can drive operational excellence and sustainability.

article thumbnail

The State of Transfer Learning in NLP

KDnuggets

This post expands on the NAACL 2019 tutorial on Transfer Learning in NLP organized by Matthew Peters, Swabha Swayamdipta, Thomas Wolf, and Sebastian Ruder. This post highlights key insights and takeaways and provides updates based on recent work.

95
article thumbnail

BERT is changing the NLP landscape

KDnuggets

BERT is changing the NLP landscape and making chatbots much smarter by enabling computers to better understand speech and respond intelligently in real-time.

102
102
article thumbnail

Can graph machine learning identify hate speech in online social networks?

KDnuggets

Online hate speech is a complex subject. Follow this demonstration using state-of-the-art graph neural network models to detect hateful users based on their activities on the Twitter social network.

article thumbnail

OpenStreetMap Data to ML Training Labels for Object Detection

KDnuggets

I am really interested in creating a tight, clean pipeline for disaster relief applications, where we can use something like crowd sourced building polygons from OSM to train a supervised object detector to discover buildings in an unmapped location.

article thumbnail

Improving the Accuracy of Generative AI Systems: A Structured Approach

Speaker: Anindo Banerjea, CTO at Civio & Tony Karrer, CTO at Aggregage

When developing a Gen AI application, one of the most significant challenges is improving accuracy. This can be especially difficult when working with a large data corpus, and as the complexity of the task increases. The number of use cases/corner cases that the system is expected to handle essentially explodes. 💥 Anindo Banerjea is here to showcase his significant experience building AI/ML SaaS applications as he walks us through the current problems his company, Civio, is solving.

article thumbnail

Ensemble Methods for Machine Learning: AdaBoost

KDnuggets

It turned out that, if we ask the weak algorithm to create a whole bunch of classifiers (all weak for definition), and then combine them all, what may figure out is a stronger classifier.

article thumbnail

A 2019 Guide to Speech Synthesis with Deep Learning

KDnuggets

In this article, we’ll look at research and model architectures that have been written and developed to do just that using deep learning.

article thumbnail

How DeepMind and Waymo are Using Evolutionary Competition to Train Self-Driving Vehicles

KDnuggets

Recently, Alphabet’s subsidiaries Waymo and DeepMind partnered to find a more efficient process to train self-driving vehicles algorithms and their work took them back to one of the cornerstones of our history as species: evolution.

article thumbnail

Discover Your Path Toward Data Science with ODSC’s Mini-Bootcamp

KDnuggets

ODSC has developed a mini-bootcamp, designed to reduce the time and monetary costs of discovering which pathway into data science you should take. In this article, we’ll discuss seven reasons why ODSC’s Mini-Bootcamp might be right for you.

article thumbnail

The Ultimate Guide To Data-Driven Construction: Optimize Projects, Reduce Risks, & Boost Innovation

Speaker: Donna Laquidara-Carr, PhD, LEED AP, Industry Insights Research Director at Dodge Construction Network

In today’s construction market, owners, construction managers, and contractors must navigate increasing challenges, from cost management to project delays. Fortunately, digital tools now offer valuable insights to help mitigate these risks. However, the sheer volume of tools and the complexity of leveraging their data effectively can be daunting. That’s where data-driven construction comes in.

article thumbnail

Data Driven Government – Agenda, Washington, DC, Sep 25

KDnuggets

Data Driven Government is coming to Washington, DC, Sep 26, and includes a stellar lineup of experts who will share the emerging trends and best practices of government agencies in the current use of data analytics to enhance mission outcomes. Use code KDNUGGETS to get 15% off.

article thumbnail

Clearsense chooses Io-Tahoe’s Smart Data Discovery to navigate healthcare data challenges

KDnuggets

Io-Tahoe, a pioneer in Smart Data Discovery and AI-Driven Data Catalog products, has announced that Clearsense, a scalable data platform as a service built for healthcare, has chosen the smart data discovery platform to automatically discover and catalog relationships across immense amounts of medical and clinical data.

article thumbnail

Version Control for Data Science: Tracking Machine Learning Models and Datasets

KDnuggets

I am a Git god, why do I need another version control system for Machine Learning Projects?

article thumbnail

Top August Stories: How to Become More Marketable as a Data Scientist

KDnuggets

Also: Top Handy SQL Features for Data Scientists; 12 NLP Researchers, Practitioners & Innovators You Should Be Following; Knowing Your Neighbours: Machine Learning on Graphs.

article thumbnail

Business Intelligence 101: How To Make The Best Solution Decision For Your Organization

Speaker: Evelyn Chou

Choosing the right business intelligence (BI) platform can feel like navigating a maze of features, promises, and technical jargon. With so many options available, how can you ensure you’re making the right decision for your organization’s unique needs? 🤔 This webinar brings together expert insights to break down the complexities of BI solution vetting.

article thumbnail

A Friendly Introduction to Support Vector Machines

KDnuggets

This article explains the Support Vector Machines (SVM) algorithm in an easy way.

article thumbnail

Top Stories, Sep 2-8: I wasn’t getting hired as a Data Scientist. So I sought data on who is.

KDnuggets

Also: Python Libraries for Interpretable Machine Learning; TensorFlow vs PyTorch vs Keras for NLP; Advice on building a machine learning career and reading research papers by Prof. Andrew Ng; Object-oriented programming for data scientists: Build your ML estimator.

article thumbnail

KDnuggets™ News 19:n34, Sep 11: I wasn’t getting hired as a Data Scientist. So I sought data on who is

KDnuggets

How one person overcame rejections applying to Data Scientist positions by getting actual data on who is getting hired; Advice from Andrew Ng on building ML career and reading research papers; 10 Great Python resources for Data Scientists; Python Libraries for Interpretable ML,

Python 52
article thumbnail

Top KDnuggets tweets, Sep 04-10: How #AI will transform #healthcare; 10 Great Python Resources for Aspiring Data Scientists

KDnuggets

Python Libraries for Interpretable Machine Learning; How #AI will transform #healthcare (and can it fix US healthcare system?); Building Recommendation System - an overview ; I wasn't getting hired as a Data Scientist. So I sought data on who is.

article thumbnail

Driving Responsible Innovation: How to Navigate AI Governance & Data Privacy

Speaker: Aindra Misra, Senior Manager, Product Management (Data, ML, and Cloud Infrastructure) at BILL

Join us for an insightful webinar that explores the critical intersection of data privacy and AI governance. In today’s rapidly evolving tech landscape, building robust governance frameworks is essential to fostering innovation while staying compliant with regulations. Our expert speaker, Aindra Misra, will guide you through best practices for ensuring data protection while leveraging AI capabilities.