Sat.Aug 03, 2019 - Fri.Aug 09, 2019

article thumbnail

Solving Data Discovery At Lyft

Data Engineering Podcast

Summary Data is only valuable if you use it for something, and the first step is knowing that it is available. As organizations grow and data sources proliferate it becomes difficult to keep track of everything, particularly for analysts and data scientists who are not involved with the collection and management of that information. Lyft has build the Amundsen platform to address the problem of data discovery and in this episode Tao Feng and Mark Grover explain how it works, why they built it, a

article thumbnail

Knowing Your Neighbours: Machine Learning on Graphs

KDnuggets

Graph Machine Learning uses the network structure of the underlying data to improve predictive outcomes. Learn how to use this modern machine learning method to solve challenges with connected data.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Migrating Functionality Between Large-scale Production Systems Seamlessly

Uber Engineering

A common axiom among Uber engineers states that building new features is like fixing a car’s engine while driving it. As we scaled up to our present level of support for 14 million trips per day, the car in that … The post Migrating Functionality Between Large-scale Production Systems Seamlessly appeared first on Uber Engineering Blog.

Systems 77
article thumbnail

Simple node.JS and Slack WebHook integration

nodeSWAT

This post will walk you through the process of how to turn this awesome chat tool into a handy monitoring & alerting tool for your application. All this without any 3rd party modules and minimal code to keep the footprint small. Note: This post is using now outmoded integration method. Slack has introduced new ways to manage and send messages via Apps.

Coding 52
article thumbnail

15 Modern Use Cases for Enterprise Business Intelligence

Large enterprises face unique challenges in optimizing their Business Intelligence (BI) output due to the sheer scale and complexity of their operations. Unlike smaller organizations, where basic BI features and simple dashboards might suffice, enterprises must manage vast amounts of data from diverse sources. What are the top modern BI use cases for enterprise businesses to help you get a leg up on the competition?

article thumbnail

Announcing Tutorials for Apache Kafka

Confluent

We’re excited to announce Tutorials for Apache Kafka ® , a new area of our website for learning event streaming. Kafka Tutorials is a collection of common event streaming use cases, with each tutorial featuring an example scenario and several complete code solutions. It’s the fastest way to learn how to use Kafka with confidence. We’re building this because we know that event streaming is a radically different way of thinking.

Kafka 22
article thumbnail

What is Benford’s Law and why is it important for data science?

KDnuggets

Benford’s law is a little-known gem for data analytics. Learn about how this can be used for anomaly or fraud detection in scientific or technical publications.

More Trending

article thumbnail

Lagrange multipliers with visualizations and code

KDnuggets

In this story, we’re going to take an aerial tour of optimization with Lagrange multipliers. When do we need them? Whenever we have an optimization problem with constraints.

Coding 104
article thumbnail

Deep Learning for NLP: ANNs, RNNs and LSTMs explained!

KDnuggets

Learn about Artificial Neural Networks, Deep Learning, Recurrent Neural Networks and LSTMs like never before and use NLP to build a Chatbot!

article thumbnail

Coding Random Forests in 100 lines of code*

KDnuggets

There are dozens of machine learning algorithms out there. It is impossible to learn all their mechanics; however, many algorithms sprout from the most established algorithms, e.g. ordinary least squares, gradient boosting, support vector machines, tree-based algorithms and neural networks.

Coding 97
article thumbnail

Data Science: Scientific Discipline or Business Process?

KDnuggets

Simply put, data science is an attempt to understand given data using the scientific method. That's why data science is a scientific discipline. You are free (and encouraged!) to apply data science to business use cases, just as you are encouraged to apply it to many other domains.

article thumbnail

Prepare Now: 2025s Must-Know Trends For Product And Data Leaders

Speaker: Jay Allardyce, Deepak Vittal, and Terrence Sheflin

As we look ahead to 2025, business intelligence and data analytics are set to play pivotal roles in shaping success. Organizations are already starting to face a host of transformative trends as the year comes to a close, including the integration of AI in data analytics, an increased emphasis on real-time data insights, and the growing importance of user experience in BI solutions.

article thumbnail

Feature selection by random search in Python

KDnuggets

Feature selection is one of the most important tasks in machine learning. Learn how to use a simple random search in Python to get good results in less time.

Python 103
article thumbnail

Exploratory Data Analysis Using Python

KDnuggets

In this tutorial, you’ll use Python and Pandas to explore a dataset and create visual distributions, identify and eliminate outliers, and uncover correlations between two datasets.

article thumbnail

Introduction to Image Segmentation with K-Means clustering

KDnuggets

Image segmentation is the classification of an image into different groups. Many kinds of research have been done in the area of image segmentation using clustering. In this article, we will explore using the K-Means clustering algorithm to read an image and cluster different regions of the image.

article thumbnail

Getting Started With Data Science

KDnuggets

Over the past many months, I’ve received hundreds of messages from people asking me how they could get started with Data Science. Therefore, I thought it would be useful to write down a framework for those wanting to get started with Data Science.

article thumbnail

How to Drive Cost Savings, Efficiency Gains, and Sustainability Wins with MES

Speaker: Nikhil Joshi, Founder & President of Snic Solutions

Is your manufacturing operation reaching its efficiency potential? A Manufacturing Execution System (MES) could be the game-changer, helping you reduce waste, cut costs, and lower your carbon footprint. Join Nikhil Joshi, Founder & President of Snic Solutions, in this value-packed webinar as he breaks down how MES can drive operational excellence and sustainability.

article thumbnail

25 Tricks for Pandas

KDnuggets

Check out this video (and Jupyter notebook) which outlines a number of Pandas tricks for working with and manipulating data, covering topics such as string manipulations, splitting and filtering DataFrames, combining and aggregating data, and more.

article thumbnail

Machine Learning is Happening Now: A Survey of Organizational Adoption, Implementation, and Investment

KDnuggets

This is an excerpt from a survey which sought to evaluate the relevance of machine learning in operations today, assess the current state of machine learning adoption and to identify tools used for machine learning. A link to the full report is inside.

article thumbnail

Top KDnuggets tweets, Jul 31 – Aug 06: NLP vs. NLU: from Understanding a Language to Its Processing

KDnuggets

Also: Ten more random useful things in R you may not know about; 5 Probability Distributions Every Data Scientist Should Know; Machine Learning is Happening Now: A Survey of Organizational Adoption, Implementation, and Investment; Programmers rejoice! Deep TabNine offer code autocompletion with #deeplearning.

Process 90
article thumbnail

Inside Pluribus: Facebook’s New AI That Just Mastered the World’s Most Difficult Poker Game

KDnuggets

The reasons why Pluribus represents a major breakthrough in AI systems might result confusing to many readers. After all, in recent years AI researchers have made tremendous progress across different complex games. However, six-player, no-limit Texas Hold’em still remains one of the most elusive challenges for AI systems.

Systems 88
article thumbnail

Improving the Accuracy of Generative AI Systems: A Structured Approach

Speaker: Anindo Banerjea, CTO at Civio & Tony Karrer, CTO at Aggregage

When developing a Gen AI application, one of the most significant challenges is improving accuracy. This can be especially difficult when working with a large data corpus, and as the complexity of the task increases. The number of use cases/corner cases that the system is expected to handle essentially explodes. 💥 Anindo Banerjea is here to showcase his significant experience building AI/ML SaaS applications as he walks us through the current problems his company, Civio, is solving.

article thumbnail

9 Tips For Training Lightning-Fast Neural Networks In Pytorch

KDnuggets

Who is this guide for? Anyone working on non-trivial deep learning models in Pytorch such as industrial researchers, Ph.D. students, academics, etc. The models we're talking about here might be taking you multiple days to train or even weeks or months.

article thumbnail

[video] Introduction to Generative Adversarial Networks (for beginners and advanced Data Scientists)

KDnuggets

Generative Adversarial Networks are driving important new technologies in deep learning methods. With so much to learn, these two videos will help you jump into your exploration with GANs and the mathematics behind the modelling.

article thumbnail

How to better manage your data science team’s workflow

KDnuggets

This workshop, Aug 14 @ 12 PM ET, will give you the proper tools and tactics to manage the entire lifecycle of your machine learning projects, from research to exploration to development and production.

article thumbnail

Keras Callbacks Explained In Three Minutes

KDnuggets

A gentle introduction to callbacks in Keras. Learn about EarlyStopping, ModelCheckpoint, and other callback functions with code examples.

Coding 89
article thumbnail

The Ultimate Guide To Data-Driven Construction: Optimize Projects, Reduce Risks, & Boost Innovation

Speaker: Donna Laquidara-Carr, PhD, LEED AP, Industry Insights Research Director at Dodge Construction Network

In today’s construction market, owners, construction managers, and contractors must navigate increasing challenges, from cost management to project delays. Fortunately, digital tools now offer valuable insights to help mitigate these risks. However, the sheer volume of tools and the complexity of leveraging their data effectively can be daunting. That’s where data-driven construction comes in.

article thumbnail

Monash University: Research Fellow – Computer Vision [Melbourne, Australia]

KDnuggets

The position requires a passion for research, a proven research track record in computer vision, an ability to work independently as well as lead a team, and a willingness to work on inter-disciplinary research projects and seek external funding. The successful candidate will align with the group goal on building a world-class computer vision team.

Project 56
article thumbnail

KDnuggets™ News 19:n29, Aug 7: What 70% of Data Science Learners Do Wrong; Pytorch Cheat Sheet for Beginners

KDnuggets

This week on KDnuggets: What 70% of Data Science Learners Do Wrong; Pytorch Cheat Sheet for Beginners and Udacity Deep Learning Nanodegree; How a simple mix of object-oriented programming can sharpen your deep learning prototype; Can we trust AutoML to go on full autopilot?; Ten more random useful things in R you may not know about; 25 Tricks for Pandas; and much more!

article thumbnail

Top Stories, Jul 29 – Aug 4: Top 10 Best Podcasts on AI, Analytics, Data Science, Machine Learning; What 70% of Data Science Learners Do Wrong

KDnuggets

Also: GPU Accelerated Data Analytics & Machine Learning; Understanding Tensor Processing Units; Top 13 Skills To Become a Rockstar Data Scientist; Five Command Line Tools for Data Science; Ten more random useful things in R you may not know about.

article thumbnail

Waste Management: Data Scientist [Houston, TX]

KDnuggets

Waste Management is seeking a Data Scientist in Houston, TX, to support their digital marketing, customer and other business segment teams with insights gained from analyzing customer data.

article thumbnail

Business Intelligence 101: How To Make The Best Solution Decision For Your Organization

Speaker: Evelyn Chou

Choosing the right business intelligence (BI) platform can feel like navigating a maze of features, promises, and technical jargon. With so many options available, how can you ensure you’re making the right decision for your organization’s unique needs? 🤔 This webinar brings together expert insights to break down the complexities of BI solution vetting.

article thumbnail

KSQL UDFs and UDAFs Made Easy

Confluent

One of KSQL’s most powerful features is allowing users to build their own KSQL functions for processing real-time streams of data. These functions can be invoked on individual messages (user-defined functions or UDFs) or used to perform aggregations on groups of messages (user-defined aggregate functions or UDAFs). The previous blog post How to Build a UDF and/or UDAF in KSQL 5.0 discussed some key steps for building and deploying a custom KSQL UDF/UDAF.

Kafka 18
article thumbnail

Four Steps to Drive Digital Transformation in Your Bank

Teradata

Digital transformation & regulatory requirements have long challenged Banks. Teradata has deep experience in ushering them through the transformation process.

Banking 15
article thumbnail

Cloud Analytic Migrations with Microsoft, Informatica & Teradata?

Teradata

Teradata partners Microsoft & Informatica announced that they are taking on cloud analytic migrations. Find out what this means for our on-premises customers.

Cloud 15