Sat.Jan 11, 2020 - Fri.Jan 17, 2020

article thumbnail

Top 9 Mobile Apps for Learning and Practicing Data Science

KDnuggets

This article will tell you about the top 9 mobile apps that help the user in learning and practicing data science and hence is improving their productivity.

article thumbnail

Streams and Tables in Apache Kafka: Elasticity, Fault Tolerance, and Other Advanced Concepts

Confluent

Now that we’ve learned about the processing layer of Apache Kafka® by looking at streams and tables, as well as the architecture of distributed processing with the Kafka Streams API […].

Kafka 26
article thumbnail

Engineering SQL Support on Apache Pinot at Uber

Uber Engineering

Uber leverages real-time analytics on aggregate data to improve the user experience across our products, from fighting fraudulent behavior on Uber Eats to forecasting demand on our platform. . As Uber’s operations became more complex and we offered additional features and … The post Engineering SQL Support on Apache Pinot at Uber appeared first on Uber Engineering Blog.

SQL 112
article thumbnail

Planet Scale SQL For The New Generation Of Applications With YugabyteDB

Data Engineering Podcast

Summary The modern era of software development is identified by ubiquitous access to elastic infrastructure for computation and easy automation of deployment. This has led to a class of applications that can quickly scale to serve users worldwide. This requires a new class of data storage which can accomodate that demand without having to rearchitect your system at each level of growth.

SQL 100
article thumbnail

Apache Airflow® Best Practices for ETL and ELT Pipelines

Whether you’re creating complex dashboards or fine-tuning large language models, your data must be extracted, transformed, and loaded. ETL and ELT pipelines form the foundation of any data product, and Airflow is the open-source data orchestrator specifically designed for moving and transforming data in ETL and ELT pipelines. This eBook covers: An overview of ETL vs.

article thumbnail

Top 10 Technology Trends for 2020

KDnuggets

With integrations of multiple emerging technologies just in the past year, AI development continues at a fast pace. Following the blueprint of science and technology advancements in 2019, we predict 10 trends we expect to see in 2020 and beyond.

article thumbnail

Streams and Tables in Apache Kafka: A Primer

Confluent

This four-part series explores the core fundamentals of Kafka’s storage and processing layers and how they interrelate. In this first part, we begin with an overview of events, streams, tables, […].

Kafka 25

More Trending

article thumbnail

Simulating Cohorts

Grouparoo

In the last post , I made a case that the way to make the biggest difference in a metric like retention is to increase how many tests you can run each month. It turns out, going from 1 to 4 tests a month makes a huge difference, especially as those cohorts build on each other over time. To prove this out, I built a spreadsheet. Because I learned even more from creating the spreadsheet itself than writing the blog post, I thought I'd give those learnings some airtime, too.

article thumbnail

The Future of Machine Learning

KDnuggets

This summary overviews the keynote at TensorFlow World by Jeff Dean, Head of AI at Google, that considered the advancements of computer vision and language models and predicted the direction machine learning model building should follow for the future.

article thumbnail

Streams and Tables in Apache Kafka: Topics, Partitions, and Storage Fundamentals

Confluent

Part 1 of this series discussed the basic elements of an event streaming platform: events, streams, and tables. We also introduced the stream-table duality and learned why it is a […].

Kafka 95
article thumbnail

SQL API for Real-Time Kafka Analytics in 3 Steps

Rockset

In this blog we will set up a real-time SQL API on Kafka using AWS Lambda and Rockset. At the time of writing (in early 2020) the San Francisco 49ers are doing remarkably well! To honor their success, we will focus on answering the following question. What are the most popular hashtags in tweets that mentioned the 49ers in the last 20 minutes? Because Twitter moves fast, we will only look at very recent tweets.

Kafka 40
article thumbnail

Apache Airflow®: The Ultimate Guide to DAG Writing

Speaker: Tamara Fingerlin, Developer Advocate

In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!

article thumbnail

Handling Trees in Data Science Algorithmic Interview

KDnuggets

This post is about fast-tracking the study and explanation of tree concepts for the data scientists so that you breeze through the next time you get asked these in an interview.

Algorithm 146
article thumbnail

Math for Programmers!

KDnuggets

Math for Programmers teaches you the math you need to know for a career in programming, concentrating on what you need to know as a developer.

article thumbnail

Decision Tree Algorithm, Explained

KDnuggets

All you need to know about decision trees and how to build and optimize decision tree classifier.

Algorithm 123
article thumbnail

Idiot’s Guide to Precision, Recall, and Confusion Matrix

KDnuggets

Building Machine Learning models is fun, but making sure we build the best ones is what makes a difference. Follow this quick guide to appreciate how to effectively evaluate a classification model, especially for projects where accuracy alone is not enough.

article thumbnail

Optimizing The Modern Developer Experience with Coder

Many software teams have migrated their testing and production workloads to the cloud, yet development environments often remain tied to outdated local setups, limiting efficiency and growth. This is where Coder comes in. In our 101 Coder webinar, you’ll explore how cloud-based development environments can unlock new levels of productivity. Discover how to transition from local setups to a secure, cloud-powered ecosystem with ease.

article thumbnail

Classify A Rare Event Using 5 Machine Learning Algorithms

KDnuggets

Which algorithm works best for unbalanced data? Are there any tradeoffs?

Algorithm 122
article thumbnail

Geovisualization with Open Data

KDnuggets

In this post I want to show how to use public available (open) data to create geo visualizations in python. Maps are a great way to communicate and compare information when working with geolocation data. There are many frameworks to plot maps, here I focus on matplotlib and geopandas (and give a glimpse of mplleaflet).

Python 113
article thumbnail

Uber Creates Generative Teaching Networks to Better Train Deep Neural Networks

KDnuggets

The new technique can really improve how deep learning models are trained at scale.

article thumbnail

Graph Machine Learning Meets UX: An uncharted love affair

KDnuggets

When machine learning tools are developed by technology first, they risk failing to deliver on what users actually need. It can also be difficult for development teams to establish meaningful direction. This article explores the challenges of designing an interface that enables users to visualise and interact with insights from graph machine learning, and explores the very new, uncharted relationship between machine learning and UX.

article thumbnail

15 Modern Use Cases for Enterprise Business Intelligence

Large enterprises face unique challenges in optimizing their Business Intelligence (BI) output due to the sheer scale and complexity of their operations. Unlike smaller organizations, where basic BI features and simple dashboards might suffice, enterprises must manage vast amounts of data from diverse sources. What are the top modern BI use cases for enterprise businesses to help you get a leg up on the competition?

article thumbnail

Schema Evolution in Data Lakes

KDnuggets

Whereas a data warehouse will need rigid data modeling and definitions, a data lake can store different types and shapes of data. In a data lake, the schema of the data can be inferred when it’s read, providing the aforementioned flexibility. However, this flexibility is a double-edged sword.

article thumbnail

Streams and Tables in Apache Kafka: Processing Fundamentals with Kafka Streams and ksqlDB

Confluent

Part 2 of this series discussed in detail the storage layer of Apache Kafka: topics, partitions, and brokers, along with storage formats and event partitioning. Now that we have this […].

Kafka 18
article thumbnail

Methods, challenges & applications of Deep Learning | Munich 11-12 May

KDnuggets

Visit Deep Learning World, 11-12 May in Munich, to broaden your knowledge, deepen your understanding and discuss your questions with other Deep Learning experts!

article thumbnail

Top KDnuggets tweets, Jan 08-14: A Beginners Guide to Data Engineering — Part I

KDnuggets

Also: The Book to Start You on Machine Learning - KDnuggets; Top KDnuggets tweets, Jan 1-7: Introduction to #DataVisualization and Storytelling: A Guide For The #DataScientist #eBook; 7 Steps to a Job-winning Data Science Resume - KDnuggets; Tips for open-sourcing research code.

article thumbnail

Prepare Now: 2025s Must-Know Trends For Product And Data Leaders

Speaker: Jay Allardyce, Deepak Vittal, Terrence Sheflin, and Mahyar Ghasemali

As we look ahead to 2025, business intelligence and data analytics are set to play pivotal roles in shaping success. Organizations are already starting to face a host of transformative trends as the year comes to a close, including the integration of AI in data analytics, an increased emphasis on real-time data insights, and the growing importance of user experience in BI solutions.

article thumbnail

7 AI Use Cases Transforming Live Sports Production and Distribution

KDnuggets

Here are 7 powerful AI led use cases both for linear television and for OTT apps that are transforming the live sports production landscape.

article thumbnail

Top Stories, Jan 6-12: Top 5 must-have Data Science skills for 2020; 7 Resources to Becoming a Data Engineer

KDnuggets

Also: The Book to Start You on Machine Learning; An Introductory Guide to NLP for Data Scientists with 7 Common Techniques; A Comprehensive Guide to Natural Language Generation; The Book to Start You on Machine Learning; 10 Python Tips and Tricks You Should Learn Today.

article thumbnail

Survey Segmentation Tutorial

KDnuggets

Learn the basics of verifying segmentation, analyzing the data, and creating segments in this tutorial. When reviewing survey data, you will typically be handed Likert questions (e.g., on a scale of 1 to 5), and by using a few techniques, you can verify the quality of the survey and start grouping respondents into populations.

Data 67
article thumbnail

Statistical Thinking for Industrial Problem Solving: a free online course.

KDnuggets

This online course is available – for free – to anyone interested in building practical skills in using data to solve problems better.

article thumbnail

How to Drive Cost Savings, Efficiency Gains, and Sustainability Wins with MES

Speaker: Nikhil Joshi, Founder & President of Snic Solutions

Is your manufacturing operation reaching its efficiency potential? A Manufacturing Execution System (MES) could be the game-changer, helping you reduce waste, cut costs, and lower your carbon footprint. Join Nikhil Joshi, Founder & President of Snic Solutions, in this value-packed webinar as he breaks down how MES can drive operational excellence and sustainability.

article thumbnail

KDnuggets™ News 20:n02, Jan 15: Top 5 Must-have Data Science Skills; Learn Machine Learning with THIS Book

KDnuggets

This week: learn the 5 must-have data science skills for the new year; find out which book is THE book to get started learning machine learning; pick up some Python tips and tricks; learn SQL, but learn it the hard way; and find an introductory guide to learning common NLP techniques.

article thumbnail

Disentangling disentanglement: Ideas from NeurIPS 2019

KDnuggets

This year’s NEURIPS-2019 Vancouver conference recently concluded and featured a dozen papers on disentanglement in deep learning. What is this idea and why is it so interesting in machine learning? This summary of these papers will give you initial insight in disentanglement as well as ideas on what you can explore next.