How Bayes’ Theorem is Applied in Machine Learning
KDnuggets
OCTOBER 28, 2019
Learn how Bayes Theorem is in Machine Learning for classification and regression!
KDnuggets
OCTOBER 28, 2019
Learn how Bayes Theorem is in Machine Learning for classification and regression!
Uber Engineering
OCTOBER 16, 2019
Michelangelo , Uber’s machine learning (ML) platform, supports the training and serving of thousands of models in production across the company. Designed to cover the end-to-end ML workflow, the system currently supports classical machine learning, time series forecasting, and deep … The post Evolving Michelangelo Model Representation for Flexibility at Scale appeared first on Uber Engineering Blog.
Confluent
OCTOBER 10, 2019
The Internet of Things (IoT) is getting more and more traction as valuable use cases come to light. A key challenge, however, is integrating devices and machines to process the data in real time and at scale. Apache Kafka ® and its surrounding ecosystem, which includes Kafka Connect, Kafka Streams, and KSQL, have become the technology of choice for integrating and processing these kinds of datasets.
Data Engineering Podcast
OCTOBER 28, 2019
Summary Despite the fact that businesses have relied on useful and accurate data to succeed for decades now, the state of the art for obtaining and maintaining that information still leaves much to be desired. In an effort to create a better abstraction for building data applications Nick Schrock created Dagster. In this episode he explains his motivation for creating a product for data management, how the programming model simplifies the work of building testable and maintainable pipelines, and
Advertisement
Whether you’re creating complex dashboards or fine-tuning large language models, your data must be extracted, transformed, and loaded. ETL and ELT pipelines form the foundation of any data product, and Airflow is the open-source data orchestrator specifically designed for moving and transforming data in ETL and ELT pipelines. This eBook covers: An overview of ETL vs.
Netflix Tech
OCTOBER 23, 2019
Jeremy Smith , Jonathan Indig , Faisal Siddiqi We are pleased to announce the open-source launch of Polynote : a new, polyglot notebook with first-class Scala support, Apache Spark integration, multi-language interoperability including Scala, Python, and SQL, as-you-type autocomplete, and more. Polynote provides data scientists and machine learning researchers with a notebook environment that allows them the freedom to seamlessly integrate our JVM-based ML platform ?
Teradata
OCTOBER 9, 2019
Teradata Workload Management automates changes in settings across processing windows as patterns of end user usage changes and as business requirements shift.
Data Engineering Digest brings together the best content for data engineering professionals from the widest variety of industry thought leaders.
Uber Engineering
OCTOBER 16, 2019
Michelangelo , Uber’s machine learning (ML) platform, supports the training and serving of thousands of models in production across the company. Designed to cover the end-to-end ML workflow, the system currently supports classical machine learning, time series forecasting, and deep … The post Evolving Michelangelo Model Representation for Flexibility at Scale appeared first on Uber Engineering Blog.
Confluent
OCTOBER 16, 2019
Trains are an excellent source of streaming data—their movements around the network are an unbounded series of events. Using this data, Apache Kafka ® and Confluent Platform can provide the foundations for both event-driven applications as well as an analytical platform. With tools like KSQL and Kafka Connect, the concept of streaming ETL is made accessible to a much wider audience of developers and data engineers.
Data Engineering Podcast
OCTOBER 21, 2019
Summary The scale and complexity of the systems that we build to satisfy business requirements is increasing as the available tools become more sophisticated. In order to bridge the gap between legacy infrastructure and evolving use cases it is necessary to create a unifying set of components. In this episode Dipti Borkar explains how the emerging category of data orchestration tools fills this need, some of the existing projects that fit in this space, and some of the ways that they can work to
Netflix Tech
OCTOBER 16, 2019
By Ammar Khaku Introduction In a microservice architecture such as Netflix’s, propagating datasets from a single source to multiple downstream destinations can be challenging. These datasets can represent anything from service configuration to the results of a batch job, are often needed in-memory to optimize access and must be updated as they change over time.
Speaker: Tamara Fingerlin, Developer Advocate
In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!
Teradata
OCTOBER 3, 2019
Predict the future faster with predictive modeling. Learn more about use cases and how to get more value out of your data.
KDnuggets
OCTOBER 24, 2019
In this post, you will see 3 different techniques of how to do Feature Selection to your datasets and how to build an effective predictive model.
Dataquest
OCTOBER 16, 2019
Exciting news: we just launched a totally revamped Data Engineering path that offers from-scratch training for anyone who wants to become a data engineer or learn some data engineering skills. Looks cool, right? But it begs the question: why learn data engineering in the first place? Typically, data science teams are comprised of data analysts, data scientists, and data engineers.
Confluent
OCTOBER 24, 2019
I’ve written an event sourcing bank simulation in Clojure (a lisp build for Java virtual machines or JVMs) called open-bank-mark , which you are welcome to read about in my previous blog post explaining the story behind this open source example. As a next step, specifically for this article I’ve added SSL and combined some topics together, using the subject name strategy option of Confluent Schema Registry , making it more production like, adding security, and making it possible to put multiple
Advertisement
Many software teams have migrated their testing and production workloads to the cloud, yet development environments often remain tied to outdated local setups, limiting efficiency and growth. This is where Coder comes in. In our 101 Coder webinar, you’ll explore how cloud-based development environments can unlock new levels of productivity. Discover how to transition from local setups to a secure, cloud-powered ecosystem with ease.
Data Engineering Podcast
OCTOBER 14, 2019
Summary Managing a data warehouse can be challenging, especially when trying to maintain a common set of patterns. Dataform is a platform that helps you apply engineering principles to your data transformations and table definitions, including unit testing SQL scripts, defining repeatable pipelines, and adding metadata to your warehouse to improve your team’s communication.
Netflix Tech
OCTOBER 18, 2019
Faisal Siddiqi Infrastructure for Contextual Bandits and Reinforcement Learning?—? theme of the ML Platform meetup hosted at Netflix, Los Gatos on Sep 12, 2019. Contextual and Multi-armed Bandits enable faster and adaptive alternatives to traditional A/B Testing. They enable rapid learning and better decision-making for product rollouts. Broadly speaking, these approaches can be seen as a stepping stone to full-on Reinforcement Learning (RL) with closed-loop, on-policy evaluation and model objec
Teradata
OCTOBER 21, 2019
With four new offerings, Teradata is helping companies move from analytics to answers wherever they are on their cloud journey. Read more.
KDnuggets
OCTOBER 21, 2019
If you are interested in learning more about the latest Youtube recommendation algorithm paper, read this post for details on its approach and improvements.
Advertisement
Large enterprises face unique challenges in optimizing their Business Intelligence (BI) output due to the sheer scale and complexity of their operations. Unlike smaller organizations, where basic BI features and simple dashboards might suffice, enterprises must manage vast amounts of data from diverse sources. What are the top modern BI use cases for enterprise businesses to help you get a leg up on the competition?
Dataquest
OCTOBER 16, 2019
We’ve got some really exciting news: we’ve just launched a total revamp of our Data Engineering learning path ! This revamped path is designed to be more like our other course paths. You can start it even if you have no prior experience with coding , and it’ll take you from total beginner to experienced practitioner with all of the core skills needed to become a data engineer.
Confluent
OCTOBER 7, 2019
This tutorial describes how to set up a sample Spring Boot application in Pivotal Application Service (PAS), which consumes and produces events to an Apache Kafka ® cluster running in Pivotal Container Service (PKS). With this tutorial, you can set up your PAS and PKS configurations so that they work with Kafka. For a tutorial on how to set up a Kafka cluster in PKS, please see How to Deploy Confluent Platform on Pivotal Container Service (PKS) with Confluent Operator.
Data Engineering Podcast
OCTOBER 7, 2019
Summary The process of exposing your data through a SQL interface has many possible pathways, each with their own complications and tradeoffs. One of the recent options is Rockset, a serverless platform for fast SQL analytics on semi-structured and structured data. In this episode CEO Venkat Venkataramani and SVP of Product Shruti Bhat explain the origins of Rockset, how it is architected to allow for fast and flexible SQL analytics on your data, and how their serverless platform can save you th
Netflix Tech
OCTOBER 23, 2019
Jeremy Smith , Jonathan Indig , Faisal Siddiqi We are pleased to announce the open-source launch of Polynote : a new, polyglot notebook with first-class Scala support, Apache Spark integration, multi-language interoperability including Scala, Python, and SQL, as-you-type autocomplete, and more. Polynote provides data scientists and machine learning researchers with a notebook environment that allows them the freedom to seamlessly integrate our JVM-based ML platform ?
Speaker: Jay Allardyce, Deepak Vittal, Terrence Sheflin, and Mahyar Ghasemali
As we look ahead to 2025, business intelligence and data analytics are set to play pivotal roles in shaping success. Organizations are already starting to face a host of transformative trends as the year comes to a close, including the integration of AI in data analytics, an increased emphasis on real-time data insights, and the growing importance of user experience in BI solutions.
Teradata
OCTOBER 15, 2019
Find out how our UX team is going to radically simplify the Teradata user experience. To be unveiled at Teradata Universe!
KDnuggets
OCTOBER 22, 2019
For full-stack data science mastery, you must understand data management along with all the bells and whistles of machine learning. This high-level overview is a road map for the history and current state of the expansive options for data storage and infrastructure solutions.
Rockset
OCTOBER 21, 2019
At Rockset, our singular focus is to be the shortest (and most efficient) path from data to applications for our users. We recognize and truly believe that our success lies in the success of our users. We constantly think about improving our workflows, coming up with new ones and iterating on them in ways that takes the user experience to a whole new level.
Confluent
OCTOBER 9, 2019
Last week, the Kafka Summit hosted nearly 2,000 people from 40 different countries and 595 companies—the largest Summit yet. By the numbers, we got to enjoy four keynote speakers, 56 sessions, 75 speakers, 38 sponsors, and one big party, including the classic Apache Kafka ® ice sculpture, per the traditions handed down to us. (I guess we handed those traditions down to ourselves in this case, but still.).
Speaker: Nikhil Joshi, Founder & President of Snic Solutions
Is your manufacturing operation reaching its efficiency potential? A Manufacturing Execution System (MES) could be the game-changer, helping you reduce waste, cut costs, and lower your carbon footprint. Join Nikhil Joshi, Founder & President of Snic Solutions, in this value-packed webinar as he breaks down how MES can drive operational excellence and sustainability.
Netflix Tech
OCTOBER 23, 2019
Jeremy Smith , Jonathan Indig , Faisal Siddiqi We are pleased to announce the open-source launch of Polynote : a new, polyglot notebook with first-class Scala support, Apache Spark integration, multi-language interoperability including Scala, Python, and SQL, as-you-type autocomplete, and more. Polynote provides data scientists and machine learning researchers with a notebook environment that allows them the freedom to seamlessly integrate our JVM-based ML platform ?
Teradata
OCTOBER 7, 2019
Companies must think about the future of personalization and how to modernize their current systems to combat over-choice in the market. Find out how.
KDnuggets
OCTOBER 17, 2019
This post aims to make you get started with putting your trained machine learning models into production using Flask API.
Speaker: Anindo Banerjea, CTO at Civio & Tony Karrer, CTO at Aggregage
When developing a Gen AI application, one of the most significant challenges is improving accuracy. This can be especially difficult when working with a large data corpus, and as the complexity of the task increases. The number of use cases/corner cases that the system is expected to handle essentially explodes. 💥 Anindo Banerjea is here to showcase his significant experience building AI/ML SaaS applications as he walks us through the current problems his company, Civio, is solving.
Let's personalize your content