The Most Useful Machine Learning Tools of 2020
KDnuggets
MARCH 13, 2020
This articles outlines 5 sets of tools every lazy full-stack data scientist should use.
KDnuggets
MARCH 13, 2020
This articles outlines 5 sets of tools every lazy full-stack data scientist should use.
Uber Engineering
MARCH 11, 2020
The performance of Uber’s services relies on our ability to quickly and stably launch new features on our platform , regardless of where the corresponding service lives in our tech stack. Foundational to our platform’s power is its microservice-based architecture … The post Why We Leverage Multi-tenancy in Uber’s Microservice Architecture appeared first on Uber Engineering Blog.
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Netflix Tech
MARCH 10, 2020
by Damir Svrtan and Sergii Makagon As the production of Netflix Originals grows each year, so does our need to build apps that enable efficiency throughout the entire creative process. Our wider Studio Engineering Organization has built more than 30 apps that help content progress from pitch (aka screenplay) to playback: ranging from script content acquisition, deal negotiations and vendor management to scheduling, streamlining production workflows, and so on.
Confluent
MARCH 11, 2020
In the Apache Kafka® ecosystem, ksqlDB and Kafka Streams are two popular tools for building event streaming applications that are tightly integrated with Apache Kafka. While ksqlDB and Kafka Streams […].
Advertisement
In Airflow, DAGs (your data pipelines) support nearly every use case. As these workflows grow in complexity and scale, efficiently identifying and resolving issues becomes a critical skill for every data engineer. This is a comprehensive guide with best practices and examples to debugging Airflow DAGs. You’ll learn how to: Create a standardized process for debugging to quickly diagnose errors in your DAGs Identify common issues with DAGs, tasks, and connections Distinguish between Airflow-relate
KDnuggets
MARCH 11, 2020
Let's talk about covid-19; the reality, the numbers, and the data science.
Data Engineering Podcast
MARCH 9, 2020
Summary Data governance is a complex endeavor, but scaling it to meet the needs of a complex or globally distributed organization requires a well considered and coherent strategy. In this episode Tim Ward describes an architecture that he has used successfully with multiple organizations to scale compliance. By treating it as a graph problem, where each hub in the network has localized control with inheritance of higher level controls it reduces overhead and provides greater flexibility.
Data Engineering Digest brings together the best content for data engineering professionals from the widest variety of industry thought leaders.
Confluent
MARCH 9, 2020
Every year on March 8th, Confluent is proud to celebrate International Women’s Day, a global holiday dedicated to honoring the accomplishments of women and advocating for gender equality around the […].
KDnuggets
MARCH 9, 2020
Upgrading your machine learning, AI, and Data Science skills requires practice. To practice, you need to develop models with a large amount of data. Finding good datasets to work with can be challenging, so this article discusses more than 20 great datasets along with machine learning project ideas for you to tackle today.
Teradata
MARCH 10, 2020
Monica Woolmer recaps her career in technology and business as an homage to Women's History Month.
Rockset
MARCH 12, 2020
At Rockset we strive to make building modern data applications easy and intuitive. Data-backed applications come with an inherent amount of complexity - managing the database backend, exposing a data API (often using hard-coded SQL or an ORM to write queries), keeping the data and application code in sync. the list goes on. Just as Rockset has reimagined and dramatically simplified the traditional ETL pipeline on the data-loading side , we’re now proud to release a new product feature - Query La
Speaker: Tamara Fingerlin, Developer Advocate
Apache Airflow® 3.0, the most anticipated Airflow release yet, officially launched this April. As the de facto standard for data orchestration, Airflow is trusted by over 77,000 organizations to power everything from advanced analytics to production AI and MLOps. With the 3.0 release, the top-requested features from the community were delivered, including a revamped UI for easier navigation, stronger security, and greater flexibility to run tasks anywhere at any time.
Confluent
MARCH 10, 2020
Given the growing concern and global impact of COVID-19 (better known as the coronavirus), we’ve made the decision to cancel the upcoming Kafka Summit London. While this decision was incredibly […].
KDnuggets
MARCH 9, 2020
In this article, we are listing down some excellent data science books which cover the wide variety of topics under Data Science.
Teradata
MARCH 9, 2020
STC uses Teradata to serve each segment as one team, increasing response rates, customer satisfaction, and revenue as well as reducing operating and call center costs.
Preset
MARCH 12, 2020
Stay up to date on the Coronavirus cases with **Superset** dashboard & Public Data
Speaker: Alex Salazar, CEO & Co-Founder @ Arcade | Nate Barbettini, Founding Engineer @ Arcade | Tony Karrer, Founder & CTO @ Aggregage
There’s a lot of noise surrounding the ability of AI agents to connect to your tools, systems and data. But building an AI application into a reliable, secure workflow agent isn’t as simple as plugging in an API. As an engineering leader, it can be challenging to make sense of this evolving landscape, but agent tooling provides such high value that it’s critical we figure out how to move forward.
Confluent
MARCH 13, 2020
As the impact of COVID-19 (coronavirus) continues to spread, our top priority is the health and well-being of our customers, employees, and community. We are acutely aware that these are […].
KDnuggets
MARCH 8, 2020
For the international women's day, we feature resources to help more women enter and succeed in AI, Big Data, Data Science, and Machine Learning fields.
Teradata
MARCH 8, 2020
In the final part of this 3-part series, Tim Miller describes how to run R and Python in-database in Vantage using SCRIPT Table Operators.
Zalando Engineering
MARCH 12, 2020
This document is heavily informed by remote work guidance from other companies and authors. Notable sources include FYI's 11 Best Practices for Working Remotely and Laurel Farrer’s How to Design Powerful Rituals for Successful Distributed Companies. Special thanks to Timo from GiantSwarm for sharing learnings in an ad-hoc phone call. Other sources are linked in the appendix.
Speaker: Andrew Skoog, Founder of MachinistX & President of Hexis Representatives
Manufacturing is evolving, and the right technology can empower—not replace—your workforce. Smart automation and AI-driven software are revolutionizing decision-making, optimizing processes, and improving efficiency. But how do you implement these tools with confidence and ensure they complement human expertise rather than override it? Join industry expert Andrew Skoog as he explores how manufacturers can leverage automation to enhance operations, streamline workflows, and make smarter, data-dri
KDnuggets
MARCH 10, 2020
Has coronavirus impacted your conference or other travel plans, and do you anticipate it causing further professional or educational disruption in the near future? Take part in the new KDnuggets poll and have your say.
KDnuggets
MARCH 9, 2020
Game theory is experiencing a renaissance driven by the evolution of AI. What are some classic and new ideas that data scientists should be aware of.
KDnuggets
MARCH 12, 2020
A new optimization framework helps to incorporate fairness constraints in machine learning models.
KDnuggets
MARCH 13, 2020
After spending a lot of time thinking about the paths that software companies take toward ML maturity, this framework was created to follow as you adopt ML and then mature as an organization. The framework covers every aspect of building a team including product, process, technical, and organizational readiness, as well as recognizes the importance of cross-functional expertise and process improvements for bringing AI-driven products to market.
Advertisement
With Airflow being the open-source standard for workflow orchestration, knowing how to write Airflow DAGs has become an essential skill for every data engineer. This eBook provides a comprehensive overview of DAG writing features with plenty of example code. You’ll learn how to: Understand the building blocks DAGs, combine them in complex pipelines, and schedule your DAG to run exactly when you want it to Write DAGs that adapt to your data at runtime and set up alerts and notifications Scale you
KDnuggets
MARCH 13, 2020
I train a series of Machine Learning models using the iris dataset, construct synthetic data from the extreme points within the data and test a number of Machine Learning models in order to draw the decision boundaries from which the models make predictions in a 2D space, which is useful for illustrative purposes and understanding on how different Machine Learning models make predictions.
KDnuggets
MARCH 12, 2020
Here is how you can teach your model to learn quickly from a few examples.
KDnuggets
MARCH 10, 2020
This article contain a brief intro to Generative Adversarial Network(GAN) and how to build a Human Face Generator.
KDnuggets
MARCH 12, 2020
Automating the analysis of customer feedback will sound like a great idea after reading a couple hundred reviews. Building an NLP solution to provide in-depth analysis of what your customers are thinking is a serious undertaking, and this guide helps you scope out the entire project.
Speaker: Tamara Fingerlin, Developer Advocate
In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!
KDnuggets
MARCH 10, 2020
Just getting started with Python's Pandas library for data analysis? Or, ready for a quick refresher? These 7 steps will help you become familiar with its core features so you can begin exploring your data in no time.
KDnuggets
MARCH 11, 2020
Math for Programmers teaches you the math you need to know for a career in programming, concentrating on what you need to know as a developer.
KDnuggets
MARCH 11, 2020
Also: Recreating Fingerprints using Convolutional Autoencoders; A simple and interpretable performance measure for a binary classifier; Resources for Women in AI, Data Science, and Machine Learning; Trends in Machine Learning in 2020; A Crash Course in Game Theory for Machine Learning; and much more.
KDnuggets
MARCH 10, 2020
This post presents an analysis of Berlin online real estate listings, investigating a controversial law capping rents in the state, which went into effect on February 23. Are current landlords already respecting the new rent cap?
Speaker: Ben Epstein, Stealth Founder & CTO | Tony Karrer, Founder & CTO, Aggregage
When tasked with building a fundamentally new product line with deeper insights than previously achievable for a high-value client, Ben Epstein and his team faced a significant challenge: how to harness LLMs to produce consistent, high-accuracy outputs at scale. In this new session, Ben will share how he and his team engineered a system (based on proven software engineering approaches) that employs reproducible test variations (via temperature 0 and fixed seeds), and enables non-LLM evaluation m
Let's personalize your content