Sat.Nov 23, 2019 - Fri.Nov 29, 2019

article thumbnail

Open Source Projects by Google, Uber and Facebook for Data Science and AI

KDnuggets

Open source is becoming the standard for sharing and improving technology. Some of the largest organizations in the world namely: Google, Facebook and Uber are open sourcing their own technologies that they use in their workflow to the public.

article thumbnail

Optimizing Observability with Jaeger, M3, and XYS at Uber

Uber Engineering

When something goes wrong with a piece of code, engineers want to know all the relevant details of the error immediately so they can get right to work remedying the malfunction. . However, as technology has advanced, measuring system metrics and … The post Optimizing Observability with Jaeger, M3, and XYS at Uber appeared first on Uber Engineering Blog.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

Building A Real Time Event Data Warehouse For Sentry

Data Engineering Podcast

Summary The team at Sentry has built a platform for anyone in the world to send software errors and events. As they scaled the volume of customers and data they began running into the limitations of their initial architecture. To address the needs of their business and continue to improve their capabilities they settled on Clickhouse as the new storage and query layer to power their business.

article thumbnail

Conquering Hybrid Cloud with Replicated Event-Driven Architectures

Confluent

Potential advantages of hybrid cloud architectures include avoiding vendor lock-in, increasing system resilience, optimizing costs, and inducing price competition among cloud providers. Hybrid cloud architectures require the ability to securely […].

article thumbnail

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

Speaker: Tamara Fingerlin, Developer Advocate

Apache Airflow® 3.0, the most anticipated Airflow release yet, officially launched this April. As the de facto standard for data orchestration, Airflow is trusted by over 77,000 organizations to power everything from advanced analytics to production AI and MLOps. With the 3.0 release, the top-requested features from the community were delivered, including a revamped UI for easier navigation, stronger security, and greater flexibility to run tasks anywhere at any time.

article thumbnail

Getting Started with Automated Text Summarization

KDnuggets

This article will walk through an extractive text summarization process, using a simple word frequency approach, implemented in Python.

Python 160
article thumbnail

The Four Types of Chief Data Officers

Teradata

Every organization, in every industry, needs a Chief Data Officer. Use these guidelines to choose the one with the optimal skills and background for success.

Data 59

More Trending

article thumbnail

A Doomed Marriage of Machine Learning and Agile

KDnuggets

Sebastian Thrun, the founder of Udacity, ruined my machine learning project and wedding.

article thumbnail

Top KDnuggets tweets, Nov 20-26: How to Speed up Pandas by 4x with one line of code

KDnuggets

Also: Deep Learning for Image Classification with Less Data; How to Speed up Pandas by 4x with one line of code; 25 Useful #Python Snippets to Help in Your Day-to-Day Work; Automated Machine Learning Project Implementation Complexities.

Coding 152
article thumbnail

Markov Chains: How to Train Text Generation to Write Like George R. R. Martin

KDnuggets

Read this article on training Markov chains to generate George R. R. Martin style text.

135
135
article thumbnail

Content-based Recommender Using Natural Language Processing (NLP)

KDnuggets

A guide to build a content-based movie recommender model based on NLP.

Process 127
article thumbnail

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Speaker: Alex Salazar, CEO & Co-Founder @ Arcade | Nate Barbettini, Founding Engineer @ Arcade | Tony Karrer, Founder & CTO @ Aggregage

There’s a lot of noise surrounding the ability of AI agents to connect to your tools, systems and data. But building an AI application into a reliable, secure workflow agent isn’t as simple as plugging in an API. As an engineering leader, it can be challenging to make sense of this evolving landscape, but agent tooling provides such high value that it’s critical we figure out how to move forward.

article thumbnail

Two Years In The Life of AI, Machine Learning, Deep Learning and Java

KDnuggets

Where does Java stand in the world of artificial intelligence, machine learning, and deep learning? Learn more about how to do these things in Java, and the libraries and frameworks to use.

article thumbnail

Top 8 Data Science Use Cases in Marketing

KDnuggets

In this article, we want to highlight some key data science use cases in marketing. Let us concentrate on several instances that present particular interest and managed to prove their efficiency in the course of time.

article thumbnail

Can Neural Networks Develop Attention? Google Thinks they Can

KDnuggets

Google recently published some work about modeling attention mechanisms in deep neural networks.

118
118
article thumbnail

Would you buy insights from this guy? (How to assess and manage a Data Science vendor)

KDnuggets

With all the hype from data science vendors selling "actionable insights" to boost your company's bottom line, selecting your analytics partner should proceed through the same, careful process as any traditional business endeavor. Follow these questions and best practices to ensure you manage accordingly.

article thumbnail

How to Modernize Manufacturing Without Losing Control

Speaker: Andrew Skoog, Founder of MachinistX & President of Hexis Representatives

Manufacturing is evolving, and the right technology can empower—not replace—your workforce. Smart automation and AI-driven software are revolutionizing decision-making, optimizing processes, and improving efficiency. But how do you implement these tools with confidence and ensure they complement human expertise rather than override it? Join industry expert Andrew Skoog as he explores how manufacturers can leverage automation to enhance operations, streamline workflows, and make smarter, data-dri

article thumbnail

Lit BERT: NLP Transfer Learning In 3 Steps

KDnuggets

PyTorch Lightning is a lightweight framework which allows anyone using PyTorch to scale deep learning code easily while making it reproducible. In this tutorial we’ll use Huggingface's implementation of BERT to do a finetuning task in Lightning.

article thumbnail

Probability Learning: Naive Bayes

KDnuggets

This post will describe various simplifications of Bayes' Theorem, that make it more practical and applicable to real world problems: these simplifications are known by the name of Naive Bayes. Also, to clarify everything we will see a very illustrative example of how Naive Bayes can be applied for classification.

IT 117
article thumbnail

Machine Learning 101: The What, Why, and How of Weighting

KDnuggets

Weighting is a technique for improving models. In this article, learn more about what weighting is, why you should (and shouldn’t) use it, and how to choose optimal weights to minimize business costs.

article thumbnail

The Future of Careers in Data Science & Analysis

KDnuggets

As the fields of data science and analysis continue to expand, the next crop of bright minds is always needed. Learn more about the nuances of these jobs and find where you can fit in for a rewarding and interesting career.

article thumbnail

The Ultimate Guide to Apache Airflow DAGS

With Airflow being the open-source standard for workflow orchestration, knowing how to write Airflow DAGs has become an essential skill for every data engineer. This eBook provides a comprehensive overview of DAG writing features with plenty of example code. You’ll learn how to: Understand the building blocks DAGs, combine them in complex pipelines, and schedule your DAG to run exactly when you want it to Write DAGs that adapt to your data at runtime and set up alerts and notifications Scale you

article thumbnail

Spark NLP 101: LightPipeline

KDnuggets

A Pipeline is specified as a sequence of stages, and each stage is either a Transformer or an Estimator. These stages are run in order, and the input DataFrame is transformed as it passes through each stage. Now let’s see how this can be done in Spark NLP using Annotators and Transformers.

IT 110
article thumbnail

Cartoon: Thanksgiving, Big Data, and Turkey Data Science…

KDnuggets

A classic KDnuggets Thanksgiving cartoon examines the predicament of one group of fowl Data Scientists.

article thumbnail

KDnuggets™ News 19:n45, Nov 27: Interpretable vs black box models; Advice for New and Junior Data Scientists

KDnuggets

This week: Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead; Advice for New and Junior Data Scientists; Python Tuples and Tuple Methods; Can Neural Networks Develop Attention? Google Thinks they Can; Three Methods of Data Pre-Processing for Text Classification.

article thumbnail

Task-based effectiveness of basic visualizations

KDnuggets

This is a summary of a recent paper on an age-old topic: what visualisation should I use? No prizes for guessing “it depends!” Is this the paper to finally settle the age-old debate surrounding pie-charts??

IT 74
article thumbnail

Apache Airflow® Best Practices: DAG Writing

Speaker: Tamara Fingerlin, Developer Advocate

In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!

article thumbnail

AXA, State Auto and Hippo on how to fuse innovative tech to your company

KDnuggets

Join this live webinar: Fast, Agile, Service-Driven Insurance: Fuse Innovative Tech to Your Company DNA - AI, Chatbots, Automation and More, Dec 11 at 10:00am EST, to get actionable insight to develop your strategy.

article thumbnail

Top Stories, Nov 18-24: How to Speed up Pandas by 4x with one line of code; Python, Selenium & Google for Geocoding Automation: Free and Paid

KDnuggets

Also: Automated Machine Learning Project Implementation Complexities; Text Encoding: A Review; The Notebook Anti-Pattern; Data Science for Managers: Programming Languages; 10 Free Must-read Books on AI.