Sat.Mar 14, 2020 - Fri.Mar 20, 2020

article thumbnail

The 4 Best Jupyter Notebook Environments for Deep Learning

KDnuggets

Many cloud providers, and other third-party services, see the value of a Jupyter notebook environment which is why many companies now offer cloud hosted notebooks that are hosted on the cloud. Let's have a look at 3 such environments.

article thumbnail

Advanced Analytics for Coronavirus – Trends, Patterns, Predictions

Teradata

Advanced analytics and AI can significantly accelerate data processing required to get the insights, answers and recommendations to handle and address the COVID-19 pandemic.

article thumbnail

Simplistic Ways to Find Interesting Data Sets

Team Data Science

I am taking you through my recent experience to find a dataset for my project. Industry Search To work with data, I need to narrow down the industry like health care, finance, insurance or other. I defined a few sources in my earlier blog post, which will give a sneak peek of techniques to extract industries. For Instance, most of the job listings introduce their job description as, One of the top insurance client looking for Data Engineer which exposes the industry.

Insurance 130
article thumbnail

15 Things Every Apache Kafka Engineer Should Know About Confluent Replicator

Confluent

Single-cluster deployments of Apache Kafka® are rare. Most medium to large deployments employ more than one Kafka cluster, and even the smallest use cases include development, testing, and production clusters. […].

Kafka 122
article thumbnail

Apache Airflow® Best Practices for ETL and ELT Pipelines

Whether you’re creating complex dashboards or fine-tuning large language models, your data must be extracted, transformed, and loaded. ETL and ELT pipelines form the foundation of any data product, and Airflow is the open-source data orchestrator specifically designed for moving and transforming data in ETL and ELT pipelines. This eBook covers: An overview of ETL vs.

article thumbnail

Time Series Classification Synthetic vs Real Financial Time Series

KDnuggets

This article discusses distinguishing between real financial time series and synthetic time series using XGBoost.

Finance 160
article thumbnail

Improving Prediction of the Unconfirmed COVID-19 Cases

Teradata

With the lack of available tests & uncertainty around the true number of COVID-19 cases, Teradata Epidemiologist Daniel Ulatowski & Data Scientist Jack McCush hypothesize how symptomatic data & the Vantage ML Engine can be utilized to predict cases.

Utilities 128

More Trending

article thumbnail

Building a Cloud ETL Pipeline on Confluent Cloud

Confluent

As enterprises move more and more of their applications to the cloud, they are also moving their on-prem ETL (extract, transform, load) pipelines to the cloud, as well as building […].

Cloud 119
article thumbnail

What is the most effective policy response to the new coronavirus pandemic?

KDnuggets

Where Test/Trace/Quarantine are working, the number of cases/day have declined empirically. Furthermore, this appears to be a radically superior strategy where it can be deployed. I’ll review the evidence, discuss the other strategies and their consequences, and then discuss what can be done.

IT 160
article thumbnail

Building A New Foundation For CouchDB

Data Engineering Podcast

Summary CouchDB is a distributed document database built for scale and ease of operation. With a built-in synchronization protocol and a HTTP interface it has become popular as a backend for web and mobile applications. Created 15 years ago, it has accrued some technical debt which is being addressed with a refactored architecture based on FoundationDB.

Building 100
article thumbnail

Teradata's Response to COVID-19

Teradata

How Teradata is responding to the COVID-19 crisis for the health and well-being of its employees, customers and partners.

IT 59
article thumbnail

Apache Airflow®: The Ultimate Guide to DAG Writing

Speaker: Tamara Fingerlin, Developer Advocate

In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!

article thumbnail

Announcing ksqlDB 0.8.0

Confluent

The latest ksqlDB release introduces long-awaited features such as tunable retention and grace period for windowed aggregates, new built-in functions including LATEST_BY_OFFSET, a peek at the new server API under […].

Process 101
article thumbnail

When Will AutoML replace Data Scientists? Poll Results and Analysis

KDnuggets

Will AI always be 5-10 years away? The majority of respondents to this poll think that AutoML will reach expert level in 5-10 years. Interestingly, it is about the same as 5 years ago. We examine the trends by AutoML experience, industry, and region.

Data 159
article thumbnail

How to Use KSQL Stream Processing and Real-Time Databases to Analyze Streaming Data in Kafka

Rockset

Intro In recent years, Kafka has become synonymous with “streaming,” and with features like Kafka Streams, KSQL, joins, and integrations into sinks like Elasticsearch and Druid, there are more ways than ever to build a real-time analytics application around streaming data in Kafka. With all of these stream processing and real-time data store options, though, also comes questions for when each should be used and what their pros and cons are.

Kafka 40
article thumbnail

Build an Artificial Neural Network From Scratch: Part 2

KDnuggets

The second article in this series focuses on building an Artificial Neural Network using the Numpy Python library.

Building 159
article thumbnail

Optimizing The Modern Developer Experience with Coder

Many software teams have migrated their testing and production workloads to the cloud, yet development environments often remain tied to outdated local setups, limiting efficiency and growth. This is where Coder comes in. In our 101 Coder webinar, you’ll explore how cloud-based development environments can unlock new levels of productivity. Discover how to transition from local setups to a secure, cloud-powered ecosystem with ease.

article thumbnail

Five Interesting Data Engineering Projects

KDnuggets

As the role of the data engineer continues to grow in the field of data science, so are the many tools being developed to support wrangling all that data. Five of these tools are reviewed here (along with a few bonus tools) that you should pay attention to for your data pipeline work.

article thumbnail

A Beginner’s Guide to Data Integration Approaches in Business Intelligence

KDnuggets

An integrated BI system has a trickle-down effect on all business processes, especially reporting and analytics. Find out how integration can help you leverage the power of BI.

article thumbnail

Nine lessons learned during my first year as a Data Scientist

KDnuggets

What is it like to be a Data Scientist? There can be many hats to wear, and so many problems to solve that are fed with data, churned by data science, and guided by business results. Find out about lessons learned from one Data Scientist about how best to work and perform in the role.

article thumbnail

A Comprehensive Data Repository for Fake Health News Detection

KDnuggets

We introduce the FakeHealth, a new data repository for fake health news detection. Following a preliminary analysis to demonstrate its features, we consider additional potential directions for better identifying fake news.

Data 147
article thumbnail

15 Modern Use Cases for Enterprise Business Intelligence

Large enterprises face unique challenges in optimizing their Business Intelligence (BI) output due to the sheer scale and complexity of their operations. Unlike smaller organizations, where basic BI features and simple dashboards might suffice, enterprises must manage vast amounts of data from diverse sources. What are the top modern BI use cases for enterprise businesses to help you get a leg up on the competition?

article thumbnail

A Top Machine Learning Algorithm Explained: Support Vector Machines (SVM)

KDnuggets

Support Vector Machines (SVMs) are powerful for solving regression and classification problems. You should have this approach in your machine learning arsenal, and this article provides all the mathematics you need to know -- it's not as hard you might think.

article thumbnail

Skynet Is Real: The History and Future of Factories With No Workers

KDnuggets

Let’s see whether robots will become "grave diggers" of the proletariat, what do we lack to get total automation, and what compromises exist.

130
130
article thumbnail

24 Best (and Free) Books To Understand Machine Learning

KDnuggets

We have compiled a list of some of the best (and free) machine learning books that will prove helpful for everyone aspiring to build a career in the field.

article thumbnail

Forecasting Stories: Is it seasonality or not?

KDnuggets

Kicking off with a series of forecasting stories, starting with seasonality and its business applications. This first article speaks of course corrections that were based on weather and calendar driven seasonality.

IT 111
article thumbnail

Prepare Now: 2025s Must-Know Trends For Product And Data Leaders

Speaker: Jay Allardyce, Deepak Vittal, Terrence Sheflin, and Mahyar Ghasemali

As we look ahead to 2025, business intelligence and data analytics are set to play pivotal roles in shaping success. Organizations are already starting to face a host of transformative trends as the year comes to a close, including the integration of AI in data analytics, an increased emphasis on real-time data insights, and the growing importance of user experience in BI solutions.

article thumbnail

Top KDnuggets tweets, Mar 11-17: Most western countries are on the same #coronavirus trajectory

KDnuggets

Most western countries are on the same #coronavirus trajectory; The Workers Who Face the Greatest #Coronavirus Risk; #Coronavirus, a Visual Rundown; How to start building an automated NLP solution for processing customer feedback.

Building 110
article thumbnail

Top 20 ODSC 2020 Global Virtual Conference Sessions

KDnuggets

At ODSC 2020, we are unveiling our first ever 4-day Global Virtual Conference, an online and on-demand version of ODSC. Here are our picks for 20 talks that show how diverse and thorough the ODSC East Global Virtual Conference will be this April 14-17.

article thumbnail

Improving the partnership between Data Science and IT

KDnuggets

Friction can quickly arise as a result of these separate workflows and priorities. Given their differences, how can data science and IT more seamlessly work together in building a model-driven organization?

article thumbnail

Salesforce Open Sources a Framework for Open Domain Question Answering Using Wikipedia

KDnuggets

The framework uses a multi-hop QA method to answer complex questions by reasoning through Wikipedia’s datasets.

article thumbnail

How to Drive Cost Savings, Efficiency Gains, and Sustainability Wins with MES

Speaker: Nikhil Joshi, Founder & President of Snic Solutions

Is your manufacturing operation reaching its efficiency potential? A Manufacturing Execution System (MES) could be the game-changer, helping you reduce waste, cut costs, and lower your carbon footprint. Join Nikhil Joshi, Founder & President of Snic Solutions, in this value-packed webinar as he breaks down how MES can drive operational excellence and sustainability.

article thumbnail

Scaling Your Data Strategy

KDnuggets

This article presents a particular vision for a cohesive data strategy for addressing large-scale problems with data-driven solutions, based on prior professional experiences.

Data 76
article thumbnail

Exploring the Adoption of Python in the Workplace – Free Metis Corporate Training Webinar

KDnuggets

Metis will break down Python for data science and analytics, explain what is driving adoption in the field, and discuss how industries and companies are reacting to the shift.

Python 75
article thumbnail

KDnuggets™ News 20:n11, Mar 18: Covid-19, your community, and you – a data science perspective; When Will AutoML replace Data Scientists? Poll Results and Analysis

KDnuggets

A Data Science perspective on Covid-19, the novel coronavirus; The results and analysis of a previous KDnuggets Poll: When Will AutoML replace Data Scientists? How to build a mature Machine Learning team; The Most Useful Machine Learning Tools of 2020; and more.

article thumbnail

Top Stories, Mar 9-15: New Poll: Coronavirus impact on Data Science community; Covid-19, your community, and you — a data science perspective

KDnuggets

Also: 50 Must-Read Free Books For Every Data Scientist in 2020; Decision Boundary for a Series of Machine Learning Models; 20 AI, Data Science, Machine Learning Terms You Need to Know in 2020 (Part 2).

article thumbnail

The Cloud Development Environment Adoption Report

Cloud Development Environments (CDEs) are changing how software teams work by moving development to the cloud. Our Cloud Development Environment Adoption Report gathers insights from 223 developers and business leaders, uncovering key trends in CDE adoption. With 66% of large organizations already using CDEs, these platforms are quickly becoming essential to modern development practices.