Sat.Nov 27, 2021 - Fri.Dec 03, 2021

article thumbnail

How to Get Certified as a Data Scientist

KDnuggets

If you are early in your journey to becoming a Data Scientist, an interesting option is to earn certification by DataCamp, and this guide offers tips that will help beginners complete the challenges.

article thumbnail

A Guide to Stream Processing and ksqlDB Fundamentals

Confluent

Event streaming applications are a powerful way to react to events as they happen and to take advantage of data while it is fresh. However, they can be a challenge […].

Process 141
article thumbnail

Create your Private Data Warehousing Environment Using Azure Kubernetes Service

Cloudera

For Cloudera ensuring data security is critical because we have large customers in highly regulated industries like financial services and healthcare, where security is paramount. Also, for other industries like retail, telecom or public sector that deal with large amounts of customer data and operate multi-tenant environments, sometimes with end users who are outside of their company, securing all the data may be a very time intensive process.

article thumbnail

Doing DataOps For External Data Sources As A Service at Demyst

Data Engineering Podcast

Summary The data that you have access to affects the questions that you can answer. By using external data sources you can drastically increase the range of analysis that is available to your organization. The challenge comes in all of the operational aspects of finding, accessing, organizing, and serving that data. In this episode Mark Hookey discusses how he and his team at Demyst do all of the DataOps for external data sources so that you don’t have to, including the systems necessary t

article thumbnail

Apache Airflow® Best Practices for ETL and ELT Pipelines

Whether you’re creating complex dashboards or fine-tuning large language models, your data must be extracted, transformed, and loaded. ETL and ELT pipelines form the foundation of any data product, and Airflow is the open-source data orchestrator specifically designed for moving and transforming data in ETL and ELT pipelines. This eBook covers: An overview of ETL vs.

article thumbnail

2021: A Year Full of Amazing AI papers — A Review

KDnuggets

A curated list of the latest breakthroughs in AI by release date with a clear video explanation, link to a more in-depth article, and code.

Coding 160
article thumbnail

Best Tutorials for Getting Started with Apache Kafka

Confluent

Each one of the more than 50 tutorials for Apache Kafka® on Confluent Developer answers a question that you might ask a knowledgeable friend or colleague about Kafka and its […].

Kafka 135

More Trending

article thumbnail

Creating A Unified Experience For The Modern Data Stack At Mozart Data

Data Engineering Podcast

Summary The modern data stack has been gaining a lot of attention recently with a rapidly growing set of managed services for different stages of the data lifecycle. With all of the available options it is possible to run a scalable, production grade data platform with a small team, but there are still sharp edges and integration challenges to work through.

BI 100
article thumbnail

Movie Recommendations with Spark Collaborative Filtering

KDnuggets

Not sure what movie to watch? Ask your recommender system.

Systems 160
article thumbnail

Reference Data: Smoothing Out the Bumps in M&A

Teradata

M&A is an important part of an organization's growth strategy. Getting reference data right can be foundational to overcoming many challenges that come with it.

Data 98
article thumbnail

A Systematic Approach to Reducing Technical Debt

Zalando Engineering

Introduction While technical debt is a recurring issue in software engineering, the case of the Merchant Orders team within Zalando Direct was a an outlier as, due to a lack of a clearly defined process, technical debt more or less only ever accumulated. When I joined this team in autumn 2020 as its new engineering lead, the technical debt backlog had entries dating back to 2018.

article thumbnail

Apache Airflow®: The Ultimate Guide to DAG Writing

Speaker: Tamara Fingerlin, Developer Advocate

In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!

article thumbnail

Eight Top DataOps Trends for 2022

DataKitchen

DataOps adoption continues to expand as a perfect storm of social, economic, and technological factors drive enterprises to invest in process-driven innovation. From our unique vantage point in the evolution toward DataOps automation, we publish an annual prediction of trends that most deeply impact the DataOps enterprise software industry as a whole.

article thumbnail

Why Machine Learning Engineers are Replacing Data Scientists

KDnuggets

The hiring run for data scientists continues along at a strong clip around the world. But, there are other emerging roles that are demonstrating key value to organizations that you should consider based on your existing or desired skill sets.

article thumbnail

Connecting the Data Lifecycle

Cloudera

Data transforms businesses. When done right it creates value and allows business leaders to make the most advantageous decisions, in real-time. That’s where the data lifecycle comes into play. Managing data and its flow, from the edge to the cloud, is one of the most important tasks in the process of gaining data intelligence. . The Data Impact Awards 2021 aim to recognize and reward the various organizations taking advantage of the latest Big Data services to successfully manage large amounts o

article thumbnail

Evaluation Modes in Scala Explained

Rock the JVM

Explore core Scala constructs from a fresh perspective and uncover new insights into their use and application

Scala 52
article thumbnail

Optimizing The Modern Developer Experience with Coder

Many software teams have migrated their testing and production workloads to the cloud, yet development environments often remain tied to outdated local setups, limiting efficiency and growth. This is where Coder comes in. In our 101 Coder webinar, you’ll explore how cloud-based development environments can unlock new levels of productivity. Discover how to transition from local setups to a secure, cloud-powered ecosystem with ease.

article thumbnail

Why Rockset Is My Next Job After Facebook

Rockset

“At every step in this process, I’ve been consistently impressed by the quality and caliber of the team." On Monday, I joined Rockset. I am joining Rockset as its first director of engineering and first external manager hire. I come here from Facebook, where I spent the last 10 years building and supporting teams. Most of my work was in the core C++ libraries and distributed systems components that power Facebook’s infrastructure.

article thumbnail

Sentiment Analysis with KNIME

KDnuggets

Check out this tutorial on how to approach sentiment classification with supervised machine learning algorithms.

Algorithm 160
article thumbnail

The Cloudera Enterprise Data Cloud Maturity Report: Uncovering progressive steps towards a hybrid future

Cloudera

This guest blog was written by Shanice Omare, Research Manager, Vanson Bourne. Organizations’ resiliency in the wake of the pandemic . So much has changed for organizations in recent times, with the pandemic accelerating shifts toward a more digital world. Some organizations have taken this as an opportunity for positive change by moving workloads to the cloud and utilizing enterprise data strategies that are key to their business resiliency.

Cloud 78
article thumbnail

Maps with PostgreSQL and PostGIS

Zalando Engineering

This blog post explains to you which tools to use to serve geospatial data from a database system (PostgreSQL) to your web browser. All you need is a database server for the data, a web map application for the frontend and a small service in between to transfer user requests. I will also show you how these components can run on top of Kubernetes in a highly available cloud native fashion.

article thumbnail

15 Modern Use Cases for Enterprise Business Intelligence

Large enterprises face unique challenges in optimizing their Business Intelligence (BI) output due to the sheer scale and complexity of their operations. Unlike smaller organizations, where basic BI features and simple dashboards might suffice, enterprises must manage vast amounts of data from diverse sources. What are the top modern BI use cases for enterprise businesses to help you get a leg up on the competition?

article thumbnail

Data Mining vs Machine Learning. Here’s the Difference

ProjectPro

We all are aware of the advancements in technology; new terminologies are coming in with these advancements. Everyone wants to keep up with this, wanting to sound tech-savvy. To ensure this, it is important to understand the exact meaning of the terminologies before we use them. Data is the New Fuel. We all know this , so you might have heard terms like Artificial Intelligence (AI), Machine Learning, Data Mining, Neural Networks, etc.

article thumbnail

5 Practical Data Science Projects That Will Help You Solve Real Business Problems for 2022

KDnuggets

This curated list of data science projects offers real-life problems that will help you master skills to demonstration that you are technically sound and know how to conduct data science projects that add business value.

article thumbnail

Making Data Science Responsible

Elder Research

The post Making Data Science Responsible appeared first on Elder Research.

article thumbnail

From Strategy to Action: How to ‘Break the Code’ of Analytics at Scale in Retail and CPG

Teradata

Retail and CPG leaders of the future need to successfully leverage analytics at speed and scale to drive performance. Find out more.

Retail 52
article thumbnail

Prepare Now: 2025s Must-Know Trends For Product And Data Leaders

Speaker: Jay Allardyce, Deepak Vittal, Terrence Sheflin, and Mahyar Ghasemali

As we look ahead to 2025, business intelligence and data analytics are set to play pivotal roles in shaping success. Organizations are already starting to face a host of transformative trends as the year comes to a close, including the integration of AI in data analytics, an increased emphasis on real-time data insights, and the growing importance of user experience in BI solutions.

article thumbnail

How to Become a Deep Learning Engineer in 2023?

ProjectPro

Deep learning was developed in the early 1940s to mimic the neural networks of the human brain. However, it did not garner enough interest due to limited computation power and storage options. However, in the last few decades, deep learning has unleashed itself into the world. Its massive evolution is also the result of substantial research labs and industry players like Facebook, Google, Apple, Netflix, Microsoft, Baidu, and IBM investing in its research. 85% of data science platform vendors ha

article thumbnail

What Percentage of Your Machine Learning Models Have Been Deployed?

KDnuggets

Take a moment to participate in the latest KDnuggets poll and let the community know what percentage of your machine learning models have been deployed.

article thumbnail

The (Missing) Role of Design in Analytics

dbt Developer Hub

If you’ve spoken to me lately, follow me on Twitter , or have taken my order at Wendy’s , you probably know how much I hate traditional dashboards. My dad, a psychotherapist, has been working with me to get to the root of my upbringing that led to this deep-rooted feeling. As it turns out, the cause of my feelings towards traditional dashboarding are actually quite obvious.

article thumbnail

RudderStack Secures SOC 2 Type II Certification

RudderStack

We consider security to be vital, especially when it comes to our customers’ data. We’re excited that we have attained SOC 2 Type II compliance.

article thumbnail

How to Drive Cost Savings, Efficiency Gains, and Sustainability Wins with MES

Speaker: Nikhil Joshi, Founder & President of Snic Solutions

Is your manufacturing operation reaching its efficiency potential? A Manufacturing Execution System (MES) could be the game-changer, helping you reduce waste, cut costs, and lower your carbon footprint. Join Nikhil Joshi, Founder & President of Snic Solutions, in this value-packed webinar as he breaks down how MES can drive operational excellence and sustainability.

article thumbnail

What is ETL Pipeline? Process, Considerations, and Examples

ProjectPro

If you are into Data Science or Big Data, you must be familiar with an ETL pipeline. If not, then don't worry. This guide provides definitions, a step-by-step tutorial, and a few best practices to help you understand ETL pipelines and how they differ from data pipelines. The crux of all data-driven solutions or business decision-making lies in how well the respective businesses collect, transform, and store data.

Process 52
article thumbnail

Avoid These Mistakes with Time Series Forecasting

KDnuggets

A few checks to make before training a Machine Learning model on data that could be random.

article thumbnail

How I Study Open Source Community Growth with dbt

dbt Developer Hub

Most organizations spend at least some of their time contributing to an open source project. 100% of them, though, depend in some way on the output of open source communities. In fact we all do. The work of communities can be found everywhere - in the cell phone that wakes you up, the machine that makes your coffee, the car that drives you to get better coffee, the magic app that brings you dinner, and the television that lulls you back to sleep.

article thumbnail

What is a Data Source?

Grouparoo

The data source is the location of the data that the processing will consume for data processing functions. This can be the point of origin of the data, the place of its creation. Alternatively, this can be data generated by another process and then made available for subsequent processing. Therefore, the source data may be raw, unfiltered, and unrefined, or polished and fully formed.

article thumbnail

The Cloud Development Environment Adoption Report

Cloud Development Environments (CDEs) are changing how software teams work by moving development to the cloud. Our Cloud Development Environment Adoption Report gathers insights from 223 developers and business leaders, uncovering key trends in CDE adoption. With 66% of large organizations already using CDEs, these platforms are quickly becoming essential to modern development practices.