Sat.Nov 27, 2021 - Fri.Dec 03, 2021

article thumbnail

How to Get Certified as a Data Scientist

KDnuggets

If you are early in your journey to becoming a Data Scientist, an interesting option is to earn certification by DataCamp, and this guide offers tips that will help beginners complete the challenges.

article thumbnail

A Guide to Stream Processing and ksqlDB Fundamentals

Confluent

Event streaming applications are a powerful way to react to events as they happen and to take advantage of data while it is fresh. However, they can be a challenge […].

Process 141
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

Create your Private Data Warehousing Environment Using Azure Kubernetes Service

Cloudera

For Cloudera ensuring data security is critical because we have large customers in highly regulated industries like financial services and healthcare, where security is paramount. Also, for other industries like retail, telecom or public sector that deal with large amounts of customer data and operate multi-tenant environments, sometimes with end users who are outside of their company, securing all the data may be a very time intensive process.

article thumbnail

Doing DataOps For External Data Sources As A Service at Demyst

Data Engineering Podcast

Summary The data that you have access to affects the questions that you can answer. By using external data sources you can drastically increase the range of analysis that is available to your organization. The challenge comes in all of the operational aspects of finding, accessing, organizing, and serving that data. In this episode Mark Hookey discusses how he and his team at Demyst do all of the DataOps for external data sources so that you don’t have to, including the systems necessary t

article thumbnail

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

Speaker: Tamara Fingerlin, Developer Advocate

Apache Airflow® 3.0, the most anticipated Airflow release yet, officially launched this April. As the de facto standard for data orchestration, Airflow is trusted by over 77,000 organizations to power everything from advanced analytics to production AI and MLOps. With the 3.0 release, the top-requested features from the community were delivered, including a revamped UI for easier navigation, stronger security, and greater flexibility to run tasks anywhere at any time.

article thumbnail

2021: A Year Full of Amazing AI papers — A Review

KDnuggets

A curated list of the latest breakthroughs in AI by release date with a clear video explanation, link to a more in-depth article, and code.

Coding 160
article thumbnail

Best Tutorials for Getting Started with Apache Kafka

Confluent

Each one of the more than 50 tutorials for Apache Kafka® on Confluent Developer answers a question that you might ask a knowledgeable friend or colleague about Kafka and its […].

Kafka 135

More Trending

article thumbnail

Creating A Unified Experience For The Modern Data Stack At Mozart Data

Data Engineering Podcast

Summary The modern data stack has been gaining a lot of attention recently with a rapidly growing set of managed services for different stages of the data lifecycle. With all of the available options it is possible to run a scalable, production grade data platform with a small team, but there are still sharp edges and integration challenges to work through.

BI 100
article thumbnail

Movie Recommendations with Spark Collaborative Filtering

KDnuggets

Not sure what movie to watch? Ask your recommender system.

Systems 160
article thumbnail

Reference Data: Smoothing Out the Bumps in M&A

Teradata

M&A is an important part of an organization's growth strategy. Getting reference data right can be foundational to overcoming many challenges that come with it.

Data 98
article thumbnail

A Systematic Approach to Reducing Technical Debt

Zalando Engineering

Introduction While technical debt is a recurring issue in software engineering, the case of the Merchant Orders team within Zalando Direct was a an outlier as, due to a lack of a clearly defined process, technical debt more or less only ever accumulated. When I joined this team in autumn 2020 as its new engineering lead, the technical debt backlog had entries dating back to 2018.

article thumbnail

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Speaker: Alex Salazar, CEO & Co-Founder @ Arcade | Nate Barbettini, Founding Engineer @ Arcade | Tony Karrer, Founder & CTO @ Aggregage

There’s a lot of noise surrounding the ability of AI agents to connect to your tools, systems and data. But building an AI application into a reliable, secure workflow agent isn’t as simple as plugging in an API. As an engineering leader, it can be challenging to make sense of this evolving landscape, but agent tooling provides such high value that it’s critical we figure out how to move forward.

article thumbnail

Eight Top DataOps Trends for 2022

DataKitchen

DataOps adoption continues to expand as a perfect storm of social, economic, and technological factors drive enterprises to invest in process-driven innovation. From our unique vantage point in the evolution toward DataOps automation, we publish an annual prediction of trends that most deeply impact the DataOps enterprise software industry as a whole.

article thumbnail

Why Machine Learning Engineers are Replacing Data Scientists

KDnuggets

The hiring run for data scientists continues along at a strong clip around the world. But, there are other emerging roles that are demonstrating key value to organizations that you should consider based on your existing or desired skill sets.

article thumbnail

Connecting the Data Lifecycle

Cloudera

Data transforms businesses. When done right it creates value and allows business leaders to make the most advantageous decisions, in real-time. That’s where the data lifecycle comes into play. Managing data and its flow, from the edge to the cloud, is one of the most important tasks in the process of gaining data intelligence. . The Data Impact Awards 2021 aim to recognize and reward the various organizations taking advantage of the latest Big Data services to successfully manage large amounts o

article thumbnail

Evaluation Modes in Scala Explained

Rock the JVM

Explore core Scala constructs from a fresh perspective and uncover new insights into their use and application

Scala 52
article thumbnail

How to Modernize Manufacturing Without Losing Control

Speaker: Andrew Skoog, Founder of MachinistX & President of Hexis Representatives

Manufacturing is evolving, and the right technology can empower—not replace—your workforce. Smart automation and AI-driven software are revolutionizing decision-making, optimizing processes, and improving efficiency. But how do you implement these tools with confidence and ensure they complement human expertise rather than override it? Join industry expert Andrew Skoog as he explores how manufacturers can leverage automation to enhance operations, streamline workflows, and make smarter, data-dri

article thumbnail

Why Rockset Is My Next Job After Facebook

Rockset

“At every step in this process, I’ve been consistently impressed by the quality and caliber of the team." On Monday, I joined Rockset. I am joining Rockset as its first director of engineering and first external manager hire. I come here from Facebook, where I spent the last 10 years building and supporting teams. Most of my work was in the core C++ libraries and distributed systems components that power Facebook’s infrastructure.

article thumbnail

Sentiment Analysis with KNIME

KDnuggets

Check out this tutorial on how to approach sentiment classification with supervised machine learning algorithms.

Algorithm 160
article thumbnail

The Cloudera Enterprise Data Cloud Maturity Report: Uncovering progressive steps towards a hybrid future

Cloudera

This guest blog was written by Shanice Omare, Research Manager, Vanson Bourne. Organizations’ resiliency in the wake of the pandemic . So much has changed for organizations in recent times, with the pandemic accelerating shifts toward a more digital world. Some organizations have taken this as an opportunity for positive change by moving workloads to the cloud and utilizing enterprise data strategies that are key to their business resiliency.

Cloud 73
article thumbnail

Maps with PostgreSQL and PostGIS

Zalando Engineering

This blog post explains to you which tools to use to serve geospatial data from a database system (PostgreSQL) to your web browser. All you need is a database server for the data, a web map application for the frontend and a small service in between to transfer user requests. I will also show you how these components can run on top of Kubernetes in a highly available cloud native fashion.

article thumbnail

The Ultimate Guide to Apache Airflow DAGS

With Airflow being the open-source standard for workflow orchestration, knowing how to write Airflow DAGs has become an essential skill for every data engineer. This eBook provides a comprehensive overview of DAG writing features with plenty of example code. You’ll learn how to: Understand the building blocks DAGs, combine them in complex pipelines, and schedule your DAG to run exactly when you want it to Write DAGs that adapt to your data at runtime and set up alerts and notifications Scale you

article thumbnail

Data Mining vs Machine Learning. Here’s the Difference

ProjectPro

We all are aware of the advancements in technology; new terminologies are coming in with these advancements. Everyone wants to keep up with this, wanting to sound tech-savvy. To ensure this, it is important to understand the exact meaning of the terminologies before we use them. Data is the New Fuel. We all know this , so you might have heard terms like Artificial Intelligence (AI), Machine Learning, Data Mining, Neural Networks, etc.

article thumbnail

5 Practical Data Science Projects That Will Help You Solve Real Business Problems for 2022

KDnuggets

This curated list of data science projects offers real-life problems that will help you master skills to demonstration that you are technically sound and know how to conduct data science projects that add business value.

article thumbnail

Making Data Science Responsible

Elder Research

The post Making Data Science Responsible appeared first on Elder Research.

article thumbnail

From Strategy to Action: How to ‘Break the Code’ of Analytics at Scale in Retail and CPG

Teradata

Retail and CPG leaders of the future need to successfully leverage analytics at speed and scale to drive performance. Find out more.

Retail 52
article thumbnail

Apache Airflow® Best Practices: DAG Writing

Speaker: Tamara Fingerlin, Developer Advocate

In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!

article thumbnail

How to Become a Deep Learning Engineer in 2023?

ProjectPro

Deep learning was developed in the early 1940s to mimic the neural networks of the human brain. However, it did not garner enough interest due to limited computation power and storage options. However, in the last few decades, deep learning has unleashed itself into the world. Its massive evolution is also the result of substantial research labs and industry players like Facebook, Google, Apple, Netflix, Microsoft, Baidu, and IBM investing in its research. 85% of data science platform vendors ha

article thumbnail

What Percentage of Your Machine Learning Models Have Been Deployed?

KDnuggets

Take a moment to participate in the latest KDnuggets poll and let the community know what percentage of your machine learning models have been deployed.

article thumbnail

The (Missing) Role of Design in Analytics

dbt Developer Hub

If you’ve spoken to me lately, follow me on Twitter , or have taken my order at Wendy’s , you probably know how much I hate traditional dashboards. My dad, a psychotherapist, has been working with me to get to the root of my upbringing that led to this deep-rooted feeling. As it turns out, the cause of my feelings towards traditional dashboarding are actually quite obvious.

article thumbnail

RudderStack Secures SOC 2 Type II Certification

RudderStack

We consider security to be vital, especially when it comes to our customers’ data. We’re excited that we have attained SOC 2 Type II compliance.

article thumbnail

How to Achieve High-Accuracy Results When Using LLMs

Speaker: Ben Epstein, Stealth Founder & CTO | Tony Karrer, Founder & CTO, Aggregage

When tasked with building a fundamentally new product line with deeper insights than previously achievable for a high-value client, Ben Epstein and his team faced a significant challenge: how to harness LLMs to produce consistent, high-accuracy outputs at scale. In this new session, Ben will share how he and his team engineered a system (based on proven software engineering approaches) that employs reproducible test variations (via temperature 0 and fixed seeds), and enables non-LLM evaluation m

article thumbnail

15 Business Analyst Project Ideas and Examples for Practice

ProjectPro

Your search for business analyst project examples ends here. This blog contains sample projects for business analyst beginners and professionals. So, continue reading this blog to know more about different business analyst projects ideas. Business analysts are the demand of the twenty-first century! One can easily affirm this by looking at a report by the U.S.

article thumbnail

Avoid These Mistakes with Time Series Forecasting

KDnuggets

A few checks to make before training a Machine Learning model on data that could be random.

article thumbnail

How I Study Open Source Community Growth with dbt

dbt Developer Hub

Most organizations spend at least some of their time contributing to an open source project. 100% of them, though, depend in some way on the output of open source communities. In fact we all do. The work of communities can be found everywhere - in the cell phone that wakes you up, the machine that makes your coffee, the car that drives you to get better coffee, the magic app that brings you dinner, and the television that lulls you back to sleep.

article thumbnail

What is a Data Source?

Grouparoo

The data source is the location of the data that the processing will consume for data processing functions. This can be the point of origin of the data, the place of its creation. Alternatively, this can be data generated by another process and then made available for subsequent processing. Therefore, the source data may be raw, unfiltered, and unrefined, or polished and fully formed.

article thumbnail

Optimizing The Modern Developer Experience with Coder

Many software teams have migrated their testing and production workloads to the cloud, yet development environments often remain tied to outdated local setups, limiting efficiency and growth. This is where Coder comes in. In our 101 Coder webinar, you’ll explore how cloud-based development environments can unlock new levels of productivity. Discover how to transition from local setups to a secure, cloud-powered ecosystem with ease.