Sat.Sep 26, 2020 - Fri.Oct 02, 2020

article thumbnail

How Real-Time Stream Processing Works with ksqlDB, Animated

Confluent

ksqlDB, the event streaming database, is becoming one of the most popular ways to work with Apache Kafka®. Every day, we answer many questions about the project, but here’s a […].

Process 145
article thumbnail

Upgrade Journey: The Path from CDH to CDP Private Cloud

Cloudera

Cloudera delivers an enterprise data cloud that enables companies to build end-to-end data pipelines for hybrid cloud, spanning edge devices to public or private cloud, with integrated security and governance underpinning it to protect customers data. Cloudera has found that customers have spent many years investing in their big data assets and want to continue to build on that investment by moving towards a more modern architecture that helps leverage the multiple form factors.

Cloud 132
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

Data Engineering Project: Stream Edition

Start Data Engineering

Table of Contents Table of Contents Introduction Project description and requirements Infrastructure overview Apache Flink Apache Kafka Design Detect fraudulent accounts Log account actions Prerequisites Code Defining dependencies Inheritance Server logs generator Defining data flow in Apache Flink Create a streaming environment Creating a consumer to read events from Apache Kafka Detecting fraud and generating alert events Writing server logs to a PostgreSQL DB Fraud detection logic Open proces

article thumbnail

Speed Up And Simplify Your Streaming Data Workloads With Red Panda

Data Engineering Podcast

Summary Kafka has become a de facto standard interface for building decoupled systems and working with streaming data. Despite its widespread popularity, there are numerous accounts of the difficulty that operators face in keeping it reliable and performant, or trying to scale an installation. To make the benefits of the Kafka ecosystem more accessible and reduce the operational burden, Alexander Gallego and his team at Vectorized created the Red Panda engine.

Kafka 100
article thumbnail

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

Speaker: Tamara Fingerlin, Developer Advocate

Apache Airflow® 3.0, the most anticipated Airflow release yet, officially launched this April. As the de facto standard for data orchestration, Airflow is trusted by over 77,000 organizations to power everything from advanced analytics to production AI and MLOps. With the 3.0 release, the top-requested features from the community were delivered, including a revamped UI for easier navigation, stronger security, and greater flexibility to run tasks anywhere at any time.

article thumbnail

Introducing Confluent Platform 6.0

Confluent

Each month, we’ve announced a set of Confluent features organized around what we think are the key foundational traits of cloud-native data systems as part of Project Metamorphosis. Data systems […].

Project 143
article thumbnail

UK Government: From cloud first to cloud appropriate?

Cloudera

Since 2013 the UK Government’s flagship ‘Cloud First’ policy has been at the forefront of enabling departments to shed their legacy IT architecture in order to meaningfully embrace digital transformation. The policy outlines that the cloud (and specifically, public cloud) be the default position for any new services; unless it can be demonstrated that other alternatives offer better value for money. .

More Trending

article thumbnail

PowerBI distribution and sharing

FreshBI

Spotlight: The PowerBI Service Lately we have been getting a lot of questions surrounding licensing and release strategy in PowerBI. This guide should serve as an internal, quick reference manual. The following is a list of topics covered in this guide, each containing a summary of how it works and what the use case is. Licensing PowerBI Desktop / Free Who uses this?

BI 52
article thumbnail

ksqlDB Meets Java: An IoT-Inspired Demo of the Java Client for ksqlDB

Confluent

Stream processing applications, including streaming ETL pipelines, materialized caches, and event-driven microservices, are made easy with ksqlDB. Until recently, your options for interacting with ksqlDB were limited to its command-line […].

Java 122
article thumbnail

Cloudera Data Warehouse outperforms Azure HDInsight in TPC-DS benchmark

Cloudera

Performance is one of the key, if not the most important deciding criterion, in choosing a Cloud Data Warehouse service. In today’s fast changing world, enterprises have to make data driven decisions quickly and for that they rely heavily on their data warehouse service. . In this blog post, we compare Cloudera Data Warehouse (CDW) on Cloudera Data Platform (CDP) using Apache Hive-LLAP to Microsoft HDInsight (also powered by Apache Hive-LLAP) on Azure using the TPC-DS 2.9 benchmark.

article thumbnail

Three Insights Into Delivering Value at Scale From Smart Factory Investments

Teradata

Industry 4.0 has promised productivity gains, but has not yet delivered. A large part of this has to do with the challenge of deploying analytics at scale. Find out more.

52
article thumbnail

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Speaker: Alex Salazar, CEO & Co-Founder @ Arcade | Nate Barbettini, Founding Engineer @ Arcade | Tony Karrer, Founder & CTO @ Aggregage

There’s a lot of noise surrounding the ability of AI agents to connect to your tools, systems and data. But building an AI application into a reliable, secure workflow agent isn’t as simple as plugging in an API. As an engineering leader, it can be challenging to make sense of this evolving landscape, but agent tooling provides such high value that it’s critical we figure out how to move forward.

article thumbnail

Scala 3: Traits Quickly Explained

Rock the JVM

This article delves into Scala 3's advanced trait functionalities, building on our previous explorations of the language's new features

Scala 52
article thumbnail

ksqlDB 0.12.0 Introduces Real-Time Query Upgrades and Automatic Query Restarts

Confluent

The ksqlDB team is pleased to announce ksqlDB 0.12.0. This release continues to improve upon the usability of ksqlDB and aims to reduce administration time. Highlights include query upgrades, which […].

Process 98
article thumbnail

How to enable Cloudera Data Visualization in CDW

Cloudera

In our previous blog post we introduced Cloudera Data Visualization in Cloudera Data Warehouse (CDW) available in tech preview, in CDP Public Cloud. This blog will help you get started with Cloudera Data Visualization, so you can start building interesting and powerful applications on all types of data. Before you start. Make sure that. You have a CDP account set up (for instance, you may use our trial experience ).

article thumbnail

Demystifying the Business Continuity Space: A Two Part Series

Teradata

In part 1 of this 2 part topic, we will define some of the commonly used (& misused) terms in the business continuity space & help you navigate what they mean to your organization.

52
article thumbnail

How to Modernize Manufacturing Without Losing Control

Speaker: Andrew Skoog, Founder of MachinistX & President of Hexis Representatives

Manufacturing is evolving, and the right technology can empower—not replace—your workforce. Smart automation and AI-driven software are revolutionizing decision-making, optimizing processes, and improving efficiency. But how do you implement these tools with confidence and ensure they complement human expertise rather than override it? Join industry expert Andrew Skoog as he explores how manufacturers can leverage automation to enhance operations, streamline workflows, and make smarter, data-dri

article thumbnail

Enums in Scala 3: Quickly Explained

Rock the JVM

Scala 3 Introduces Enums: A Major Update with Significant Implications

Scala 52
article thumbnail

Building a Real-Time Customer 360 on Kafka, MongoDB and Rockset

Rockset

Users interact with services in real-time. They login to websites, like and share posts, purchase goods and even converse, all in real-time. So why is it that whenever you have a problem when using a service and you reach a customer support representative, that they never seem to know who you are or what you’ve been doing recently? This is likely because they haven’t built a customer 360 profile and if they have, it certainly isn’t real-time.

MongoDB 40
article thumbnail

Coffee with Cloudera: Meet Ali Bajwa, Partner Solutions – Engineer by Day, Rockstar by Night!

Cloudera

Meet Ali Bajwa , Director of Partner Solutions Engineering at Cloudera. For the past 6 years, Ali has been front and center in many partner field deployments, training, and discussions; he is a rockstar in the Cloudera Partner Ecosystem! We hope this interview helps you get to know the afterhours Ali. If you get a chance, follow Ali on twitter! @abajwa_hdp.

article thumbnail

Demystifying the Business Continuity Space: A Three Part Series

Teradata

In part 1 of this 3 part series, we will define some of the commonly used (& misused) terms in the business continuity space & help you navigate what they mean to your organization.

52
article thumbnail

The Ultimate Guide to Apache Airflow DAGS

With Airflow being the open-source standard for workflow orchestration, knowing how to write Airflow DAGs has become an essential skill for every data engineer. This eBook provides a comprehensive overview of DAG writing features with plenty of example code. You’ll learn how to: Understand the building blocks DAGs, combine them in complex pipelines, and schedule your DAG to run exactly when you want it to Write DAGs that adapt to your data at runtime and set up alerts and notifications Scale you

article thumbnail

Build a Slack Dashboard (Part 2): Loading Into Postgres & Creating Basic Charts

Preset

Build a beautiful Slack dashboard using open source tools Meltano and Superset. Part 2 of 3.

article thumbnail

How to Solve the “You’re Using THAT Table?!” Problem

Monte Carlo

As companies increasingly rely on data to power decision making and drive innovation, it’s important that this data is timely, accurate, and reliable. When you consider that only a small fraction of the over 7.5 septillion (7,700,000,000, 000,000,000,000) GB of data generated worldwide every day is usable, keeping tabs on what data assets are important has only gotten harder.

article thumbnail

Break Out of the Data Silo!

Teradata

Marketing might be the best place to start operationalizing a bank-wide data strategy. But, to be effective, the CMO needs to dissolve data silos & create a model for data orchestration.

Banking 52