Top Data Engineering Digest AWS Metadata Content for Week of Apr 06

From Apache Kafka to Amazon S3: Exactly Once

Confluent

APRIL 11, 2019

At Confluent, we see many of our customers are on AWS, and we’ve noticed that Amazon S3 plays a particularly significant role in AWS-based architectures. Unless a use case actively requires a specific database, companies use S3 for storage and process the data with Amazon Elastic MapReduce (EMR) or Amazon Athena. But even if a use case requires a specific database such as Amazon Redshift, data will still land to S3 first and only then load to Redshift.

Kafka

Kafka AWS Metadata Architecture

Serverless Data Pipelines On DataCoral

Data Engineering Podcast

APRIL 7, 2019

Summary How much time do you spend maintaining your data pipeline? How much end user value does that provide? Raghu Murthy founded DataCoral as a way to abstract the low level details of ETL so that you can focus on the actual problem that you are trying to solve. In this episode he explains his motivation for building the DataCoral platform, how it is leveraging serverless computing, the challenges of delivering software as a service to customer environments, and the architecture that he has de

Data Pipeline

Data Pipeline Pipeline-centric Database-centric AWS

How to Analyze Data at Speed and Scale Using Pervasive Data Intelligence

Teradata

APRIL 9, 2019

Chris Twogood explains while large companies who utilize data need Pervasive Data Intelligence in order to leverage all of their data, all of the time.

Utilities

Utilities Data

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

How to Modernize Manufacturing Without Losing Control

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Why adopt a hybrid, multi-cloud strategy?

Cloudera

APRIL 9, 2019

Enterprises are moving to the cloud. In 2016, 60.9% of application workloads were still on-premises in enterprise data centers; by the end of 2017, less than half (47.2%) were on-premises. Enterprises plan to implement new apps primarily in the cloud while migrating 20.7% of existing apps to public cloud. Despite this trend to move to cloud, It will be rare for enterprises to deploy 100% of their apps in the cloud, let alone deploy all apps to a single cloud.

Cloud

Cloud AWS Metadata Data Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

Speaker: Tamara Fingerlin, Developer Advocate

Apache Airflow® 3.0, the most anticipated Airflow release yet, officially launched this April. As the de facto standard for data orchestration, Airflow is trusted by over 77,000 organizations to power everything from advanced analytics to production AI and MLOps. With the 3.0 release, the top-requested features from the community were delivered, including a revamped UI for easier navigation, stronger security, and greater flexibility to run tasks anywhere at any time.

Data

Announcing Confluent Cloud for Apache Kafka as a Native Service on Google Cloud Platform

Confluent

APRIL 9, 2019

I’m excited to announce that we’re partnering with Google Cloud to make Confluent Cloud, our fully managed offering of Apache Kafka ® , available as a native offering on Google Cloud Platform (GCP). This means you will have the ability to use Confluent Cloud’s managed Apache Kafka service with familiar Google tools and processes, including integration into the Google Cloud Console and GCP Marketplace to provide a seamless sign-up experience, and integrated billing and first-line support provided

Google Cloud

Google Cloud Kafka Cloud MongoDB

End-to-end load testing Zalando’s production website

Zalando Engineering

APRIL 10, 2019

Black Friday is the busiest day of the year for us, with over 4,200 orders per minute during the event in 2018. We need to make sure we’re technically able to handle the huge influx of customers. As a part of our preparations we ask all of our teams to perform load tests to ensure their individual components will handle the expected load. In addition, and due to the distributed nature of our system's architecture , we also need to ensure it will handle the expected load once all components have

Python

Python Project Systems Engineering

Monitoring Data Replication in Multi-Datacenter Apache Kafka Deployments

Confluent

APRIL 10, 2019

Enterprises run modern data systems and services across multiple cloud providers, private clouds and on-prem multi-datacenter deployments. Instead of having many point-to-point connections between sites, the Confluent Platform provides an integrated event streaming architecture with frictionless data replication between sites. Applications can publish streams of data to a self-hosted on-prem cluster, replicate them to another on-prem cluster or to different cloud providers, load them into data s

Kafka

Kafka Java Metadata Cloud

Monitoring Data Replication in Multi-Datacenter Apache Kafka Deployments

Confluent

APRIL 10, 2019

Enterprises run modern data systems and services across multiple cloud providers, private clouds and on-prem multi-datacenter deployments. Instead of having many point-to-point connections between sites, the Confluent Platform provides an integrated event streaming architecture with frictionless data replication between sites. Applications can publish streams of data to a self-hosted on-prem cluster, replicate them to another on-prem cluster or to different cloud providers, load them into data s

Kafka

Kafka Java Metadata Cloud

Data Engineering Digest

Sat.Apr 06, 2019 - Fri.Apr 12, 2019

From Apache Kafka to Amazon S3: Exactly Once

Serverless Data Pipelines On DataCoral

Webinars

Trending Sources

How to Analyze Data at Speed and Scale Using Pervasive Data Intelligence

Webinars

Why adopt a hybrid, multi-cloud strategy?

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

Announcing Confluent Cloud for Apache Kafka as a Native Service on Google Cloud Platform

End-to-end load testing Zalando’s production website

Monitoring Data Replication in Multi-Datacenter Apache Kafka Deployments

Monitoring Data Replication in Multi-Datacenter Apache Kafka Deployments

Stay Connected