Top Data Engineering Digest Data Storage Data Collection Content for Week of May 04

Sat.May 04, 2019 - Fri.May 10, 2019

Using FoundationDB As The Bedrock For Your Distributed Systems

Data Engineering Podcast

MAY 6, 2019

Summary The database market continues to expand, offering systems that are suited to virtually every use case. But what happens if you need something customized to your application? FoundationDB is a distributed key-value store that provides the primitives that you need to build a custom database platform. In this episode Ryan Worl explains how it is architected, how to use it for your applications, and provides examples of system design patterns that can be built on top of it.

Systems

Systems MongoDB NoSQL Database

Dawn of Kafka DevOps: Managing Multi-Cluster Kafka Connect and KSQL with Confluent Control Center

Confluent

MAY 8, 2019

In anything but the smallest deployment of Apache Kafka ® , there are often going to be multiple clusters of Kafka Connect and KSQL. Kafka Connect is used for building event streaming data pipelines between upstream and downstream systems with Kafka, and KSQL is used for building stream processing applications declared in a SQL-like language. People will have multiple clusters of these for various reasons, including: Resource isolation.

Kafka

Kafka Management Hadoop Database

Join 37,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

How to Modernize Manufacturing Without Losing Control

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Trending Sources

8 Places to Visit in Denver While Attending Teradata Universe 2019

Teradata

MAY 6, 2019

Heading to Teradata Universe 2019? Camille Schmidt lists the "8 Places to Visit in Denver" while attending the flagship conference.

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

How to Modernize Manufacturing Without Losing Control

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

A 5D model to assess your IoT readiness

Cloudera

MAY 9, 2019

The number one challenge that enterprises struggle with their IoT implementation is not being able to measure if they are successful or not with it. Most of the enterprises start an IoT initiative without assessing their potential prior hand to be able to complete it. Even if they complete it, they lack the ability to identify and correlate the success metrics with key business goals.

Manufacturing

Manufacturing Data Ingestion Architecture Data Governance

A Guide to Debugging Apache Airflow® DAGs

In Airflow, DAGs (your data pipelines) support nearly every use case. As these workflows grow in complexity and scale, efficiently identifying and resolving issues becomes a critical skill for every data engineer. This is a comprehensive guide with best practices and examples to debugging Airflow DAGs. You’ll learn how to: Create a standardized process for debugging to quickly diagnose errors in your DAGs Identify common issues with DAGs, tasks, and connections Distinguish between Airflow-relate

Data Pipeline

Back-Pressure Strategy for a Sharded Akka Cluster

Zalando Engineering

MAY 8, 2019

AWS SQS polling from sharded Akka Cluster running on Kubernetes NOTE: This blog post requires the reader to have prior knowledge of AWS SQS , Akka Actors and Akka Cluster Sharding. My last post introduced Akka Cluster Sharding as a Distributed Cache running on Kubernetes. As that Proof-of-concept(PoC) proved promising, we started building a high-throughput and low-latency system based on the gained experiences and learnings.

AWS

AWS Architecture Systems Process

Case Study: Decore Uses Rockset for Search & Analytics on DynamoDB

Rockset

MAY 6, 2019

Many early adopters of cryptocurrency were individuals at the forefront of this technology, but enterprises are now increasingly getting more involved. As using cryptocurrency for business transactions becomes more commonplace, Decore aims to make accounting as streamlined as possible for companies accepting and sending crypto. Conceived as a “Quickbooks for crypto,” Decore provides accounting solutions for companies that have adopted crypto.

Banking

Banking MySQL AWS Architecture

OCR Algorithm: Improve and Automate Business Processes

InData Labs

MAY 4, 2019

Businesses of mid and large scale have massive amounts of printed documents in daily use. Among them are invoices, receipts, corporate documents, reports, media releases. And millions of them can be handwritten, which makes documents understandable for humans but difficult to read for machines. Basic Concept of OCR Optical character recognition (OCR) algorithms allow computers.

Algorithm

Algorithm Process Media Technology

More Trending

OCR Algorithm: Improve and Automate Business Processes

InData Labs

MAY 4, 2019

Algorithm

Algorithm Process Media Technology

Apache Kafka Data Access Semantics: Consumers and Membership

Confluent

MAY 7, 2019

Every developer who uses Apache Kafka ® has used a Kafka consumer at least once. Although it is the simplest way to subscribe to and access events from Kafka, behind the scenes, Kafka consumers handle tricky distributed systems challenges like data consistency, failover and load balancing. Luckily, Kafka’s consuming model is quite easy to understand.

Kafka

Kafka Accessible Accessibility Metadata

Journey to Event Driven – Part 4: Four Pillars of Event Streaming Microservices

Confluent

MAY 9, 2019

So far in this series, we have recognized that by going back to first principles, we have a new foundation to work with. Event-first thinking enables us to build a new atomic unit: the event. Storing events in a stream and connecting streams via stream processors provide a generic, data-centric, distributed application runtime that you can use to build ETL, event streaming applications, applications for recording metrics and anything else that has a real-time data requirement.

Kafka

Kafka Pipeline-centric Architecture Database-centric

Sat.May 04, 2019 - Fri.May 10, 2019

Using FoundationDB As The Bedrock For Your Distributed Systems

Dawn of Kafka DevOps: Managing Multi-Cluster Kafka Connect and KSQL with Confluent Control Center

Webinars

Trending Sources

8 Places to Visit in Denver While Attending Teradata Universe 2019

Webinars

A 5D model to assess your IoT readiness

A Guide to Debugging Apache Airflow® DAGs

Back-Pressure Strategy for a Sharded Akka Cluster

Case Study: Decore Uses Rockset for Search & Analytics on DynamoDB

OCR Algorithm: Improve and Automate Business Processes

Sign up to get articles personalized to your interests!

More Trending

OCR Algorithm: Improve and Automate Business Processes

Apache Kafka Data Access Semantics: Consumers and Membership

Journey to Event Driven – Part 4: Four Pillars of Event Streaming Microservices

Stay Connected