Top Data Engineering Digest Data Schemas Software Engineering Content for Week of Jun 08

Sat.Jun 08, 2019 - Fri.Jun 14, 2019

Streaming Data from the Universe with Apache Kafka

Confluent

JUNE 13, 2019

You might think that data collection in astronomy consists of a lone astronomer pointing a telescope at a single object in a static sky. While that may be true in some cases (I collected the data for my Ph.D. thesis this way), the field of astronomy is rapidly changing into a data-intensive science with real-time needs. Each night, large-scale astronomical telescope surveys detect millions of changing objects in the sky and need to stream results to scientists for time-sensitive, complementary f

Kafka

Kafka Bytes Python Data Pipeline

Managing The Machine Learning Lifecycle

Data Engineering Podcast

JUNE 9, 2019

Summary Building a machine learning model can be difficult, but that is only half of the battle. Having a perfect model is only useful if you are able to get it into production. In this episode Stepan Pushkarev, founder of Hydrosphere, explains why deploying and maintaining machine learning projects in production is different from regular software projects and the challenges that they bring.

Machine Learning

Machine Learning Management Scala Data Science

Join 37,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

How to Modernize Manufacturing Without Losing Control

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Trending Sources

What Tableau Customers Should Expect Post-Salesforce Acquisition

Teradata

JUNE 11, 2019

Chad Meley examines how Salesforce's acquisition of Tableau will impact customer choice and flexibility.

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

How to Modernize Manufacturing Without Losing Control

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Building a SQL Development Environment for Messy, Semi-Structured Data

Rockset

JUNE 13, 2019

Why build a new SQL development environment? We love SQL — our mission is to bring fast, real-time queries to messy, semi-structured real-world data and SQL is a core part of our effort. A SQL API allows our product to fit neatly into the stacks of our users without any workflow re-architecting. Our users can easily integrate Rockset with a multitude of existing tools for SQL development (e.g.

SQL

SQL Structured Data Building Raw Data

A Guide to Debugging Apache Airflow® DAGs

In Airflow, DAGs (your data pipelines) support nearly every use case. As these workflows grow in complexity and scale, efficiently identifying and resolving issues becomes a critical skill for every data engineer. This is a comprehensive guide with best practices and examples to debugging Airflow DAGs. You’ll learn how to: Create a standardized process for debugging to quickly diagnose errors in your DAGs Identify common issues with DAGs, tasks, and connections Distinguish between Airflow-relate

Data Pipeline

How to Connect KSQL to Confluent Cloud using Kubernetes with Helm

Confluent

JUNE 12, 2019

Confluent Cloud, a fully managed event cloud-native streaming service that extends the value of Apache Kafka ® , is simple, resilient, secure, and performant, allowing you to focus on what is important—building contextual event-driven applications, not infrastructure. If you are using Confluent Cloud as your managed Apache Kafka cluster, you probably also want to start using other Confluent Platform components like the Confluent Schema Registry, Kafka Connect, KSQL, and Confluent REST Proxy.

Cloud

Cloud Kafka Healthcare Software Engineer

Unlock the Value of Data Faster Through Modern Data Warehousing

Advancing Analytics: Data Engineering

JUNE 10, 2019

Data has value – I think we’ve finally got to the point where most people agree on this. The problem we face is how long it takes to unlock that value, and it’s a frustration that most businesses I speak to are having. Let’s think about why this is. After the horror that was the “data silo” days, with clumps of data living in Access databases, Excel spreadsheets and isolated data stores, we’ve had a pretty good run with the classic Kimball data warehouse.

Data Warehouse

Data Warehouse Data Lake Data Data Validation

New As-a-Service Offers on Vantage Bring Simplicity, Modernization

Teradata

JUNE 9, 2019

Analytics as a service lets you offload IT infrastructure tasks so you can focus on solving your toughest business problems. Learn more about options for Teradata Vantage.

More Trending

New As-a-Service Offers on Vantage Bring Simplicity, Modernization

Teradata

JUNE 9, 2019

Analytics as a service lets you offload IT infrastructure tasks so you can focus on solving your toughest business problems. Learn more about options for Teradata Vantage.

Spring for Apache Kafka Deep Dive – Part 4: Continuous Delivery of Event Streaming Pipelines

Confluent

JUNE 11, 2019

For event streaming application developers, it is important to continuously update the streaming pipeline based on the need for changes in the individual applications in the pipeline. It is also important to understand some of the common streaming topologies that streaming developers use to build an event streaming pipeline. Here in part 4 of the Spring for Apache Kafka Deep Dive blog series, we will cover: Common event streaming topology patterns supported in Spring Cloud Data Flow.

Kafka

Kafka Cloud Java MongoDB

Modern Data Warehousing with Azure Databricks at the #PASSSummit in Seattle

Advancing Analytics: Data Engineering

JUNE 10, 2019

Hey everyone, Advancing Analytics are heading to Seattle in November for the PASS Summit. We will be delivering a full day training day on Azure Databricks - Practical Azure Databricks: Engineering & Warehousing at Scale. The session will focus on using Azure Databricks for Modern Data Warehousing. Not sure if the day is for you? Well take a look at the video we recorded.

Data Science

Data Science Data Engineering