Sat.Feb 09, 2019 - Fri.Feb 15, 2019

article thumbnail

Machine Learning In The Enterprise

Data Engineering Podcast

Summary Machine learning is a class of technologies that promise to revolutionize business. Unfortunately, it can be difficult to identify and execute on ways that it can be used in large companies. Kevin Dewalt founded Prolego to help Fortune 500 companies build, launch, and maintain their first machine learning projects so that they can remain competitive in our landscape of constant change.

article thumbnail

Journey to Event Driven – Part 2: Programming Models for the Event-Driven Architecture

Confluent

Part 1 of this series discussed why you need to embrace event-first thinking, while this article builds a rationale for different styles of event-driven architectures and compares and contrasts scaling, persistence and runtime models. Once settled on the event streaming approach, I’ll provide a high-level dataflow of how we design systems for payment processing at scale using this approach.

article thumbnail

Building a Cross-platform In-app Messaging Orchestration Service

Netflix Tech

George Abraham , Devika Chawla , Chris Beaumont , and Daniel Huang. Thoughtful, relevant, and timely messaging is an integral part of a customer’s Netflix experience. The Netflix Messaging Engineering team builds the platform and the messages to communicate with Netflix customers. Messages in the Netflix App In-app messages at Netflix fall broadly into two channels?

article thumbnail

Cloudera announces support for Azure’s next-generation Data Lake Store

Cloudera

Today we are proud to announce our support for ADLS Gen2 as it enters general availability on Microsoft Azure. CDH 6.1 already includes support for MapReduce and Spark jobs, Hive and Impala queries, and Oozie workflows on ADLS Gen2. The Cloudera platform delivers a one-stop shop that allows you to store any kind of data, process and analyze it in many different ways in a single environment, and integrate with the rest of your data infrastructure.

article thumbnail

Apache Airflow® Best Practices for ETL and ELT Pipelines

Whether you’re creating complex dashboards or fine-tuning large language models, your data must be extracted, transformed, and loaded. ETL and ELT pipelines form the foundation of any data product, and Airflow is the open-source data orchestrator specifically designed for moving and transforming data in ETL and ELT pipelines. This eBook covers: An overview of ETL vs.

article thumbnail

What Is Readable Code?

Pandora Engineering

Code creates interfaces. But code itself is also an interface.

Coding 52
article thumbnail

Kafka Connect Deep Dive – JDBC Source Connector

Confluent

One of the most common integrations that people want to do with Apache Kafka ® is getting data in from a database. That is because relational databases are a rich source of events. The existing data in a database, and any changes to that data, can be streamed into a Kafka topic. From there these events can be used to drive applications, be streamed to other data stores such as search replicas or caches and streamed to storage for analytics.

Kafka 90

More Trending

article thumbnail

Governing for digital transformation and growth

Cloudera

Ask a CIO where their focus lies and ‘digital transformation’ as well as ‘growth’ will come into the conversation quite quickly. The former sees growing investment in data analytics to become data-driven (45% of organizations expect to increase their spending in this area) while the latter is fueled by disruptive technology and the adoption of AI (41% of organizations name it as their game changer).

article thumbnail

Distributed Aggregation Queries - A Rockset Intern Story

Rockset

I first met with the Rockset team when they were just four people in a small office in San Francisco. I was taken aback by their experience and friendliness, but most importantly, their willingness to spend a lot of time mentoring me. I knew very little about Rockset's technologies and didn’t know what to expect from such an agile early-stage startup, but decided to join the team for a summer internship anyway.

Food 40
article thumbnail

Typescript Best Practices

Zalando Engineering

Typescript is becoming more and more popular. As with everything, there are good and bad sides. How good it is depends on your usage on your application. This article will not discuss the good and bad sides of Typescript but some best practices, which will help for some cases to get the best out of Typescript. 1. Strict configuration Strict configuration should be mandatory and enabled by default, as there is not much value using Typescript without these settings.

Coding 40
article thumbnail

Is There Such a Thing as Too Much Parallelism?

Teradata

In her blog, Carrie Ballinger discusses parallelism and how you can fashion it to specific needs by using the new sparse map capability

IT 45
article thumbnail

Apache Airflow®: The Ultimate Guide to DAG Writing

Speaker: Tamara Fingerlin, Developer Advocate

In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!

article thumbnail

Cloudera’s and Hortonworks’ data platform in the cloud named among Leaders in new Forrester Wave

Cloudera

When Cloudera was formed about 10 years ago, the founders believed that companies would jump at the chance to store, manage, and analyze their data in the cloud. Thus, they came up with the name Cloudera, which was a play on “era of cloud.” But, much to their surprise, companies weren’t ready for cloud; they were more focused with on-prem. So, Cloudera focused on helping companies with storing, managing, and analyzing data on-prem.

Cloud 60
article thumbnail

Three Takeaways from Gartner’s 2019 Magic Quadrant for Data Management Solutions for Analytics

Cloudera

The Magic Quadrant (MQ) is an established, widely-referenced series of research reports published by the analyst firm Gartner, Inc. The January 2019 “Magic Quadrant for Data Management Solutions for Analytics” provides valuable insights into the status, direction, and players in the DMSA market. A total of 19 vendors satisfied Gartner’s extensive inclusion criteria for insertion in this year’s MQ DMSA report.

article thumbnail

Transforming the business of communication with 5G

Cloudera

3.2 billion. That is the number of unique mobile subscribers that Asia Pacific is projected to have by 2025, which accounts for more than half of the world’s mobile subscribers. Mobile data traffic is predicted to grow at a 40 to 50 percent rate annually, and Internet of Things (IoT) connections from 25 to 30 percent. As technology adoption increases, more service providers require 5G to support the surge of incoming data.