Sat.Jul 06, 2019 - Fri.Jul 12, 2019

article thumbnail

Our Commitment to Open Source Software

Cloudera

Open source has been core to the missions of both Hortonworks and Cloudera and central to our values and culture. With more than 700 engineers in the new Cloudera, our company writes a prodigious amount of open source code each year that’s contributed to more than 30 different open source projects. We’re also a very innovative open source company, having collectively launched more than a dozen new open source projects since the founding of the two companies. .

article thumbnail

The Power of Integrated Data and Analytics

Teradata

Integrated data and analytics has a proven track record of helping organize operations, enhance customer experience and improve revenue and market growth.

Data 104
article thumbnail

Scale Your Analytics On The Clickhouse Data Warehouse

Data Engineering Podcast

Summary The market for data warehouse platforms is large and varied, with options for every use case. ClickHouse is an open source, column-oriented database engine built for interactive analytics with linear scalability. In this episode Robert Hodges and Alexander Zaitsev explain how it is architected to provide these features, the various unique capabilities that it provides, and how to run it in production.

article thumbnail

Deploying Kafka Streams and KSQL with Gradle – Part 3: KSQL User-Defined Functions and Kafka Streams

Confluent

Building off part 1 where we discussed an event streaming architecture that we implemented for a customer using Apache Kafka, KSQL, and Kafka Streams, and part 2 where we discussed how Gradle helped us address the challenges we faced developing, building, and deploying the KSQL portion of our application, here in part 3, we’ll explore using Gradle to build and deploy KSQL user-defined functions (UDFs) and Kafka Streams microservices.

Kafka 89
article thumbnail

Apache Airflow® Best Practices for ETL and ELT Pipelines

Whether you’re creating complex dashboards or fine-tuning large language models, your data must be extracted, transformed, and loaded. ETL and ELT pipelines form the foundation of any data product, and Airflow is the open-source data orchestrator specifically designed for moving and transforming data in ETL and ELT pipelines. This eBook covers: An overview of ETL vs.

article thumbnail

Solving the Pain Points of Big Data Management

Cloudera

Every business aims to deliver products and services quickly and efficiently based upon customer wants and needs. Today, much of that speed and efficiency relies on insights driven by big data. Yet big data management often serves as a stumbling block, because many businesses continue to struggle with how to best capture and analyze their data. Unorganized data presents another roadblock.

article thumbnail

Re-Architecting the Video Gatekeeper

Netflix Tech

By Drew Koszewnik This is the story about how the Content Setup Engineering team used Hollow, a Netflix OSS technology, to re-architect and simplify an essential component in our content pipeline?—?delivering a large amount of business value in the process. The Context Each movie and show on the Netflix service is carefully curated to ensure an optimal viewing experience.

More Trending

article thumbnail

KSQL Training for Hands-On Learning

Confluent

I’ve been using KSQL from Confluent since its first developer preview in 2017. Reading, writing, and transforming data in Apache Kafka ® using KSQL is an effective way to rapidly deliver event streaming applications for clients (e.g., streaming insurance events ). Plus, I’ve also had the opportunity to deploy KSQL in some not-so-serious hobby projects (see Noise Mapping with KSQL, a Raspberry Pi and a Software-Defined Radio and ML and KSQL Let Me Know When I’ve Left the Heater Running ).

Kafka 85
article thumbnail

What is Data Extraction and How It Can Serve Your Business

InData Labs

In the highly competitive business world of today, data reign supreme. Customer personal data, comprehensive operating statistics, sales figures, or inter-company information may play a core role in strategic decision making. It’s vital to keep an eye on the quantity and quality of data that can be captured and extracted from different web sources.

IT 52
article thumbnail

Methods for Running SQL on JSON in PostgreSQL, MySQL and Other Relational Databases

Rockset

One of the main hindrances to getting value from our data is that we have to get data into a form that’s ready for analysis. It sounds simple, but it rarely is. Consider the hoops we have to jump through when working with semi-structured data, like JSON, in relational databases such as PostgreSQL and MySQL. JSON in Relational Databases In the past, when it came to working with JSON data, we’ve had to choose between tools and platforms that worked well with JSON or tools that provided good suppor