Sat.Dec 05, 2020 - Fri.Dec 11, 2020

article thumbnail

Transactional Machine Learning at Scale with MAADS-VIPER and Apache Kafka

Confluent

This blog post shows how transactional machine learning (TML) integrates data streams with automated machine learning (AutoML), using Apache Kafka® as the data backbone, to create a frictionless machine learning […].

article thumbnail

Proven Patterns For Building Successful Data Teams

Data Engineering Podcast

Summary Building data products are complicated by the fact that there are so many different stakeholders with competing goals and priorities. It is also challenging because of the number of roles and capabilities that are necessary to go from idea to delivery. Different organizations have tried a multitude of organizational strategies to improve the success rate of these data teams with varying levels of success.

Building 100
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

3x better performance with CDP Data Warehouse compared to EMR in TPC-DS benchmark

Cloudera

In a previous blog post on CDW performance, we compared Azure HDInsight to CDW. In this blog post, we compare Cloudera Data Warehouse (CDW) on Cloudera Data Platform (CDP) using Apache Hive-LLAP to EMR 6.0 (also powered by Apache Hive-LLAP) on Amazon using the TPC-DS 2.9 benchmark. Amazon recently announced their latest EMR version 6.1.0 with support for ACID transactions.

article thumbnail

How Netflix Scales its API with GraphQL Federation (Part 2)

Netflix Tech

In our previous post and QConPlus talk , we discussed GraphQL Federation as a solution for distributing our GraphQL schema and implementation. In this post, we shift our attention to what is needed to run a federated GraphQL platform successfully?—?from our journey implementing it to lessons learned. Our Journey so Far Over the past year, we’ve implemented the core infrastructure pieces necessary for a federated GraphQL architecture as described in our previous post: Studio Edge Architecture The

IT 101
article thumbnail

15 Modern Use Cases for Enterprise Business Intelligence

Large enterprises face unique challenges in optimizing their Business Intelligence (BI) output due to the sheer scale and complexity of their operations. Unlike smaller organizations, where basic BI features and simple dashboards might suffice, enterprises must manage vast amounts of data from diverse sources. What are the top modern BI use cases for enterprise businesses to help you get a leg up on the competition?

article thumbnail

Getting Started with Scala and Apache Kafka

Confluent

If you’re getting started with Apache Kafka® and event streaming applications, you’ll be pleased to see the variety of languages available to start interacting with the event streaming platform. It […].

Kafka 122
article thumbnail

Books to level up your data skills!

Start Data Engineering

1.

SQL 130

More Trending

article thumbnail

Supporting content decision makers with machine learning

Netflix Tech

by Melody Dye *, Chaitanya Ekanadham *, Avneesh Saluja *, Ashish Rastogi * contributed equally Netflix is pioneering content creation at an unprecedented scale. Our catalog of thousands of films and series caters to 195M+ members in over 190 countries who span a broad and diverse range of tastes. Content, marketing, and studio production executives make the key decisions that aspire to maximize each series’ or film’s potential to bring joy to our subscribers as it progresses from pitch to play o

article thumbnail

How to Run Apache Kafka on Windows

Confluent

Is Windows your favorite development environment? Do you want to run Apache Kafka® on Windows? Thanks to the Windows Subsystem for Linux 2 (WSL 2), now you can, and with […].

Kafka 116
article thumbnail

Booking’s Journey with Brotli

Booking.com Engineering

Booking.com’s Journey with Brotli The challenges of improving performance in a complex environment The Transfăgărășan road in Romania is known for its jaw-dropping views. But you’re gonna have to work for it. Photo CC BY-SA 2.0 by Antony Stanley , from Flickr. Brotli is a lossless compression algorithm, designed and released by Google for use on the web.

Bytes 52
article thumbnail

Cost Conscious Data Warehousing with Cloudera Data Platform

Cloudera

Why worry about costs with cloud-native data warehousing? Have you been burned by the unexpected costs of a cloud data warehouse? If so, you know about the failed economics of some cloud-native solutions on the market today. If not, before adopting a cloud data warehouse, consider the true costs of a cloud-native data warehouse. Data warehouses have been broadly adopted to provide timely reports and valuable insights.

article thumbnail

Prepare Now: 2025s Must-Know Trends For Product And Data Leaders

Speaker: Jay Allardyce, Deepak Vittal, and Terrence Sheflin

As we look ahead to 2025, business intelligence and data analytics are set to play pivotal roles in shaping success. Organizations are already starting to face a host of transformative trends as the year comes to a close, including the integration of AI in data analytics, an increased emphasis on real-time data insights, and the growing importance of user experience in BI solutions.

article thumbnail

Toward a Better Quality Metric for the Video Community

Netflix Tech

by Zhi Li, Kyle Swanson, Christos Bampis, Lukáš Krasula and Anne Aaron Over the past few years, we have been striving to make VMAF a more usable tool not just for Netflix, but for the video community at large. This tech blog highlights our recent progress toward this goal. VMAF is a video quality metric that Netflix jointly developed with a number of university collaborators and open-sourced on Github.

article thumbnail

Apache Kafka Lag Monitoring at AppsFlyer

Confluent

This article covers one crucial piece of every distributed system: visibility. At AppsFlyer, we call ourselves metrics obsessed and truly believe that you cannot know what you cannot see. We […].

Kafka 111
article thumbnail

Data.What? What Can I Buy in a Data Marketplace?

Teradata

How does a Data Marketplace relate to Data Sharing? Here's a hint: enabling both internal and external users to access integrated data on demand to bring agility to business. Read more.

Data 52
article thumbnail

Covid Data: An anomalous blip, or the new normal?

Cloudera

COVID-19 has forced virtually every industry to embrace an acceleration in digital capabilities. While it can be argued that digital transformation was already underway; it’s hard to dispute that it has accelerated in recent months. A recent McKinsey survey, cited in CRN , shows that worldwide, 58 percent of customer interactions were digital as of July 2020.

article thumbnail

How to Drive Cost Savings, Efficiency Gains, and Sustainability Wins with MES

Speaker: Nikhil Joshi, Founder & President of Snic Solutions

Is your manufacturing operation reaching its efficiency potential? A Manufacturing Execution System (MES) could be the game-changer, helping you reduce waste, cut costs, and lower your carbon footprint. Join Nikhil Joshi, Founder & President of Snic Solutions, in this value-packed webinar as he breaks down how MES can drive operational excellence and sustainability.

article thumbnail

What I've Learned in 2020: A Technical Version

Rockset

I'm on paternity leave till the end of year since my daughter is on the way, and since I have some little time left before getting really busy, I want to reflect on how I've grown as an engineer in 2020. I left Facebook at the end of 2019 to join Rockset, and it has been a fun year. For those who don't know, Rockset is a real-time analytics database.

article thumbnail

Preset Getting Started Guide is Now Available

Preset

End-user documentation is focused on taking you step-by-step through the entire onboarding Preset Cloud experience, from connecting your data to building your very first chart and dashboard.

Cloud 40
article thumbnail

Looking Forwards Not Backwards: New Ways of Working for the CFO

Teradata

The bold CFO that steps into the breach and takes ownership of the bank’s data asset can transform the way they work and add massive value. Learn more.

Data 52
article thumbnail

Federated Learning, Machine Learning, Decentralized Data

Cloudera

Two years ago we wrote a research report about Federated Learning. We’re pleased to make the report available to everyone, for free. You can read it online here: Federated Learning. Federated Learning is a paradigm in which machine learning models are trained on decentralized data. Instead of collecting data on a single server or data lake, it remains in place — on smartphones, industrial sensing equipment, and other edge devices — and models are trained on-device.

article thumbnail

Improving the Accuracy of Generative AI Systems: A Structured Approach

Speaker: Anindo Banerjea, CTO at Civio & Tony Karrer, CTO at Aggregage

When developing a Gen AI application, one of the most significant challenges is improving accuracy. This can be especially difficult when working with a large data corpus, and as the complexity of the task increases. The number of use cases/corner cases that the system is expected to handle essentially explodes. 💥 Anindo Banerjea is here to showcase his significant experience building AI/ML SaaS applications as he walks us through the current problems his company, Civio, is solving.

article thumbnail

2020 Data Impact Award Winner Spotlight: Globe Telecom

Cloudera

It’s been a few weeks since we celebrated the 2020 Data Impact Awards, and everyone at Cloudera is still on a high. It was a brilliant event, and we are so pleased we were able to celebrate our fantastic customers virtually. Thank you again to all those who tuned in! . The Connect the Data Lifecycle award was our fifth award at this year’s ceremony.

article thumbnail

How Cloudera Supports Government Data Encryption Standards

Cloudera

As part of our ongoing commitment to supporting Government regulations and standards in our enterprise solutions, including data protection, Cloudera recently introduced a version of our Cloudera Data Platform, Private Cloud Base product (7.1.5 release) that can be configured to use FIPS compliant cryptography. We have accomplished this significant improvement through supporting the deployment of the Cloudera Data Platform (CDP) Private Cloud Base on FIPS mode enabled RedHat Enterprise Linux (RH

article thumbnail

Get to Know Your Retail Customer: Accelerating Customer Insight and Relevance

Cloudera

There are lessons to be learned from the brick and mortar or pure-play digital retailers that have been successful in the Covid-19 chaos. As the pandemic’s stress test of e-commerce, in-store insights, supply chain visibility, and fulfillment capabilities have revealed shortcomings, and long-lasting consumer experiences— it has also allowed many companies to pivot to very successful strategies built on enterprise data and the digitization efforts that accompany it.

Retail 61
article thumbnail

Global View Distributed File System with Mount Points

Cloudera

Apache Hadoop Distributed File System (HDFS) is the most popular file system in the big data world. The Apache Hadoop File System interface has provided integration to many other popular storage systems like Apache Ozone, S3, Azure Data Lake Storage etc. Some HDFS users want to extend the HDFS Namenode capacity by configuring Federation of Namenodes.

Systems 60
article thumbnail

The Ultimate Guide To Data-Driven Construction: Optimize Projects, Reduce Risks, & Boost Innovation

Speaker: Donna Laquidara-Carr, PhD, LEED AP, Industry Insights Research Director at Dodge Construction Network

In today’s construction market, owners, construction managers, and contractors must navigate increasing challenges, from cost management to project delays. Fortunately, digital tools now offer valuable insights to help mitigate these risks. However, the sheer volume of tools and the complexity of leveraging their data effectively can be daunting. That’s where data-driven construction comes in.

article thumbnail

How to configure clients to connect to Apache Kafka Clusters securely – Part 2: LDAP

Cloudera

In the previous post, we talked about Kerberos authentication and explained how to configure a Kafka client to authenticate using Kerberos credentials. In this post we will look into how to configure a Kafka client to authenticate using LDAP, instead of Kerberos. We will not cover the server-side configuration in this article but will add some references to it when required to make the examples clearer.

Kafka 52
article thumbnail

The Economic Value of Supply Chain Investments

Teradata

What is the impact of adjusting various supply chain levers on a company's stock price? How do they impact shareholder value? Find out more.

52
article thumbnail

How Much Security Is Too Much Security?

Teradata

In these budget conscious times, how much security is too much security? That depends on how much you value your data. Read more.

Data 52
article thumbnail

Medibank

Teradata

Teradata Vantage on AWS transforms private healthcare company to create “Better Health for Better Lives.

article thumbnail

Business Intelligence 101: How To Make The Best Solution Decision For Your Organization

Speaker: Evelyn Chou

Choosing the right business intelligence (BI) platform can feel like navigating a maze of features, promises, and technical jargon. With so many options available, how can you ensure you’re making the right decision for your organization’s unique needs? 🤔 This webinar brings together expert insights to break down the complexities of BI solution vetting.